Bug 290318 - need a means to represent dead symbolic links in resource tree
Summary: need a means to represent dead symbolic links in resource tree
Status: REOPENED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 3.4.2   Edit
Hardware: All Linux
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Platform-Resources-Inbox CLA
QA Contact:
URL:
Whiteboard: stalebug
Keywords:
: 443409 (view as bug list)
Depends on:
Blocks: 380337
  Show dependency tree
 
Reported: 2009-09-23 15:48 EDT by John Camelon CLA
Modified: 2020-05-11 19:07 EDT (History)
10 users (show)

See Also:


Attachments
Patch_v01 (25.42 KB, patch)
2009-10-19 12:00 EDT, Pawel Pogorzelski CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John Camelon CLA 2009-09-23 15:48:00 EDT
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 GTB5 (.NET CLR 3.5.30729)
Build Identifier: 3.4.2 Build id: M20090211-1700

In providing support for symbolic in links in Jazz Source Control, we have encountered the following problem:  if a symbolic link does not resolve in the file system, the Eclipse resource model skips over it.  This leads to problems in manipulating the containing folder using the Resource API due to the underlying skew between the resource model and the filesystem. 

What we would need is a way of describing that a resource is a dead link so that operations which affect the resource tree can work consistently.  

Reproducible: Always
Comment 1 Pawel Pogorzelski CLA 2009-10-19 12:00:36 EDT
Created attachment 149890 [details]
Patch_v01

A rough POC showing the idea of the proposed solution.

I've changed the implementation of IResource.exists() which now returns true for broken symlinks. This way resource layer matches disk structure more accurately and no resources are hidden from clients.

The proposal changes the contract on relation between IResource.exists() and IResource.isAccessible(). As of now these methods return always (but for projects) the same value. With the patch applied broken symlinks are returned as existing but not accessible. This change introduces problems since clients often relay only on IResource.exists() and in such case they get an exception while reaching for the bytes. Fortunately the exceptions are usually handled well by clients since it's an IO operation.

The other way to handle the problem is to return false from IResource.exists() and carry information about the symlink in a different manner. I talked to Szymon and we dropped the idea as it brings confusion and also requires clients to adapt.

The proposed change would also encourage clients to distinguish existing resources from accessible ones. This way potentially other cases where file's content can't be loaded would be easy to handle (on Resources internal side).

Please comment, however keep in mind that the patch is not completed by any means.
Comment 2 John Arthorne CLA 2009-10-19 13:59:16 EDT
I haven't had a chance to look at the patch, but the direction sounds promising. Although this redefines the meaning of "accessible" for resources, it is a reasonable description of these resources: the link exists but the target is not accessible because the target is missing. Clearly clients need some way to distinguish the dead symbolic link case in case they want to do some special handling. Clients could detect this already by querying the underlying IFileInfo for ATTRIBUTE_LINK_TARGET and checking existence of the target themselves, but some easier way to detect this would be useful.

A similar case is linked resources for which the link target is undefined (generally because it is relative to an undefined path variable). I expect dead symlinks would behave much link broken linked resources behave today: they exist in the tree but attempts to perform I/O on the resource will fail.

Adding Martin to CC because he has a lot of experience with symlink issues. Any experience with handling "dead" symbolic links Martin, or concerns about making them present in the resource model?
Comment 3 Szymon Brandys CLA 2009-10-20 05:07:35 EDT
(In reply to comment #2)
I discussed it with Pawel and it seems we can do this in two steps. 

First is to have dead symlinks in the tree. They will be similar to broken linked resources (exist=true, accessible=true, getContent says that file cannot be found). I guess our client would not be affected by this change.

Second step is to redefine "accessible" and maybe "local" to give (as John wrote) clients some way to distinguish the dead symbolic link or broken linked resources and do some special handling.
Comment 4 Martin Oberhuber CLA 2009-10-20 08:03:24 EDT
This is not an easy topic to think about.

On the low level (plain C) OS side, there are two separate functions to check existence of a symbolic link or its target respectively:
   lstat() -- returns properties of the link
   stat()  -- returns properties of the item pointed to by the link.
The former would return true for exists, the latter returns false for exists.

In most cases, (java.io.File and most applications), the stat() semantics is being used. People like to think about symbolic links as being fully transparent: that is, any operation on a link such as reading its contents or reading its attributes actually read from the item pointed to by the link. If you look at "symbolic links" in Wikipedia, they also highlight this transparency aspect of the link: unless you look at a link on the lowest level, it should be considered fully transparent by higher level applications. The only notable exceptions are the "move" and "delete" operations which typically move or delete the link itself (because they really operate on the folder containing the link, and not the link itself).

In my understanding, Eclipse has also been using stat() semantics from its very beginning, consistently everywhere.

Because of this, I am concerned about changing the semantics of IResource#exists() to return existence of the link rather than the item it points to. Making broken links visible in the resource tree has quite wide impact -- loading such items in an editor must be handled, semantics of update / refresh must be considered (what if the item pointed to by the link is created? The "exists" property would not change, how to report the resourceDelta?), what happens when exporting into an archive, what would the icon for broken symbolic links look like (see also bug 209190) etc etc.

I wouldn't want to completely push the current direction off the table immediately, and unfortunately I don't have time at the moment to think about this in all its details. On first thought, I'd be more comfortable with new API for explicitly querying link properties versus the link target properties similar to the lstat() versus stat() calls so I'm wondering whether this has been explored already.

Long story short, rather than fiddeling with the semantics of IResource#exists() I'd prefer exploring other solutions in more depth. Changing API semantics after the fact is never a good idea. But, again, I haven't thought this through to the end.
Comment 5 Martin Oberhuber CLA 2009-10-20 08:05:24 EDT
PS Reading up "Windows symbolic links" on Wikipedia is also quite informative, since it talks about how the Windows programmers managed to mess up the topic.
Comment 6 Pawel Pogorzelski CLA 2009-10-20 09:56:44 EDT
(In reply to comment #4)
> This is not an easy topic to think about.

Right.

> In most cases, (java.io.File and most applications), the stat() semantics is
> being used. People like to think about symbolic links as being fully
> transparent: that is, any operation on a link such as reading its contents or
> reading its attributes actually read from the item pointed to by the link.

True. Eclipse however (in contrast to java.io) doesn't handle symbolic links transparently. We have ResourceAttributes.isSymbolicLink() on Resources layer and information on link's target in IFileInfo. Moreover we have finer information when querying resources provided by IResource.exists() and IResource.isAccessible() (and IResource.isLocal() which is deprecated).

The way we handle broken links now isn't very graceful. Copying, removing or renaming a folder with broken links ends with an exception. Removing and restoring a target causes markers loss. Couldn't we handle such files like OS itself by marking them inaccessible (since it's usually a temporary state).

Martin, thanks for your valuable comment. I'll investigate other issues you mentioned.
Comment 7 Martin Oberhuber CLA 2009-10-20 12:01:27 EDT
(In reply to comment #6)
> True. Eclipse however (in contrast to java.io) doesn't handle symbolic links
> transparently. We have ResourceAttributes.isSymbolicLink() on Resources layer
> and information on link's target in IFileInfo.

I think that you mix up "transparent" with "distinguishable" here. The two APIs that you mention have been added by me in Eclipse 3.4 -- before that, nobody would know in Eclipse whether an IResource happens to be a link or an actual file. And even with these two additions of mine, accessing the IResource behaves exactly like accessing the file pointed to by it. It *is* still transparent and was always transparent -- what I added in 3.4 made it distinguishable.

My point is that a client who receives "true" for IResource.exists() may safely assume that either isFile() or isFolder() is true. Furthermore, if isFile() is true that he can open a Stream on it and if that throws an Exception, it is because there is a permission denied error. If you break this existing API contract, you risk getting clients that show confusing error messages to the user such as "Failed to open foo: permission denied" although foo happened to be a broken symbolic link.

I am all in favor of showing the actual state of the file system in UI views such as the Project Explorer -- including overlay icons for symbolic links, including the link target in Properties, and including special markup for broken symbolic links. Also, I may still be convinced that changing the semantics of exists() is the right way to go. But before I am convinced I would like to see other options explored in more detail.

> The way we handle broken links now isn't very graceful. Copying, removing or
> renaming a folder with broken links ends with an exception. 

This should be easy enough to fix without changing API.

> Removing and restoring a target causes markers loss. 

I dont understand what you mean by this, can you reference an existing bug for this issue?
Comment 8 John Arthorne CLA 2009-10-20 13:41:30 EDT
Martin, are you specifically concerned about the semantics of IResource#exists(), or about broken links appearing in the workspace at all? To me "exists" means "it is in the workspace/resource tree". I.e., exists() returns true for a resource iff methods like IContainer#findMember and members() will find it. I certainly wouldn't want to change the semantics of IResource#exists() at all. There are various scenarios today where IResource#exists() doesn't match the file system state - non-local resoures, broken linked resources, or most commonly out of sync resources where the file was deleted on disk but we haven't noticed yet. I don't think we've ever interpreted IResource#exists() to mean "exists on disk".

I think the question is whether broken links should be represented as resources at all. Currently they are not - when scanning the local file system we ignore them completely today. This makes our life easier because we hide their problematic behaviour from clients. However the request being made here is that we do represent them in the resource tree so that clients who do care about them can operate on them (hopefully without breaking clients who don't care about them).

While we do have to tread lightly here and consider the issues Martin raised, I'm less worried because we've been through many of the same issues before with broken linked resources. For example we added IResourceDelta#LOCAL_CHANGED for broadcasting a change in existence of link targets and clients already live with cases where IResource#exists doesn't agree with the file system state.
Comment 9 Pawel Pogorzelski CLA 2009-10-20 14:35:58 EDT
(In reply to comment #7)

> I think that you mix up "transparent" with "distinguishable" here.

This is certainly a better world. Thanks.

> My point is that a client who receives "true" for IResource.exists() may safely
> assume that either isFile() or isFolder() is true. Furthermore, if isFile() is
> true that he can open a Stream on it and if that throws an Exception, it is
> because there is a permission denied error.

This is not the only case. Like John wrote there could be an out of sync exception or a regular IO exception. We would just expand the list of cases when it happens.

> Also, I may still be convinced that changing the semantics of exists() is the
> right way to go. But before I am convinced I would like to see other options
> explored in more detail.

I'll explore alternative solutions. As of now this way seems the most consistent because non existing resources are not included in the tree.

> This should be easy enough to fix without changing API.

True.

> > Removing and restoring a target causes markers loss. 
>
> I dont understand what you mean by this, can you reference an existing bug for
> this issue?

There is no bug for this. I just wondered if it's a proper behavior.
Comment 10 Martin Oberhuber CLA 2009-10-21 07:23:42 EDT
(In reply to comment #8)
> To me "exists" means "it is in the workspace/resource tree". I.e., exists()
> returns true for a resource iff methods like IContainer#findMember and
> members() will find it.

Now this is a very good argument and essentially addresses all my concerns. I have been thinking too much on the file system level. In some sense, a broken link can be seen as a resource of different type than folder or file. 

Reading up on the current Javadocs of IResource#exists(), this makes a lot of sense. The current javadocs are very vague with respect to what "exists" means but this interpretation definitely makes sense. In fact, the Javadocs highlight the "type" of resource quite clearly. Making sure that broken links are neither counted as type file nor as type folder, I could imagine that everything falls into place really nicely.
Comment 11 Szymon Brandys CLA 2010-08-30 08:09:44 EDT
Moving the bug to the inbox. Pawel does not work in this area anymore.
Comment 12 John Arthorne CLA 2011-06-30 10:05:05 EDT
Just recording for reference that we ended up fixing a specific instance of the problem, where IResource#move failed on dead sym links. See bug 289637.
Comment 13 Szymon Ptaszkiewicz CLA 2014-09-24 06:06:04 EDT
*** Bug 443409 has been marked as a duplicate of this bug. ***
Comment 14 Andrey Loskutov CLA 2015-02-12 03:20:38 EST
Just wondering if we can get some progress on this one. I've stumbled upon this bug in bug 405326 comment 8.

Any comments on the possible solution? Is it something like patch from Pawel from comment 1 what we want?
Comment 15 Eclipse Genie CLA 2020-02-16 09:11:38 EST
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're closing this bug.

If you have further information on the current state of the bug, please add it and reopen this bug. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 16 Dean Roberts CLA 2020-04-27 10:11:46 EDT
We just had a new defect raised against our product, Jazz Source Control, aka RTC, aka EWM.

This is still an issue, and I see from the comments here it causes problems with other Eclipse based projects.

Symbolic links are common for some customers as is not necessarily having the targets loaded.
Comment 17 Lars Vogel CLA 2020-05-11 15:53:01 EDT
Please reopen if the problem still persists.
Comment 18 Dean Roberts CLA 2020-05-11 19:07:54 EDT
The problem still exists on Mac and Linux.  Presumably on Windows as well.