Bug 177994 - Support remote projects using EFS
Summary: Support remote projects using EFS
Status: NEW
Alias: None
Product: CDT
Classification: Tools
Component: cdt-core (show other bugs)
Version: 4.0   Edit
Hardware: PC Windows XP
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact: Jonah Graham CLA
URL:
Whiteboard:
Keywords:
: 142092 186170 196425 206044 (view as bug list)
Depends on: 142092 194578
Blocks:
  Show dependency tree
 
Reported: 2007-03-19 01:39 EDT by Doug Schaefer CLA
Modified: 2020-09-04 15:24 EDT (History)
21 users (show)

See Also:


Attachments
Patch for API changes (57.72 KB, patch)
2008-03-28 15:39 EDT, Chris Recoskie CLA
no flags Details | Diff
EFS work in progress, patch 1 (293.37 KB, patch)
2008-03-28 15:52 EDT, Chris Recoskie CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Doug Schaefer CLA 2007-03-19 01:39:40 EDT
We are getting increasing requests to support remote development where the files associated with a project are located on a remote machine from the host running the CDT. My plan has been to use EFS to do this. This bug captures the discussion regarding the issues that this brings up.

There are cases where parts of the CDT assume a local file system. These would have to be changed to ensure they use IResource. For external files like include files, it should be possible to use EFS directly to get at them. Although it might be nice to have everyting as a IResource.

Include paths are interesting in that they generally point to places outside of the project. Generating the proper URIs based on data from the build system, EFS should be able to handle this for us.

There are a number of options for getting at the remote files. RSE is working on EFS support. Or we could write an EFS adapter directly for something like sftp. But I think the most promising idea would be to have a small remote agent that acts like sftp but adds compression of the files to improve network performance. We could then write something like the extssh cvs client in Eclipse to talk the remote agent. This is something that would be useful for RSE too so the point may be mute.

Here's some numbers that should predict the performance impact of remote files. Using firefox as our standard example. Firefox has 40 Meg of compressed source. Over a conservative 100 KB/sec Internet connection, that would take 7 minutes to download. That number is also conservative since we do not index every file in firefox. Our current reindexing time for Firefox is in the 16 minute range with caching turned on. So that would put us at 23 minutes to reindex firefox. A significant impact, but not intolerable. And the beauty is that this would only need to be done on a reindex. If we write the EFS filesystem corrrectly, we should be able to index incrementally out of a local cache and keep incremental speeds the same as they are today.

Another issue will be remote builds. These should be fairly simple to accomplish with Makefile projects. Simply ssh, or use the aforementioned agent to kick off the build. We would need to tell the error parsers how to deal with remote file paths. We may also be able to get the internal builder for managed projects to work in this environment as well.

Remote debugging would be the same. You could run gdb localling talking to gdbserver remotely, or remotely launch gdb and talk MI over the wire as the HP guys have done. Solutions for other debuggers may vary.

I think this would make a cool environment and should blend into the CDT with minimal change.
Comment 1 Doug Schaefer CLA 2007-03-19 01:41:55 EDT
Obviously, this won't make 4.0. But something we should work towards for 4.1. I have a feeling that this may stretch EFS farther than anyone has done up until now and may need some changes in the next Eclipse release.
Comment 2 Tianchao Li CLA 2007-03-19 05:38:55 EDT
I would like to remind you of my previous patch that address the same issue, which is available from here:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=142092. 
Comment 3 Warren Paul CLA 2007-03-19 12:11:54 EDT
I'm not familiar with the details of the EFS, but I assume it doesn't change the fact that IResource's have to be under a project root.  If so, then I don't think external headers or sources can be IResource's.  I assume we'll still need to use ExternalTranslationUnit's?  So are we just talking about opening project's and/or linking to files/folders on a remote machine?
Comment 4 Doug Schaefer CLA 2007-03-19 12:14:02 EDT
(In reply to comment #3)
> I'm not familiar with the details of the EFS, but I assume it doesn't change
> the fact that IResource's have to be under a project root.  If so, then I don't
> think external headers or sources can be IResource's.  I assume we'll still
> need to use ExternalTranslationUnit's?  So are we just talking about opening
> project's and/or linking to files/folders on a remote machine?

I had sometimes wondered whether we could link in the include path directories into a project. But there are probably a lot of issues with that. So yeah, we would probably continue to use ExternalTranslationUnits for these.
Comment 5 David Inglis CLA 2007-03-19 12:24:28 EDT
(In reply to comment #3)
> I'm not familiar with the details of the EFS, but I assume it doesn't change
> the fact that IResource's have to be under a project root.  If so, then I don't
> think external headers or sources can be IResource's.  I assume we'll still
> need to use ExternalTranslationUnit's?  So are we just talking about opening
> project's and/or linking to files/folders on a remote machine?
> 
EFS is basically I replacement of java.io.File (which Resources now use to access the real files now) so there is nothing stopping EFS from accessing a filesystem outside of the workspace, as long as a URI can be used to access it and a supported EFS implementation is install.

Comment 6 Doug Schaefer CLA 2007-03-19 15:04:44 EDT
Another thing we're going to have to deal with is the binary parser which searches for binaries in the project. We will probably have to turn this off for remote files and make sure the debugger can debug any file.
Comment 7 Doug Schaefer CLA 2007-03-19 15:06:36 EDT
I also need to correct my firefox timings. I just tried the CDT HEAD and we are back down to 7 minutes (woo hoo). With the remote file timing assumption, that would make us at 14 minutes which is still well below the 20 minute requirement.
Comment 8 Martin Oberhuber CLA 2007-07-13 07:27:06 EDT
*** Bug 196425 has been marked as a duplicate of this bug. ***
Comment 9 Martin Oberhuber CLA 2007-07-13 07:28:52 EDT
For background info on C/C++ Remote Development see also
http://dev.eclipse.org/mhonarc/lists/dsdp-tm-dev/msg01369.html
Comment 10 Martin Oberhuber CLA 2007-10-11 12:16:12 EDT
*** Bug 206044 has been marked as a duplicate of this bug. ***
Comment 11 Martin Oberhuber CLA 2007-10-11 12:20:03 EDT
*** Bug 186170 has been marked as a duplicate of this bug. ***
Comment 12 Doug Schaefer CLA 2007-10-15 07:39:44 EDT
*** Bug 142092 has been marked as a duplicate of this bug. ***
Comment 13 Chris Recoskie CLA 2008-03-28 15:39:25 EDT
Created attachment 94048 [details]
Patch for API changes

In response to my discussions with Markus at EclipseCon, I'm attaching a patch which details some changes to public API to allow for EFS support.  Specifically, this patch adds new interfaces in the IPathEntry hierarchy (IPathEntry2 and friends), as well as a new ASTFileLocation2.  The idea behind all these new interfaces is to add the notion of resources having a location URI.  The idea is to get some buy-in for these API changes for the API freeze and flesh out the details later.

I will separately attach my work in progress on converting the scanner/parser/indexer to support EFS.
Comment 14 Chris Recoskie CLA 2008-03-28 15:52:45 EDT
Created attachment 94052 [details]
EFS work in progress, patch 1

I'm attaching my work in progress on converting CDT's scanner/parser/indexer to handle EFS so that people can examine/review it.

Right now the patch is rough around the edges and still requires some cleanup (e.g. I was playing with some different timeout numbers on the testcases at one point and haven't changed everything back yet).  However I wanted to give people a look at it, even in its rough state so that the basic ideas could be validated, and to hopefully give the HSR guys a head start on figuring out what would need to change in the refactoring framework.

The one major change that is going to happen compared to what is in the patch is that Markus and I have agreed that it would be better to not encode URIs as strings in various APIs.  The old APIs had strings for the "filename" that was being handled and it was ambiguous as to what was being handled (a workspace path?  an absolute path? a project relative path?).  So in the patch there is some backwards compatibility broken (especially IASTFileLocation.getFilename()) that will be put back right after I make those changes.  These changes will help make sure that the standalone indexer continues to function as well.

You will notice some of the core tests fail due to deadlocks or other errors in the project description code.  The testcases themselves pass other than the ASTRewriter tests, which are known failures for now.
Comment 15 Markus Schorn CLA 2008-03-31 09:05:46 EDT
The API suggested in the first patch is wrong in that it confuses workspace-paths (IResource.getFullPath()) and locations. Only paths denoting locations need to be changed to URIs.

Please do not commit this patch!

Because the proposal comes very late (last day of API-freeze) and contains major flaws, I don't see how you could make this happen in 5.0.
Comment 16 Chris Recoskie CLA 2008-03-31 13:33:28 EDT
(In reply to comment #15)
> The API suggested in the first patch is wrong in that it confuses
> workspace-paths (IResource.getFullPath()) and locations. Only paths denoting
> locations need to be changed to URIs.
> Please do not commit this patch!
> Because the proposal comes very late (last day of API-freeze) and contains
> major flaws, I don't see how you could make this happen in 5.0.

So if I understand you correctly, you are saying that IPathEntry.getPath() denotes a workspace path?  If so, does it not make sense then to just document that in the javadoc, and then change the API patch so that it does not deprecate this method (since the API would still be valid)?  That seems simple to do...
Comment 17 Markus Schorn CLA 2008-04-01 03:32:30 EDT
(In reply to comment #16)
> So if I understand you correctly, you are saying that IPathEntry.getPath()
> denotes a workspace path?  If so, does it not make sense then to just document
> that in the javadoc, and then change the API patch so that it does not
> deprecate this method (since the API would still be valid)?  That seems simple
> to do...
Documenting the entire path-entry stuff certainly makes sense. Looking on how those path-entries are constructed I conclude that IPathEntry.getPath() returns a workspace-path (not a location).

However, for supporting EFS we do need to make API changes (e.g.: IIncludePathEntry, IASTFileLocation). Because we are beyond API-freeze and the API is not even figured out at this point, my thinking is that these changes should no longer be done in 5.0.

It'd be good to here other committers opinon on this.
Comment 18 Emanuel Graf CLA 2008-04-01 08:17:29 EDT
I applied your patches to my local workspace and was looking for the reason for the failing ASTRewriter tests.
The reason I found for the failing tests is that IASTNode.getContainingFilename() now returns an URI string with a leading file:/ and no longer a path. Is this intended? because the documentation of this method is still the same.
Comment 19 Anton Leherbauer CLA 2008-04-01 08:47:35 EDT
(In reply to comment #16)
> So if I understand you correctly, you are saying that IPathEntry.getPath()
> denotes a workspace path?  If so, does it not make sense then to just document
> that in the javadoc, and then change the API patch so that it does not
> deprecate this method (since the API would still be valid)?  That seems simple
> to do...
> 

It is a bit more difficult. You should also remove IPathEntry2.getLocationURI() and add URI getters for those entry types that can refer to file system locations, like e.g. IIncludeEntry.

One missing piece is also an extension interface for IScannerInfo to allow for URIs. It is contained in the second patch (IExtendedScannerInfo2), but not in the API patch. 
This would be required to properly feed the parser with URIs instead of opaque Strings (to avoid problems like in comment 18). 

But, anyway, I agree with Markus that it is very late for such a change.
Comment 20 Doug Schaefer CLA 2008-04-01 09:05:43 EDT
BTW, we are not past API freeze. The M6 build has moved to Wednesday.
Comment 21 Chris Recoskie CLA 2008-04-01 13:04:55 EDT
(In reply to comment #18)
> I applied your patches to my local workspace and was looking for the reason for
> the failing ASTRewriter tests.
> The reason I found for the failing tests is that
> IASTNode.getContainingFilename() now returns an URI string with a leading
> file:/ and no longer a path. Is this intended? because the documentation of
> this method is still the same.

This was a side effect of what I was doing.  I think we will need new methods here.  Given that the javadoc for this method says it is intended to be lightweight, we may need to give some careful thought as to what we do here.  It may or may not be too heavyweight to have methods returning IPath or URI here, I don't yet know.

Since as far as I know the AST location (IASTNode.getFileLocation()) is already calculated though, I wonder if we can just use that and stay away from getContainingFilename() other than for lightweight equality checks and the like.  It seems to me that these two methods should be referencing the same file, but the comment on getContainingFilename() isn't clear on that.
Comment 22 Doug Schaefer CLA 2008-04-01 13:36:32 EDT
(In reply to comment #21)
> This was a side effect of what I was doing.  I think we will need new methods
> here.  Given that the javadoc for this method says it is intended to be
> lightweight, we may need to give some careful thought as to what we do here. 
> It may or may not be too heavyweight to have methods returning IPath or URI
> here, I don't yet know.
> 
> Since as far as I know the AST location (IASTNode.getFileLocation()) is already
> calculated though, I wonder if we can just use that and stay away from
> getContainingFilename() other than for lightweight equality checks and the
> like.  It seems to me that these two methods should be referencing the same
> file, but the comment on getContainingFilename() isn't clear on that.

This episode points out a serious problem. The APIs we do have are in worse shape than I thought yet people are using them and making assumptions based on the way they work now. I have the fear in me that the changes being proposed could produce a big impact on the rest of the DOM and we could end up fighting bugs for the rest of this release and I'm not sure we have the resource committments in place to do that.

I think it's time to take a step back and learn from our "New Project Model" experience and not rush in this change at the end of the cycle. The DOM is an area where we want more plug-in providers to be participating so we need to make sure that we have the right APIs in place. And we need to do it with the some rigor and community participation. And we need to make sure we are in agreement before proceeding. And unfortunately that will take time.
Comment 23 Chris Recoskie CLA 2008-04-01 13:56:58 EDT
In reply to comment #19)
> It is a bit more difficult. You should also remove IPathEntry2.getLocationURI()
> and add URI getters for those entry types that can refer to file system
> locations, like e.g. IIncludeEntry.
> One missing piece is also an extension interface for IScannerInfo to allow for
> URIs. It is contained in the second patch (IExtendedScannerInfo2), but not in
> the API patch. 
> This would be required to properly feed the parser with URIs instead of opaque
> Strings (to avoid problems like in comment 18). 
> But, anyway, I agree with Markus that it is very late for such a change.

Ok, based on the comments of Markus and Toni and Emanuel, I think we should wait too and do this right.  It will take time to hammer out all the discussion, and it's a bad idea to rush it and make a mistake.

For IBM's part, we don't need the parsing stuff to work with EFS as we will be using the standalone indexer, so this doesn't really impact us.  I was working on this solely for the community's benefit.  I think it's still important for the community so I think we should not give up on it entirely.

I think if this is to happen though that we need to keep progressing as time goes on, and the best catalyst for discussion seems to be code.  So I think the thing to do is to put my work in progress in a sandbox branch and then we can hack at it over the summer when we have more time, and figure out how things would need to change before later promoting it into HEAD sometime after 5.0 (for 5.1/6.0 or whatever you want to call it) when we are happy with it.  The discussion needs to continue early in the release cycle though or else we will be a victim of time again next year.

I still plan on making any changes we need to enable creating a project on an EFS filesystem without major errors derailing the project creation (which has started happening again lately even though I had it working before), and then being able to open the files.  I am not really seeing a need in these cases for any API breakage.  These changes are going to be my immediate priority, as we at IBM still are going to need that much working for Ganymede.
Comment 24 Emanuel Graf CLA 2008-04-01 15:01:40 EDT
(In reply to comment #23)

> Ok, based on the comments of Markus and Toni and Emanuel, I think we should
> wait too and do this right.  It will take time to hammer out all the
> discussion, and it's a bad idea to rush it and make a mistake.
> 
> I think if this is to happen though that we need to keep progressing as time
> goes on, and the best catalyst for discussion seems to be code.  So I think the
> thing to do is to put my work in progress in a sandbox branch and then we can
> hack at it over the summer when we have more time, and figure out how things
> would need to change before later promoting it into HEAD sometime after 5.0
> (for 5.1/6.0 or whatever you want to call it) when we are happy with it.  The
> discussion needs to continue early in the release cycle though or else we will
> be a victim of time again next year.
> 

I'd like to help with this and I'll try to get the permission to spend some time on this during the summer semester break.
Comment 25 Chris Recoskie CLA 2008-04-04 14:05:51 EDT
I checked in some changes to get EFS projects working and to allow EFS resources to work better in the editor.  You can now create managed or makefile projects successfully on a non-local EFS filesystem and open the files.

- fixed a bunch of NPEs in the build system when IProject.getLocation() returned null
- created a new IStorage implementation (EFSFileStorage) that can be backed with an EFS file.
- added some utility methods that deal with URIs
Comment 26 Martin Oberhuber CLA 2008-04-04 14:32:10 EDT
Cool! What's the branch name?
Comment 27 Chris Recoskie CLA 2008-04-04 15:11:55 EDT
(In reply to comment #26)
> Cool! What's the branch name?

The stuff I just checked in was to HEAD.  It does not include the indexer stuff.  I haven't yet made a branch for the indexer stuff but plan to next week.

Please note, just to be clear, that the builders will not work yet if you actually try to do a build.  They just won't crash when you create the project.
Comment 28 Martin Oberhuber CLA 2008-04-04 15:24:37 EDT
So your checkin fixes the implementation but doesn't change any API, right?
Does that help with EDITING EFS-backed files in any way? I suppose I wouldn't get an Outline view, right?

Getting Editor + Outline view working would be helpful already even if the rest of the indexer wouldn't work properly.
Comment 29 Chris Recoskie CLA 2008-04-04 15:36:36 EDT
(In reply to comment #28)
> So your checkin fixes the implementation but doesn't change any API, right?
> Does that help with EDITING EFS-backed files in any way? I suppose I wouldn't
> get an Outline view, right?
> Getting Editor + Outline view working would be helpful already even if the rest
> of the indexer wouldn't work properly.

There are no breaking API changes.

Editing and outline view both work.