Bug 500776 - Improve smart import API
Summary: Improve smart import API
Status: NEW
Alias: None
Product: Platform
Classification: Eclipse Project
Component: UI (show other bugs)
Version: 4.6   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Platform-UI-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-02 14:59 EDT by Stefan Oehme CLA
Modified: 2016-09-16 18:44 EDT (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Oehme CLA 2016-09-02 14:59:20 EDT
Hi everyone,

following up on my thread on the mailing list as well as the discussion on the Buildship Smart Import request, I'd like to suggest some improvements to the smart import API. Please forgive this rather broad enhancement request, but many of the suggestions tightly interact with each other, so I thought it would be best to present them together. We can break them down into individual ones later if needed. 

I hope that these improvements will make smart import

- simpler
- more powerful
- more efficient

Let's start with simplicity. I understand that the smart import API has evolved over a long time, so some of the methods don't really make sense without knowing the history. In the beginning there was no findConfigurableLocations method and instead the smart importer traversed the file tree itself, using methods like shouldBeAnEclipseProject and getFoldersToIgnore to make itself more efficient.

Later the findConfigurableLocations method was added, which makes it much easier for configurators to declare up-front what they consider to be Eclipse projects. My first proposal would thus be:

1. Deprecate shouldBeAnEclipseProject and getFoldersToIgnore and stop calling them.

If a configurator returns something from findConfigurableLocations, it will be considered a "primary configurator". If it returns nothing there, it will be considered a "secondary" configurator. The smart importer currently allows multiple primary configurators. This is problematic, for instance when a project migrates from Maven to Gradle. They will have both Maven and Gradle build files in their project. But m2e and Buildship are mutually exclusive and will shoot each other in the foot really hard :)

That's why my second proposal is:

2. If a location is returned by several primary configurators, tell the user about the situation, so the user can choose which one to use. 

Now on to power and efficiency: The findConfigurableLocations method currently returns a Set<File>. This is very unfortunate. The configurators probably read a lot of useful metadata, but have to throw it away because of this return type. This can easily be seen in the Maven implementation, which loads a Maven model, only to throw it away again, even though it would be useful in other methods later:

https://github.com/eclipse/m2e-core/blob/master/org.eclipse.m2e.importer/src/org/eclipse/m2e/importer/internal/MavenProjectConfigurator.java

This is a blocker for Buildship, because loading the Gradle model is more expensive than just parsing a pom.xml and Gradle's project layout is much more flexible. Finding the subprojects given a root project is easy. But finding the root project if you only have a sub-project is generally not possible.

That's why I would love to see a more structured return type that allows configurators to attach additional metadata to the configurable locations. For instance, the Buildship configurator could attach the Gradle model that it loaded to each location. That would make it super cheap to reuse the model later in the configure() method.

3. Let primary configurators return a Set<ProjectDescriptor> that they can subclass, adding additional information. Pass that ProjectDescriptor back into the configure() method later.

This will make the API more powerful, but also more efficient, because we don't need to recompute information in canConfigure() and configure().

Finally, when importing large projects it is often more efficient to configure all of them in one large transaction. This avoids unnecessary classpath updates and full builds on JDT projects for instance. That's why I'd like to propose a final performance feature:

4. Provide a configureBatch method that takes a Set<IProject> after the basic import is complete for all of them, allowing the configurator to configure them in a batch.

Thank you for staying with me :) I'm excited to hear your comments on this proposal. I'll be on vacation for the next 3 weeks though, so please don't expect a response before September 24th. I just wanted to write this down to get the discussion started as early as possible, so that hopefully we can make something happen for Oxygen.

Cheers,
Stefan