Community
Participate
Working Groups
In some cases, indexing takes a significant amount of time. There is not much that can be done to significantly improve the performance except for having a pre-built index.
Satyam, Please investigate.
I will not be able to finish it by M3, so moving it to M4.
Here is the overall approach I am taking for this in JDT/Core. . This solution is targeted to have indexes for only jars, which includes plugin jars. It will not be targeted for class folders or source folders. Source folders and class folders are likely to change and hence tracking could be difficult. . The index file can be generated either through an ant task or direct function call or an headless application. The index file can be generated only for a jar. To make it simple, all of these just take two parameters, a jar file to index and the path to the index file that needs to be written. . The path to the index file can be provided through the classpath attribute. It should be like a URL string, so that the path to the indexFile can be easily mentioned even if it is in a jar. Here is the javadoc for attribute in IClasspathAttribute ####### /** * Constant for the name of the index location attribute. * * <p>The value for this attribute has to be the string representation of a URL. * It should point to an existing index file in a folder or a jar.</p> * * @since 3.8 */ String INDEX_LOCATION_ATTRIBUTE_NAME = "index_location"; //$NON-NLS-1$ ####### . If the index file specified is not present or if the indexer thinks it is corrupt, the library is reindexed. I haven't planned yet to see if the index file is really in sync with the jar. If there are multiple projects depending on the same jar, but if their index_location classpath attributes are different, the last one that sets wins. I am assuming that this will generally not happen and hence it is not worth taking care of this. Please let me know if you have any alternative view on the above points.
Created attachment 205241 [details] First cut This patch is not complete. I have marked TODOs for all the incomplete tasks.
(In reply to comment #3) > The path to the index file can be provided through the classpath attribute. > It should be like a URL string Would that be an absolute URL, or a relative one? When sharing project(s) in a team, it can happen that different team members have the project (and thus the index) in a different physical location. So the "index_location" should either support specifying the location relative (eg relative to the plugin root, or relative to the JAR being indexed; Javadocs would need to specify what it is relative to) or some sort of variable substitution mechanism is needed.
(In reply to comment #5) Thanks for this comment. This should better support both an absolute and a relative path. Do you think variable substitution mechanism is needed? I don't plan to. I will update the Javadoc accordingly. Thanks.
I think this sounds good. In our case, we have a WST server target which defines many jars. If these jars can be pre-indexed, we'll get a big boost in performance. It sounds like we should be able to: - generate these index files before shipping our product - ensure that the classpath containers point to the index files (which would be plugin resource URLs). Is this true? Can I set up index file references for a classpath container? If the above is all true, then I'm good with the design.
(In reply to comment #7) > If the above is all true, then I'm good with the design. Satyam, for the record, please document where we stand with respect to Grant's observations. Jay, please review when a close to final form version as indicated by Satyam becomes available.
(In reply to comment #7) Grant, Thanks for your comments. You should be able to specify the index file from a classpath container.
Created attachment 207155 [details] Proposed patch
(In reply to comment #10) I still have to improve in the Javadocs and comments. Jay, please review.
(In reply to comment #5) Here is the new javadoc for attribute in IClasspathAttribute. Hope this is good enough. I wanted to refer to the javadoc of the platform spec but couldn't find a javadoc. ####### /** * Constant for the name of the index location attribute. * * <p>The value for this attribute has to be the string representation of a URL. * It should point to an existing index file in a folder or a jar. * The URL can also be of platform protocol.</p> * * @since 3.8 */ String INDEX_LOCATION_ATTRIBUTE_NAME = "index_location"; //$NON-NLS-1$ #######
(In reply to comment #11) > (In reply to comment #10) > I still have to improve in the Javadocs and comments. Jay, please review. All stake holders, you are welcome for an early test drive even as this patch is being reviewed and tested from our side. Let us know of any issues asap, TIA.
While discussing with Srikanth, he has pointed out that I haven't taken care of reporting a build error/warning/info when the index file attribute is wrong or not present or corrupt. Should the classpath validator report a warning or should there be a different API to tell whether the index file looks good?
Satyam, a couple of nit picks about usage of the command line application which you specified as " Usage: eclipse -application org.eclipse.jdt.core.JavaCodeGenerateIndex [ OPTIONS ] -output <indexFile> <jarfile> -output <indexFile> The index file to be generated. <jarfile> Jar file for which index needs to be generated. " 1. Since the output file is specified via -output option, I assumed that there would be a default behavior in case this is not specified, i.e. an index file would be created in the same location as the jar file. Is it possible to have this default behavior? 2. There is no success message once the application is run. If I specify verbose option I get a 'Generating Index .... " however a similar message should also be there when verbose option is not specified. 3. When an index file already exists, it is 'silently' overwritten. I suppose this is deliberate?
(In reply to comment #7) > I think this sounds good. In our case, we have a WST server target which > defines many jars. If these jars can be pre-indexed, we'll get a big boost in > performance. It sounds like we should be able to: > - generate these index files before shipping our product > - ensure that the classpath containers point to the index files (which would be > plugin resource URLs). Is this true? Can I set up index file references for a > classpath container? (In reply to comment #9) > (In reply to comment #7) > Grant, Thanks for your comments. > You should be able to specify the index file from a classpath container. Wouldn't a better/right solution be to have an extension point similar to the ones for Javadoc and source attachments - http://help.eclipse.org/indigo/index.jsp?topic=%2Forg.eclipse.pde.doc.user%2Freference%2Fextension-points%2Forg_eclipse_pde_core_javadoc.html - http://help.eclipse.org/indigo/index.jsp?topic=%2Forg.eclipse.pde.doc.user%2Freference%2Fextension-points%2Forg_eclipse_pde_core_source.html
(In reply to comment #14) > While discussing with Srikanth, he has pointed out that I haven't taken care of > reporting a build error/warning/info when the index file attribute is wrong or > not present or corrupt. > Should the classpath validator report a warning or should there be a different > API to tell whether the index file looks good? A wrong index results in very bad behavior of the whole IDE and hence needs to be reported to the user. I'd says this should be reported as build path problem.
> Wouldn't a better/right solution be to have an extension point similar to the > ones for Javadoc and source attachments > - > http://help.eclipse.org/indigo/index.jsp?topic=%2Forg.eclipse.pde.doc.user%2Freference%2Fextension-points%2Forg_eclipse_pde_core_javadoc.html > - > http://help.eclipse.org/indigo/index.jsp?topic=%2Forg.eclipse.pde.doc.user%2Freference%2Fextension-points%2Forg_eclipse_pde_core_source.html I *think* this is the old way for providing the source and no longer in use. We now normally ship the source in a bundle called <bundleName> + ".source." + <version> + ".jar" and the manifest of the source bundle has something like: Eclipse-SourceBundle: <bundleName>;version="<version>";roots:="." Curtis, please correct me if I'm wrong. PDE needs to provide a similar mechanism for the index.
(In reply to comment #18) > I *think* this is the old way for providing the source and no longer in use. We > now normally ship the source in a bundle called <bundleName> + ".source." + > <version> + ".jar" and the manifest of the source bundle has something like: > > Eclipse-SourceBundle: <bundleName>;version="<version>";roots:="." > > Curtis, please correct me if I'm wrong. PDE needs to provide a similar > mechanism for the index. Yes, source is now provided by generated source bundles which have an OSGi bundle structure but use that header to specify the source roots and what they provide source for. Bug 364290 covers adding support for this in the PDE classpath container, but the discussion on how to contribute the index belongs here. Couple of questions: - Does the ant task produce individual files for each jar? Is there any way an index file will belong to more than one jar? - Are we expecting index files to typically be included inside the binary jars, in the source jars or independently? PDE wants to address our side of this in M5. As there won't be PDE Build support for index generation or creating a custom manifest header I'm not sure where the process for contributing will be specified. If a custom header is used, when we read the bundles of the target platform we will store the information and later add it to the classpath entry similar to how we specify source or javadoc locations.
> 1. Since the output file is specified via -output option, I assumed that there > would be a default behavior in case this is not specified, i.e. an index file > would be created in the same location as the jar file. Is it possible to have > this default behavior? Users would have to do something with the generated index file and hence I thought it will be better to force them to specify it. Otherwise, I am fine to do this. If we do that I think it can be location of the jar file itself and may be the jarname appended with index could do. > > 2. There is no success message once the application is run. If I specify > verbose option I get a 'Generating Index .... " however a similar message > should also be there when verbose option is not specified. I think this should behave something like javac and hence it better not generate. At the same time, I think verbose should be improved. > > 3. When an index file already exists, it is 'silently' overwritten. I suppose > this is deliberate? Yes and I think this is OK, again taking inspiration from javac.
(In reply to comment #17) > (In reply to comment #14) > > While discussing with Srikanth, he has pointed out that I haven't taken care of > > reporting a build error/warning/info when the index file attribute is wrong or > > not present or corrupt. > > Should the classpath validator report a warning or should there be a different > > API to tell whether the index file looks good? > > A wrong index results in very bad behavior of the whole IDE and hence needs to > be reported to the user. I'd says this should be reported as build path > problem. Sorry, I didn't mention that in this case the indexer will generate the index file and use it and hence there will be loss of performance but no change in search results.
> - Does the ant task produce individual files for each jar? Is there any way an > index file will belong to more than one jar? Yes, this provides individual files for each jar. No, there is 1-1 mapping between a jar and the index file. > > - Are we expecting index files to typically be included inside the binary jars, > in the source jars or independently? I think this could go in the source jar or be independent but not in the binary jar. I haven't done yet but I am contemplating putting some signature in the index file which could tell if the index file is generated from a different jar file. If this is done, the index files cannot be part of the binary jar. > > PDE wants to address our side of this in M5. As there won't be PDE Build > support for index generation or creating a custom manifest header I'm not sure > where the process for contributing will be specified. If a custom header is > used, when we read the bundles of the target platform we will store the > information and later add it to the classpath entry similar to how we specify > source or javadoc locations.
I am part way through the review and even as I go through the rest of the files, here are some review comments for you to chew on: I. The changes in org.eclipse.jdt.core.tests.model/src/org/eclipse/jdt/core/tests/model/JavaSearchScopeTests.java org.eclipse.jdt.core.tests.model/src/org/eclipse/jdt/core/tests/model/RunJavaSearchTests.java org.eclipse.jdt.core/META-INF/MANIFEST.MF org.eclipse.jdt.core/antadapter/org/eclipse/jdt/internal/antadapter/messages.properties org.eclipse.jdt.core/build.properties org.eclipse.jdt.core/plugin.xml org.eclipse.jdt.core/model/org/eclipse/jdt/core/IPackageFragmentRoot.java org.eclipse.jdt.core/model/org/eclipse/jdt/core/IClasspathAttribute.java look ok. II. Here are some comments: org.eclipse.jdt.core.tests.compiler/src/org/eclipse/jdt/core/tests/util/Util.java: (1) The current call stacks are using external paths only, but should we conservatively include the call to path.replace('\\', '/') as done in other Util.zipFiles methods ? (2) What explains the absence of try-finally in this method compared to org.eclipse.jdt.core.tests.util.Util.zip(File, String) ? org.eclipse.jdt.core/antadapter/org/eclipse/jdt/core/BuildJarIndex.java (3) javadoc: introduce a blank line above @since. (4) what is the setProject(getProject()); doing ? Is this a cut & paste issue ? org.eclipse.jdt.core/search/org/eclipse/jdt/core/index/JavaIndexer.java (5) Provide Javadoc (as acknowledged in TODO) org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/indexing/DefaultJavaIndexer.java (6) FileNotFoundException ctor call missing white spaces. (7) Is the condition in if below always true ? if (org.eclipse.jdt.internal.compiler.util.Util.isClassFileName(entryDocument.getPath())) { new BinaryIndexer(entryDocument).indexDocument(); } I could not see how it can be false. (8) Also why is one isClassFileName call fully qualified while the other is not ? (9) Is the field JAR_SEPARATOR really needed ? org.eclipse.jdt.core/search/org/eclipse/jdt/core/index/JavaIndexerApplication.java (10) Fix javadoc (as acknowledged in TODO) (11) org.eclipse.jdt.core.index.JavaIndexerApplication.displayHelp() would print wrong application name ? (12) There is a mix of System.err.println and System.out.println - is this intentional ? (13) Messages.CommandLineProcessing: arguments are passed in the wrong order. Do we have tests to elicit the various messages ? (14) CommandLineOnlyOutputError is better renamed CommandLineOnlyOneOutputError (15) If the jar file does not exist, we will print a message but will continue to attempt to index it ? (16) if (this.indexFile != null || index == argCount) { displayError(Messages.bind(Messages.CommandLineOnlyOutputError)); return false; } block would generate "Only one jar file need to be specified" if -output is the last token in the command line ? org.eclipse.jdt.core/search/org/eclipse/jdt/core/index/messages.properties (17) "need to be specified" should be "needs to be specified" (in 2 places) org.eclipse.jdt.core/model/org/eclipse/jdt/internal/core/PackageFragmentRoot.java (18) org.eclipse.jdt.core.IPackageFragmentRoot.getIndexPath() is declared to throw JME, while org.eclipse.jdt.internal.core.PackageFragmentRoot.getIndexPath() deliberately drops the exception. Is this intentional ? org.eclipse.jdt.core/model/org/eclipse/jdt/internal/core/ClasspathEntry.java (19) extra attrributes could be null ? I see some of the old code guarding for null and at other places not - so may be it is safe. (20) getLibraryIndexLocation: Can this silently swallow malformed urls ? (21) Please get rid of the noise change at the end of the file. III. I am yet to review the following files: org.eclipse.jdt.core/model/org/eclipse/jdt/core/IJavaElementDelta.java org.eclipse.jdt.core/model/org/eclipse/jdt/internal/core/ClasspathChange.java org.eclipse.jdt.core/model/org/eclipse/jdt/internal/core/DeltaProcessor.java org.eclipse.jdt.core/model/org/eclipse/jdt/internal/core/JavaElementDelta.java org.eclipse.jdt.core/search/org/eclipse/jdt/core/search/SearchParticipant.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/index/DiskIndex.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/index/FlatFileIndex.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/index/Index.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/index/IndexLocation.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/index/JarEntryIndex.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/IndexSelector.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/JavaSearchParticipant.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/PatternSearchJob.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/indexing/AddJarFileToIndex.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/indexing/IndexAllProject.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/indexing/IndexBinaryFolder.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/indexing/IndexManager.java org.eclipse.jdt.core.tests.model/src/org/eclipse/jdt/core/tests/model/JavaIndexTests.java After I finish all, I'll share overall comments.
Hi Satyam, Here is another of comments: These files: org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/indexing/IndexBinaryFolder.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/indexing/IndexAllProject.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/index/IndexLocation.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/index/JarEntryIndex.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/index/FlatFileIndex.java org.eclipse.jdt.core/search/org/eclipse/jdt/core/search/SearchParticipant.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/JavaSearchParticipant.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/IndexSelector.java look ok. Comments: --------- org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/indexing/AddJarFileToIndex.java (22) org.eclipse.jdt.internal.core.search.indexing.AddJarFileToIndex.execute(IProgressMonitor) when we fall back and regenerate the index, we do it silently, should be report/log here ? org.eclipse.jdt.core/model/org/eclipse/jdt/core/IJavaElementDelta.java org.eclipse.jdt.core/model/org/eclipse/jdt/internal/core/JavaElementDelta.java (23) Do we need these new deltas at all ? Who would be the listeners to these "index attached" and "index detached" deltas and what do we think they will do with these ? I still need to review the following: org.eclipse.jdt.core/model/org/eclipse/jdt/internal/core/ClasspathChange.java org.eclipse.jdt.core/model/org/eclipse/jdt/internal/core/DeltaProcessor.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/index/DiskIndex.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/index/Index.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/PatternSearchJob.java org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/indexing/IndexManager.java org.eclipse.jdt.core.tests.model/src/org/eclipse/jdt/core/tests/model/JavaIndexTests.java (a) I am still trying to wrap my hands around this functionality and don't have a full picture yet. After I finish with the remaining 7 files, I may be in a better position (the caveat still being there that I am not familiar with this area of JDT/Core) (b) I think without delay, we should work towards validating the performance objectives - Can you work with the requesting project so that you can gather some numbers on their application so we know where we stand.
(In reply to comment #24) > (b) I think without delay, we should work towards validating the performance > objectives - Can you work with the requesting project so that you can gather > some numbers on their application so we know where we stand. Grant, is it possible for you to use this patch and see it works and the performance is good?
(In reply to comment #25) > (In reply to comment #24) > > (b) I think without delay, we should work towards validating the performance > > objectives - Can you work with the requesting project so that you can gather > > some numbers on their application so we know where we stand. > Grant, is it possible for you to use this patch and see it works and the > performance is good? Please ask for any help you may need, Grant, Thanks for your help with this exercise. Independently, Satyam and I will put together some benchmarking scenarios to see what the performance numbers look like. In particular, the current implementation supports multiple packaging/shipping/distribution models. While that may offer convenience to consumers of this capability, it is not clear that the performance characteristics would be appealing across the solutions: For example, we can support a zip file of index files, but does it make sense from a performance point of view which after all is the raison_d'être for this exercise, or are we better off generating the indexes in the usual way. So, before we go any further, we need some data collected. Stay tuned.
Here is the next batch of line level review comments: org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/index/Index.java Looks ok. org.eclipse.jdt.core/model/org/eclipse/jdt/internal/core/DeltaProcessor.java (24) Third diff in this file reads: URL indexLocation = ((ClasspathEntry)entries[j]).getLibraryIndexLocation(); if (indexLocation != null) { // force reindexing, this could be faster rather than maintaining the list this.manager.indexManager.indexLibrary(entryPath, project.getProject(), ((ClasspathEntry)entries[j]).getLibraryIndexLocation()); } Eliminate the latter call to getLibraryIndexLocation by reusing the result of the earlier call. (25) 5th diff - Remove extra white spaces. org.eclipse.jdt.core/model/org/eclipse/jdt/internal/core/ClasspathChange.java (26) Second diff marked "TODO SATYAM: Shouldn't the check be opposite?" I agree this existing code looks suspicious. Please work with Jay to understand this issue and raise a separate defects as needed. (27) It is better to extract a local for the expression (ClasspathEntry)newResolvedClasspath[i]) to avoid repeared array indexing and dynamic casts. org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/PatternSearchJob.java (28) Fix white space: length =paths.length; should be length = paths.length; org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/index/DiskIndex.java (29) org.eclipse.jdt.internal.core.index.DiskIndex.readHeaderInfo(InputStream) Line 802 changed from if (size > fileLength) { to Line 808 after the patch which reads: if (length != -1 && this.numberOfChunks > length) { This certainly looks like a bug. (should have been (size > length)) org.eclipse.jdt.core/search/org/eclipse/jdt/internal/core/search/indexing/IndexManager.java (30) Line 258 reads: if (currentIndexState != REBUILDING_STATE || currentIndexState != REUSE_STATE) { // rebuild index if existing file is corrupt, unless the index is already being rebuilt This looks wrong, as one or the other constituent conditions will always be true Should this be: if (currentIndexState != REBUILDING_STATE && currentIndexState != REUSE_STATE) { // rebuild index if existing file is corrupt, unless the index is already being rebuilt (31) In org.eclipse.jdt.internal.core.search.indexing.IndexManager.removeIndexPath(IPath) a call to indexFile.exists() seems to have been deleted. Is this inadvertant or intentional ? (32) The last change in org.eclipse.jdt.internal.core.search.indexing.IndexManager.removeIndexPath(IPath) looks structurally different from the version prior to patch - is this intentional ? (33) org.eclipse.jdt.internal.core.search.indexing.IndexManager.scheduleDocumentIndexing(SearchDocument, IPath, IndexLocation, SearchParticipant) Fold the two lines: Index index; index = getIndex(this.containerPath, indexLocation, true, /*reuse index file*/ true /*create if none*/); into one line as before.
I have only one file JavaIndexTests.java which has all the new tests for this feature left. After I finish with that early next week, I'll summarize my overall views/observations.
(In reply to comment #27) > (25) 5th diff - Remove extra white spaces. > > org.eclipse.jdt.core/model/org/eclipse/jdt/internal/core/ClasspathChange.java > (33) > org.eclipse.jdt.internal.core.search.indexing.IndexManager.scheduleDocumentIndexing(SearchDocument, > IPath, IndexLocation, SearchParticipant) > Fold the two lines: > > Index index; > index = getIndex(this.containerPath, indexLocation, true, /*reuse index file*/ > true /*create if none*/); > > into one line as before. Don't you guys use common formatter settings and have Save Actions enabled to auto format edited lines? Just saying...
Srikanth, Thanks for your detailed review. I agree with many of the comments. (In reply to comment #23) > (9) Is the field JAR_SEPARATOR really needed ? It is not required in this path, but I think it is better to keep. > (12) There is a mix of System.err.println and System.out.println - > is this intentional ? What should be the right thing to do? Should error's go in System.err and messages go in System.out? Should everything go in System.out? > (13) Messages.CommandLineProcessing: arguments are passed in the wrong order. Actually this is correct except the order is confusing. I have fixed this though. > (18) org.eclipse.jdt.core.IPackageFragmentRoot.getIndexPath() is declared to > throw JME, while > org.eclipse.jdt.internal.core.PackageFragmentRoot.getIndexPath() > deliberately drops the exception. Is this intentional ? Not intentional, changed at one place but forgot the other place. > (20) getLibraryIndexLocation: Can this silently swallow malformed urls ? This is internal function and it better swallow. Will try to add some verbose. > (23) Do we need these new deltas at all ? Who would be the listeners to > these "index attached" and "index detached" deltas and what do we think > they will do with these ? We probably don't need the deltas. Removing them as of now. Will add when we feel necessary. > > (31) In > org.eclipse.jdt.internal.core.search.indexing.IndexManager.removeIndexPath(IPath) > a call to indexFile.exists() seems to have been deleted. Is this inadvertant or > intentional ? Yes it is intentional, because exists() is costly for the one in jar. and probably not necessary too. > > (32) The last change in > org.eclipse.jdt.internal.core.search.indexing.IndexManager.removeIndexPath(IPath) > looks structurally different from the version prior to patch - is this > intentional ? Yes, the earlier code is not correct and hence.
(In reply to comment #26) > Independently, Satyam and I will put together some benchmarking scenarios > to see what the performance numbers look like. In particular, the current > implementation supports multiple packaging/shipping/distribution models. > While that may offer convenience to consumers of this capability, it is > not clear that the performance characteristics would be appealing across > the solutions: For example, we can support a zip file of index files, but > does it make sense from a performance point of view which after all is the > raison_d'être for this exercise, or are we better off generating the indexes > in the usual way. So, before we go any further, we need some data collected. > Stay tuned. I have tried some tests which seem to tell that there is no performance impact. I still need to run some more tests to confirm that.
(In reply to comment #30) > > (12) There is a mix of System.err.println and System.out.println - > > is this intentional ? > What should be the right thing to do? Should error's go in System.err and > messages go in System.out? Should everything go in System.out? Any reasoned approach (may be the current one already is) as opposed to inadvertent mixed usage is fine.
(In reply to comment #31) [...] > I have tried some tests which seem to tell that there is no performance impact. > I still need to run some more tests to confirm that. While initial experiments show that there is no search performance degradation while using a jar of index files, just to convince ourselves we are not violating some laws of thermodynamics or laws against time travel and such, we will work with the requesting project to get some voluminous jars and devise some searches that would use index data from the beginning, middle and end of the zipped stream to gauge the best case, average case and worst case times and compare the times against the search done against a plain vanilla index file.
Deepak, I presume some high level operations in the UI such as refactoring, result in multiple searches being performed for a single high level operation. Can you suggest some suitable operation that would result in a stream of searches being done against a binary package fragment root, so that it can be used as a benchmarking scenario ?
(In reply to comment #5) > So the "index_location" should either support specifying the location relative > (eg relative to the plugin root, or relative to the JAR being indexed; Javadocs > would need to specify what it is relative to) or some sort of variable > substitution mechanism is needed. Satyam, was this part addressed in the patch ? The javadoc isn't for sure and I don't recall whether the implementation supports relative location specification.
(In reply to comment #31) > I have tried some tests which seem to tell that there is no performance impact. > I still need to run some more tests to confirm that. I was under the assumption that jar file automatically implies compressed data, turns out this is not true. So in these experiments, was there compression involved ? In our experiments, let us make sure we include both the compressed and uncompressed archives.
The current plan is to complete the review, testing and performance characterizations over the next two weeks and release this for the first integration build past M4. Changing target milestone accordingly.
Comments on JavaIndexTests: (34) Comment above testDeleteIndexedFile() is not clear about what it is supposed to test. (35) testNonExistentIndex() : In these kind of scenarios, what should be our strategy ? should we issue a warning/info message that the index file referenced by the class path entry is missing ? Should we log this ? Currently we are silently recovering - is this what we want ? (36) Fix comment above testDeleteProject() (37) In this file and in all files in general, complete TODO items. That completes the line level review.
Overall, I didn't find any major issues as such, here are some observations: (1) We should make very careful performance characterizations for the case where the index file appears as a compressed entity in a jar file to make sure that search performance is not degraded. We should devise experiments against large, real life jars (Grant is helping here - Thanks) that "touch" various parts of the jar to make assessments of average and worst case measurements. (2) The experiments should also involve high level operations that translate into a stream of consults against the compressed index file and preferably touch different parts of the index file. (3) It appears the model where the index files appear as entries in a folder would be the best case win-win situation from a indexing and searching point of view. But supporting only this would mean we don't enable some natural use cases such as index files being packaged inside the source jar or the binary jar itself. I would like to hear what the stake holders think about supporting such a folder only model. (Of course this becomes a moot point if experiments show there is no major degradation) (4) The code change is fairly complex with the meat of the implementation interspersed with "supporting changes" which resulted from some types being refactored (IPath -> IndexLocation, FileInputStream -> InputStream). These supporting changes are so pervasive that the signal to noise ratio is adversely impacted and it is difficult to form a clear opinion about the correctness and completeness about the implementation. (5) It appears much of the implementation complexity arises from supporting index files inside jars. If we do go ahead with the full scope of what has been prototyped, I would recommend splitting the patch into core + supporting changes + tests so that just the core set of changes can be reviewed quickly in another pass. (6) Should the class path validation step validate the existence of the index file and issue info/warning if something is amiss. (7) What about the cases where the index file generation/version information is different from the one supported by JDT/Core ?
(In reply to comment #35) > (In reply to comment #5) > > > So the "index_location" should either support specifying the location relative > > (eg relative to the plugin root, or relative to the JAR being indexed; Javadocs > > would need to specify what it is relative to) or some sort of variable > > substitution mechanism is needed. > > Satyam, was this part addressed in the patch ? The javadoc isn't for sure and > I don't recall whether the implementation supports relative location > specification. As it supports the platform url specification, relative to plugin root and relative to project will be supported, but not relative to an external jar.
(In reply to comment #39) > (3) It appears the model where the index files appear as entries in a > folder would be the best case win-win situation from a indexing and searching > point of view. But supporting only this would mean we don't enable some > natural use cases such as index files being packaged inside the source jar > or the binary jar itself. I would like to hear what the stake holders think > about supporting such a folder only model. (Of course this becomes a moot > point if experiments show there is no major degradation) [...] > (5) It appears much of the implementation complexity arises from supporting > index files inside jars. [...] In informal chat with the requesting project (represented by Grant), it emerged that for their project, they would only be shipping index files as folder entries i.e not inside jars. Grant, for posterity's sake, please confirm/record here your project's preferences and the rationale. Dani, Markus, Curtis (even as we wait for performance data), any strong feelings , for/against jars with compressed index files as constituents ? We need to finalize on this by this week preferably.
(In reply to comment #33) > While initial experiments show that there is no search performance > degradation while using a jar of index files, just to convince > ourselves we are not violating some laws of thermodynamics or laws > against time travel and such, we will work with the requesting > project to get some voluminous jars and devise some searches that > would use index data from the beginning, middle and end of the zipped > stream to gauge the best case, average case and worst case times and > compare the times against the search done against a plain vanilla index > file. I have run the tests with some big jars and could not see any degradation or improvement by using the jar for index files. While trying to understand the rationale behind this, I found that the compressed jars will need lesser IO and hence reading could be faster. Well, I could not see a performance benefit either :(. However, this explains that there should not be a loss in performance.
As we think we can keep the performance issue at bay, I could see one more point to consider the decision. Shipping Indexes in jar could be easier and could take lesser space. For eg: An index file for a big jar takes around 12MB, when compressed in a jar takes only 5MB. This probably is not an issue as generally the whole product will be zipped.
(In reply to comment #42) > (In reply to comment #33) > > While initial experiments show that there is no search performance > > degradation while using a jar of index files, just to convince > > ourselves we are not violating some laws of thermodynamics or laws > > against time travel and such, we will work with the requesting > > project to get some voluminous jars and devise some searches that > > would use index data from the beginning, middle and end of the zipped > > stream to gauge the best case, average case and worst case times and > > compare the times against the search done against a plain vanilla index > > file. > I have run the tests with some big jars and could not see any degradation or > improvement by using the jar for index files. While trying to understand the > rationale behind this, I found that the compressed jars will need lesser IO and > hence reading could be faster. Well, I could not see a performance benefit > either :(. However, this explains that there should not be a loss in > performance. There seems to be quite a bit of anecdotal evidence if you go by various forums, community discussion boards etc to suggest that compressed jars need not imply performance degradation and based on the CPU, I/O and other configuration parameters may actually translate into performance improvement. See the thread http://dev.eclipse.org/mhonarc/lists/platform-core-dev/msg00691.html for some historic discussion on this topic in eclipse' very own mailing lists. Will dig up the references cited there and see what comes up.
> Dani, Markus, Curtis (even as we wait for performance data), any strong > feelings , for/against jars with compressed index files as constituents ? I think we should support to ship it as archive. If it's easier for us to process a folder, we can still extract it to a folder on first access.
(In reply to comment #44) > (In reply to comment #42) > > (In reply to comment #33) [...] > > hence reading could be faster. Well, I could not see a performance benefit > > either :(. However, this explains that there should not be a loss in > > performance. [...] > very own mailing lists. Will dig up the references cited there and see what > comes up. I dug up my copy of "Java Performance Tuning" by Jack Shirazi- there isn't a whole lot more than what we have discussed here: basically improvements result from the need for several open/close operations being obviated. Also the point Satyam made earlier about volume of I/O being reduced due to compression also helps some cases based on configuration. In summary, our experiments so far show that compressed indexes inside jars can be viably supported.
>>Grant, for posterity's sake, please confirm/record here your project's preferences and the rationale. Our product primarily wants to use this feature for our server target classpath container. The jars for this container come from the a different team than the Eclipse tooling. We have two cases as well: the server target classpath container may point to a real local server, or "stubs" that represent the APIs of a server. In both cases, we don't want to change the actual jars. For the stubs, the reason is mainly organizational. A different team creates these and they are not responsible for things like tooling performance. For a stand-alone server, packing indexes in the Jars would cause every server install to be larger, even if it's not being pointed to by our tooling. Hence, our usage of this feature will be to pre-create the indexes for the known targets (stubs, server v1.0, server, v1.1, etc) and add register the index files with the classpath container.
(In reply to comment #47) > >>Grant, for posterity's sake, please confirm/record here your project's preferences and the rationale. > > Our product primarily wants to use this feature for our server target classpath > container. The jars for this container come from the a different team than the > Eclipse tooling. We have two cases as well: the server target classpath > container may point to a real local server, or "stubs" that represent the APIs > of a server. In both cases, we don't want to change the actual jars. For the > stubs, the reason is mainly organizational. A different team creates these and > they are not responsible for things like tooling performance. For a > stand-alone server, packing indexes in the Jars would cause every server > install to be larger, even if it's not being pointed to by our tooling. > > Hence, our usage of this feature will be to pre-create the indexes for the > known targets (stubs, server v1.0, server, v1.1, etc) and add register the > index files with the classpath container. Grant, Thanks for your reply. I think there is a slight misunderstanding. To make sure we are on in the same page, we don't mean the index should go in to the actual jar, but we do mean that the indexes could be packaged into any jar, may be a special jar which could just include a group of indexes.
Ok, I understand. So my comment, obviously, was that we don't want the indexes packaged with the jars that contain the indexes classes. As for flat files vs jars for separately packaged indexes, I would say that ideally we'd package the flat files in a jar'd plugin. I'm not sure if this would be supported. If not, I don't think it's a big deal for us to ship a new, unjar'd plugin that only contains the index files (jar'd or flat).
Created attachment 207847 [details] Patch with core changes (without index in jar support) Srikanth, Here is the first part of the patch. This patch is without the support for the index in a jar. - Incorporated much of the review components - Removed the getIndexPath() from IPackageFragmentRoot. Basically there is no API as of now. I will incorporate it anybody wants it. - This patch doesn't validate the class path as such. Filed bug 365446 to take care of this part.
(In reply to comment #13) > (In reply to comment #11) > > (In reply to comment #10) > > I still have to improve in the Javadocs and comments. Jay, please review. > > All stake holders, you are welcome for an early test drive even as this patch > is being reviewed and tested from our side. Let us know of any issues asap, > TIA. We have confirmation from the requesting team that they have tested the patch in both the indexes-in-a-folder-mode and compressed-indexes-in-a-jar mode and that performance improvements are inline with expectations. So this feature looks ready for release for next week's integration build after M4 is out of the way.
Created attachment 208084 [details] Patch for supporting the index file in a jar Here is the second part of the jar which has support for the index file to be in the jar.
(In reply to comment #50) > Created attachment 207847 [details] > Patch with core changes (without index in jar support) > > Srikanth, Here is the first part of the patch. This patch is without the > support for the index in a jar. > - Incorporated much of the review components > - Removed the getIndexPath() from IPackageFragmentRoot. Basically there is no > API as of now. I will incorporate it anybody wants it. > - This patch doesn't validate the class path as such. Filed bug 365446 to > take care of this part. Satyam, thanks a lot for breaking down the patches - it made the job of reviewing these core set of changes a lot easier. The changes look good. One comment: Could you restore your first cut of changes to org.eclipse.jdt.internal.core.ClasspathChange.requestIndexing() from the https://bugs.eclipse.org/bugs/attachment.cgi?id=207155 rather than go with the revised version you have in this patch. Your first cut changes to this method are much more readable and provable to a reviewer's satisfaction than this latest version. (Remember to get rid of the TODO of course.)
(In reply to comment #51) > (In reply to comment #13) > > (In reply to comment #11) > > > (In reply to comment #10) > > > I still have to improve in the Javadocs and comments. Jay, please review. > > > > All stake holders, you are welcome for an early test drive even as this patch > > is being reviewed and tested from our side. Let us know of any issues asap, > > TIA. > > We have confirmation from the requesting team that they have tested the patch > in both the indexes-in-a-folder-mode and compressed-indexes-in-a-jar mode > and that performance improvements are inline with expectations. > > So this feature looks ready for release for next week's integration build > after M4 is out of the way. Satyam, we should ask the requesting project to not tear down any setup they have invested in to validate this, as we will ask for a revalidation after it released and an integration build becomes available.
(In reply to comment #53) > The changes look good. A further editorial comment that we should not piggy back fixes for unconnected issues discovered during implementation. The current patch has 3-4 such fixes. It is terrific that these issues are being discovered, but let us keep the separate concerns separate in future to simplify the review process as well future exercise in software archaeology.
(In reply to comment #52) > Created attachment 208084 [details] > Patch for supporting the index file in a jar > > Here is the second part of the jar which has support for the index file to be > in the jar. Thanks, this patch looks good. After incorporating the comment about requestIndexing (see comment#53) and retesting, let us target to release it early next week.
Released on HEAD through commit 47da1e1f50b1360a7b560d74dfe1c60303fec552
Do we have some example JARs and indices somewhere? If not, please create some and attach them here so that the UI can be tested.
(In reply to comment #58) > Do we have some example JARs and indices somewhere? If not, please create some > and attach them here so that the UI can be tested. Sorry, this was meant for bug 364287.
Verified for 3.8M5 using build I20120122-2000
Grant et al, this feature has been released in 3.8 M5 available at http://download.eclipse.org/eclipse/downloads/drops/S-3.8M5-201201251800/index.php. Thanks for testing this from an end user standpoint. Please raise fresh defects for any issues found. Thanks.