Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[egit-dev] re-indexing running for nothing

Hey,

 

I did some profiling on why our Eclipse is quite slow at times. I discovered a major problem (for us) with the re-indexing. Basically, it looks like the indexing job is triggered way too often, even if nothing at all changes. The very best and easiest example is to just clean-build all projects in the workspace. It turned out that more than half of the CPU-time during a build of the whole workspace is spent with calculating SHA-1 checksums, even though there are zero changes to any files...!

 

Do you have any hints on what could be the cause?

 

I tried to enable tracing for EGit, and do see that Eclipse seems to generate deltas for:

 

1)      All the ignored .class files. From the logs I see that the index update job (IndexDiffFilter) does check that they are ignored but I’m not sure whether it still calculates checksums for them somehow?

2)      All (or most) feature.xml files (must be some internal trick to force re-validation or so...?)

3)      Some MANIFEST.MF files (why? I can think of things that Eclipse could do there to cause this...)

4)      Some .java files (WHY? I cannot think of ANY reason why a build should cause a delta in a .java file?! And it’s not any generated file – it’s purely, 100% hand-written files)

5)      A lot of other files (.zip, .project)

6)      .gitignore files. This is especially bad, as it triggers a full re-index of the repository! The .gitignore files that cause the re-index are all files under /bin/ directories – they have been copied from the /src/ directories by the build... :(

 

For testing purposes I implemented filtering of ignored paths based on the “old” index state in the GitResourceDeltaVisitor. I also have to filter changes to .gitignore files if they are beneath an already ignored path. Otherwise these copied-by-the build files in /bin/ directories cause full re-indexing for each and every plugin that has /some/ .gitignore while building (one full re-index when cleaning the plugin, one full re-index when copying the .gitignore back into place). I’m not sure whether this is OK. I think yes, but please help me think ;) Just for your reading pleasure, I also pushed the change to Gerrit: https://git.eclipse.org/r/#/c/37880/ . There is a test failling, but just for getting first opinions I didn’t even have a closer look at that ;)

 

Still just a hack-ish try, but that changes the CPU times dramatically :) I /think/ that this check is valid, because changes to (not ignored) .gitinore force a full re-index anyway, thus the “IgnoredNotInIndex” should keep being valid across invocations. The only pitfall is that adding a .gitignore to a directory that is itself ignored will require an explicit “refresh” button press in staging view...

 

Do you think that such a change makes sense? It eliminates all unnecessary re-indexing jobs for me during a normal full –build or during launching an application. Essentially all the runs where I did not actually change a (not-ignored) file. Also the profiler shows that SHA-1 checksum calculation is now not one of the top-scorers anymore.

--

Mit freundlichen Grüßen / Best regards

 

Markus Duft | Software Architect

SSI SCHÄFER | Salomon Automation GmbH | Friesachstraße 15 | 8114 Friesach bei Graz | Austria

Phone +43 3127 200-575 | Fax +43 3127 200-22

markus.duft@xxxxxxxxxxxxxxxx

Website | Blog | YouTube | Facebook

 

Salomon Automation GmbH | Friesachstrasse 15 | 8114 Friesach bei Graz | Austria
Registered Office: Friesach bei Graz | Commercial Register: 49324 K | VAT no. ATU28654300
Commercial Court: Landesgericht für Zivilrechtssachen Graz

Back to the top