|[equinox-dev] interning strings in RegistryCacheReader|
I've been looking at Eclipse startup in YourKit 3.0 beta and about half of the memory used is taken up with Strings. I looked at the strings and the same strings are repeated over and over again, for example "org.eclipse.ui.defaultAcceleratorConfiguration". I traced this back to the org.eclipse.core.internal.registry.RegistryCacheReader class. It has two methods, readString() and readCachedString() which take an 'intern' boolean parameter that would cause String.intern() to be called on the strings, thus eliminating the dups. Only a few callers pass true to the functions though.
http://www.eclipse.org/eclipse/development/performance/bloopers.html talks about this a little and it says "On some JVM implementations the performance of intern() degrades dramatically. Interning the registry strings eagerly and early seeds the intern() table increasing the collision rate". This makes it sound like at some point in the past, somebody tried using intern() all the time and didn't like the results. Can anybody shed some light on the design decision not to use intern() and whether or not this caveat is still true?