James Blackburn wrote on 10/16/2008 05:42:40 AM:
> Eclipse then has a org.eclipse.core.internal.utils.StringPoolJob which
> calls 'shareStrings(StringPool)' on participating
> IStringPoolParticipants. Given that StringPool.add() uses a
HashMap
> to reimplement String.intern() I wonder what the performance
> difference is between StringPool and String.intern()...
As Markus mentions, String.intern() performance was
known to degrade as the interned pool got very large. This may be improved
in newer VMs. The other problem is that interned strings were traditionally
not garbage collected. There is some debate in the VM community over whether
garbage collection of interned strings is legal according to the language
spec. The Sun VM now garbage collects interned strings, but this isn't
true for other VMs. This lack of garbage collection makes String#intern()
inappropriate for any strings that may not be around forever (such as resource
names, marker attributes, etc). The StringPool concept works a bit differently
from interning: it is used in a background task that periodically walks
over all strings and uniquifies them in the pool. After the pass is completed,
the pool object is discarded. This avoids any extra memory overhead, as
you would get with weak references.
As for the multiple copies, the class is so trivial
it didn't deserve promoting to API. Some better technique may appear down
the road, or String#intern may become a better option, at which point we
can get rid of them.