Bug 210993 - [prov] Uniquify strings when parsing
Summary: [prov] Uniquify strings when parsing
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: Incubator (show other bugs)
Version: 3.4   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.4 M4   Edit
Assignee: equinox.incubator-inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2007-11-26 17:17 EST by John Arthorne CLA
Modified: 2007-11-26 17:32 EST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John Arthorne CLA 2007-11-26 17:17:43 EST
Our p2 metadata contains lots and lots of strings: ids of IUs, required/provided capabilities, property keys/values, touchpoint types, etc, etc. Simply uniquifying strings while parsing our data files would save a lot of memory.
Comment 1 John Arthorne CLA 2007-11-26 17:32:47 EST
I have released a simple canonicalization of strings in XMLParser. I did a quick benchmark of the admin UI application with a single metadata repository, and a profile registry with two profiles, each containing the Eclipse SDK.  Before the change, there were 8,909,576 bytes worth of strings (transitive size), and after the change there were 5,081,024 bytes of strings.  There is lots more room for optimization, but this was an easy first start (only a couple of lines of code).