210993 – [prov] Uniquify strings when parsing

Bug 210993 - [prov] Uniquify strings when parsing

Summary: [prov] Uniquify strings when parsing

Status:	RESOLVED FIXED

Alias:	None

Product:	Equinox
Classification:	Eclipse Project
Component:	Incubator (show other bugs)
Version:	3.4
Hardware:	PC Windows XP

Importance:	P3 normal (vote)
Target Milestone:	3.4 M4
Assignee:	equinox.incubator-inbox
QA Contact:

URL:
Whiteboard:
Keywords:	performance

Depends on:
Blocks:

Reported:	2007-11-26 17:17 EST by John Arthorne
Modified:	2007-11-26 17:32 EST (History)
CC List:	0 users

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description John Arthorne

2007-11-26 17:17:43 EST

Our p2 metadata contains lots and lots of strings: ids of IUs, required/provided capabilities, property keys/values, touchpoint types, etc, etc. Simply uniquifying strings while parsing our data files would save a lot of memory.

Comment 1 John Arthorne

2007-11-26 17:32:47 EST

I have released a simple canonicalization of strings in XMLParser. I did a quick benchmark of the admin UI application with a single metadata repository, and a profile registry with two profiles, each containing the Eclipse SDK.  Before the change, there were 8,909,576 bytes worth of strings (transitive size), and after the change there were 5,081,024 bytes of strings.  There is lots more room for optimization, but this was an easy first start (only a couple of lines of code).