Summary: | Format of variablesAndContainers.dat doesn't scale well | ||||||
---|---|---|---|---|---|---|---|
Product: | [Eclipse Project] JDT | Reporter: | Keith W. Campbell <keithc> | ||||
Component: | Core | Assignee: | Jerome Lanneluc <jerome_lanneluc> | ||||
Status: | CLOSED FIXED | QA Contact: | |||||
Severity: | major | ||||||
Priority: | P3 | CC: | keithc | ||||
Version: | 3.1 | Keywords: | performance | ||||
Target Milestone: | 3.2 M5 | ||||||
Hardware: | All | ||||||
OS: | All | ||||||
Whiteboard: | |||||||
Attachments: |
|
Description
Keith W. Campbell
2005-07-14 11:52:04 EDT
Sorry, I was mistaken: there are (only) N copies of JRE_CONTAINER, not N+1, so we only have to remove N-1 copies. :-) Created attachment 29895 [details]
proposed patch
Here is a patch that improves the format of variablesAndContainers.dat.
Rather than storing as (much repeated) XML, the data are prefixed by
keys that identify repeating values. The result is a much more compact
file that can be read and written much more quickly. Data gathered
from medium-sized sample workspaces follows.
Workspace A with 148 plugin projects:
Load: 5,089 ms -> 114 ms (44.5 times faster)
Save: 1,219 ms -> 67 ms (18.1 times faster)
File: 4,927,707 bytes -> 112,746 bytes (43.7 times smaller)
Workspace B with 323 plugin projects:
Load: 5,891 ms -> 78 ms (75.5 times faster)
Save: 1,308 ms -> 52 ms (25.0 times faster)
File: 5,115,547 bytes -> 155,509 bytes (32.9 times smaller)
Let's consider inclusion for M5. Going even furhter: sharing individual data within entries might be even more valuable. Thinking for instance of access rules. Access rules for a given project should all be distinct and projects normally don't have overlapping package names so there should be no duplicate AccessRule objects to be worried about. I did have a mechanism at one point to handle things like that, but it affected performance and if my assumptions above are valid there would be no benefit. Thanks very much for the patch. Released it to HEAD with minor edits (in particular, I added back the check that avoid leaking containers for no longer existing projects). Code verified for 3.2 M5 using build I20060214-0010. Closing. |