Bug 512913 - [newindex] Compress strings containing common prefixes
Summary: [newindex] Compress strings containing common prefixes
Status: NEW
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 4.6   Edit
Hardware: PC Linux
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: JDT-Core-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-03-01 14:15 EST by Stefan Xenos CLA
Modified: 2018-05-10 13:43 EDT (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Xenos CLA 2017-03-01 14:15:26 EST
Currently about 30% of the index database consists of short strings. Most of those strings are Java identifiers that have common prefixes. We should use some sort of compression scheme that avoids storing the common prefixes more than once.

I'd suggest using a trie search structure. This would embeds the string itself implicitly within the search structure and would merge common prefixes.

Assuming this saves half the memory used by short strings, this would save us 15% of the total database size.

Note: this is a low priority item. We should only look into it after we've fixed all the index-related UI freezes and fragmentation issues.
Comment 1 Stefan Xenos CLA 2017-03-01 14:16:27 EST
I'll assign a milestone after all the higher-priority work is done.