512913 – [newindex] Compress strings containing common prefixes

Bug 512913 - [newindex] Compress strings containing common prefixes

Summary: [newindex] Compress strings containing common prefixes

Status:	NEW

Alias:	None

Product:	JDT
Classification:	Eclipse Project
Component:	Core (show other bugs)
Version:	4.6
Hardware:	PC Linux

Importance:	P3 enhancement (vote)
Target Milestone:	---
Assignee:	JDT-Core-Inbox
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2017-03-01 14:15 EST by Stefan Xenos
Modified:	2018-05-10 13:43 EDT (History)
CC List:	2 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Stefan Xenos

2017-03-01 14:15:26 EST

Currently about 30% of the index database consists of short strings. Most of those strings are Java identifiers that have common prefixes. We should use some sort of compression scheme that avoids storing the common prefixes more than once.

I'd suggest using a trie search structure. This would embeds the string itself implicitly within the search structure and would merge common prefixes.

Assuming this saves half the memory used by short strings, this would save us 15% of the total database size.

Note: this is a low priority item. We should only look into it after we've fixed all the index-related UI freezes and fragmentation issues.

Comment 1 Stefan Xenos

2017-03-01 14:16:27 EST

I'll assign a milestone after all the higher-priority work is done.