Bug 564327

Summary: Position handling in document is a bottle neck for large docs with many positions
Product: [Eclipse Project] Platform Reporter: Sebastian Zarnekow <sebastian.zarnekow>
Component: TextAssignee: Platform-Text-Inbox <platform-text-inbox>
Status: NEW --- QA Contact:
Severity: normal    
Priority: P3    
Version: 4.16   
Target Milestone: ---   
Hardware: PC   
OS: Mac OS X   
Whiteboard:
Attachments:
Description Flags
Test case none

Description Sebastian Zarnekow CLA 2020-06-16 05:34:06 EDT
Created attachment 283296 [details]
Test case

Large documents that a large number of positions will suffer from poor editing experience due to the position handling in the document.

The document maintains two sorted lists (by start offset, by end offset) of positions what are populated one-by-one and the lookup has basically a fall-back to a complete scan of the lists.
While these lists / the algorithm serves its purpose, it imposes also some performance penalties that can be avoided. The attached test illustrates the runtime behaviour with large documents (numbers below).

I'd like to propose an extension to the IPositionUpdater that basically shifts the entire responsibility for storing / maintaining the positions of a given category into the updater. Implementations, that guarantee non-overlapping positions (e.g. the highlighter in JDT) can exploit more efficient data structures and algorithms to update the positions on document changes.
It would also be worthwhile to have IDocument.addPositions(String, List<Position>) and IDocument.removePositions respectively to allow to trigger bulk changes and avoid the unnecessary costs of repetitive insertion sorting into the positions lists.

I'm willing to work on this and provide a Gerrit, if you consider this interesting.

The promised numbers to illustrate the urgency of this.

#of positions|category|action
50000|highlighting|Add (ms): 31
50000|highlighting|Replace (ms): 9
50000|highlighting|Remove (ms): 256
290164|highlighting|Add (ms): 70
290164|highlighting|Replace (ms): 7
290164|highlighting|Remove (ms): 11497
50000|__dflt_position_category|Add (ms): 20
50000|__dflt_position_category|Replace (ms): 102
50000|__dflt_position_category|Remove (ms): 518
290164|__dflt_position_category|Add (ms): 58
290164|__dflt_position_category|Replace (ms): 4009
290164|__dflt_position_category|Remove (ms): 15886
50000|highlighting|Add (ms): 13
50000|highlighting|Replace (ms): 5
50000|highlighting|Re-Add (ms): 10
290164|highlighting|Add (ms): 100
290164|highlighting|Replace (ms): 3
290164|highlighting|Re-Add (ms): 32

Please note that the position update is triggered synchronously with the document event on the UI thread.
- Add is a loop that adds # positions.
- Replace is a document change that deletes 50% of the document content.
- Remove is a loop over all positions to remove the deleted positions from the document.
- Re-Add: same effect as "Remove", but remove the entire category and re-add only the positions, that are not deleted. I put this here to show the potential that even the most naive, custom position management has on the runtime.