Bug 389281 - Creating linked folder with many files in flat hierarchy is slow.
Summary: Creating linked folder with many files in flat hierarchy is slow.
Status: VERIFIED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 3.7.2   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: 4.3 M2   Edit
Assignee: John Arthorne CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2012-09-11 11:51 EDT by Thomas Wolf CLA
Modified: 2012-09-28 11:07 EDT (History)
0 users

See Also:


Attachments
Patch for a proposed fix (4.20 KB, patch)
2012-09-11 11:51 EDT, Thomas Wolf CLA
no flags Details | Diff
Patch against master fixing a minor oversight in a condition (1.62 KB, patch)
2012-09-17 02:47 EDT, Thomas Wolf CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Wolf CLA 2012-09-11 11:51:27 EDT
Created attachment 220939 [details]
Patch for a proposed fix

Noticed when we added a linked folder containing two subdirectories, one with 20,000+ files, the other with 5,000+ files. UI is blocked in folder refresh for several (about 5 to 6) seconds. That may not appear too much, but is already not acceptable for our customers.

Analysis turned up that Eclipse calls

AbstractDataTreeNode.assembleWith(AbstractDataTreeNode[] oldNodes, AbstractDataTreeNode[] newNodes, boolean keepDeleted)

repeatedly for each file, with oldNodes the files already added to the tree, and newNodes containing exactly one file: the next one to be added. That gives an overall performance of O((n+1)*n/2), i.e. quadratic.

The attached patch fixes this by not going linearly through oldNodes. Instead, it figures out where the next node from newNodes would need to be inserted, and the bulk copies the "skipped" part of oldNodes, resulting in O(n*log(n)) for this case.

Tested this with a linked folder containing two subfolders with 10,000 empty files each. The original algorithm ends up doing about 100,000,000 string comparisons in assembleWith() during the refresh of that linked folder (as expected). The modified algorithm brings this down by a factor of 400 to 250,000 (also as expected). The UI is no longer blocked, refresh of that linked folder runs through smoothly in less than a second.

As I see that AbstractDataTreeNode.assembleWith() hasn't changed in master with respect to 3.7, I suspect this applies to all Eclipse versions.
Comment 1 John Arthorne CLA 2012-09-11 15:07:40 EDT
Thank you for the bug report and patch. Your analysis confused me at first because it is still a linear copy operation. However I realize now you're just looking at complexity of string comparisons. I applied the patch. There were no test failures, and I could notice a 3-4 second time improvement with a wall-clock measurement of creating a link to a folder with 10,000 files in it.

Released to master:

http://git.eclipse.org/c/platform/eclipse.platform.resources.git/commit/?id=3e00d5c7b5f2ea672faf60c276d981015525bf79
Comment 2 Thomas Wolf CLA 2012-09-17 02:47:20 EDT
Created attachment 221135 [details]
Patch against master fixing a minor oversight in a condition

There was a minor oversight in my original patch; nothing serious, but it makes it abandon the binary search branch a bit too early. This patch fixes that.
Comment 3 Thomas Wolf CLA 2012-09-17 02:51:50 EDT
Reopened for consideration of the additional patch.
Comment 4 John Arthorne CLA 2012-09-18 11:19:17 EDT
Pushed to master. Thanks again for your patch.

http://git.eclipse.org/c/platform/eclipse.platform.resources.git/commit/?id=1ee7648e5251c89243cc020a6f22e0cf3d1b73cb
Comment 5 John Arthorne CLA 2012-09-28 11:07:45 EDT
Verified in Kepler M2.