Community
Participate
Working Groups
Created attachment 136631 [details] WordRule performance patch The WordRule class is used by various text editors for partitioning and colouring. It is also used by some incremental builders for parsing the files. I've found a major performance problem when using the class in case-insensitive mode. The problem arises from this block of code: if (token == null && fIgnoreCase) { Iterator iter= fWords.keySet().iterator(); while (iter.hasNext()) { String key= (String)iter.next(); if(buffer.equalsIgnoreCase(key)) { token= (IToken)fWords.get(key); break; } } } So for every word in the rule .equalsIgnoreCase() is called, which will convert the key chars to lowercase before comparing. As you can imagine, this is a very slow piece of code when you have a lot of words to iterate over, and does not scale well compared to a map. In the attached patch, I convert any added words to lower case as they are initially added, and then convert the buffer to lower case (just once). This means that we can remove the above code altogether, in favour of using a lookup on the hashmap as is done for normal case-sensitive words. I have also taken the opportunity to replace StringBuffer with the more efficient StringBuilder. In my testing, this simple change increased the performance of my document parsing by a factor of 30!! That time includes reading the file, converting to a document, applying the rules and then printing out to the console on a sample of 300 source files. I admit that my case may be unusual as I have a couple of hundred case-insensitive words, but I definitely think this is a patch worth applying pre-3.5 release, as it does not affect the API anyway and the performance gain is so enormous.
>I have also taken the opportunity to replace StringBuffer with the more >efficient StringBuilder. Unfortunately this is a no go as the bundle must run on 1.4. Can you please attach a performance test case for this.
Created attachment 136639 [details] Performance test
Its not a test as such, but you can see how to make it into one. I called the new implementation MyWordRule. You can vary the parameters, the current settings are: - 100,000 line document with every line the being same 6-character word that will be found - 200 words added to the rule - 10 iterations to average over With these settings, I get the following result on my PC: Average of 10 iterations: Existing implementation: 3827ms New implementation: 307ms i.e. the new implementation is easily an order of magnitude faster.
Created attachment 136640 [details] Test class Corrected a mistake in the word detector. It should find the word, so allow both letters and digits. It is interesting to vary the wordInDocument paramenter from AtEsT0 to AtEsT+(numWords-1). This makes a huge difference to the existing implementation, but little difference to the new implementation because it uses a map. "AtEsT0": Existing implementation: 917ms New implementation: 85ms "AtEsT199": Existing implementation: 1610ms New implementation: 105ms
Doug, please make sure to attach a patch that doesn't use StringBuilder and that the org.eclipse.jface.text.tests.rules.WordRuleTest (in 'org.eclipse.jface.text.tests' plug-in) pass.
Created attachment 136645 [details] WordRule performance patch Attached version without using StringBuilder. Sorry, can't run the tests as I have an old environment (3.3) and it will be a real pain to install all the latest Eclipse at the moment. The patch is very small though so you should be able to try it yourself very easily if you have the right env.
Just checking, what is the progress with this? I can think of one very popular language that might get quite a performance boost from this patch! Thanks.
It's tagged 3.6, so I'll look at it sometime during this release. However, currently I'm busy doing other stuff.
Good patch Doug! I committed it to HEAD with some minor changes: - removed unused java.util.Iterator import - removed space before assignment operator - added you to the copyright notice: Doug Satchwell <doug.satchwell@ymail.com> - [implementation] Performance issue with jface text WordRule - http://bugs.eclipse.org/277299 and added 'org.eclipse.jdt.text.tests.performance.WordRulePerformanceTest' based on your Test class.