Community
Participate
Working Groups
By accident we were creating a job with a long name (3000 chars). This has dramatically slowed down the system. The culprit is ProgressManagerUtil.shortText(). The algorithm to compute the short text assumes that the text is alrady short and performs an expensive gc.textExtent() in a loop.
As per http://wiki.eclipse.org/Platform_UI/Bug_Triage_Change_2009
Wow, yeah. The algorithm in shortenText looks really inefficient. I can think of a few options for speeding it up: 1. Do a binary search on the string size rather than a linear search. 2. Use textExtent to measure the character widths (rather than the width of the entire string) and estimate the string size by taking their sum. Then correct the result afterward. Note that the inner loop also calls linear-time operations like substring. We should also try to avoid that.
Mass move to 4.7 as M7 is approaching. Please move back in case you are planning to fix it for Neon.
Mikael, something for you?
Ok, the algorithm is not the best I've ever seen, but the initial report was about significant slow down when the job name is long. With the current version of the shortenText, as soon as the name is longer to what can be displayed, the cost is more or less the same (be the name 10x times too long or 1000x too long). (In reply to Stefan Xenos from comment #2) > I can think of a few options for speeding it up: > 1. Do a binary search on the string size rather than a linear search. Agreed, but is it worth the cost? > 2. Use textExtent to measure the character widths (rather than the width of > the entire string) and estimate the string size by taking their sum. Then > correct the result afterward. The character widths can also be measure with gc.getFontMetrics().getAverageCharWidth(). Also, the average char width is already approximated to determine the number characters that will be needed to be dropped (charsToClip). Retaining only 95% of this number may be a bit too conservative. We can tweak it to 98 or 99, and it will prevent a lot of the loops. > Note that the inner loop also calls linear-time operations like substring. > We should also try to avoid that. The best thing I can think of would be to do the concatenation and the substring at once. Don't know how to avoid a linear operation here, as gc.textExtent takes a string. String s1 = textValue.substring(0, start); String s2 = textValue.substring(end, length); String s = s1 + ellipsis + s2; to String s = new StringBuilder(start + ellipsis.length + (length - end) + 1) .append(textValue, 0, start) .append(ellipsis) .append(textValue, end, length) .toString();
> Agreed, but is it worth the cost? What sort of cost concerns you? Its CPU time should be faster.
I'm skeptical about whether or not the suggestions in comment 5 would be sufficient to fix this... but a benchmark would answer the question unambiguously.
(In reply to Stefan Xenos from comment #7) > I'm skeptical about whether or not the suggestions in comment 5 would be > sufficient to fix this... but a benchmark would answer the question > unambiguously. I think this is not sufficient, here is my little test: package org.eclipse.ui.internal.progress; import org.eclipse.swt.SWT; import org.eclipse.swt.widgets.Display; import org.eclipse.swt.widgets.Shell; import org.eclipse.swt.widgets.Text; public class ShortPerformanceTest { public static void main(String[] args) { Shell shell = new Shell(Display.getDefault()); Text text = new Text(shell, SWT.NONE); StringBuilder b = new StringBuilder(); for (int i = 0; i < 20000; i++) { b.append("abcdef");//$NON-NLS-1$ } for (int i = 0; i < 5; i++) { long start = System.currentTimeMillis(); ProgressManagerUtil.shortenText(b.toString(), text); long end = System.currentTimeMillis(); System.out.println("Duration: " + (end - start)); //$NON-NLS-1$ } } } Here is the result, the suggested change did not improve the runtime: Old Duration: 4914 Duration: 4220 Duration: 3996 Duration: 4285 Duration: 4660 Mikaels suggestion Duration: 4277 Duration: 5047 Duration: 4312 Duration: 4547 Duration: 4538
Lars, Replacing int charsToClip = Math.round(0.95f * length * (1 - ((float) maxWidth / maxExtent))); by int charsToClip = Math.round(0.98f * length * (1 - ((float) maxWidth / maxExtent))); alone gives me the following results: Before: Duration: 2781 Duration: 2669 Duration: 2666 Duration: 2671 Duration: 2674 After: Duration: 676 Duration: 562 Duration: 560 Duration: 563 Duration: 554 5x faster, I call it an improvement ;) However, I agree that using the StringBuilder does not really help.
(In reply to Mikaël Barbero from comment #9) > alone gives me the following results: Cool, can you upload a Gerrit for the change?
I've done more testing to make sure that the change does not introduce change in the behavior. So far it looks good, and we can even go further on performance improvements. By setting the ratio to 0.9995f, here are the results of my testing (units for stats is milliseconds): >> Test with string.length=20 Stats for v1 Stats{average=10.30, min=0, max=95} Stats for v2 Stats{average=0.90, min=0, max=5} >> Test with string.length=100 Stats for v1 Stats{average=0.50, min=0, max=1} Stats for v2 Stats{average=0.30, min=0, max=1} >> Test with string.length=200 Stats for v1 Stats{average=1.50, min=0, max=3} Stats for v2 Stats{average=0.70, min=0, max=3} >> Test with string.length=3000 Stats for v1 Stats{average=15.40, min=11, max=32} Stats for v2 Stats{average=1.40, min=1, max=2} >> Test with string.length=30000 Stats for v1 Stats{average=241.00, min=232, max=255} Stats for v2 Stats{average=9.00, min=8, max=11} >> Test with string.length=300000 Stats for v1 Stats{average=15870.30, min=14967, max=17590} Stats for v2 Stats{average=95.20, min=82, max=117} >> Test with WS string.length=14 Stats for v1 Stats{average=0.30, min=0, max=1} Stats for v2 Stats{average=0.20, min=0, max=1} >> Test with WS string.length=444 Stats for v1 Stats{average=1.80, min=1, max=3} Stats for v2 Stats{average=0.80, min=0, max=1} >> Test with WS string.length=2220 Stats for v1 Stats{average=10.30, min=8, max=14} Stats for v2 Stats{average=2.20, min=2, max=3} >> Test with WS string.length=44400 Stats for v1 Stats{average=589.70, min=502, max=695} Stats for v2 Stats{average=14.00, min=13, max=15} >> Test with WS string.length=444000 Stats for v1 Stats{average=36970.40, min=36076, max=38824} Stats for v2 Stats{average=143.10, min=136, max=152} As you can see, for very long strings, the improvement is up to x250 (without regressions on the behavior). Patch is coming.
New Gerrit change created: https://git.eclipse.org/r/84824
Created attachment 265304 [details] Test class
The performance test provided by lars produces artificially optimistic results since it uses a 0-width text box and a string that is very uniform. Here's a better test that uses an asymmetric string and a text box with nonzero size: package org.eclipse.ui.internal.progress; import org.eclipse.swt.SWT; import org.eclipse.swt.widgets.Display; import org.eclipse.swt.widgets.Shell; import org.eclipse.swt.widgets.Text; public class ShortPerformanceTest { static final int iterations = 300; public static void main(String[] args) { Shell shell = new Shell(Display.getDefault()); Text text = new Text(shell, SWT.NONE); text.setBounds(0, 0, 1001, 1000); StringBuilder b = new StringBuilder(); for (int i = 0; i < 20000; i++) { b.append("abcdef");//$NON-NLS-1$ } for (int i = 0; i < 20000; i++) { b.append("iiiii");//$NON-NLS-1$ } for (int i = 0; i < iterations; i++) { long start = System.currentTimeMillis(); ProgressManagerUtil.shortenText(b.toString(), text); long end = System.currentTimeMillis(); System.out.println("Duration(1): " + (end - start)); //$NON-NLS-1$ } } }
New Gerrit change created: https://git.eclipse.org/r/84851
I've attached an alternative implementation that uses newton's method to compute the size of the string. It has several benefits over the earlier attachment: 1. It converges much faster (it converges in 3 iterations in my tests rather than 53). 2. It can handle strings where the initial guess is an underestimate (unlike the current implementation which needs to overestimate on its first guess). 3. It can produce tighter bounds (since it grows by 1 character each iteration rather than 2). Unfortunately, it's only about 6% faster than the first attachment since - in both cases - the majority of runtime is going into the very first call to textExtent. We could eliminate that first expensive call by doing a one-time guess of the string size (based on the average character width in the font) and then measuring the string produced by that operation. After that, we could either go directly into newton's method or repeatedly double the string length. I'll pick this up another night.
(In reply to Stefan Xenos from comment #14) > The performance test provided by lars produces artificially optimistic > results since it uses a 0-width text box and a string that is very uniform. Agreed. The class provided in comment #13 was fixing it (by setting the shell size and using a FillLayout). Results in comment #11 were build with this class. Find below a comparison between your patch and mine on this benchmark: >> Test with string.length=20 Stats for patch 84824 Stats{average=11.10, min=0, max=107} <- this one is weird. JIT warmup? Stats for patch 84851 Stats{average=1.10, min=0, max=4} >> Test with string.length=100 Stats for patch 84824 Stats{average=0.30, min=0, max=1} Stats for patch 84851 Stats{average=0.40, min=0, max=1} >> Test with string.length=200 Stats for patch 84824 Stats{average=0.70, min=0, max=2} Stats for patch 84851 Stats{average=1.80, min=1, max=5} >> Test with string.length=3000 Stats for patch 84824 Stats{average=3.00, min=2, max=5} Stats for patch 84851 Stats{average=3.00, min=1, max=5} >> Test with string.length=30000 Stats for patch 84824 Stats{average=10.70, min=9, max=13} Stats for patch 84851 Stats{average=8.40, min=8, max=9} >> Test with string.length=300000 Stats for patch 84824 Stats{average=87.70, min=82, max=99} Stats for patch 84851 Stats{average=79.00, min=73, max=87} >> Test with WS string.length=14 Stats for patch 84824 Stats{average=0.50, min=0, max=2} Stats for patch 84851 Stats{average=0.30, min=0, max=1} >> Test with WS string.length=444 Stats for patch 84824 Stats{average=0.90, min=0, max=2} Stats for patch 84851 Stats{average=1.00, min=1, max=1} >> Test with WS string.length=2220 Stats for patch 84824 Stats{average=1.50, min=1, max=2} Stats for patch 84851 Stats{average=1.60, min=1, max=3} >> Test with WS string.length=44400 Stats for patch 84824 Stats{average=15.70, min=14, max=18} Stats for patch 84851 Stats{average=14.50, min=13, max=17} >> Test with WS string.length=444000 Stats for patch 84824 Stats{average=146.00, min=138, max=164} Stats for patch 84851 Stats{average=122.90, min=118, max=133} Note that there is a small regression on all case that use "Lorem ipsum" (Test with WS): Before and with my patch: "Lorem ipsum dolor sit amet, consectetur adipiscing elit, s...culpa qui officia deserunt mollit anim id est laborum" With your patch : "Lorem ipsum dolor sit amet, consectetur adipiscing elit, s... culpa qui officia deserunt mollit anim id est laborum" Note the space after the ellipsis. Also, I did not determine if this extra space introduce glitches or was it a bad behavior from the initial algorithm. Besides that, you patch is much more sounded, perform better, and is more open to improvement (with usage of average char width). I will abandon mine.
Created attachment 265323 [details] Updated test class
Results with the latest patch: >> Test with strng.length=20 Stats for v1 Stats{average=8.20, min=0, max=81} Stats for v2 Stats{average=0.10, min=0, max=1} >> Test with string.length=100 Stats for v1 Stats{average=0.40, min=0, max=1} Stats for v2 Stats{average=0.20, min=0, max=1} >> Test with string.length=100 Stats for v1 Stats{average=0.10, min=0, max=1} Stats for v2 Stats{average=0.20, min=0, max=1} >> Test with string.length=200 Stats for v1 Stats{average=0.40, min=0, max=1} Stats for v2 Stats{average=0.40, min=0, max=1} >> Test with string.length=3000 Stats for v1 Stats{average=0.60, min=0, max=1} Stats for v2 Stats{average=1.40, min=1, max=2} >> Test with string.length=30000 Stats for v1 Stats{average=0.40, min=0, max=1} Stats for v2 Stats{average=12.30, min=12, max=13} >> Test with string.length=660000 Stats for v1 Stats{average=1.60, min=1, max=2} Stats for v2 Stats{average=309.00, min=308, max=313} >> Test with string.length=660000 Stats for v1 Stats{average=1.50, min=1, max=2} Stats for v2 Stats{average=277.90, min=263, max=314} >> Test with string.length=140000 Stats for v1 Stats{average=0.60, min=0, max=1} Stats for v2 Stats{average=60.30, min=58, max=69} >> Test with string.length=140000 Stats for v1 Stats{average=1.40, min=1, max=3} Stats for v2 Stats{average=71.20, min=66, max=78} >> Test with WS string.length=14 Stats for v1 Stats{average=0.10, min=0, max=1} Stats for v2 Stats{average=0.10, min=0, max=1} >> Test with WS string.length=444 Stats for v1 Stats{average=0.50, min=0, max=1} Stats for v2 Stats{average=0.30, min=0, max=1} >> Test with WS string.length=2220 Stats for v1 Stats{average=0.40, min=0, max=1} Stats for v2 Stats{average=0.90, min=0, max=1} >> Test with WS string.length=44400 Stats for v1 Stats{average=0.40, min=0, max=1} Stats for v2 Stats{average=16.50, min=16, max=17} >> Test with WS string.length=444000 Stats for v1 Stats{average=0.60, min=0, max=1} Stats for v2 Stats{average=160.40, min=157, max=167}
Note that in comment 19 v1 is my new patch and v2 is the patch you attached earlier... so both are expected to outperform the code in master now. As you noted, the very first result should be discarded due to JIT issues. Even in the most extreme cases, this new algorithm runs in under 2ms, which beats the code in master by well over 100x.
Gerrit change https://git.eclipse.org/r/84851 was merged to [master]. Commit: http://git.eclipse.org/c/platform/eclipse.platform.ui.git/commit/?id=62b818369a1dfc551ef1ffdb102760b7087a8b6c