Community
Participate
Working Groups
Created attachment 228517 [details] Screen shot I ran a profiling to use the SaxContextParser and found "SimpleDateFormat" is created every time a start element is hit. That takes about 50% of the computing time for the SaxContextParser. See attached picture. This should be filed as a subtask of 356109
Created attachment 228518 [details] Patch for bug Fix for bug.
Hi Casey, thanks for the patch! Would you mind pushing it to Gerrit [1]? [1] http://wiki.eclipse.org/Mylyn_Contributor_Reference#Reviews
What version of Mylyn did you profile? I ask because I see "ca.ubc.cs.mylyn" in the screenshot. Also, note that SimpleDateFormat is not threadsafe so we'd need to make sure the field is only accessed by one thread.
I profiled the master from git.
Casey's review: https://git.eclipse.org/r/#/c/15386/ (In reply to comment #3) > Also, note that SimpleDateFormat is not threadsafe so > we'd need to make sure the field is only accessed by one thread. This could be accomplished by using org.apache.commons.lang.time.FastDateFormat, but only if used for formatting. See https://git.eclipse.org/r/#/c/15272/1/org.eclipse.mylyn.reviews.ui/src/org/eclipse/mylyn/internal/reviews/ui/providers/ReviewsLabelProvider.java for an example
(In reply to comment #3) > I see "ca.ubc.cs.mylyn" in > the screenshot. I am still very curious about this. I assume that was the package name used before Mylyn became an Eclipse project.
I am using mylyn core in my own Ph.D. research project. I am a student at UBC, so it is labeled very similar to the original project name. My research project needs to open about 5000 context files. It was taking too long. So I profiled and found the biggest sink and fixed it.
Do you want me to make this update? (In reply to comment #5) > Casey's review: https://git.eclipse.org/r/#/c/15386/ > > (In reply to comment #3) > > Also, note that SimpleDateFormat is not threadsafe so > > we'd need to make sure the field is only accessed by one thread. > > This could be accomplished by using > org.apache.commons.lang.time.FastDateFormat, but only if used for > formatting. See > https://git.eclipse.org/r/#/c/15272/1/org.eclipse.mylyn.reviews.ui/src/org/ > eclipse/mylyn/internal/reviews/ui/providers/ReviewsLabelProvider.java for an > example
I see, thanks for the explanation. It's great that this improves the performance; it would be a valuable contribution. But I don't think it's guaranteed to be accessed from only one thread at a time. It's not clear to me that we can use FastDateFormat because there's a "slight" incompatibility with SimpleDateFormat to do with time zones, and I don't know what "Only formatting is supported" means. Unless we can make sure that FastDateFormat will work here, I think we should either synchronize access to the formatter by using a synchronized get method or use a ThreadLocal field. The latter might be faster. Tomek, what do you think? I don't know how much this will impact the performance improvement but we'd need to make sure there was still an improvement after making that change. I wonder if there are other places where we could make the same optimization?
How do I show there is still an improvement after making the change? When I made the change to my own project, the profiler didn't show any percentage SimpleDateFormat.<init>. I assume was was sped up by 19%. (In reply to comment #9) > I see, thanks for the explanation. It's great that this improves the > performance; it would be a valuable contribution. > > But I don't think it's guaranteed to be accessed from only one thread at a > time. It's not clear to me that we can use FastDateFormat because there's a > "slight" incompatibility with SimpleDateFormat to do with time zones, and I > don't know what "Only formatting is supported" means. Unless we can make > sure that FastDateFormat will work here, I think we should either > synchronize access to the formatter by using a synchronized get method or > use a ThreadLocal field. The latter might be faster. Tomek, what do you > think? I don't know how much this will impact the performance improvement > but we'd need to make sure there was still an improvement after making that > change. > > I wonder if there are other places where we could make the same optimization?
(In reply to comment #9) > But I don't think it's guaranteed to be accessed from only one thread at a time. Sam, can you clarify which code path you are concerned about? I don't see any sharing of the SAX parser instance between threads.
I think you're right Steffen, the SaxContextContentHandler instance is not shared between threads. In that case, if we just make the dateFormat field non-static it should be fine.
(In reply to comment #8) > Do you want me to make this update? See my comments on the review and comment 12 from Sam. (In reply to comment #9) > and I don't know what "Only formatting is supported" means. You cannot use FastDateFormat for parsing a date, as you could do with SimpleDateFormat > I think we should either > synchronize access to the formatter by using a synchronized get method or > use a ThreadLocal field. The latter might be faster. Tomek, what do you > think? I don't know how much this will impact the performance improvement > but we'd need to make sure there was still an improvement after making that > change. Yes, that's why I asked for a simple performance test on the review. > I wonder if there are other places where we could make the same optimization? Casey, have you noticed any other bottle necks with SimpleDateFormat involved? (In reply to comment #10) > How do I show there is still an improvement after making the change? > > When I made the change to my own project, the profiler didn't show any > percentage SimpleDateFormat.<init>. I assume was was sped up by 19%. With a perf test in place we could assert the same without running the profiler, by executing the test without and with your patch applied.
How do I do a performance test? Are there examples of performance test in the test suite? Should I do something like: long startTime = System.currentTimeMillis(); doSomething(); long finishTime = System.currentTimeMillis(); System.out.println("That took: "+(finishTime-startTime)+ " ms"); (In reply to comment #13) > Casey, have you noticed any other bottle necks with SimpleDateFormat > involved? > > (In reply to comment #10) > > How do I show there is still an improvement after making the change? > > > > When I made the change to my own project, the profiler didn't show any > > percentage SimpleDateFormat.<init>. I assume was was sped up by 19%. > > With a perf test in place we could assert the same without running the > profiler, by executing the test without and with your patch applied.
(In reply to comment #14) > Are there examples of performance test in the test suite? Yes, e.g. https://git.eclipse.org/c/mylyn/org.eclipse.mylyn.incubator.git/tree/org.eclipse.mylyn.tests.performance/src/org/eclipse/mylyn/tasks/tests/performance/TaskListPerformanceTest.java#n44
I see how I can write a performance test. The strange part is how do I show the difference between the two implementations? Do you require only one performance test?
Yes, one perf test would be enough. We will run it twice, without and with the patch to see the diff and if it's worth it, the patch will go in. I doubt there is a point in having this test to be part of a bigger suite run regularly [1]. I don't even think we have such tests for Mylyn [2]. Even thoug it's going to be a one-off, that would be a great example for other perf improvements to come. [1] like we used for Eclipse Platform http://archive.eclipse.org/eclipse/downloads/drops/R-3.7.2-201202080800/performance/performance.php [2] I guess AllPerformanceTests are run manually, on demand.
I have made the change I mentioned in comment 12. With that, I think this is clearly a good change and we should merge it even though we don't have a performance test. Thanks very much, Casey, for the contribution!
I am going to still try getting the performance test out there. I am just having a few problems setting up the test environment.
That's good to hear! I would like to get this fix merged soon, though, so that we can bootstrap on it for a little while before the upcoming release. If the performance test is not ready then, we'll be happy to merge it later in a separate review.
I have merged the change. Feel free to submit a review with the performance test and add me and Tomasz as reviewers.
Created attachment 237052 [details] Performance Test Patch I have created a performance test. I have added the patch to the bug.
(In reply to C. Thompson from comment #22) > I have created a performance test. I have added the patch to the bug. Please push it to Gerrit.
Gerrit code review: https://git.eclipse.org/r/#/c/17909/
new Gerrit id: https://git.eclipse.org/r/#/c/17931/
The change-id in the commit message identifies the review. To push a new patch set to a review you need to amend the commit and keep the same change id. Please abandon one of the 2 reviews you pushed. Thanks!
What needs to be done to get this task closed? Should I make another patch and push it to gerrit?
Currently you have the following 2 reviews: 17909: https://bugs.eclipse.org/bugs/show_bug.cgi?id=403531 [Ia8298fea] https://git.eclipse.org/r/#/c/17909/ 17931: 403531: Made changes according to comments on gerrit Bug: 403531 Task-Url: https://bugs.eclipse.org/bugs/show_bug.cgi?id=403531 [Id1113b56] https://git.eclipse.org/r/#/c/17931/ The second depends on the first. Instead we should have one review which contains the changes from both and has a summary describing the change, e.g. "add a performance test for SaxContextParser" I suggest that you restore the first review, abandon the second review, and ammend the first review so it includes the changes from the second.