Bug 306170 - [perfs] Regression for FullSourceWorkspaceTypeHierarchyTests#testPerfAllTypes()
Summary: [perfs] Regression for FullSourceWorkspaceTypeHierarchyTests#testPerfAllTypes()
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.6   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: 3.6 RC1   Edit
Assignee: Satyam Kandula CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 286955
  Show dependency tree
 
Reported: 2010-03-17 07:28 EDT by Frederic Fusier CLA
Modified: 2010-05-21 07:01 EDT (History)
1 user (show)

See Also:
frederic_fusier: review+
Olivier_Thomann: review+


Attachments
Proposed patch (6.67 KB, patch)
2010-03-30 02:53 EDT, Satyam Kandula CLA
no flags Details | Diff
Proposed patch (14.87 KB, patch)
2010-05-06 02:50 EDT, Satyam Kandula CLA
no flags Details | Diff
New proposed patch (18.21 KB, patch)
2010-05-07 10:50 EDT, Frederic Fusier CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Frederic Fusier CLA 2010-03-17 07:28:44 EDT
Verifying 36M6 and I201016-0859 perf results, I noticed a regression in FullSourceWorkspaceTypeHierarchyTests#testPerfAllTypes() perf test.

It's more visible on RHEL test machine but also visible on other ones. It seems to have been introduced between builds I20100305-1011 and I20100310-1300.
Comment 1 Frederic Fusier CLA 2010-03-18 07:41:53 EDT
Satyam, please investigate, thanks
Comment 2 Satyam Kandula CLA 2010-03-29 07:01:53 EDT
Fix to bug 289057 is responsible for this slowdown. 
In bug 289057, hashtable computation was changed to include 16 characters instead of 8. In this case, the time to compute the hash is considerably more than the time in case of conflicts. 

One way, I see to fix this is to use the hash according to the size of the hashtable. We do know the size of the hashtable affront. Hence, constructing the appropriate hashtable depending upon the size should help.
Comment 3 Satyam Kandula CLA 2010-03-30 02:53:13 EDT
Created attachment 163365 [details]
Proposed patch

This is the patch what I was thinking when I wrote my last comment. Frederic, let me know your thoughts.
Comment 4 Frederic Fusier CLA 2010-03-30 03:26:44 EDT
(In reply to comment #3)
> Created an attachment (id=163365) [details]
> Proposed patch
> 
> This is the patch what I was thinking when I wrote my last comment. Frederic,
> let me know your thoughts.

And do you have perf numbers with this patch?
Why did you choose 1024 for the threshold?
Comment 5 Satyam Kandula CLA 2010-03-30 04:35:05 EDT
> And do you have perf numbers with this patch?
I will try to get the numbers with this. 
> Why did you choose 1024 for the threshold?
1024 is a number I used randomly to show what I have in mind. I need to get a justifiable number.
Comment 6 Satyam Kandula CLA 2010-03-31 02:50:11 EDT
On thinking properly, here is my analysis. 

Using putUnsafely() instead of put() is actually reducing the need of increasing the hash size while reading the CategoryTable from the file. There could be one get() on the CategoryTable in addQueryResult(), if it is queried for EXACT_MATCH. I don't think this should impact much.  Next, there are many get()s' on CategoryTable from EntryResult. These get()s' could really benefit the 16 char hash size. However, these can actually be removed and if we could remove them, there should not be any considerable benefit with 16 char hash size for the CategoryTable. 
I will get some numbers for the offending jar - with EntryResult changes (8 and 16 char hashes). 

At the same time, I think 16 char hash size could help the ResultsTable when the results are many. As for each result, there will be a get/put, the number of conflicts could  make an impact. 
I will get the numbers here too. 

Meanwhile, let me know if you have any comments.
Comment 7 Frederic Fusier CLA 2010-03-31 05:05:51 EDT
(In reply to comment #6)
> On thinking properly, here is my analysis. 
> 
> Using putUnsafely() instead of put() is actually reducing the need of
> increasing the hash size while reading the CategoryTable from the file. There
> could be one get() on the CategoryTable in addQueryResult(), if it is queried
> for EXACT_MATCH. I don't think this should impact much.  Next, there are many
> get()s' on CategoryTable from EntryResult. These get()s' could really benefit
> the 16 char hash size. However, these can actually be removed and if we could
> remove them, there should not be any considerable benefit with 16 char hash
> size for the CategoryTable. 
> I will get some numbers for the offending jar - with EntryResult changes (8 
> and 16 char hashes). 
> 
> At the same time, I think 16 char hash size could help the ResultsTable when
> the results are many. As for each result, there will be a get/put, the number
> of conflicts could  make an impact. 
> I will get the numbers here too. 
> 
> Meanwhile, let me know if you have any comments.

If only using putUnsafely fixes the initial issue, then the safest solution (for the while) is to get rid of the change on the hashCode() as it has other performance slowdown impacts.

I'll wait for your numbers to make my thoughts more precise on this. Please try to be as clear as possible while putting these numbers to make the decision easier to take while reading them, TIA
Comment 8 Satyam Kandula CLA 2010-05-06 02:50:58 EDT
Created attachment 167267 [details]
Proposed patch

Reverted back to make use of 8 bytes instead of 16 bytes for the hashtable. Also modified the code by which some Hashtable get's are removed. 
This has improved the testPerfAllTypes() on some machines and doesn't see to on other machines. It also improved the testSearchAllTypeNames() considerably.
Comment 9 Olivier Thomann CLA 2010-05-06 08:15:13 EDT
Targeting RC1
Comment 10 Frederic Fusier CLA 2010-05-07 10:50:30 EDT
Created attachment 167490 [details]
New proposed patch

New patch based on previous proposed one.

It reverts all changes on HashtableOfObject as we do not want to add more characters when computing the hashCode which means back to initial behavior...

Of course, it keeps the declaration of the new putUnsafely(...) method and its usage in the rehash() method.
Comment 11 Frederic Fusier CLA 2010-05-07 10:57:18 EDT
(In reply to comment #10)
> Created an attachment (id=167490) [details]
> New proposed patch
> 
I ran performance tests using this patch (as well as this Satyam's previous one) and I do not observe any improvement on the offending test.

However, I also do not observe the same regression since v_A39 when we introduced the change in this area, hence I still do not know what really happens on the releng perf test machines showing a regression over 10%... :-(

Analyzing difference between v_A38 and v_A39 does not show any clues on this, but I guess this is due to the fact that the regression is not observable neither on my Linux test box nor on my Thinkpad!

So, the conclusion, here is that I'm going to release this patch and see if the test result will be better on the releng test perfs machines (I cross my fingers).

BTW, even if there's no improvement there, the patch will still be interesting to apply as it makes the results better for the search all type names test :-)
Which is not negligible as this is a real hot spot during the eclipse session start!
Comment 12 Frederic Fusier CLA 2010-05-07 12:40:18 EDT
(In reply to comment #10)
> Created an attachment (id=167490) [details]
> New proposed patch
> 
Released for 3.6RC1 in HEAD stream.

Satyam,

I have released this patch without coming back to you as I wanted to have the code released for the Sunday's I-build which will run the performance tests...

Thanks for the good job, it's not so often that we can have a +5% on search all type names tests :-)
Comment 13 Frederic Fusier CLA 2010-05-07 12:42:02 EDT
Olivier, as I add some code to the initial patch, can you also review?
TIA
Comment 14 Olivier Thomann CLA 2010-05-07 13:05:04 EDT
+1.
Comment 15 Frederic Fusier CLA 2010-05-11 09:36:26 EDT
The perf results for build I20100509-0800 clearly show that the TypeHierarchy test is now back to normal :-)

They also confirm the improvement for the search all type names tests which are now between +5.4% and +13.5% !

Great job Satyam !
Comment 16 Frederic Fusier CLA 2010-05-17 09:19:35 EDT
(In reply to comment #15)
> The perf results for build I20100509-0800 clearly show that the TypeHierarchy
> test is now back to normal :-)
> 
Same for perf results of RC1 build