Bug 114363 - [expressions] Provides equals and hashCode on Expression subclasses
Summary: [expressions] Provides equals and hashCode on Expression subclasses
Status: RESOLVED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: UI (show other bugs)
Version: 3.1   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: 3.2 M6   Edit
Assignee: Dirk Baeumer CLA
QA Contact:
URL:
Whiteboard:
Keywords: api, performance
Depends on:
Blocks: 126901
  Show dependency tree
 
Reported: 2005-10-31 09:41 EST by Douglas Pollock CLA
Modified: 2006-03-15 18:32 EST (History)
4 users (show)

See Also:


Attachments
Proposed Patch (24.86 KB, patch)
2006-02-22 12:57 EST, Douglas Pollock CLA
no flags Details | Diff
Removes checks for null (24.86 KB, patch)
2006-02-27 16:19 EST, Douglas Pollock CLA
no flags Details | Diff
Revised Patch with Tests (39.83 KB, patch)
2006-02-27 17:57 EST, Douglas Pollock CLA
no flags Details | Diff
Patch that adds the computeHashCode method (13.83 KB, patch)
2006-02-28 05:38 EST, Dirk Baeumer CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Douglas Pollock CLA 2005-10-31 09:41:29 EST
It would be nice to have equals(Object) and hashCode() on the subclasses of 
Expression provided by "org.eclipse.core.expressions".  This allows another 
optimization where equivalent expressions can be detected, thereby avoiding 
unnecessary duplicate computations.
Comment 1 Douglas Pollock CLA 2006-02-21 16:05:10 EST
I believe the solution for Bug 126901 relies on having an implementation of equals(Object) and hashCode() on all instances of Expression.  I can get most of the speed up by providing those methods on my own Expression subclasses, but it won't have any effect on handlers contributed through extension points.

I'll try to provide a patch....
Comment 2 Douglas Pollock CLA 2006-02-22 12:57:18 EST
Created attachment 35163 [details]
Proposed Patch

This patch adds equals() and hashCode() methods to Expression subclasses.  It adds a transient protected member variable to Expression called "hashCode", which subclasses can use to cache the result.  It also adds a few utility static methods for computing equals and hashCode.
Comment 3 Dirk Baeumer CLA 2006-02-23 09:06:40 EST
Douglas, thanks a lot for the patch.

Two quick questions: do you have numbers how this will help for bug 126901. And would it be possible to provide test cases as well.

I will look into releasing the patch tomorrow so that we have something in the next I-Build.
Comment 4 Douglas Pollock CLA 2006-02-23 11:37:26 EST
What are you looking for in the way of tests?  I must confess that I don't relish writing an exhaustive test of every possible expression pair using "equals()".
Comment 5 Dirk Baeumer CLA 2006-02-24 05:05:39 EST
Hi Douglas,

since most of the attributes in the expression are mandantory I was looking for:

- testing that two objects are not equal.

- and if two objects are equal that they produce the same hashcode value.

That would help ensuring the spec and detecting cases that someone adds additional attributes but forgets to add them to equals and hashcode method.
Comment 6 Dirk Baeumer CLA 2006-02-24 05:40:17 EST
Douglas, I looked at the patch and I have two questions:

- the code is written in a way that the fields can be null. However all fields
  except the expected value field in the test expression and the list of 
  children in composite expression can't be null. I would prefer code that takes 
  this into account. Otherwise someone who is looking at the code in the future 
  might be confused.

- Composite expression provides a hashCode which is reused in some sub classes. 
  Wouldn't it be better to copy the method down as well to always have the
  equal/hashcode pair together. For example the AND expession doesn't have
  additional attributes defines it own equals but inherits the hashcode.

An additional node to the fields not being null: the constructor that is used during normal Eclipse execution is the one which takes an IConfigurationElement. This constructor throws a core exception if an argument is null. The constructor which takes object arguments is for testing purpose only and some of the constructors don't have an assert in it which was simply laziness on my end. I will correct this.
Comment 7 Dirk Baeumer CLA 2006-02-27 14:08:33 EST
Douglas, what is the state here at your end. I can release something tomorrow morning, but if the support isn't needed I will hold it back.
Comment 8 Douglas Pollock CLA 2006-02-27 15:17:05 EST
(In reply to comment #6)
> - the code is written in a way that the fields can be null. However all
> fields except the expected value field in the test expression and the list
> of children in composite expression can't be null. I would prefer code that
> takes this into account. Otherwise someone who is looking at the code in the
> future might be confused.

Okay, I'll look at doing this now.


> - Composite expression provides a hashCode which is reused in some sub
> classes. Wouldn't it be better to copy the method down as well to always
> have the equal/hashcode pair together. For example the AND expession doesn't
> have additional attributes defines it own equals but inherits the hashcode.

OrExpression is not equivalent to AndExpression, even if they have the same fExpressions.  However, it is safe to use the same hashCode for the two of them.  It seems weird to write duplicate hashCode implementations for multiple CompositeExpression subclasses.
Comment 9 Douglas Pollock CLA 2006-02-27 16:19:03 EST
Created attachment 35436 [details]
Removes checks for null

I find it a bit odd to removing defence against null, but here you go.  As a note, several constructors do not defend against null.  At least one evaluate method checks for null, when null is theoretically not possible.

I've still not seen performance results since committing my part of the patch, so I have no answer as to whether equals and hashCode will part of the final solution for 3.2.
Comment 10 Douglas Pollock CLA 2006-02-27 17:57:50 EST
Created attachment 35445 [details]
Revised Patch with Tests

This patch builds upon the last patch.  It adds tests for equals and hashCode.  It tests whether two unequal expressions are detected as such, and it tests whether equal expressions have the same hashCode.

The tests discovered a bug in how the hashCode works for Object[] member variables, which has been corrected in this patch.
Comment 11 Dirk Baeumer CLA 2006-02-28 05:07:09 EST
Douglas, thanks a lot for the patch. I released it for todays integration build with the following modifications:

- reordered some of the equals tests to test for more significant attributes 
  first

- adapted the code to the naming and formatting settings used in 
  core.expressions. This resulted in one change which impacts subclasses: I 
  renamed hashCode to  fHashCode to make it consistent with the rest of the 
  naming. I checked that this doesn't break the build input from Platform/UI for 
  todays I-Build (currently all subclasses of Expression outside of 
  core.expressions have their own hashCode field).

I also updated the constructors to assert against null where appropriate.
Comment 12 Dirk Baeumer CLA 2006-02-28 05:14:45 EST
To avoid that subclasses are seeing fHashCode we can change the code in the following way: 

- we make fHashCode private
- add a final method hashCode to Expressions which calls a method 
  computeHashCode which default implementation forwards to 
  System#identityHashCode.

Let me know what you think and I will change the code if you prefer this solution.

And one additional note: I removed the final from hashCode and equals. Although not official supported their could be clients implementing subclasses of internal classes. In this case they should be able to provide their own equals and hashCode if necessary.
Comment 13 Dirk Baeumer CLA 2006-02-28 05:38:32 EST
Created attachment 35464 [details]
Patch that adds the computeHashCode method
Comment 14 Douglas Pollock CLA 2006-02-28 09:06:40 EST
(In reply to comment #12)
> - we make fHashCode private
> - add a final method hashCode to Expressions which calls a method 
>   computeHashCode which default implementation forwards to 
>   System#identityHashCode.

This sounds good.  If you release this change to HEAD today, I will adapt to the changes.
Comment 15 Dirk Baeumer CLA 2006-02-28 14:03:32 EST
Douglas, I have released the change with one minor change: hashCode isn't final to not break any clients that already provide a hashCode and equals method.
Comment 16 Douglas Pollock CLA 2006-02-28 14:36:39 EST
The static methods on Expression require @since tags, as does the computeHashCode method.
Comment 17 Douglas Pollock CLA 2006-02-28 14:39:13 EST
Similarly for the two protected static variables.  This probably also means this should have gotten PMC approval before it went in.  (argh)
Comment 18 John Arthorne CLA 2006-02-28 16:14:46 EST
Re comment #17 - afraid so ;)  Protected members in an API class are API.
Comment 19 John Arthorne CLA 2006-02-28 16:22:04 EST
The API change could be minimized by moving all the static helpers into a class in the internal package - they don't seem to be required as API, and they aren't used in the Expressions class.  Current list of new API:

protected static final boolean equals(final Object left, final Object right)
protected static final boolean equals(final Object[] leftArray, final Object[] rightArray)
protected static final int hashCode(final Object object)
protected static final int hashCode(final Object[] array)
protected static final int HASH_CODE_NOT_COMPUTED
protected static final int HASH_FACTOR
protected int computeHashCode()
Comment 20 Douglas Pollock CLA 2006-02-28 16:29:18 EST
(In reply to comment #19)
> The API change could be minimized by moving all the static helpers into a
> class in the internal package - they don't seem to be required as API, and
> they aren't used in the Expressions class.  Current list of new API:

They are helpful to downstream plug-ins trying to provide their own implementations of equals and hashCode.
Comment 21 Dirk Baeumer CLA 2006-03-01 10:51:45 EST
Sorry John I fully forgot that we already have API freeze. The minimal new API is:

protected int computeHashCode()

however I agree that the rest of the API is convenient for subclasses of expressions (which is allowed).

To continue I propose the following: Douglas, can you verify that this really helps you solving your performance problem. If so I will write the PMC request. If not we remove the API again (it only went into an I-Build so far). Can you do this today, so that I can react tomorrow. I am on vacation next week.

Reopening the bug report.
Comment 22 Boris Bokowski CLA 2006-03-01 11:57:00 EST
(JohnA is away, back Friday afternoon)
Comment 23 Douglas Pollock CLA 2006-03-01 16:30:22 EST
(In reply to comment #21)
> To continue I propose the following: Douglas, can you verify that this
> really helps you solving your performance problem. If so I will write the
> PMC request.

Please see Bug 126901 comment #8.  I'm seeing a reduction in elapsed process time of 56% using the patches in this bug and Bug 126901.
Comment 24 Dani Megert CLA 2006-03-02 04:48:34 EST
+1 for this API change.
Comment 25 Douglas Pollock CLA 2006-03-03 11:04:07 EST
Philippe:  I'm not sure if Dirk has gone on vacation yet.  If he has, then I would like to request PMC approval for these API changes on his behalf.  Also, as I don't have commit rights to org.eclipse.core.expressions, someone will need to go in and add the @since tags to the methods outlined in comment #19.
Comment 26 Dirk Baeumer CLA 2006-03-03 11:34:46 EST
Douglas, still in. I sent a request document to Philippe and Dani/Martin. The content is:

I am requesting new API for core.expressions. The corresponding bug reports are:

https://bugs.eclipse.org/bugs/show_bug.cgi?id=114363 - for the API request
https://bugs.eclipse.org/bugs/show_bug.cgi?id=126901 - for the performance
problem that revealed the need for the new API.

Short summary: to speed up context switching in the command framework the
expressions should provide equal and hashCode methods to avoid duplicate
evaluation of the expressions.

Since computing the hash code for expressions can be expensive as well
the hash value should be cached by the expression object (which basically
leads to the API request).

Minimal API to add:

protected int computeHashCode()

computes the hash code and remembers the value in a private field. For backward 
compatibility the method calls super.hashCode() as a default implementation.

Additional API to ease writing of HashCode and Equals in subclasses of
Expression (subclassing is allowed and wanted).

// constant for computing hash code.
protected static final int HASH_CODE_NOT_COMPUTED
protected static final int HASH_FACTOR

// Helper method to compute hash codes
protected static final int hashCode(final Object object)
protected static final int hashCode(final Object[] array)

// Helper methods to deal with equals
protected static final boolean equals(final Object left, final Object right)
protected static final boolean equals(final Object[] leftArray, final Object[] rightArray)

We can move these methods to an internal helper class in core.expression. However this would result in a copy of those methods in Platform/UI.

Unfortunately I already released the code to the repository (I simply forgot the API freeze).  However removing the code (or parts of it again) isn't a big issue.

Can you please approve the API request.
Comment 27 Douglas Pollock CLA 2006-03-06 13:54:02 EST
What is going on with this API request?
Comment 28 John Arthorne CLA 2006-03-08 17:17:54 EST
Philippe: could you approve this change?  It is a non-breaking API addition, it's a simple and safe fix, and it yields a noticeable performance improvement (see comment 23).
Comment 29 Dani Megert CLA 2006-03-15 02:35:33 EST
Any update on this one?
Comment 30 Philipe Mulet CLA 2006-03-15 05:22:58 EST
+1 for non-breaking API change
Comment 31 Dirk Baeumer CLA 2006-03-15 06:50:17 EST
Since we have two votes now I will close the bug as fixed.
Comment 32 Douglas Pollock CLA 2006-03-15 14:26:46 EST
@since tags are missing on the new protected static methods on Expression.
Comment 33 Dirk Baeumer CLA 2006-03-15 18:32:18 EST
Thanks Doug. I added the @since 3.2 tags