Bug 110060 - [plan][search] Add support for Camel Case search pattern
Summary: [plan][search] Add support for Camel Case search pattern
Status: VERIFIED FIXED
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.1   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.2 M3   Edit
Assignee: Frederic Fusier CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-09-20 12:54 EDT by Jerome Lanneluc CLA
Modified: 2005-10-31 06:45 EST (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jerome Lanneluc CLA 2005-09-20 12:54:18 EDT
I20050920

Search should provide a match rule that is a Camel Case rule so that a user can
search for references to "NPE" in Camel Case mode, and references to
NullPointerException will be returned.
Comment 1 Jerome Lanneluc CLA 2005-09-20 12:55:31 EDT
All types of search should support this: declaration, references, search all
types, etc ...
Comment 2 Frederic Fusier CLA 2005-10-14 09:18:51 EDT
Dirk,
I'm working on this new Camel Case search, pattern.

Here's following additional constant value I'm thinking to add to SearchPattern API:
/**
 * Match rule: The search pattern contains a Camel Case expression.
 * For example, <code>NPE</code> type string pattern will match
 * <code>NullPointerException</code> type.
 * @see CharOperation#camelCaseMatch(char[], char[]) for a detailed explanation
 * of Camel Case matching.
 *<br>
 * Can be combined to {@link #R_PREFIX_MATCH} match rule. For example,
 * when prefix match rule is combined with Camel Case match rule,
 * <code>"npE"</code> pattern will match <code>npException</code>.
 *<br>
 * Match rule {@link #R_PATTERN_MATCH} may also be specified but will not be 
 * combined with Camel Case match rule as these two rules are mutually 
 * exclusive.
 * Used match rule depends on whether string pattern contains specific pattern 
 * characters (e.g. '*' or '?') or not. If it does not, then Camel Case match 
 * rule will override Pattern one, otherwise Pattern match rule will override 
 * Camel Case one.
 * For example, with <code>"NPE"</code> string pattern, search will only use
 * Camel Case match rule. With <code>Nu*Po*Ex*</code> string pattern, it will 
 * use only Pattern match rule.
 *<br>
 * Note that {@link #R_CASE_SENSITIVE} match rule will not have any effect 
 * combined with Camel Case one. This is due to the fact that Camel Case match 
 * rule already uses case sensitive comparison.
 *<br>
 * 
 * @since 3.2
 */
public static final int R_CAMELCASE_MATCH = 0x0080;

I think that this should match requirement for Open Type Dialog if you use
combined rule SearchPattern.R_CAMELCASE_MATCH | SearchPattern.R_PREFIX_MATCH.

Please confirm or comment if you think I'm missing something, thx
Comment 3 Frederic Fusier CLA 2005-10-14 09:19:54 EDT
Jim,

May you review my Javadoc comment for this new API?

Thanks
Comment 4 Dirk Baeumer CLA 2005-10-17 09:15:42 EDT
Frederic, a couple of comments:

can you explain the following sentence:

 * Can be combined to {@link #R_PREFIX_MATCH} match rule. For example,
 * when prefix match rule is combined with Camel Case match rule,
 * <code>"npE"</code> pattern will match <code>npException</code>.

For me npE using *only* prefix match would already find npExcption.

From the PRs I got regarding camel case there are two additional requests where
we should interpret patterns as camel case as well:

o "NuPoEx": should be interpreted as a camel case pattern as well. I discussed 
  this already with Philippe and we decided to not support it in the first 
  place.

o "npe": should be interpreted as a camel case pattern as well. Although this
  is handy it will result in lots of matches since almost every pattern the 
  user types in will be more or less a camel case pattern. What is your take
  here ?

Additionally to be able to specify the camel case pattern, results should be
annotated to show if they are a camel case match or a normal prefix match. For
example if the user types "IMethod" as a pattern we will interpret it as a camel
case pattern and it will match InnerMethod as well. However since IMethod is an
exact match the match should appear in the open type dialog before any camel
case matches.

We would also need API to find out what kind a certain pattern is. Otherwise UI
has to implement this and the algorithm might end up being different from the
core one (especially for camel case). Something like

int SearchPattern#getPatternType(String pattern) 

The method would return R_CAMELCASE_MATCH for NPE, R_PATTERN_MATCH for N*P*E and
R_PREFIX_MATCH for all other strings.
Comment 5 Frederic Fusier CLA 2005-10-17 10:29:33 EDT
(In reply to comment #4)
> can you explain the following sentence:
> 
>  * Can be combined to {@link #R_PREFIX_MATCH} match rule. For example,
>  * when prefix match rule is combined with Camel Case match rule,
>  * <code>"npE"</code> pattern will match <code>npException</code>.
> 
> For me npE using *only* prefix match would already find npExcption.
Sorry, my example was not good enough. Let take "nPE" for pattern. If you
combine R_CAMELCASE_MATCH and R_PREFIX_MATCH match rule values, you can get
"nullPointerException" and "nPException" results with only one search request.
First result matches only camel case pattern and second result matches only
prefix one...

> 
> From the PRs I got regarding camel case there are two additional requests where
> we should interpret patterns as camel case as well:
> 
> o "NuPoEx": should be interpreted as a camel case pattern as well. I discussed 
>   this already with Philippe and we decided to not support it in the first 
>   place.
Agree
> 
> o "npe": should be interpreted as a camel case pattern as well. Although this
>   is handy it will result in lots of matches since almost every pattern the 
>   user types in will be more or less a camel case pattern. What is your take
>   here ?
Currently, for pattern match (ie. with '*' or '?' characters), this is not
SearchPattern but clients which look at string patterns and set R_PATTERN_MATCH
bit accordingly. I think it should be the same for Camel Case pattern: that is
clients responsibility to let SearchPattern know that a Camel Case search must
be done and not SearchPattern to infer it from string.
So, if clients want to specify that "npe" is a Camel Case pattern then
R_CAMELCASE_MATCH bit must be set explicitely in given matchRule parameter.

> 
> Additionally to be able to specify the camel case pattern, results should be
> annotated to show if they are a camel case match or a normal prefix match. For
> example if the user types "IMethod" as a pattern we will interpret it as a camel
> case pattern and it will match InnerMethod as well. However since IMethod is an
> exact match the match should appear in the open type dialog before any camel
> case matches.
Hmm, we do not have implemented this support yet. Currently, CharOperation
implements camelCase methods which only returns true or false. There's no way to
know, when these methods return true, whether the match was exact, prefix or
camel...
Modify this behavior may have performance impact and I'm not sure this is
absolutely necessary to have this result in first place.
Perhaps, this may be considered while implementing bug 79866...
> 
> We would also need API to find out what kind a certain pattern is. Otherwise UI
> has to implement this and the algorithm might end up being different from the
> core one (especially for camel case). Something like
> 
> int SearchPattern#getPatternType(String pattern) 
> 
> The method would return R_CAMELCASE_MATCH for NPE, R_PATTERN_MATCH for N*P*E and
> R_PREFIX_MATCH for all other strings.
OK, sounds a good idea to centralize this in SearchPattern. Furthermore, I think
we need to verify that match rule bits and string are compatible.
For example, match rule R_PATTERN_MATCH set without any '*' or '?' character.
Or R_CAMELCASE_MATCH and R_PATTERN_MATCH bits set simultaneously.
SearchPattern could simply return null when a string and/or match rule pattern
are invalid or try the best rule which can be applied on given string...
Comment 6 Dirk Baeumer CLA 2005-10-17 11:27:01 EDT
I have another questions regarding CharOperation#camelCaseMatch: it is specified
that the tail after the sequence of the upper case characters is matched case
sensitive. I would prefer matching at case insensitive. e.g NPExCeption should
match NullPointerException. Any reason why it is defined to be case sensitive ?
Comment 7 Philipe Mulet CLA 2005-10-17 11:55:09 EDT
Current behavior is matching existing implementation for open type dialog.
Did you get many complaint for this behavior ?
Comment 8 Frederic Fusier CLA 2005-10-17 12:19:07 EDT
Dirk, here's the summary of our conf'call (please let me know if I missed
something):

1) Clients will always set R_CAMELCASE_MATCH, R_PATTERN_MATCH and R_PREFIX_MATCH
bits for match rule. SearchPattern will then choose the best match depending on
string pattern contents. The reason to do so is to avoid algorithm duplication.
That means that there won't be new API method getPatternType() on SearchPattern.

2) "npe" will not be considered as a Camel Case for first place (it currently
does not in Open Type dialog implementation). Not sure if we really agreed on
this point but it sounds more reasonable to defer this kind of behavior after
initial support implementation and come back after having played with it a
little bit.

3) SearchEngine will return pattern rule used while finding the SearchMatch (ie.
R_CAMELCASE_MATCH, R_PREFIX_MATCH or R_PATTERN_MATCH). It will be stored in
SearchMatch.rule field and will be accessible using already existing API getter
getRule(). Note that this point is handled by bug 79886 and may be addressed
later (milestone M4 or M5).
Comment 9 Frederic Fusier CLA 2005-10-17 12:22:13 EDT
Of course, please read bug 79866 instead of 79886 in previous comment...
Comment 10 Dirk Baeumer CLA 2005-10-17 12:22:18 EDT
Here is what the open type dialog currently does: NPException matches
NullPointerExCeption however NPExCeption doesn't match NullPointerException. The
current implementation only treats a pattern as a camel case pattern and a tail
if the tail is all lower case, but the actual matching is then done case
insensitive. 

However, now that I am thinking of xCeption could be interperted as a tail as
well in pattern NPExCeption. What do you think ?
Comment 11 Dirk Baeumer CLA 2005-10-17 12:24:38 EDT
One little addition: the functionality described has to be provided for the all
types search as well, not only for reference and declaration search (which
already have the concept of a search match).
Comment 12 Frederic Fusier CLA 2005-10-18 13:16:41 EDT
Fixed and released in HEAD.

Dirk, I've added an reminder in bug 79866 to be sure to address the point you
highlighted in comment 11...
Comment 13 Jerome Lanneluc CLA 2005-10-31 06:45:24 EST
Verified for 3.2 M3 using build I20051031-0010