Bug 80661 - [find/replace] Regex find does not find next match after greedy *
Summary: [find/replace] Regex find does not find next match after greedy *
Status: ASSIGNED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Text (show other bugs)
Version: 3.1   Edit
Hardware: PC Windows XP
: P3 minor (vote)
Target Milestone: ---   Edit
Assignee: Platform-Text-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 327055 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-12-09 19:29 EST by Markus Keller CLA
Modified: 2019-09-06 15:30 EDT (History)
2 users (show)

See Also:


Attachments
Work in Progress (4.05 KB, patch)
2007-07-26 05:26 EDT, Markus Keller CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Markus Keller CLA 2004-12-09 19:29:46 EST
I20041207-0800

In a text file with contents "aaabab", invoke Edit > Find/Replace... and search
for regex "a*". The first match is "aaa", which is fine. But then, Edit > Find
Next (or pressing the Find button again) does not find the second match "a"
between the "b"s.
Comment 1 Dani Megert CLA 2007-07-17 05:32:52 EDT
Mhh. Now it's not even finding the first match anymore.
Comment 2 Markus Keller CLA 2007-07-17 06:29:19 EDT
Still works for me in HEAD for the first match but not for subsequent matches.
Comment 3 Dani Megert CLA 2007-07-17 07:06:54 EDT
Not for me. Probably depends on the VM.
Comment 4 Markus Keller CLA 2007-07-17 11:53:21 EDT
> Not for me. Probably depends on the VM.
... or on the initial caret position in the editor and the state of the Wrap Search and Regex options. FWIW, I'm running JDK 1.6.0_01-b06 right now.
Comment 5 Markus Keller CLA 2007-07-18 14:09:30 EDT
To find the first match, the text file must have exactly contents "aaabab" (no whitespace in front of the characters). If there's a space between the caret and the first "a", then the regex "a*" effectively first matches at the caret position, with length 0. The second match would be "aaa".

The problem in Eclipse is that FindReplaceDocumentAdapter currently thinks a zero-length match is no match:
	if (found && fFindReplaceMatcher.group().length() > 0)
		return new Region(..);
	return null;

This is problematic, since it blocks all regex patterns that can match with length 0, e.g. "a*", "\b" (matches at word boundaries), etc.

The fix is not trivial, since removing the 0-length check yields exceptions when the caret is at offset 0, and there's no protocol that allows the find(..) method to properly distinguish between the first search (needs offset) and consecutive searches (internal FIND_NEXT state).


Here's a snippet to see what the regex package does:

public class Try {
	public static void main(String[] args) {
		Pattern pattern= Pattern.compile("a*");
		Matcher matcher= pattern.matcher(" aaabab");
		while (matcher.find())
			print(matcher);
	}

	private static void print(Matcher matcher) {
		String group= matcher.group();
		int start= matcher.start();
		int end= matcher.end();
		System.out.println("'"+group+"' at ["+start+","+end+"]");
	}
}
Comment 6 Patrick Schulz CLA 2007-07-26 04:46:04 EDT
I'm not sure if this matters here. But the find next and find previous actions seem to be aware of the options set in the Find / Replace dialog.

As the Regex option is activated, the "normal" mark-a-text and find next / previous doesn't work anymore until you deactivate this option.

I think it is a problem to make these both actiona work with both cases:
- Find next based on the last find/replace request
- Find next based on the actual selection


So in this case, if a* matches aaa and aaa gets selected, which expression is find next / previous looking for?
Comment 7 Markus Keller CLA 2007-07-26 05:26:55 EDT
Created attachment 74662 [details]
Work in Progress

I took a stab at fixing this, but the Regex & Backwards case needs more work (it already fails now with e.g. text "a" and pattern ".?").

(In reply to comment #6)
> As the Regex option is activated, the "normal" mark-a-text and find next /
> previous doesn't work anymore until you deactivate this option.
Works for me in I20070724-0800.

> So in this case, if a* matches aaa and aaa gets selected, which expression is
> find next / previous looking for?
It's looking for the regex (since the selection of the last match is remembered). In the normal case (matches with length > 1), this works fine, e.g. search for "a." in "aaabab", press Esc, Ctrl+K finds second match.
Comment 8 Patrick Schulz CLA 2007-08-23 10:15:15 EDT
Oh I forgot to say, that this behavior I described was the case in PDT's Editor.

I marked a variable like $var.
As the $ is also a kind of regex pattern, it didn't work.

I couldn't test it as you mentioned in I20070724-0800, cause I still want to use the stable Europa build.
But I think it behaves the same way like in my version.


I still think, that it is no good idea to take the preferences that are made in the find/replace dialog also for find next / previous.
Or if so, the user should be informed, what happened or what kind of searchoptions are used as someone uses find next by Ctrl+K.


Comment 9 Markus Keller CLA 2007-08-23 13:32:53 EDT
(In reply to comment #8)
> I still think, that it is no good idea to take the preferences that are made in
> the find/replace dialog also for find next / previous.

This problem got fixed for 3.4 M1, see bug 44422 comment 2.

We chose not to disable regular expressions on Find Next, since that would make the 'Find' button different from the 'Find Next' command, which would be hard to explain. Instead, we now escape the selection, which has the added benefit that 'Edit > Find/Replace' also initializes the find field with a pattern that matches the selection.
Comment 10 Dani Megert CLA 2010-10-06 03:06:35 EDT
*** Bug 327055 has been marked as a duplicate of this bug. ***
Comment 11 Eclipse Webmaster CLA 2019-09-06 15:30:37 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.