Bug 417481 - Wikitext stuck in a loop
Summary: Wikitext stuck in a loop
Status: RESOLVED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Mylyn (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P1 normal (vote)
Target Milestone: 2.0   Edit
Assignee: Jeremie Bresson CLA
QA Contact: David Green CLA
URL: http://fi.wikipedia.org/w/api.php?act...
Whiteboard:
Keywords:
: 402248 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-09-18 04:02 EDT by Calum L CLA
Modified: 2013-10-25 19:04 EDT (History)
3 users (show)

See Also:


Attachments
JUnit test which currently fails (1.41 KB, text/plain)
2013-09-18 04:03 EDT, Calum L CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Calum L CLA 2013-09-18 04:02:40 EDT
Hello all,

I am using Wikitext 1.9.0.20130621-1305 to convert from Mediawiki to HTML.
With a certain piece of text, it gets stuck in a loop.
I think the text is valid. Either way, I don't think it should get stuck in a loop forever.
It comes from the Finnish Wikipedia page for "Dio".

The problem is the following line:
<!--              -->{{Yhtye/tietorivi|[[Universal Music Group]]}}<!--              -->{{Yhtye/tietorivi|Vertigo Records}}

When the second "part" of the line is brought down onto its own line, it works fine.
It also works fine as it is if it is the last line of the infobox.


I have included a JUnit test which demonstrates the problem.
Comment 1 Calum L CLA 2013-09-18 04:03:26 EDT
Created attachment 235587 [details]
JUnit test which currently fails
Comment 2 David Green CLA 2013-10-21 13:08:31 EDT
Thanks for the bug Calum.  Can you add the specific JVM version that you're using to this bug?  Sometimes that makes a difference since we're using regular expressions.
Comment 3 Calum L CLA 2013-10-22 06:23:29 EDT
Hey David,

JDK 1.6.0_25 seems to be what Eclipse is configured to run currently.

java version "1.6.0_25"
Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode)

It looks like the Oracle JDK, which is strange, as I'd usually use the OpenJDK one.
I have this file in my Downloads, which must be what I used. jdk-6u25-linux-x64.bin
I think I had a problem in the OpenJDK one a while ago, and perhaps I switched to the Oracle one to resolve it.

I followed the code looping around, and it didn't seem to be to do with regex - but I didn't look too deeply.


Calum
Comment 4 Jeremie Bresson CLA 2013-10-22 07:12:37 EDT
Reduced example:

public void testParserShouldNotTimeout() {
	String input = "<!-- X -->Lorem<!-- Y -->Ipsum\n" + "<!-- Z -->";

	String html = parser.parseToHtml(input);
	TestUtil.println(html);
	assertTrue(html.contains("XXX")); //TODO fix the assert.
}

Added in class MediaWikiLanguageTest.

The assertTrue(..) line is never executed. 
I will try to debug it this evening, to see what is going wrong.
Comment 5 Jeremie Bresson CLA 2013-10-22 14:15:31 EDT
Here a Gerrit Change with the proposed JUnit test and a solution:
https://git.eclipse.org/r/17659

It does the job but I am not sure this is the solution.

Two questions:
1/ Are you sure that CommentBlock should be the ParagraphBreakingBlocks?
Why is there a NESTED_BLOCK_START_PATTERN after the ParagraphBreakingBlocks Check?
I wanted to have a look with a Code Coverage tool, because I do not see when this code is used.

2/ Are you sure that findCloseOffset can return -1? It seems to me, that by returning -1, the ParagraphBlock instance is never poped from the nestedBlocks stack.
=> Maybe with an other value, my change in canResume(..) would not be necessary.
Comment 6 David Green CLA 2013-10-22 22:36:34 EDT
Jeremie, thanks for your excellent work.  I've pushed an updated patchset to the review that includes your test but provides an alternate solution.  Please take a look and let me know what you think.
Comment 7 Jeremie Bresson CLA 2013-10-23 00:13:58 EDT
Keeping closed blocks in nestedBlocks is wrong. Your solution that pop them from the stack is good.

I still think that the nested-protocol in MediaWiki ParagraphBlock is not so well implemented: findCloseOffset(..) should not always return -1.

Your change makes MarkupLanguage more robust -> this is good.
Comment 8 David Green CLA 2013-10-23 11:20:15 EDT
I've merged review 17659, thanks Jeremie.

Calumn I'll close this one as resolved/fixed since the change fixes the problem in our tests.  If you can confirm that the fix works for you that would be great.  Feel free to reopen this bug if the problem comes up again.
Comment 9 David Green CLA 2013-10-25 19:04:47 EDT
*** Bug 402248 has been marked as a duplicate of this bug. ***