Bug 420258 - Test_BrowserSuite#Browser1() times out on Mac (takes almost 2h)
Summary: Test_BrowserSuite#Browser1() times out on Mac (takes almost 2h)
Status: RESOLVED WORKSFORME
Alias: None
Product: Platform
Classification: Eclipse Project
Component: SWT (show other bugs)
Version: 4.4   Edit
Hardware: PC Mac OS X
: P3 major (vote)
Target Milestone: 4.4 M3   Edit
Assignee: Markus Keller CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-10-24 07:29 EDT by Markus Keller CLA
Modified: 2016-10-20 14:31 EDT (History)
5 users (show)

See Also:


Attachments
System.err.flush() in EclipseTestRunner (1.11 KB, patch)
2013-10-25 09:28 EDT, Markus Keller CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Markus Keller CLA 2013-10-24 07:29:31 EDT
http://download.eclipse.org/eclipse/downloads/drops4/I20131022-1300/testresults/html/org.eclipse.swt.tests_macosx.cocoa.x86_5.0.html

org.eclipse.swt.tests.junit.browser.Test_BrowserSuite#Browser1() times out on Mac (takes 6910.674s). Was OK in I20131015-0800.

Consolelog for AllCocoaTests only says:
     [java] ImageIO: <ERROR>  PNG Invalid IHDR data

See bug 417928 comment 7 for a possible explanation why the test is still green (infrastructure kicks hanging test).
Comment 1 David Williams CLA 2013-10-24 11:47:57 EDT
Can this test be disabled for now? I think it is interfering with "overall Hudson" performance and therefore (might) be interfering with other projects.
Comment 2 Dani Megert CLA 2013-10-24 12:29:38 EDT
Please either fix this asap or disable the test. It delays our test and overloads Hudson. Thanks.
Comment 3 Markus Keller CLA 2013-10-25 06:53:15 EDT
Happened again in N20131024-2000.

Disabled Browser1 test on Mac with http://git.eclipse.org/c/platform/eclipse.platform.swt.git/commit/?id=8fbc959aef2669dd6e0968ec4b6f32974f4bcd56

See bug 420365 for a possible reason. However, that doesn't explain why all other browser tests complete in a reasonable time. And the timeout screenshots also don't show a dialog (actually no window at all).
Comment 4 Lakshmi P Shanmugam CLA 2013-10-25 08:13:25 EDT
I ran this locally several times and Browser1 test completes without any delay. Also, there has been no recent change in the area.
Comment 5 Markus Keller CLA 2013-10-25 09:28:12 EDT
Created attachment 236899 [details]
System.err.flush() in EclipseTestRunner

I also couldn't reproduce locally on 10.8 with 1.7.0_45. The test results page says it ran on 10.7.5 with 1.7.0_25.

The consolelog says after the "swt:" label:

     [java] INFO: timeoutScreenOutputDir: /Users/hudsonBuild/workspace/ep4-unit-mac64/workarea/N20131024-2000/eclipse-testing/results/macosx.cocoa.x86_5.0/timeoutScreens
     [java] INFO: timeout: 7200000
     [java] starting EclipseTestRunnerTimer with timeout=7080000 at 2013-10-24 22:07:03 -0400
     [java] ImageIO: <ERROR>  PNG Invalid IHDR data
     [java] ImageIO: <ERROR>  PNG Invalid IHDR data

This output should not be possible! The EclipseTestRunner prints "starting EclipseTestRunnerTimer [..]" and then starts the TimerTask that will dump stack traces and take screenshots. The dump(..) method should print "EclipseTestRunner almost reached timeout [..]", then print all stack traces, and finally do a syncExec and create screenshots.

The screenshots are actually created, so the only explanation I have is that the System.err buffer gets lost when OS X prints the "ImageIO: <ERROR>  [..]" line.

David or Dani, please release the attached patch, so that we hopefully get more information.
Comment 6 Dani Megert CLA 2013-10-25 10:05:19 EDT
(In reply to Markus Keller from comment #5)
> Created attachment 236899 [details] [diff]
> System.err.flush() in EclipseTestRunner

Fixed with http://git.eclipse.org/c/platform/eclipse.platform.releng.git/commit/?id=0b1239d15c7efaf4e407e6c905b15568339cf211

Sorry, I forgot to set you as author.
Comment 7 Markus Keller CLA 2013-10-25 10:38:00 EDT
(In reply to Dani Megert from comment #6)
> Sorry, I forgot to set you as author.

NP, as long as this doesn't lower my chances of becoming a platform.releng committer eventually...


I've re-enabled the test, so that we can see whether this reveals more infos: http://git.eclipse.org/c/platform/eclipse.platform.swt.git/commit/?id=bccc68c5828d2c053f6e7cbaa816077c400ed3cc
Comment 8 David Williams CLA 2013-10-25 11:50:19 EDT
(In reply to Markus Keller from comment #7)
> (In reply to Dani Megert from comment #6)
> > Sorry, I forgot to set you as author.
> 
> NP, as long as this doesn't lower my chances of becoming a platform.releng
> committer eventually...
> 
> 
> I've re-enabled the test, so that we can see whether this reveals more
> infos:
> http://git.eclipse.org/c/platform/eclipse.platform.swt.git/commit/
> ?id=bccc68c5828d2c053f6e7cbaa816077c400ed3cc

Googling around, a little, reminded me we had to specify a special location for "crash log" for Macs ... currently says:
 
# https://bugs.eclipse.org/bugs/show_bug.cgi?id=392213
# If this property is set, the actual location and file name
# is computed at test time, but this is assumed to be the
# VM argument required to specify that location.
loglocationarg=-XX:ErrorFile

Think that's now invalid, or anything? (On Oracle's u25?)
Comment 9 Markus Keller CLA 2013-10-25 12:46:33 EDT
(In reply to David Williams from comment #8)
> loglocationarg=-XX:ErrorFile
> 
> Think that's now invalid, or anything? (On Oracle's u25?)

I don't know, haven't had a VM crash on the Mac for quite a while. Bug 392594 comment 5 also doesn't crash any more.

For this bug, I'm pretty sure there's no VM crash involved:
- no message about a crash in the consolelog
- all tests terminate, including the hung browser test and subsequent tests
- screenshot files exist and are readable, which proves that dump(..) has been executed two times
Comment 10 Markus Keller CLA 2013-10-27 14:28:52 EDT
In N20131026-1500, the run time of Browser1 was back to normal (2.802s).

I've left it enabled for now, but will turn off as soon as we run into the timeout again.


Interestingly, the log still says:
     [java] starting EclipseTestRunnerTimer with timeout=7080000 at 2013-10-26 17:08:08 -0400
     [java] ImageIO: <ERROR>  PNG Invalid IHDR data
     [java] ImageIO: <ERROR>  PNG Invalid IHDR data

That means the last two lines have nothing to do with the timeout screenshots. They must have been created by other tests, probably Test_org_eclipse_swt_graphics_Image, which seems to test invalid images.
Comment 11 Markus Keller CLA 2013-10-28 06:44:36 EDT
Was normal again in I20131027-2130. Closing this bug until it appears again (hopefully with more information).
Comment 12 Markus Keller CLA 2016-10-20 14:31:20 EDT
(In reply to Markus Keller from comment #5)
> This output should not be possible! The EclipseTestRunner prints "starting
> EclipseTestRunnerTimer [..]" and then starts the TimerTask that will dump
> stack traces and take screenshots. The dump(..) method should print
> "EclipseTestRunner almost reached timeout [..]", then print all stack
> traces, and finally do a syncExec and create screenshots.
> 
> The screenshots are actually created, so the only explanation I have is that
> the System.err buffer gets lost when OS X prints the "ImageIO: <ERROR> 
> [..]" line.

At least on GTK, there are scenarios where an endless native loop in the main thread also blocks output to System.err (bug 502410). After bug 506304, the EclipseTestRunner now prints to both System.err and System.out.