Bug 241225 - [compiler] ECJ 3.4 fails to bootstrap icedtea6
Summary: [compiler] ECJ 3.4 fails to bootstrap icedtea6
Status: VERIFIED NOT_ECLIPSE
Alias: None
Product: JDT
Classification: Eclipse Project
Component: Core (show other bugs)
Version: 3.4.1   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: 3.5 M2   Edit
Assignee: Olivier Thomann CLA
QA Contact:
URL: http://overlays.gentoo.org/proj/java/...
Whiteboard:
Keywords: vm
Depends on:
Blocks:
 
Reported: 2008-07-17 07:01 EDT by James Le Cuirot CLA
Modified: 2008-09-15 07:37 EDT (History)
5 users (show)

See Also:


Attachments
configure (370.56 KB, text/plain)
2008-07-18 06:18 EDT, James Le Cuirot CLA
no flags Details
Test which sometimes fails when using GCJ (123.47 KB, application/bzip2)
2008-07-22 10:18 EDT, James Le Cuirot CLA
no flags Details
Test which almost always fails with Sun JDK 1.6.0.07 (1006.12 KB, application/bzip2)
2008-07-22 10:37 EDT, James Le Cuirot CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description James Le Cuirot CLA 2008-07-17 07:01:55 EDT
Build ID: M20080716-0800

Steps To Reproduce:
1. Install ecj.jar 3.4 or 3.4.1 (M20080716-0800).
2. Attempt to bootstrap icedtea6-1.2 with ecj.jar and either Sun JDK 1.6 or GCJ 4.3.1.
3. Inconsistent crash occurs.


More information:
I have been helping to package icedtea6 on Gentoo Linux. It bootstraps just fine with ECJ 3.2.1 and 3.3.0 but 3.4/3.4.1 has failed for at least two people. The crash occurs with both Sun's JDK and GCJ so the VM doesn't appear to be at fault. 

The point at which the crash occurs can vary but it's generally quite early in the build. Here are three examples of the errors I've received.

    [javac] 1. ERROR in /var/tmp/portage/dev-java/icedtea6-1.2/work/icedtea6-1.2/openjdk-ecj/langtools/src/share/classes/com/sun/source/tree/BinaryTree.java (at line 0)
    [javac]     /*
    [javac]     ^
    [javac] Internal compiler error
    [javac] java.lang.NullPointerException
    [javac]    at org.eclipse.jdt.internal.compiler.ReadManager.run(ReadManager.java:160)
    [javac]    at java.lang.Thread.run(libgcj.so.9)
    [javac] 
    [javac] ----------
    [javac] 1 problem (1 error)

    [javac] 1. ERROR in /var/tmp/portage/dev-java/icedtea6-1.2/work/icedtea6-1.2/openjdk-ecj/langtools/src/share/classes/com/sun/javadoc/ConstructorDoc.java (at line 0)
    [javac] 	/*
    [javac] 	^
    [javac] Internal compiler error
    [javac] java.lang.NullPointerException
    [javac]    at org.eclipse.jdt.internal.compiler.ReadManager.run(ReadManager.java:160)
    [javac]    at java.lang.Thread.run(libgcj.so.9)
    [javac] 
    [javac] ----------
    [javac] 1 problem (1 error)

java.lang.NullPointerException
   at org.eclipse.jdt.internal.compiler.batch.CompilationUnit.getContents(CompilationUnit.java:71)
   at org.eclipse.jdt.internal.compiler.ReadManager.run(ReadManager.java:160)
   at java.lang.Thread.run(libgcj.so.9)
incorrect classpath: hotspot-tools/com/sun/codemodel/internal/ClassType.java
java.lang.NullPointerException
   at org.eclipse.jdt.internal.compiler.batch.CompilationUnit.getContents(CompilationUnit.java:71)
   at org.eclipse.jdt.internal.compiler.ReadManager.run(ReadManager.java:160)
   at java.lang.Thread.run(libgcj.so.9)

Despite this, I have used ECJ 3.4 to build the Ganymede SDK just fine and it appears to run okay too, though I haven't used it much yet.
Comment 1 Olivier Thomann CLA 2008-07-17 08:21:04 EDT
Could you please provide the steps to reproduce?
Thanks.
Comment 2 Philipe Mulet CLA 2008-07-17 09:01:23 EDT
Candidating for 3.4.1, assuming we can reproduce it by then.
Comment 3 James Le Cuirot CLA 2008-07-17 09:26:55 EDT
I was hoping that would be enough since Gentoo has not required me to build it manually. Your GCJ setup will probably vary from mine. If you don't have GCJ, point --with-libgcj-jar to Sun's rt.jar and --with-gcj-home to your Sun JDK home. You can probably leave out --with-java and --with-javah if you're using Sun.

# wget http://icedtea.classpath.org/download/source/icedtea6-1.2.tar.gz
# wget http://download.java.net/openjdk/jdk6/promoted/b09/openjdk-6-src-b09-11_apr_2008.tar.gz
# tar zxf icedtea6-1.2.tar.gz
# cd icedtea6-1.2
# unset JAVA_HOME JDK_HOME CLASSPATH JAVAC JAVACFLAGS
# ./configure --with-openjdk-src-zip=/path/to/openjdk-6-src-b09-11_apr_2008.tar.gz --with-ecj-jar=/path/to/ecj-M20080716-0800.jar --with-libgcj-jar=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.3.1/java/libgcj-4.3.1.jar --with-gcj-home=/usr/x86_64-pc-linux-gnu/gcc-bin/4.3.1 --with-java=/usr/x86_64-pc-linux-gnu/gcc-bin/4.3.1/gij --with-javah=/usr/x86_64-pc-linux-gnu/gcc-bin/4.3.1/gjavah
# make

I hadn't noticed this before but it seems to fail much more quickly with Sun's VM. I recommend you test it with both if possible.
Comment 4 James Le Cuirot CLA 2008-07-17 09:28:26 EDT
Oh yeah and make sure the path you give for ecj.jar is absolute. Relative paths don't work.
Comment 5 Olivier Thomann CLA 2008-07-17 09:35:41 EDT
ok, I'll try to set up an ant build script to compile it and see if I can reproduce this issue.

The only linux box I have available is a RH5.0.
Comment 6 Olivier Thomann CLA 2008-07-17 14:30:52 EDT
For now I am stuck with a missing package cups-devel and I cannot find the appropriate package anywhere.
So right now I don't even know if I can reproduce the failure.
Can you pass some system property to the running instance of ecj ?
I'd like you to try to disable the multi-thread reads in the compiler. To do so, you need to set the java property to false.
-Djdt.compiler.useSingleThread=true

Let us know if this is feasible and if yes, does it solve the issue ?
Comment 7 Olivier Thomann CLA 2008-07-17 14:38:07 EDT
I think what I really need are all the source files once configured is done. For now I cannot set up the RHEL5 to be able to run the configuration completely.
Comment 8 Olivier Thomann CLA 2008-07-17 14:45:58 EDT
I don't think I actually need cups-devel. The only thing I want is to compile the java files. I don't care about cpp files.
Do you know what I need to change to remove the dependency on cups-devel ?
Comment 9 James Le Cuirot CLA 2008-07-18 06:18:12 EDT
Created attachment 107812 [details]
configure

The dependencies aren't really optional but you shouldn't need the CUPS stuff for this test. I removed it from configure.ac and ran autoconf to regenerate configure. Here is that new configure for you. I would have removed the other native stuff but some of it is needed for the test.
Comment 10 James Le Cuirot CLA 2008-07-18 07:22:36 EDT
First of all, if you do want cups-devel, this CentOS 5 version should work for you. http://mirror.bytemark.co.uk/centos/5/os/i386/CentOS/cups-devel-1.2.4-11.18.el5.i386.rpm

I tried -Djdt.compiler.useSingleThread=true as you suggested and it worked for GCJ but not for Sun. Since Sun is failing in a different place as well, that might suggest a different problem. It could also be a bug in Sun's VM but remember that ECJ 3.3 worked fine.

If you want to try -Djdt.compiler.useSingleThread=true yourself then add it directly after @JAVA@ in javac.in BEFORE running configure. Running "make distclean" between builds would probably be a good idea to ensure that we're always starting from the same point.

This only matters once you get further in the build (after the GCJ error usually happens) so maybe this is irrelevant but the value I gave for --with-gcj-home when using GCJ is wrong. It varies from distro to distro. I can't tell what it's supposed to be for RHEL5. Maybe something like /usr/lib/jvm/java-1.4.2-gcj-1.4.2.0 but I'm not sure.
Comment 11 Olivier Thomann CLA 2008-07-21 13:13:27 EDT
Now, it is failing with this error:
xorg-x11-proto-devel

Clearly my linux box is not set up to run configure as is. Would it be possible to get all the source files (java file) that need to be compiled by ecj with a corresponding ant script file once configure has run ?

I don't have access to a gentoo build so I need to get it working on this RHEL-5 system.
Thanks.
Comment 12 Olivier Thomann CLA 2008-07-21 13:23:09 EDT
This one requires a dependency on mesa-libGL-devel which requires libX11-devel which requires many other files.
So without the data requested in comment 11, it can take a while before I can reproduce this issue.
Comment 13 Olivier Thomann CLA 2008-07-21 13:50:24 EDT
I am hoping to get a running gentoo image by the beginning of next week. So once I get it, I'll try again your original steps.
Comment 14 Olivier Thomann CLA 2008-07-21 13:52:02 EDT
What version of Gentoo linux should I request? 2008 ?
Comment 15 James Le Cuirot CLA 2008-07-21 15:05:26 EDT
I highly recommend that you do not try to replicate this on Gentoo unless you already have it installed and know it well. It is not a trivial distribution to set up and can take most of a day to install. icedtea6 also hasn't been officially added to the tree and isn't straightforward to install at the moment.

I don't understand why you are having such a problem sourcing these packages. They should be readily available from your package manager and shouldn't require any digging around the web.
Comment 16 James Le Cuirot CLA 2008-07-21 15:09:08 EDT
I should add that I realise RHEL is not freely available but assuming you still have access to the packages, it shouldn't be a problem. RHEL5 should still be supported. If for some reason you do not have access to the packages anymore, CentOS is freely available and strives to be compatible with RHEL. Their packages should work fine.
Comment 17 Olivier Thomann CLA 2008-07-21 15:33:46 EDT
I don't have access to the packages and this is the biggest problem.
If I don't get a reproducable setup, I am not going to be able to do anything to investigate this issue.
If you see an easy way for me to get a test case, let me know.
Comment 18 James Le Cuirot CLA 2008-07-22 10:18:03 EDT
Created attachment 108066 [details]
Test which sometimes fails when using GCJ

Okay, I managed to package this into a nice little test. However, what I've found is that while it sometimes fails with GCJ, it always works with Sun. I hadn't noticed this before because Sun always failed earlier in the build for a seemingly different reason. Having said that, it does work with GCJ when using ECJ 3.3 and when specifying -Djdt.compiler.useSingleThread=true so it may not necessarily be a problem with GCJ. You may have to try as many as 10 times before it will fail. Adjust the following as necessary.

# For running with GCJ...
/usr/x86_64-pc-linux-gnu/gcc-bin/4.3.1/gij -cp /path/to/ecj.jar org.eclipse.jdt.internal.compiler.batch.Main -1.5 -nowarn -bootclasspath /usr/share/gcc-data/x86_64-pc-linux-gnu/4.3.1/java/libgcj-4.3.1.jar:/usr/share/gcc-data/x86_64-pc-linux-gnu/4.3.1/java/libgcj-tools-4.3.1.jar `find -name "*.java"`

# For running with Sun...
/opt/sun-jdk-1.6.0.07/bin/java -cp /path/to/ecj.jar org.eclipse.jdt.internal.compiler.batch.Main -1.5 -nowarn -bootclasspath /opt/sun-jdk-1.6.0.07/jre/lib/rt.jar:/opt/sun-jdk-1.6.0.07/lib/tools.jar `find -name "*.java"`
Comment 19 James Le Cuirot CLA 2008-07-22 10:37:04 EDT
Created attachment 108068 [details]
Test which almost always fails with Sun JDK 1.6.0.07

This test deals with the Sun issue. It almost always fails. It does not build with GCJ due to missing classes. Specifying -Djdt.compiler.useSingleThread=true does not help but using ECJ 3.3 does. Adjust as necessary.

/opt/sun-jdk-1.6.0.07/bin/java -cp /path/to/ecj.jar org.eclipse.jdt.internal.compiler.batch.Main -1.5 -nowarn -bootclasspath /opt/sun-jdk-1.6.0.07/jre/lib/rt.jar:/opt/sun-jdk-1.6.0.07/lib/tools.jar `find -name "*.java"`

A crash log was (accidentally) included, which may help.
Comment 20 Olivier Thomann CLA 2008-07-22 13:08:25 EDT
It works fine using a 32-bit VM.
I wonder if this is not a VM issue on a 64-bit platform. Can you try disabling the JIT?
-Djava.compiler=NONE in the command line should disable the JIT.
Comment 21 Olivier Thomann CLA 2008-07-22 14:08:36 EDT
How many times do you get the NPE that you reported in the first comment?
I could not get it using a Sun JDK6_07 build on a 32-bit platform.
Comment 22 James Le Cuirot CLA 2008-07-22 18:47:55 EDT
Sorry if I have not been clear. As I said, I realised after making the original report that there may be two issues here. Sun's JDK fails at a different point (hence the separate test) and does not return an NPE. It actually crashes, as indicated by the included crash log.

When I attempted to run the second test again, it seemed to fail less frequently, but still failed nevertheless. Please try to run it at least 10 times. Adding -Djava.compiler=NONE does seem to avert the problem. I have looped the test 60 times and it has not failed once with that option.

To reiterate, it is GCJ that returns an NPE in the first test. I imagine that problem will be easier to diagnose.
Comment 23 Andrew John Hughes CLA 2008-07-23 05:47:37 EDT
FWIW, I was unable to replicate this with GCJ.  GCJ + ecj 3.4 builds IcedTea6 fine for me on amd64.  I also just tried the testcase and that didn't produce any issues.

$ gij -cp ecj.jar org.eclipse.jdt.internal.compiler.batch.Main -1.5 -nowarn `find -name '*java'`
andrew@omega ~/ecj_icedtea_test $ 
Comment 24 Andrew John Hughes CLA 2008-07-23 05:48:39 EDT
Maybe this is a case of a bad ecj build?
Comment 25 James Le Cuirot CLA 2008-07-23 06:01:46 EDT
I've tried with a Gentoo-built ecj 3.4 and M20080716-0800. I've also tried with a binary version of M20080716-0800 downloaded from eclipse.org. If there's a problem anywhere, it's probably in GCJ but that doesn't explain the Sun problem. I'll try rebuilding GCJ.
Comment 26 Olivier Thomann CLA 2008-07-23 09:54:06 EDT
If it works with the Sun VM when you disable the JIT, then this is a JIT issue on the 64-bit Sun VM.
You should report it to Sun directly. I don't have a 64-bit install so this might explain why it doesn't crash for me.

I'll try again to reproduce the NPE using GCJ. It could simply be a problem with the VM itself again when running the compiler in multi-threaded mode.
Comment 27 James Le Cuirot CLA 2008-08-03 11:57:11 EDT
Okay, I waited till I had replaced my dodgy SATA cable before trying any of this again. I rebuilt GCC. When I tried rebuilding ECJ, I saw the same problem pop up. I've now found that I can often reproduce it just by trying to build ECJ. I have an x86 laptop so I built GCJ on that and tried building ECJ 50 times. No problem. I then tried using the bootstrapped ecj.jar from that machine to build ECJ on this machine. The same problem occurred.

Andrew, is your amd64 machine SMP? Does ECJ only run in a single thread if you only have one core?
Comment 28 James Le Cuirot CLA 2008-08-07 12:40:22 EDT
Now I'm almost certain there's a problem somewhere. I installed Ubuntu Intrepid Ibex Alpha 3 in a virtual machine (KVM) and tried to replicate the issue there. As before, I told it to build ECJ 50 times and it failed four times.

java.lang.NullPointerException
   at org.eclipse.jdt.internal.compiler.ReadManager.run(ReadManager.java:160)
   at java.lang.Thread.run(libgcj.so.90)
java.lang.NullPointerException
   at org.eclipse.jdt.internal.compiler.ReadManager.run(ReadManager.java:160)
   at java.lang.Thread.run(libgcj.so.90)
java.lang.NullPointerException
   at org.eclipse.jdt.internal.compiler.util.Util.getFileCharContent(Util.java:226)
   at org.eclipse.jdt.internal.compiler.batch.CompilationUnit.getContents(CompilationUnit.java:71)
   at org.eclipse.jdt.internal.compiler.ReadManager.run(ReadManager.java:160)
   at java.lang.Thread.run(libgcj.so.90)
java.lang.NullPointerException
   at org.eclipse.jdt.internal.compiler.util.Util.getFileCharContent(Util.java:226)
   at org.eclipse.jdt.internal.compiler.batch.CompilationUnit.getContents(CompilationUnit.java:71)
   at org.eclipse.jdt.internal.compiler.ReadManager.run(ReadManager.java:160)
   at java.lang.Thread.run(libgcj.so.90)

That time, I ran KVM with "-smp 4" which is the same number of cores that the host machine has. I then tried the same thing again without that option and no errors occurred. It therefore seems that threading only occurs within ECJ if you have more than one core.
Comment 29 Olivier Thomann CLA 2008-08-07 12:43:11 EDT
I'll try to get access to a linux box with multi-cores.
Comment 30 Olivier Thomann CLA 2008-08-07 12:43:40 EDT
Do you see a problem using a Sun VM on the same machine ?
Comment 31 Kent Johnson CLA 2008-08-07 13:14:30 EDT
The most likely explanation is a VM issue.

2 of the NPEs come from here :

public static char[] getFileCharContent(File file, String encoding) {
  InputStream stream = null;
  try {
    stream = new FileInputStream(file);
    return getInputStreamAsCharArray(stream, (int) file.length(), encoding);
  } finally {
    if (stream != null) {
      try {
        stream.close(); // <<< NPE ????
      } catch (IOException e) {}
    }
  }
}

How can stream be null on line 226 even with multiple threads ?
Comment 32 Olivier Thomann CLA 2008-08-07 13:28:29 EDT
Is it possible to disable the JIT on the KVM vm?
Comment 33 James Le Cuirot CLA 2008-08-07 19:33:51 EDT
First, to answer those questions...

I tried using Ubuntu and Sun's VM to build ECJ just to make sure it worked. It did, which further suggests that the Sun issue is different. I was able to reproduce the Sun issue using the second test I uploaded. As before, disabling JIT fixes the Sun issue but has no effect on the GCJ issue. Does GCJ even use the java.compiler property?

I suspect you're right about these being VM issues. If I'd realised from the start that the issues were different, I would have contacted the GCC/Sun guys first. I have just filed a GCC bug report now. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37051.
Comment 34 Olivier Thomann CLA 2008-08-08 15:03:15 EDT
Closing as NOT_ECLIPSE.
Please reopen if this ends up being an issue with the compiler implementation.
Comment 35 Jerome Lanneluc CLA 2008-09-15 07:37:43 EDT
Verified for 3.5M2