Community
Participate
Working Groups
The machine is a recently built RH9 box, running KDE, and Eclipse using GTK. Trying to build platform-ui from CVS (importing the other pieces as binary projects), I get a failure that complains: "The project was not built since the source file /org.eclipse.ui.workbench/Eclipse UI/org/eclipse/ui/internal/misc/StringMatcher.java could not be read." When I first tried to open the file, it complained that it was not valid UTF-8. I switched to ASCII, and now it opens fine. The build still fails. STEPS TO REPRODUCE: 1.) Install the latest I20030717 build on a RH9 box under KDE. 2.) Open Eclipse, add a CVS perspective, and add a CVS resource pointing to dev.eclipse.org (anonymous) 3.) Checkout platform-ui 4.) Import all the other Eclipse stuff as binary projects. 5.) Rebuild all. OBSERVED RESULTS: 29 problems (24 errors). The errors are all traced back to the error mentioned above. Opening StringMatcher at this point should cause problems.
Found this in the log. There are multiple entries, all the same. Destroying the project and checking it out again does nothing. Neither does updating. I am now using the M2 build; problem is still present. !STACK 1 org.eclipse.core.internal.resources.ResourceException: Resource is out of sync with the file system: /org.eclipse.ui.workbench/Eclipse UI/org/eclipse/ui/CVS/Root. at java.lang.Throwable.<init>(Throwable.java) at java.lang.Throwable.<init>(Throwable.java) at org.eclipse.core.runtime.CoreException.<init>(CoreException.java:35) at org.eclipse.core.internal.resources.ResourceException.<init>(ResourceException.java:30) at org.eclipse.core.internal.localstore.FileSystemResourceManager.read(FileSystemResourceManager.java:406) at org.eclipse.core.internal.resources.File.getContents(File.java:214) at org.eclipse.core.internal.resources.File.getContents(File.java:204) at org.eclipse.team.internal.ccvs.core.util.SyncFileWriter.readFirstLine(SyncFileWriter.java:398) at org.eclipse.team.internal.ccvs.core.util.SyncFileWriter.readFolderSync(SyncFileWriter.java:171) at org.eclipse.team.internal.ccvs.core.resources.EclipseSynchronizer.cacheFolderSync(EclipseSynchronizer.java) at org.eclipse.team.internal.ccvs.core.resources.EclipseSynchronizer.getFolderSync(EclipseSynchronizer.java) at org.eclipse.team.internal.ccvs.core.resources.EclipseFolder.isCVSFolder(EclipseFolder.java) at org.eclipse.team.internal.ccvs.core.resources.EclipseFolder.isIgnored(EclipseFolder.java) at org.eclipse.team.internal.ccvs.core.resources.EclipseFolder.members(EclipseFolder.java) at org.eclipse.team.internal.ccvs.core.resources.EclipseFolder.calculateAndSaveChildModificationStates(EclipseFolder.java:390) at org.eclipse.team.internal.ccvs.core.resources.EclipseFolder.isModified(EclipseFolder.java:359) at org.eclipse.team.internal.ccvs.core.resources.EclipseFolder.calculateAndSaveChildModificationStates(EclipseFolder.java:394) at org.eclipse.team.internal.ccvs.core.resources.EclipseFolder.isModified(EclipseFolder.java:359) at org.eclipse.team.internal.ccvs.core.resources.EclipseFolder.calculateAndSaveChildModificationStates(EclipseFolder.java:394) at org.eclipse.team.internal.ccvs.core.resources.EclipseFolder.isModified(EclipseFolder.java:359) at org.eclipse.team.internal.ccvs.core.resources.EclipseFolder.calculateAndSaveChildModificationStates(EclipseFolder.java:394) at org.eclipse.team.internal.ccvs.core.resources.EclipseFolder.isModified(EclipseFolder.java:359) at org.eclipse.team.internal.ccvs.ui.CVSLightweightDecorator.isDirty(CVSLightweightDecorator.java:99) at org.eclipse.team.internal.ccvs.ui.CVSLightweightDecorator.isDirty(CVSLightweightDecorator.java:112) at org.eclipse.team.internal.ccvs.ui.CVSLightweightDecorator.decorate(CVSLightweightDecorator.java:189) at org.eclipse.ui.internal.decorators.LightweightDecoratorDefinition.decorate(LightweightDecoratorDefinition.java:158) at org.eclipse.ui.internal.decorators.LightweightDecoratorManager$LightweightRunnable.run(LightweightDecoratorManager.java:54) at org.eclipse.core.internal.runtime.InternalPlatform.run(InternalPlatform.java) at org.eclipse.core.runtime.Platform.run(Platform.java) at org.eclipse.ui.internal.decorators.LightweightDecoratorManager.decorate(LightweightDecoratorManager.java) at org.eclipse.ui.internal.decorators.LightweightDecoratorManager.getDecorations(LightweightDecoratorManager.java) at org.eclipse.ui.internal.decorators.DecorationScheduler$1.run(DecorationScheduler.java) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:58) !ENTRY org.eclipse.core.resources 4 274 Jul 21, 2003 09:49:58.584 !MESSAGE Resource is out of sync with the file system: /org.eclipse.ui.workbench/Eclipse UI/org/eclipse/ui/CVS/Root.
Created attachment 5547 [details] Eclipse Log A log file showing the actual UTF8 conversion failure.
The StringMatcher.java file contains the hexidecimal values 0x91 and 0x92 in multiple positions. I don't believe these to be valid UTF-8 encoded characters. For example, the following sequence of bytes can be seen in vi: * pattern which may contain <91>*<92> for 0 and many characters and * <91>?<92> for exactly one character.
From bash, executing "rm StringMatcher.java; cvs update -d -C StringMatcher.java" still leaves the strange hexadecimal values in the file.
I've confirmed this on a Debian box. This file displays this way under Linux. The characters appear as left and right quotes under Windows, as well as in the Mozilla browser on Linux. However, on a Linux terminal it displays as an invalid UTF-8 character (both uxterm and xterm). In Eclipse, it complains that it is not valid UTF-8. The last person to edit this file must have used Windows and inserted these characters, which Windows happens to encode as 0x91 and 0x92. However, 0x91 and 0x92 are not valid UTF-8 characters, and hence Linux complains. Why does Mozilla display it properly? (font? special handler code?) This is really two problems. CVS seems to contain a file that is not valid UTF-8. The eclipse core should escape those bytes before storing them to the file system. (But wouldn't eclipse core use Java's IO libraries to do this anyway?) You could probably also point a finger at Linux' UTF-8 locale implementation, but it does seem to match the specification.
Moving to Platform UI since they own that particular copy of the string matcher.
The problem occurs in a second (duplicate?) StringMatcher class located in "org.eclipse.ui.views". I'm supplying patches for both projects. Note that this does not fix the problem of how 0x91 and 0x92 ended up in CVS in the first place.
Created attachment 5564 [details] Patch for org.eclipse.ui.views
Created attachment 5565 [details] Patch for org.eclipse.ui.workbench
Note: There are several more instances of the StringMatcher class with different owners.
As a note, it looks like the code generating patches is also affected. Text from the original is not included in the patch file starting at the first offending character. It looks like the patch generator doesn't like including unrecognized characters, and doesn't recover as well as it might from such an error. (arg!)
Moving to VCM.
Maybe this is a VM problem? Did you try using another VM?
Under Sun's 1.4.2 VM, the code will compile. When the source is viewed in an editor, it will display, but missing the 0x91 and 0x92 characters. Editing the file and then saving it will overwrite the 0x91 and 0x92 characters with their UTF-8 equivalents. So, there are still files in CVS that are not valid UTF-8. Sun's VM is tolerant of these oddities, but IBM's VM that I was using is not (pj9xia32131-20030714a). Somehow, invalid UTF-8 can be written to a CVS repository using Eclipse. It wasn't Sun's 1.4.2 VM that wrote them to CVS (see above). Further testing with other VMs?
I'm not a VM guy, so I don't know if there's a spec for this i.e. which VM behavior is the one we can expect.
The CVS plugin transfers bytes to the server and is agnostic about the encoding used in the platform. The stack trace relates to the CVS decorators and the fact that some of the projects in your workspace were out-pf-sync with the file system. This is not related to the java builder not compiling the class. Eclipse uses the default OS encoding or uses the overriden setting under Preferences > Workbench > Editors. There shouldn't be a plugin that assumes UTF-8 as the default. To conclude, this is not a CVS problem but a problem with the java compiler. However I'm not sure what encoding scheme it should use to parse the source files when two developers are using different OS encodings and committing the files to CVS.
On Windows I can't open the files in question either when selecting UTF-8 encoding. The default encoding was 8859 anyway so this wasn't a problem. With the default encoding the questionable characters are not shown at all. I.e., they are 0 length characters. The build works fine because the Java compiler still uses 8859. The build would probably fail on my Windows box as well if I specified UTF-8 encoding on the command line (file.encoding property). Not sure where these bogus characters come from. Removed the offending characters in the three Platform UI StringMatcher files. Suggest Team, Search and JDT Debug and JDT UI teams do the same.
See Sections 3.1 and 3.3 of the Java Language Specification. ("http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html#95413" [Section 3.1]). "Programs are written using the Unicode character set." It's not a valid Java program if it isn't written in Unicode.
Fixed for Search and JDT UI.
Jean-Michel, What makes you think there is a Java compiler bug here ? If the specified encoding is incorrect, then how could we process it without any errors ?
Let me take that back. What I was trying to say is that if the java spec says that Java source files must be encoded as Unicode then either the VM (as Doug has observed) or the JDT Java Editor is not ensuring that the file is written as UTF-8? BTW, I've also fixed the StringMatcher in Team/CVS.
Many apologies, but I don't think that I read it closely enough the first time. There is an "except". Any character (e.g., 0x91) is allowed in comments, string/character literals and identifiers. Only keywords, separators, and operators need to be in low ASCII (or escaped using "\uXXXX" sequences). There is no problem using Sun's JDK 1.4.2. I'm beginning to think this is a VM bug.
this is late in the game but i encountered this problem today on a new linux install with the IBM 1.4.1 VM. the problem disappeared without any other changes using the SUN 1.4.2 VM.
Closing as JRE issue.