Community
Participate
Working Groups
This is a regression caused by the fix to bug 67211. Currently, we are swallowing IOExceptions thrown while a describer reads the stream. That was done before, when we used to read a block of bytes right at the beginning, to avoid non low-level IOExceptions (such as for bad encoding) causing the content type determination to fail. Now, we read lazily, so we may end up facing real low-level IOExceptions later. A fix would be to throw a specialized IOException from LazyReader/InputStream wrapping the original IOException, and rethrow the inner exception. This way it would be able to easily understand which exceptions are interesting and which ones are not.
Fixed. Test case added. Released to HEAD.
The fix actually implemented (in the context of bug 62443) for this bug was: IOExceptions are always let go by the content description framework, XMLRootElementContentDescriber (which uses a SAX parser) swallows IOExceptions related to bad encodings (subclasses of CharConversionException), letting other IOExceptions flow to the caller, interrupting the content description loop. This does not always work because there is no official single encoding-related IOException class. For instance, Xerces on latest IBM VM will throw a bare IOException if the following string is to be parsed: <?xml version='1.0' encoding='us-ascii'?> <!-- αινσϊ --> <org.eclipse.core.runtime.tests.root/> When running with the mentioned VM, the test case added for this bug fails with the following (root) exception: java.io.IOException: Byte "225" is not a member of the (7-bit) ASCII character set. at org.apache.xerces.impl.io.ASCIIReader.read(Unknown Source) at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source) at org.apache.xerces.impl.XMLEntityScanner.skipSpaces(Unknown Source) at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at javax.xml.parsers.SAXParser.parse(Unknown Source) at org.eclipse.core.internal.content.XMLRootHandler.parseContents(XMLRootHandler.java:176) at org.eclipse.core.runtime.content.XMLRootElementContentDescriber.checkCriteria(XMLRootElementContentDescriber.java) at org.eclipse.core.runtime.content.XMLRootElementContentDescriber.describe(XMLRootElementContentDescriber.java:114) at org.eclipse.core.internal.content.ContentType.describe(ContentType.java:189) at org.eclipse.core.internal.content.ContentTypeManager.internalFindContentTypesFor(ContentTypeManager.java:288) at org.eclipse.core.internal.content.ContentTypeManager.findContentTypesFor(ContentTypeManager.java:172) at org.eclipse.core.tests.runtime.content.IContentTypeManagerTest.testIOException(IContentTypeManagerTest.java) ... It seems the best way of reliably telling real IOExceptions apart from high level IOExceptions is to do something along the lines of what was originally suggested in this bug's description.
Fixed again, by using a special LowLevelIOException wrapper that allows us to tell apart exceptions generated by the stream a LazyInputStream/Reader reads from, from those generated by streams / readers reading from our streams.