Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[smila-dev] org.eclipse.smila.utils.XMLHelper / XML 1.1

Hi,

tried to index some documents containing special control characters and
got an error when the 
resulting SMILA Record was read from the Queue:

	[Fatal Error] :223:6: Character reference "&#26" is an invalid
XML character.

This is due to the fact that XMLHelper uses/adds XML header 1.0 when
converting Records to XML, 
resp. when creating a Queue Message from a Record.

XML 1.0 standard doesn't allow (escaped) control characters in XML,
whereas XML 1.1 does.

Xerces 2.9 (which we use in SMILA) supports XML 1.1. So, when replacing
the version in the 
header constant XML_HEADER_UTF8 in XMLHelper everything works fine.

Another workaround may be to use a CDATA section instead of a text node
for record attribute values.
(see
org.eclipse.smila.datamodel.record.dom.RecordBuilder.appendTextElement()
).
Hmm, that didn't work with my test case, CDATA section isn't correctly
wrapped around the control 
character at the end of the text, but maybe I'm doing something wrong...
;)

What do you think?

Best regards,
 Andreas














Back to the top