Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
AW: [smila-dev] org.eclipse.smila.utils.XMLHelper / XML 1.1

Hi all,

thanks for your feedback.
I'm afraid the solution won't be that easy as I first thought...

Setting the XML header to 1.1 will diminish the problem but not solve it.
Reason is, that XML 1.1 still doesn't allow the following characters:
- unescaped control characters
- escaped control character #x0 

So, if we have one of those in an imported document, there's still a chance
to get the error I described before.
(Problem is not academic, e.g. the #x0 character can be found in our html test documents)

So it looks like we have to find a more general solution for this. (CDATA?)

Best regards,
 Andreas


> -----Ursprüngliche Nachricht-----
> Von: smila-dev-bounces@xxxxxxxxxxx [mailto:smila-dev-bounces@xxxxxxxxxxx] Im Auftrag von
> Juergen.Schumacher@xxxxxxxxxxx
> Gesendet: Donnerstag, 30. April 2009 15:05
> An: smila-dev@xxxxxxxxxxx
> Betreff: RE: [smila-dev] org.eclipse.smila.utils.XMLHelper / XML 1.1
> 
> Hi,
> 
> > XML 1.0 standard doesn't allow (escaped) control characters in XML,
> > whereas XML 1.1 does.
> >
> > Xerces 2.9 (which we use in SMILA) supports XML 1.1. So, when replacing
> > the version in the header constant XML_HEADER_UTF8 in XMLHelper everything
> > works fine.
> 
> The only "problem" with this approach I can think of is that there could
> be some non-SMILA message listener that is absolutely not able to read
> XML 1.1 ... that seems quite esoteric to me.
> 
> > Another workaround may be to use a CDATA section instead of a text node
> > for record attribute values.(see
> > org.eclipse.smila.datamodel.record.dom.RecordBuilder.appendTextElement())
> > Hmm, that didn't work with my test case, CDATA section isn't correctly
> > wrapped around the control character at the end of the text, but maybe I'm
> > doing something wrong...
> 
> Even if this worked, it would add quite some overhead to the created XML
> (at least in a string serialization) to surround each attribute value
> by the CDATA tag. Or a performance overhead for first checking the string
> for invalid characters.
> 
> > What do you think?
> 
> I suppose, I'd rather vote for the XML-1.1 approach then.
> 
> Cheers,
> Jürgen.
> 
> 
> _______________________________________________
> smila-dev mailing list
> smila-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/smila-dev


Back to the top