Community
Participate
Working Groups
When you have source like this: <foo> data</foo> and you run the formatter you get this: <foo>data</foo> e.g. the white space within the tags is removed. additonally, this: <foo><bar><data</bar></foo> yeilds this: <foo> <bar><data</bar></foo>
Shaun, are you saying that in your first example that the white space shouldn't be removed?
(In reply to comment #1) > Shaun, are you saying that in your first example that the white space shouldn't > be removed? > That's right, the data between the tags (even though it is white space) is valid and should not be removed when formatting. White space outside tags, yes, inside, no.
Do you have a grammar for the first example that explicitly states that the whitespace should be preserved? The second example that you provide looks like the problem reported in bug 238026 where text after an entity reference is not formatted.
(In reply to comment #3) > Do you have a grammar for the first example that explicitly states that the > whitespace should be preserved? > > The second example that you provide looks like the problem reported in bug > 238026 where text after an entity reference is not formatted. > I agree. I searched for both issues but was not fruitful. Should I alter the description?
If you're satisfied with bug 238026 describing the second problem, sure. But as Nick asked, what does your grammar say to do when handling the white space?
(In reply to comment #5) > If you're satisfied with bug 238026 describing the second problem, sure. But > as Nick asked, what does your grammar say to do when handling the white space? > I am not sure what you mean by 'what does your grammar say to do when handling the white space'. However, it seems problematic that formatting an XML document would remove white space between tags.
As the XML specification has a section on white space in XML documents (http://www.w3.org/TR/REC-xml/#sec-white-space). There is essentially "readability" white space and "significant" white space. For the most part, the XML formatter assumes that all white space in the document is simply there for readability. And that's what the formatter is there for: to improve readability. However, applications (like our formatter) can be notified that the white space within an element is significant using the xml:space attribute. So, for your example, you could have <foo xml:space="preserve"> data</foo> which would end up preserving the white space. You could also define the grammar of an element using an XML Schema (XSD) or through DTDs to explicitly state that the white space is significant.
(In reply to comment #7) > As the XML specification has a section on white space in XML documents > (http://www.w3.org/TR/REC-xml/#sec-white-space). There is essentially > "readability" white space and "significant" white space. > > For the most part, the XML formatter assumes that all white space in the > document is simply there for readability. And that's what the formatter is > there for: to improve readability. However, applications (like our formatter) > can be notified that the white space within an element is significant using the > xml:space attribute. > > So, for your example, you could have > <foo xml:space="preserve"> data</foo> > which would end up preserving the white space. > > You could also define the grammar of an element using an XML Schema (XSD) or > through DTDs to explicitly state that the white space is significant. > I do appreciate your explanation, unfortunately, our message structure is not mutable for such a cause. I would be happy to implement my own formatter as an Eclipse plugin, however I have been unable to locate the proper hooks to override this action, can you provide any help with this? Lastly, given that this is the intended functionality, I will change this issue to a 'feature request' -> it would be really nice to tell the formatter to honor whitespace in window -> preferences instead of having to alter the message.
There has been some work here in that there is now a "clear all blank lines" preference and there is a "preserve whitespace in tags with PCDATA content". Probably still need a more general case preference for just preserving whitespace entirely.