Community
Participate
Working Groups
When we have web page with Unicode content, something like: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> ... Then will be useful to convert special symbols like: title="Eclipse RCP. Файловый менеджер" /> to supported Unicode symbols like: "Eclipse RCP. Файловый менеджер" In this case we can reduce pages size and code becomes more readable. You can try Oracle Java developer html editor, this feature present in it.
Sorry, but Unicode symbols converted by bugzilla to codes :(
Perhaps you could attach an example
Created attachment 50023 [details] Unicode pages cleanup samples There are two same html Unicode pages in sample. After Oracle JDeveloper cleanup, page size reduced almost two times and code becomes readable. Original page was generated from docbook xml file.
I'll leave as a feature request, but have to ask ... how do you convert from docbook xml file? Shouldn't that be the point the characters are encoded to UTF8?
I'm converted Docbook xml files with external program (xml mind editor). I know that it is possible to prepare my own xsl files for transformation, but I have not so much time to learn xsl. And maybe it is possible to generate documents directly in UTF-8, but I don't know how to do this right now.
This is a great idea, but we don't have people to work on this just now. There is still a chance it would fit in for 3.0, but we would accept high quality patches, regardless.