Bug 225434 - Faster XML escaping in XMLWriter
Summary: Faster XML escaping in XMLWriter
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: 3.4   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.4 M7   Edit
Assignee: P2 Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2008-04-02 17:18 EDT by Simon Kaegi CLA
Modified: 2008-04-03 12:32 EDT (History)
1 user (show)

See Also:


Attachments
proposed patch (1.57 KB, patch)
2008-04-02 17:18 EDT, Simon Kaegi CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Kaegi CLA 2008-04-02 17:18:48 EDT
Created attachment 94630 [details]
proposed patch

Some IUs contain large amounts of text that we end up escaping as we write it as XML. As an example licensing info. We're currently using String concatination but should instead use a StringBuffer.
Comment 1 Pascal Rapicault CLA 2008-04-02 20:31:39 EDT
What is the gain, given that this force us to have another copy of the string in memory?
Comment 2 Simon Kaegi CLA 2008-04-02 21:30:54 EDT
For the tptp_min testcase this is saving a little over 1s (on my laptop). Writing manifests and especially licenses are the main culprits. Also, this is just for serializing the IUs in the profile but would also help whenever we persist any repo changes during generation or mirroring. 

If there are no characters to escape we don't allocate a replacement buffer and eventually return the original string. When there are characters to escape I think the memory footprint should be similar to the current implementation.
Comment 3 John Arthorne CLA 2008-04-02 21:37:02 EDT
This change looks good to me. The old code was really inefficient:

 txt = txt.substring(0, i) + replace + txt.substring(i + 1);

This creates at least two garbage strings, and a string buffer, for each character that is escaped.  The new code just creates a single string buffer for the whole escaping operation.
Comment 4 John Arthorne CLA 2008-04-03 09:58:43 EDT
The only tweak I would make is to initialize the string buffer so it is large enough to fit the entire original text, plus some extra space for escaped characters. Otherwise the buffer may need to grow several times and create more garbage.
Comment 5 Simon Kaegi CLA 2008-04-03 12:32:35 EDT
Patch released with John's suggestion - using the string length + 16 as the inital buffer size.