225434 – Faster XML escaping in XMLWriter

Bug 225434 - Faster XML escaping in XMLWriter

Summary: Faster XML escaping in XMLWriter

Status:	RESOLVED FIXED

Alias:	None

Product:	Equinox
Classification:	Eclipse Project
Component:	p2 (show other bugs)
Version:	3.4
Hardware:	PC Windows XP

Importance:	P3 normal (vote)
Target Milestone:	3.4 M7
Assignee:	P2 Inbox
QA Contact:

URL:
Whiteboard:
Keywords:	performance

Depends on:
Blocks:

Reported:	2008-04-02 17:18 EDT by Simon Kaegi
Modified:	2008-04-03 12:32 EDT (History)
CC List:	1 user (show)

See Also:

Attachments
proposed patch (1.57 KB, patch) 2008-04-02 17:18 EDT, Simon Kaegi	no flags	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Simon Kaegi

2008-04-02 17:18:48 EDT

Created attachment 94630 [details]
proposed patch

Some IUs contain large amounts of text that we end up escaping as we write it as XML. As an example licensing info. We're currently using String concatination but should instead use a StringBuffer.

Comment 1 Pascal Rapicault

2008-04-02 20:31:39 EDT

What is the gain, given that this force us to have another copy of the string in memory?

Comment 2 Simon Kaegi

2008-04-02 21:30:54 EDT

For the tptp_min testcase this is saving a little over 1s (on my laptop). Writing manifests and especially licenses are the main culprits. Also, this is just for serializing the IUs in the profile but would also help whenever we persist any repo changes during generation or mirroring. 

If there are no characters to escape we don't allocate a replacement buffer and eventually return the original string. When there are characters to escape I think the memory footprint should be similar to the current implementation.

Comment 3 John Arthorne

2008-04-02 21:37:02 EDT

This change looks good to me. The old code was really inefficient:

 txt = txt.substring(0, i) + replace + txt.substring(i + 1);

This creates at least two garbage strings, and a string buffer, for each character that is escaped.  The new code just creates a single string buffer for the whole escaping operation.

Comment 4 John Arthorne

2008-04-03 09:58:43 EDT

The only tweak I would make is to initialize the string buffer so it is large enough to fit the entire original text, plus some extra space for escaped characters. Otherwise the buffer may need to grow several times and create more garbage.

Comment 5 Simon Kaegi

2008-04-03 12:32:35 EDT

Patch released with John's suggestion - using the string length + 16 as the inital buffer size.