Bug 424552 - Improve the performance of the markup validator
Summary: Improve the performance of the markup validator
Status: RESOLVED FIXED
Alias: None
Product: WTP Source Editing
Classification: WebTools
Component: wst.xml (show other bugs)
Version: 3.5.1   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: 3.6 M5   Edit
Assignee: Nick Sandonato CLA
QA Contact: Nick Sandonato CLA
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2013-12-20 15:26 EST by Nick Sandonato CLA
Modified: 2014-01-04 01:07 EST (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Sandonato CLA 2013-12-20 15:26:48 EST
The markup validator currently loads the file's model into memory. This can cause problems for users who have enabled markup validation, since the build validator may come across a large file without the user having ever opened it.
Comment 1 Nick Sandonato CLA 2013-12-20 16:32:54 EST
http://git.eclipse.org/c/sourceediting/webtools.sourceediting.git/commit/?id=b9d8adb0866fe6db674b4d569f6889ce9ca640b7

I've pushed the changes to master.

The markup validator now tokenizes an XML file instead of using the structured model to perform validation. This has yielded a 95-99% decrease in the time it takes to perform validation on larger files (tested on a ~2MB file). Memory usage has also has also been dramatically improved. With the same ~2MB file:

Garbage Collection using the old markup validator:
[GC 83210K->67716K(129468K), 0.0038218 secs]
[GC 83489K->76683K(129468K), 0.0256200 secs]
[GC 93707K->90081K(129468K), 0.0147162 secs]
[GC 90081K(129468K), 0.0051320 secs]
[GC 107105K->104152K(129468K), 0.0672853 secs]
[GC 121176K->117141K(134844K), 0.0448868 secs]
[GC 134165K->126963K(144824K), 0.0283667 secs]
[GC 143987K->132660K(150776K), 0.0151360 secs]
[GC 132944K(150776K), 0.0040487 secs]
[GC 149607K->138446K(236592K), 0.0109819 secs]
[GC 155470K->145282K(236592K), 0.0171411 secs]
[GC 162306K->146878K(236592K), 0.0089203 secs]
[GC 163902K->145559K(236592K), 0.0020895 secs]
[GC 162583K->145361K(236592K), 0.0021159 secs]
[GC 162385K->145311K(236592K), 0.0021374 secs]
[GC 162335K->145303K(236592K), 0.0020723 secs]
[GC 162327K->145303K(236592K), 0.0020560 secs]
[GC 162327K->145305K(236592K), 0.0021291 secs]
[GC 162329K->145303K(236592K), 0.0021311 secs]
[GC 162327K->145303K(236592K), 0.0020654 secs]
[GC 162327K->145303K(236592K), 0.0021938 secs]
[GC 162327K->145305K(236592K), 0.0021320 secs]
[GC 162329K->145303K(236592K), 0.0021453 secs]
[GC 162327K->145301K(236592K), 0.0022555 secs]
[GC 162325K->145305K(236592K), 0.0020959 secs]
[GC 162329K->145303K(236592K), 0.0020837 secs]
[GC 162327K->145303K(236592K), 0.0020251 secs]
[GC 162327K->145301K(236592K), 0.0020939 secs]
[GC 162325K->145305K(236592K), 0.0020974 secs]
[GC 162329K->145303K(236592K), 0.0020893 secs]
[GC 162327K->145303K(236592K), 0.0022085 secs]
[GC 162327K->145303K(236592K), 0.0021111 secs]
[GC 162327K->145305K(236592K), 0.0020748 secs]
[GC 162329K->145303K(236592K), 0.0022079 secs]
[Full GC 157249K->67775K(236592K), 0.4674900 secs]

Garbage Collection with the new validator:
[GC 83272K->67968K(129608K), 0.0037404 secs]
[GC 84992K->75629K(129608K), 0.0268719 secs]
[GC 92653K->76503K(129608K), 0.0063339 secs]
[GC 93527K->75898K(129608K), 0.0024181 secs]
[GC 92922K->75691K(129608K), 0.0024072 secs]
[Full GC 89762K->67665K(129608K), 0.4362108 secs]
Comment 2 Nick Sandonato CLA 2013-12-20 17:01:59 EST
Tested with a 16MB XML file.

Validation with new validator: 2511ms
Validation with old validator: (I don't know. I gave up after five minutes)