57137 – [content type] investigate content type registry performance

Bug 57137 - [content type] investigate content type registry performance

Summary: [content type] investigate content type registry performance

Status:	RESOLVED FIXED

Alias:	None

Product:	Platform
Classification:	Eclipse Project
Component:	Runtime (show other bugs)
Version:	3.0
Hardware:	PC Windows XP

Importance:	P3 normal (vote)
Target Milestone:	3.1 RC1
Assignee:	Rafael Chaves
QA Contact:

URL:
Whiteboard:
Keywords:	performance

Depends on:
Blocks:

Reported:	2004-04-01 17:13 EST by Rafael Chaves
Modified:	2005-07-16 15:01 EDT (History)
CC List:	4 users (show)

See Also:

Attachments
patch for org.eclipse.core.tests.runtime (12.04 KB, patch) 2005-02-15 13:53 EST, Rafael Chaves	no flags	Details \| Diff
patch for org.eclipse.core.runtime (4.56 KB, patch) 2005-05-16 17:44 EDT, Rafael Chaves	no flags	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Rafael Chaves

2004-04-01 17:13:57 EST

Consider 1000 content types a reasonable worst case scenario. How much memory is
consumed? How long does it take to find content types based on file names/ contents?

Comment 1 Rafael Chaves

2004-06-15 12:41:30 EDT

The following improvements were performed so far:

- reduced footprint for content description objects (bug 61976)
- caching content descriptions when IFile#getContentDescription is used, default
content descriptions are shared (bug 58726)

Keeping this PR open to track any further work on performance.

Comment 2 Rafael Chaves

2004-10-07 12:09:45 EDT

Another performance fix: bug 61975.

Comment 3 Rafael Chaves

2005-02-15 13:53:53 EST

Created attachment 17958 [details]
patch for org.eclipse.core.tests.runtime

The patch contributes some performance tests for the content type
infrastructure. Running the tests, I could not see any excessive performance
costs while doing regular operations such as content type matching (by name or
content) and content type inheritance checking when the platform contains more
than 1000 content types.

Once the patch gets released to HEAD, perf_30 and perf_301 branches, I will
close this PR.

Comment 4 Rafael Chaves

2005-02-15 13:56:15 EST

DJ, these are changes that only affect the performance tests (test addition).
Should I still wait until M5 is out or can I release it now?

Comment 5 DJ Houghton

2005-02-15 14:33:50 EST

If the core code is already released and its only a test change, then feel free
to release.

Comment 6 Rafael Chaves

2005-02-15 18:24:04 EST

Ok. Released to HEAD/M5, 3.0 and 3.0.1 performance testing branches. Closing.

Comment 7 Rafael Chaves

2005-04-01 13:59:54 EST

This could use more work during M7. Need to:

- investigate why performance tests are failing (measuring done wrong in the
3.0.1 tests?)
- add performance monitoring using new PerformanceStats API
- make sure new features (bug 69640 and bug 87447) don't cause performance
regressions

See also related bug 89287.

Comment 8 Rafael Chaves

2005-04-05 11:58:15 EDT

Things to look into:
- consider merging the validation/description stages
- at least optimize the case where one single content type is eligible
- improve XMLRootElementContentDescriber (stop using XML parser)

See also bug 89195 comment 23.

Comment 9 Rafael Chaves

2005-04-25 11:49:29 EDT

Minor improvement (memory footprint): XML files with explicit UTF-8 encoding
manifested in the <?xml?> processing instruction are treated as if no encoding
was explicitly specified. This increases the chances the same default content
description object can be used for most XML files.

Comment 10 Rafael Chaves

2005-05-11 10:17:56 EDT

Run out of time for M7. If the required changes end up not being very extensive
and the performance improvements prove to be worthwhile, I will try to get
approval to do this after M7.

Comment 11 Rafael Chaves

2005-05-16 17:44:23 EDT

Created attachment 21231 [details]
patch for org.eclipse.core.runtime

This patch implements the idea of doing a single pass of validation/description
if only one content type matches a file name (*.java case). It also will
short-circuit the case where no content type matches a file name specification
(we were doing some unnecessary work in this case).

Comment 12 Rafael Chaves

2005-05-16 17:47:04 EDT

DJ, do you approve the changes to org.eclipse.core.runtime? They are very
localized and there are no relevant changes to behavior.

Comment 13 DJ Houghton

2005-05-16 21:58:05 EDT

+1 for RC1

Comment 14 Rafael Chaves

2005-05-17 12:40:22 EDT

Released patch. 

Also, I opened bug 92617 to address the fact we use a full fledged SAX parser
for content type matching, but I don't think that will be addressed for 3.1.

Will close this PR as soon remaining issues (failure in workspace content
description performance tests) are investigated.

Comment 15 Rafael Chaves

2005-05-18 18:20:20 EDT

The test failures are due to the fact that an unrelated test project contributes
many more content types in 3.1 than used to in 3.0. Running without that
project, actually shows that we are slightly faster in 3.1 (~7% in the cold
test, ~3.5% in the warmed up test). I am disabling those tests for now. Need to
devise a way to run them in separate, as a session test for instance.

Closing. All the work planned for 3.1 is done. Any specific issues should be
addressed in a separate PR.