Bug 57137 - [content type] investigate content type registry performance
Summary: [content type] investigate content type registry performance
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Runtime (show other bugs)
Version: 3.0   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.1 RC1   Edit
Assignee: Rafael Chaves CLA
QA Contact:
URL:
Whiteboard:
Keywords: performance
Depends on:
Blocks:
 
Reported: 2004-04-01 17:13 EST by Rafael Chaves CLA
Modified: 2005-07-16 15:01 EDT (History)
4 users (show)

See Also:


Attachments
patch for org.eclipse.core.tests.runtime (12.04 KB, patch)
2005-02-15 13:53 EST, Rafael Chaves CLA
no flags Details | Diff
patch for org.eclipse.core.runtime (4.56 KB, patch)
2005-05-16 17:44 EDT, Rafael Chaves CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Rafael Chaves CLA 2004-04-01 17:13:57 EST
Consider 1000 content types a reasonable worst case scenario. How much memory is
consumed? How long does it take to find content types based on file names/ contents?
Comment 1 Rafael Chaves CLA 2004-06-15 12:41:30 EDT
The following improvements were performed so far:

- reduced footprint for content description objects (bug 61976)
- caching content descriptions when IFile#getContentDescription is used, default
content descriptions are shared (bug 58726)

Keeping this PR open to track any further work on performance.
Comment 2 Rafael Chaves CLA 2004-10-07 12:09:45 EDT
Another performance fix: bug 61975.
Comment 3 Rafael Chaves CLA 2005-02-15 13:53:53 EST
Created attachment 17958 [details]
patch for org.eclipse.core.tests.runtime

The patch contributes some performance tests for the content type
infrastructure. Running the tests, I could not see any excessive performance
costs while doing regular operations such as content type matching (by name or
content) and content type inheritance checking when the platform contains more
than 1000 content types.

Once the patch gets released to HEAD, perf_30 and perf_301 branches, I will
close this PR.
Comment 4 Rafael Chaves CLA 2005-02-15 13:56:15 EST
DJ, these are changes that only affect the performance tests (test addition).
Should I still wait until M5 is out or can I release it now?
Comment 5 DJ Houghton CLA 2005-02-15 14:33:50 EST
If the core code is already released and its only a test change, then feel free
to release.
Comment 6 Rafael Chaves CLA 2005-02-15 18:24:04 EST
Ok. Released to HEAD/M5, 3.0 and 3.0.1 performance testing branches. Closing.
Comment 7 Rafael Chaves CLA 2005-04-01 13:59:54 EST
This could use more work during M7. Need to:

- investigate why performance tests are failing (measuring done wrong in the
3.0.1 tests?)
- add performance monitoring using new PerformanceStats API
- make sure new features (bug 69640 and bug 87447) don't cause performance
regressions

See also related bug 89287.
Comment 8 Rafael Chaves CLA 2005-04-05 11:58:15 EDT
Things to look into:
- consider merging the validation/description stages
- at least optimize the case where one single content type is eligible
- improve XMLRootElementContentDescriber (stop using XML parser)

See also bug 89195 comment 23.
Comment 9 Rafael Chaves CLA 2005-04-25 11:49:29 EDT
Minor improvement (memory footprint): XML files with explicit UTF-8 encoding
manifested in the <?xml?> processing instruction are treated as if no encoding
was explicitly specified. This increases the chances the same default content
description object can be used for most XML files.
Comment 10 Rafael Chaves CLA 2005-05-11 10:17:56 EDT
Run out of time for M7. If the required changes end up not being very extensive
and the performance improvements prove to be worthwhile, I will try to get
approval to do this after M7.
Comment 11 Rafael Chaves CLA 2005-05-16 17:44:23 EDT
Created attachment 21231 [details]
patch for org.eclipse.core.runtime

This patch implements the idea of doing a single pass of validation/description
if only one content type matches a file name (*.java case). It also will
short-circuit the case where no content type matches a file name specification
(we were doing some unnecessary work in this case).
Comment 12 Rafael Chaves CLA 2005-05-16 17:47:04 EDT
DJ, do you approve the changes to org.eclipse.core.runtime? They are very
localized and there are no relevant changes to behavior.
Comment 13 DJ Houghton CLA 2005-05-16 21:58:05 EDT
+1 for RC1
Comment 14 Rafael Chaves CLA 2005-05-17 12:40:22 EDT
Released patch. 

Also, I opened bug 92617 to address the fact we use a full fledged SAX parser
for content type matching, but I don't think that will be addressed for 3.1.

Will close this PR as soon remaining issues (failure in workspace content
description performance tests) are investigated.
Comment 15 Rafael Chaves CLA 2005-05-18 18:20:20 EDT
The test failures are due to the fact that an unrelated test project contributes
many more content types in 3.1 than used to in 3.0. Running without that
project, actually shows that we are slightly faster in 3.1 (~7% in the cold
test, ~3.5% in the warmed up test). I am disabling those tests for now. Need to
devise a way to run them in separate, as a session test for instance.

Closing. All the work planned for 3.1 is done. Any specific issues should be
addressed in a separate PR.