Bug 306573 - Add Performance tests for fast project import from snapshot data
Summary: Add Performance tests for fast project import from snapshot data
Status: RESOLVED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 3.6   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: 3.6 M7   Edit
Assignee: Martin Oberhuber CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 301563
Blocks: 305717 305718
  Show dependency tree
 
Reported: 2010-03-19 15:10 EDT by Martin Oberhuber CLA
Modified: 2010-04-01 06:46 EDT (History)
3 users (show)

See Also:


Attachments
performance tests v1 (3.44 KB, patch)
2010-03-19 17:03 EDT, Martin Oberhuber CLA
no flags Details | Diff
Quick hack to avoid snapshot file copy (3.86 KB, patch)
2010-03-19 17:37 EDT, Martin Oberhuber CLA
no flags Details | Diff
perf_35x version of patch (2.14 KB, patch)
2010-03-22 07:50 EDT, Martin Oberhuber CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Oberhuber CLA 2010-03-19 15:10:58 EDT
+++ This bug was initially created as a clone of Bug #301563 +++

Since performance was the main driver behind the "import from snapshot" functionality in bug 301563, we should add performance tests to continue monitoring this operation.

I would like to add two new performance tests:
  1.) close/open project to monitor speed of writing/reading .tree snapshots,
      related to the idea of storing these in ZIP format as per bug 305717.
  2.) importFromSnapshot to monitor speed of importing a large project from
      a refresh snapshot.

I'll use this bug to coordinate with the maintainers of the performance test suite, to ensure that we don't cause ripple by unexpected timing differences
due to addition of new tests.
Comment 1 Martin Oberhuber CLA 2010-03-19 17:03:42 EDT
Created attachment 162569 [details]
performance tests v1

Attached are the proposed performance tests. Couple questions:

1.) I'd like some feedback from a person acquainted with performance tests
    whether the approach / number of iterations makes sense. In my local 
    testing on a Windows XP laptop. I found run-time deviations in 
    [1.05s, 1.44s] for open/close and [391ms, 577ms] for loadSnapshot.
    Do I need more samples, could it be a systematic error with the test,
    or is this expected on a PC with the typical background programs like 
    Skype, Mail, IM... running? - I'm only interested in run-time for 
    these tests.

2.) I noticed that the time for "loadSnapshot" is actually LONGER (484ms)
    than the time for close/open project together (400ms). This is very
    surprising, since one would expect that just loading a snapshot should
    be faster than saving and loading it! I can only imagine that this 
    may be due to copying + deleting the snapshot ZIP into the metadata 
    area rather han opening it from its original location. Other ideas
    are welcome.

3.) Frederic, is there any kind of coordination necessary before these 
    tests can be committed and thus added to the performance wheel? Do
    any baselines need updating (loadSnapshot uses new API from 3.6M6
    but openClose uses old API). I was also unsure what the meaning of
    runner.setFingerprintName("Refresh Project") was.
Comment 2 Martin Oberhuber CLA 2010-03-19 17:37:53 EDT
Created attachment 162572 [details]
Quick hack to avoid snapshot file copy

Trying again with a quick hack to avoid copying the snapshot file shows that I'm only slightly faster: 440ms (95% in [317ms, 564ms]) - that's still slower than a full close-project/open-project cycle.

Could it be that reading from a 150KB .zip is slower than reading from a 550KB .tree? But that would contradict Francis' earlier measurements that originally made us use a .zip format for snapshots. Or could it be because IProject#close() doesn't really write a new tree when nothing changed in the project? - But even then I'd expect import-from-snapshot to be as fast as re-open. Or perhaps it's because empty sync info + markers need to be generated rather than read on project import?

Ideas are most welcome...
Comment 3 Martin Oberhuber CLA 2010-03-19 17:40:22 EDT
.. Or perhaps it's because the .tree is still in the disk write-cache on a close-project/open-project sequence? But that should be countered by the loadSnapshot running 5 times from the same snapshot file...
Comment 4 Martin Oberhuber CLA 2010-03-19 17:50:50 EDT
I made another quick hack to store as 550K .tree rather than 150K .zip: 
456ms (95% in [355ms, 556ms]), which is slightly slower than 440ms, just as expected. This solidifies Francis' original measurements, since the file and folder names are randomly generated in this testcase and thus don't compress as well as real-world file and folder names, and better compression is supposed to result in better performance since fewer disk blocks need to be read.

Still puzzled what's causing the slow-down...
Comment 5 Frederic Fusier CLA 2010-03-22 06:17:35 EDT
(In reply to comment #1)
> Created an attachment (id=162569) [details]
> performance tests v1
> 
> Attached are the proposed performance tests. Couple questions:
> 
> 1.) I'd like some feedback from a person acquainted with performance tests
>     whether the approach / number of iterations makes sense. In my local 
>     testing on a Windows XP laptop. I found run-time deviations in 
>     [1.05s, 1.44s] for open/close and [391ms, 577ms] for loadSnapshot.
>     Do I need more samples, could it be a systematic error with the test,
>     or is this expected on a PC with the typical background programs like 
>     Skype, Mail, IM... running? - I'm only interested in run-time for 
>     these tests.
> 
It's sure that running perf test on a windows machine with lot of other active applications will not help to have a small standard deviation. But the releng perf test machines do not have this problem, hence the results should be better there. If this was not the case, then you could add some more samples and see if that makes the numbers more stable.

> 2.) I noticed that the time for "loadSnapshot" is actually LONGER (484ms)
>     than the time for close/open project together (400ms). This is very
>     surprising, since one would expect that just loading a snapshot should
>     be faster than saving and loading it! I can only imagine that this 
>     may be due to copying + deleting the snapshot ZIP into the metadata 
>     area rather han opening it from its original location. Other ideas
>     are welcome.
> 
> 3.) Frederic, is there any kind of coordination necessary before these 
>     tests can be committed and thus added to the performance wheel? Do
>     any baselines need updating (loadSnapshot uses new API from 3.6M6
>     but openClose uses old API).

There's no real absolute need to synchronize when adding a new performance test to your test suite. When there's no baseline for a test, the reference to compute the delta for such tests is the result got with the first build which will have run this new test... Note that this usually the case for tests using new API as it's not possible to add them on top of the previous version code.

However, when a new added test can be put in the baseline, it's of course better to have the baseline number before running the test in the HEAD stream. As the baseline runs on Friday at 6pm, the best thing to do is to release all your changes on Friday morning (Ottawa time) both in perf_35x and HEAD streams. Then, as there's a nightly build with performances on Sunday at 8:00pm, if everything goes well (I mean no nightly build breakage), you should have the complete results for the added tests on Monday morning...


>     I was also unsure what the meaning of
>     runner.setFingerprintName("Refresh Project") was.

My guess was that it should put the test calling it in the fingerprints and a quick look at the implementation of the PerformanceTestRunner in your tests harness confirmed this assumption.

However I didn't see any usage of this in your patch. Was it intentional? Was your question about only about other existing tests which are calling this method?

HTH
Comment 6 Martin Oberhuber CLA 2010-03-22 06:39:10 EDT
(In reply to comment #5)
> It's sure that running perf test on a windows machine with lot of other 
> active applications will not help to have a small standard deviation. But
> the releng perf test machines do not have this problem,

Interestingly, the deviation did somehow seem to be systematic, all test results were in the same range on repeated runs and even when I tried hard to not have anything else running on the machine.

Anyways, from your comment I understand that it may be best to just commit the tests to the HEAD and perf_35x Streams (as far as possible) before Friday and see what the result is. I assume that tests can still be improved after that, or would it be a problem if I modify test behavior once it's been checked in?

> As the baseline runs on Friday at 6pm, the best thing to do is to release
> all your changes on Friday morning (Ottawa time) both in perf_35x and HEAD

Do perftests always run on I-builds such that changes need to be released to a Mapfile? Since I assume that Szymon would do that, I guess I'd have to have changes committed earlier.

> >     runner.setFingerprintName("Refresh Project") was.
> My guess was that it should put the test calling it in the fingerprints and a
> However I didn't see any usage of this in your patch. Was it intentional? Was

I did not put it in the patch because I did not understand it. But I saw it in the existing code. I would like to understand the impact of this, and whether or not it may make sense to add my test to the fingerprint. I assume that this would only make sense once the test runs stable and produces understandable data, since a fingerprint should not change its behavior easily. Correct?

Thanks
Martin
Comment 7 Martin Oberhuber CLA 2010-03-22 06:52:08 EDT
(In reply to comment #5)
> your changes on Friday morning (Ottawa time) both in perf_35x and

I committed the patch to HEAD. 

I did not find a "perf_35x" Stream on org.eclipse.core.tests.resources -- I can create it, but what should the branching point be? Looking at the mapfile, perf_35x seems to be on v20090520 for org.eclipse.core.tests.resources, which is surprisingly early, although it seems to match the R3_5 tag.
Comment 8 Frederic Fusier CLA 2010-03-22 07:18:41 EDT
(In reply to comment #6)
> Interestingly, the deviation did somehow seem to be systematic, all test
> results were in the same range on repeated runs and even when I tried hard to
> not have anything else running on the machine.
> 
> Anyways, from your comment I understand that it may be best to just commit the
> tests to the HEAD and perf_35x Streams (as far as possible) before Friday and
> see what the result is. I assume that tests can still be improved after that,
> or would it be a problem if I modify test behavior once it's been checked in?
> 
No problem to modify it after having released it. I usually do not react too early on results of new added tests...

> Do perftests always run on I-builds such that changes need to be released to a
> Mapfile? Since I assume that Szymon would do that, I guess I'd have to have
> changes committed earlier.
> 
Yes perf tests always run on I-builds. As the performance tests suite belongs to the Platform/Resource test plugin, I guess that when you change them, they will go into the next I-build while doing the build input. Szymon surely will confirm this assumption...

> I did not put it in the patch because I did not understand it. But I saw it in
> the existing code. I would like to understand the impact of this, and whether
> or not it may make sense to add my test to the fingerprint. I assume that this
> would only make sense once the test runs stable and produces understandable
> data, since a fingerprint should not change its behavior easily. Correct?
> 
Correct. But fingerprint tests are also usually used to show the performance of interesting functionalities that users can directly observe when using your component (typically API methods). So, these are tests you want to be sure that no performance regression occurs on them. 

Usually a new test is not directly put in the fingerprint as it often needs some time to stabilize it. But that can happen, typically if you have made big improvements in some area and the new test intend to show them... :-)

Do not forget that fingerprints may have strong impact on the users performance perception of your component. As soon as they are negative, a red bar is displayed there and may make users having the feeling that the current version has worst performance results than the previous one. That's why I do the verification mainly on these tests each week.
Comment 9 Frederic Fusier CLA 2010-03-22 07:21:54 EDT
(In reply to comment #7)
> 
> I committed the patch to HEAD. 
> 
> I did not find a "perf_35x" Stream on org.eclipse.core.tests.resources -- I can
> create it, but what should the branching point be? Looking at the mapfile,
> perf_35x seems to be on v20090520 for org.eclipse.core.tests.resources, which
> is surprisingly early, although it seems to match the R3_5 tag.

If the perf_35x branch does not exist then you have to create it and that must be done imperatively from the R3_5 tag (whatever the internal version it does match...).
Comment 10 Martin Oberhuber CLA 2010-03-22 07:50:40 EDT
Created attachment 162660 [details]
perf_35x version of patch

Ok, I created a perf_35x branch and committed attached perf_35x version of the test (only the testCloseOpenProject is included). Initial measurement on my machine is same as on 3.6:

 Elapsed Process:         1.2s         (95% in [1.09s, 1.31s]) 

Szymon, are you going to release this or should I?
Comment 11 Martin Oberhuber CLA 2010-03-22 18:34:59 EDT
I noticed that John released to HEAD, release to perf_35x is still pending.
I'm going to keep this bug open until we've got first performance results.
Comment 12 Szymon Brandys CLA 2010-03-31 10:24:59 EDT
I think that WorkspacePerformanceTest#testLoadSnapshot can be moved to ProjectSnapshotPerformanceTest as I already mentioned in Bug 305716, comment 3.
Comment 13 Martin Oberhuber CLA 2010-03-31 10:57:51 EDT
(In reply to comment #12)
What would be the value of making this move? - The infrastructure in WorkspacePerformanceTest is completely different, and optimized for performance tests.

ProjectSnapshotPerformanceTest, as you will know when you read its class Javadoc, is meant for standalone local tests and needs actual project data (whereas WorkspacePerformanceTest is capable of auto-generating fake data).
Comment 14 Martin Oberhuber CLA 2010-03-31 11:01:21 EDT
I'm actually marking this FIXED since the tests are in, and data has been generated for N-builds, I-builds as well as the perf_35x baseline.

Szymon if you think strongly about moving the tests, please file a new bug.
Comment 15 Szymon Brandys CLA 2010-04-01 06:46:54 EDT
(In reply to comment #14)
> Szymon if you think strongly about moving the tests, please file a new bug.
It's fine. I was just confused by this manual perf test for snapshots.