Bug 431610 - "Compare with each other in tree" very slow compared to normal compare
Summary: "Compare with each other in tree" very slow compared to normal compare
Status: RESOLVED FIXED
Alias: None
Product: EGit
Classification: Technology
Component: Core (show other bugs)
Version: 3.3   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-03-31 05:38 EDT by Marc Khouzam CLA
Modified: 2014-04-12 11:10 EDT (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marc Khouzam CLA 2014-03-31 05:38:56 EDT
From the history view, I select two commits to compare.  If I use "compare with each other" the result is almost instantaneous.  If I use "compare with each other in tree", EGit seems to compare each file in my workspace one by one to build the comparison; this takes quite a long time for large workspaces.

I don't understand why the "compare in tree" is so much slower than the normal "compare".  My un-educated guess is that the result of the first could be re-used to fill the tree view very quickly.

I'm using EGit 3.3 but I believe this situation has been there from the start.
Comment 1 Robin Rosenberg CLA 2014-04-01 14:05:44 EDT
Your guess is pretty close. A more precise explanation is that it was written by a contributor who wasn't familiar with the APIs and thus used a naïve approach. 

I think this is relatively low-hanging fruit for potential contributors.
Comment 2 Marc Khouzam CLA 2014-04-07 22:27:50 EDT
I've pushed a patch that dramatically improves the performance of the CompareTreeView:
  https://git.eclipse.org/r/24600

The idea is that for equal elements between the two commits, we don't call
  file =  ResourceUtil.getFileForLocation(repository,repoRelativePath);
which was the expensive computation (as well as setting the TaskName).

However, I don't know the impact of that change in the case where
  baseCommit == null
since having (file == null) will now lead to a different code path (that I don't quite understand).

The nice thing about this fix is that it does not change the behavior of the view but speeds it up a lot.

If this fix is not safe, I have another fix which will simply ignore all equal elements, unless the user asks to see them.  I'm trying to avoid such a solution because it will still cause a long delay when the user does want to see all equal content.
Comment 3 Robin Rosenberg CLA 2014-04-08 03:51:06 EDT
Merged bf777eecfd5fe111633277cbfa561cd179609274.
thanks
Comment 4 Robin Rosenberg CLA 2014-04-08 04:21:02 EDT
What makes it real slow is the progress monitor.
Comment 5 Robin Rosenberg CLA 2014-04-08 04:23:31 EDT
That was an improvement, but we're not done yet
Comment 6 Marc Khouzam CLA 2014-04-08 05:11:59 EDT
(In reply to Robin Rosenberg from comment #4)
> What makes it real slow is the progress monitor.

Yes printing each filename to the dialog slowed things down a lot.  In fact, that is the reason pressing 'Cancel' on that dialog takes a while to be honoured.  But even removing the use of the progress monitor wasn't enough as fetching the IFile was also quite slow.

(In reply to Robin Rosenberg from comment #5)
> That was an improvement, but we're not done yet

What do you have in mind?
Should we use a generic single string for the progress monitor to at least improve the 'Cancel' operation when showing all equal content?
Comment 7 Robin Rosenberg CLA 2014-04-08 05:19:53 EDT
The last one speeds up. The first one is more of a cleanup.
The cut-off-size is quite arbitrary. I haven't measured anything
to figure out an appropriate values. It's way above any normal source
file.

https://git.eclipse.org/r/24618
https://git.eclipse.org/r/24619
Comment 8 Marc Khouzam CLA 2014-04-08 05:41:47 EDT
(In reply to Robin Rosenberg from comment #7)
> The last one speeds up. The first one is more of a cleanup.
> The cut-off-size is quite arbitrary. I haven't measured anything
> to figure out an appropriate values. It's way above any normal source
> file.
> 
> https://git.eclipse.org/r/24618

That was actually my other solution.  However, the problem with that is that it breaks the "Show file with equal content support".  If you don't mind, I'll post my solution which adds a couple of checks to fix that.

> https://git.eclipse.org/r/24619

I'll try it. Thanks!
Comment 9 Robin Rosenberg CLA 2014-04-08 05:44:08 EDT
(In reply to Marc Khouzam from comment #8)
> (In reply to Robin Rosenberg from comment #7)
> > The last one speeds up. The first one is more of a cleanup.
> > The cut-off-size is quite arbitrary. I haven't measured anything
> > to figure out an appropriate values. It's way above any normal source
> > file.
> > 
> > https://git.eclipse.org/r/24618
> 
> That was actually my other solution.  However, the problem with that is that
> it breaks the "Show file with equal content support".  If you don't mind,
> I'll post my solution which adds a couple of checks to fix that.

Not at all.
Comment 10 Marc Khouzam CLA 2014-04-08 06:03:09 EDT
(In reply to Robin Rosenberg from comment #9)
> (In reply to Marc Khouzam from comment #8)
> > (In reply to Robin Rosenberg from comment #7)
> > > The last one speeds up. The first one is more of a cleanup.
> > > The cut-off-size is quite arbitrary. I haven't measured anything
> > > to figure out an appropriate values. It's way above any normal source
> > > file.
> > > 
> > > https://git.eclipse.org/r/24618

I pushed on top of your gerrit review before noticing you had abandoned it.  I then pushed a new one:
  https://git.eclipse.org/r/24623
Comment 11 Marc Khouzam CLA 2014-04-08 06:04:15 EDT
(In reply to Marc Khouzam from comment #10)

> I then pushed a new one:
>   https://git.eclipse.org/r/24623

What I don't like about this solution is that it remains very slow when showing the equal content.  Maybe with your improved progress monitor it will be better... I will try it.
Comment 12 Robin Rosenberg CLA 2014-04-08 06:08:29 EDT
(In reply to Marc Khouzam from comment #11)
> (In reply to Marc Khouzam from comment #10)
> 
> > I then pushed a new one:
> >   https://git.eclipse.org/r/24623
> 
> What I don't like about this solution is that it remains very slow when
> showing the equal content.  Maybe with your improved progress monitor it
> will be better... I will try it.

It makes all the difference. Mske sure you test the latest patch. I accidentally pushed a broken rebase.
Comment 13 Marc Khouzam CLA 2014-04-08 06:26:41 EDT
(In reply to Robin Rosenberg from comment #12)

> > What I don't like about this solution is that it remains very slow when
> > showing the equal content.  Maybe with your improved progress monitor it
> > will be better... I will try it.
> 
> It makes all the difference. Mske sure you test the latest patch. I
> accidentally pushed a broken rebase.

If we keep the first commit (bf777eecfd5fe111633277cbfa561cd179609274) which prevents setting the monitor taskName for equal content, the proposed improvement of being smarter about setting that taskName will only apply for non-equal content, right?  That would only make a difference for very large commits.  But maybe that is what you are aiming for?
Comment 14 Robin Rosenberg CLA 2014-04-08 07:15:18 EDT
If there is a large change, updating the taskmo(In reply to Marc Khouzam from comment #13)
> (In reply to Robin Rosenberg from comment #12)
> 
> > > What I don't like about this solution is that it remains very slow when
> > > showing the equal content.  Maybe with your improved progress monitor it
> > > will be better... I will try it.
> > 
> > It makes all the difference. Mske sure you test the latest patch. I
> > accidentally pushed a broken rebase.
> 
> If we keep the first commit (bf777eecfd5fe111633277cbfa561cd179609274) which
> prevents setting the monitor taskName for equal content, the proposed
> improvement of being smarter about setting that taskName will only apply for
> non-equal content, right?  That would only make a difference for very large
> commits.  But maybe that is what you are aiming for?

It's the large number changes I'm out to fix with that one. 
Actully, I see checking the size doesn't matter either since it never compares with the workspace so getting the SHA-1s is constant time.
Comment 15 Robin Rosenberg CLA 2014-04-12 11:10:25 EDT
Merged 9536c617f3a98a0d6c18499038984b804e5f7750 and bf777eecfd5fe111633277cbfa561cd179609274