Bug 393332 - Stuck in reindexing loop caused by indexing repository in home directory or root ("/.git")
Summary: Stuck in reindexing loop caused by indexing repository in home directory or r...
Status: RESOLVED FIXED
Alias: None
Product: EGit
Classification: Technology
Component: Core (show other bugs)
Version: 2.1   Edit
Hardware: PC Mac OS X
: P3 major (vote)
Target Milestone: 3.0.2   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 405833 414629 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-11-01 09:46 EDT by Greg Watson CLA
Modified: 2013-10-22 03:45 EDT (History)
7 users (show)

See Also:


Attachments
reindexing loop (43.88 KB, image/png)
2012-11-01 09:51 EDT, Greg Watson CLA
no flags Details
GC error after about 10 minutes of reindexing (24.44 KB, image/png)
2013-07-12 07:07 EDT, Stephen Evanchik CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Greg Watson CLA 2012-11-01 09:46:46 EDT
Eclipse: 4.2.1
Platform OS X 10.8.2

I just added a new file to a project. Now Egit is stuck in an endless loop reindexing the repository that consumes all CPU on my machine. The only way to stop it is to exit from Eclipse.

Screenshot attached.
Comment 1 Greg Watson CLA 2012-11-01 09:51:09 EDT
Created attachment 223073 [details]
reindexing loop
Comment 2 Matthias Sohn CLA 2012-11-07 02:28:44 EST
Which EGit version did you use here ?
Could you create a couple of thread dumps when this happens again and post them here for analysis ?
Comment 3 John Eblen CLA 2013-07-09 11:08:25 EDT
Eclipse:
Eclipse for Parallel Application Developers

Version: Kepler Release
Build id: 20130614-0229

Platform: linux openSUSE 12.3 64-bit


I am experiencing this with a fresh install of the Kepler parallel package. I only created a "Hello World" C project, and the following job ran (<username> replaces my real user name):

  Re-indexing (fully) repository <username>

This process finally crashes after about 20 minutes with the following message:

  'Re-indexing (fully) repository <username>' has encountered a problem.

  An internal error occurred during: "Re-indexing (fully) repository <username>".

I'll try to post a stack trace later.
Comment 4 John Eblen CLA 2013-07-09 11:53:38 EDT
Below is the stack trace when the process crashes. My guess is that it is trying to index my entire home directory and eventually encounters an obscure file name that it can't handle.

This indexing process seems to block other operations on the newly created project. For synchronized projects, the initial sync is blocked, and when converting a project to a synchronized project, it prevents the UI window from closing.

Also, this process runs whenever Eclipse is restarted.

A few questions to consider:

1) If I am not using EGit, why does this process run at all?
2) What "repository" is it indexing, since my new project does not have one?
3) How come I did not have this problem previously? My machine was just updated from 32-bit to 64-bit, so I was trying the 64-bit Linux Eclipse Parallel package for the first time, but I'm not sure if that has anything to do with it.


Stack Trace:
java.lang.RuntimeException: Unencodeable file: \��+?��
	at org.eclipse.jgit.treewalk.WorkingTreeIterator$Entry.encodeName(WorkingTreeIterator.java:941)
	at org.eclipse.jgit.treewalk.WorkingTreeIterator.init(WorkingTreeIterator.java:659)
	at org.eclipse.jgit.treewalk.FileTreeIterator.<init>(FileTreeIterator.java:128)
	at org.eclipse.egit.core.AdaptableFileTreeIterator.<init>(AdaptableFileTreeIterator.java:74)
	at org.eclipse.egit.core.AdaptableFileTreeIterator.createSubtreeIterator(AdaptableFileTreeIterator.java:85)
	at org.eclipse.jgit.treewalk.AbstractTreeIterator.createSubtreeIterator(AbstractTreeIterator.java:535)
	at org.eclipse.jgit.treewalk.TreeWalk.enterSubtree(TreeWalk.java:908)
	at org.eclipse.jgit.treewalk.TreeWalk.next(TreeWalk.java:566)
	at org.eclipse.jgit.lib.IndexDiff.diff(IndexDiff.java:389)
	at org.eclipse.egit.core.internal.indexdiff.IndexDiffCacheEntry.calcIndexDiffDataFull(IndexDiffCacheEntry.java:485)
	at org.eclipse.egit.core.internal.indexdiff.IndexDiffCacheEntry.access$7(IndexDiffCacheEntry.java:474)
	at org.eclipse.egit.core.internal.indexdiff.IndexDiffCacheEntry$4.run(IndexDiffCacheEntry.java:285)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:53)
Comment 5 Matthias Sohn CLA 2013-07-09 12:21:49 EDT
try to switch on git trace in Eclipse:
- click CTRL-3 or CMD-3 (on Mac)
- type "trace"
- select "Git Trace Configuration"
- switch on platform trace
- enable "Main switch for org.eclipse.egit.core"
- enable trace for /debug/core/indexdiffcache

check the recorded trace hopefully this reveals what's going on
Comment 6 John Eblen CLA 2013-07-09 12:39:26 EDT
Okay, I think I know what is happening. My current workspaces are in a subdirectory of my home directory, and before they were on a separate partition. There is a Git repository in my home directory (an accident from some earlier debugging), which seems to be what EGit is attempting to index.

So the issue seems to be that EGit attempts to index any repository on the project path, not just in the project directory. This is still a bug, I think, but much less serious since this shouldn't normally be a problem.
Comment 7 Matthias Sohn CLA 2013-07-10 04:47:41 EDT
(In reply to comment #6)
> Okay, I think I know what is happening. My current workspaces are in a
> subdirectory of my home directory, and before they were on a separate
> partition. There is a Git repository in my home directory (an accident from
> some earlier debugging), which seems to be what EGit is attempting to index.
> 
> So the issue seems to be that EGit attempts to index any repository on the
> project path, not just in the project directory. This is still a bug, I
> think, but much less serious since this shouldn't normally be a problem.

How should EGit know on which level to look for git repositories ? Most git repositories containing Eclipse projects contain many Eclipse projects and often they contain some versioned files outside Eclipse projects.

Can you provide a minimal example with steps to reproduce the problem ?
Comment 8 Robin Stocker CLA 2013-07-10 07:18:39 EDT
(In reply to comment #6)
> So the issue seems to be that EGit attempts to index any repository on the
> project path, not just in the project directory. This is still a bug, I
> think, but much less serious since this shouldn't normally be a problem.

Because "Auto share projects located in a git repository" is on by default in the Team > Git > Projects preferences, it automatically tries to find a repository for new projects, in the directory itself and any parent directories.

Maybe we could prompt for confirmation before beginning to index the found repository in case of automatic sharing?
Comment 9 Stephen Evanchik CLA 2013-07-12 07:02:05 EDT
(In reply to comment #4)
> Below is the stack trace when the process crashes. My guess is that it is
> trying to index my entire home directory and eventually encounters an
> obscure file name that it can't handle.
> 
> This indexing process seems to block other operations on the newly created
> project. For synchronized projects, the initial sync is blocked, and when
> converting a project to a synchronized project, it prevents the UI window
> from closing.
> 
> Also, this process runs whenever Eclipse is restarted.
> 
> A few questions to consider:
> 
> 1) If I am not using EGit, why does this process run at all?
> 2) What "repository" is it indexing, since my new project does not have one?
> 3) How come I did not have this problem previously? My machine was just
> updated from 32-bit to 64-bit, so I was trying the 64-bit Linux Eclipse
> Parallel package for the first time, but I'm not sure if that has anything
> to do with it.
> 
> 
> Stack Trace:
> java.lang.RuntimeException: Unencodeable file: \��+?��
> 	at

I am having the same problem but in my case my machine is unusable while Eclipse is running. I have a fresh Kepler installation on Linux (eclipse-rcp-kepler-R-linux-gtk-x86_64) with a new workspace where I have created 3 projects which are nearly empty. None of these projects use git.

The directory structure is as follows:

 /home/evanchsa
 /home/evanchsa/workspaces/my_workspace

Here is the stack trace taken while this is running:

"Worker-3" prio=10 tid=0x00007fb48c007000 nid=0x4141 runnable [0x00007fb4f4cf1000]
   java.lang.Thread.State: RUNNABLE
	at org.eclipse.core.runtime.Path.computeSegmentCount(Path.java:450)
	at org.eclipse.core.runtime.Path.computeSegments(Path.java:467)
	at org.eclipse.core.runtime.Path.initialize(Path.java:602)
	at org.eclipse.core.runtime.Path.<init>(Path.java:163)
	at org.eclipse.core.internal.resources.AliasManager$2.compare(AliasManager.java:525)
	at org.eclipse.core.internal.resources.AliasManager$2.compare(AliasManager.java:1)
	at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:369)
	at java.util.TreeMap.getEntry(TreeMap.java:340)
	at java.util.TreeMap.get(TreeMap.java:273)
	at org.eclipse.core.internal.resources.AliasManager$LocationMap.matchingResourcesDo(AliasManager.java:196)
	at org.eclipse.core.internal.resources.AliasManager.findResources(AliasManager.java:446)
	at org.eclipse.core.internal.localstore.FileSystemResourceManager.findLinkedResourcesPaths(FileSystemResourceManager.java:144)
	at org.eclipse.core.internal.localstore.FileSystemResourceManager.allPathsForLocationNonCanonical(FileSystemResourceManager.java:124)
	at org.eclipse.core.internal.localstore.FileSystemResourceManager.allPathsForLocation(FileSystemResourceManager.java:64)
	at org.eclipse.core.internal.localstore.FileSystemResourceManager.allResourcesFor(FileSystemResourceManager.java:216)
	at org.eclipse.core.internal.resources.WorkspaceRoot.findContainersForLocationURI(WorkspaceRoot.java:89)
	at org.eclipse.core.internal.resources.WorkspaceRoot.findContainersForLocationURI(WorkspaceRoot.java:80)
	at org.eclipse.egit.core.IteratorService.findContainer(IteratorService.java:68)
	at org.eclipse.egit.core.AdaptableFileTreeIterator.createSubtreeIterator(AdaptableFileTreeIterator.java:82)
	at org.eclipse.jgit.treewalk.AbstractTreeIterator.createSubtreeIterator(AbstractTreeIterator.java:535)
	at org.eclipse.jgit.treewalk.TreeWalk.enterSubtree(TreeWalk.java:908)
	at org.eclipse.jgit.treewalk.TreeWalk.next(TreeWalk.java:566)
	at org.eclipse.jgit.lib.IndexDiff.diff(IndexDiff.java:389)
	at org.eclipse.egit.core.internal.indexdiff.IndexDiffCacheEntry.calcIndexDiffDataFull(IndexDiffCacheEntry.java:485)
	at org.eclipse.egit.core.internal.indexdiff.IndexDiffCacheEntry.access$7(IndexDiffCacheEntry.java:474)
	at org.eclipse.egit.core.internal.indexdiff.IndexDiffCacheEntry$4.run(IndexDiffCacheEntry.java:285)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:53)


My machine is at 100% CPU for all cores and will remain there until I see the exception complaining about an unreadable file which is in:

 /home/evanchsa/.cxoffice

I think the indexer is scanning my entire home directory.



Version details:

Eclipse for RCP and RAP Developers

Version: Kepler Release
Build id: 20130614-0229

Egit: 3.0.0.20130610
Comment 10 Stephen Evanchik CLA 2013-07-12 07:07:50 EDT
Created attachment 233416 [details]
GC error after about 10 minutes of reindexing

An internal error occurred during: "Re-indexing (fully) repository ".
GC overhead limit exceeded
Comment 11 Robin Stocker CLA 2013-07-12 10:39:23 EDT
Stephen, try the following:

1. Disable "Auto share projects located in a git repository" in the Git preferences (under Projects)
2. Open the Git Repositories view and remove the repository that represents your home directory
Comment 12 Stephen Evanchik CLA 2013-07-12 10:58:52 EDT
(In reply to comment #11)
> Stephen, try the following:
> 
> 1. Disable "Auto share projects located in a git repository" in the Git
> preferences (under Projects)
> 2. Open the Git Repositories view and remove the repository that represents
> your home directory

Hi Robin,

I disabled "Auto share projects located in a git repository" and there was no change in behavior.

I will remove the home directory repository once I get home as this environment is on a different laptop.

Is the user's home directory added by default? I didn't add the home directory as a git repo (unless there was a hidden in plain sight checkbox in a wizard somewhere).

I think there are two bugs here:

 1. Files that are not accessible should not abort the indexer with an exception. They should just be left out of the index.
 2. The indexer process may have a memory problem. I suppose I can set heap dump on OOM to see if this is true.
Comment 13 Stephen Evanchik CLA 2013-07-12 21:46:44 EDT
(In reply to comment #12)
> (In reply to comment #11)
> > Stephen, try the following:
> > 
> > 1. Disable "Auto share projects located in a git repository" in the Git
> > preferences (under Projects)
> > 2. Open the Git Repositories view and remove the repository that represents
> > your home directory
> 
> Hi Robin,
> 
> I disabled "Auto share projects located in a git repository" and there was
> no change in behavior.
> 
> I will remove the home directory repository once I get home as this
> environment is on a different laptop.
> 
> Is the user's home directory added by default? I didn't add the home
> directory as a git repo (unless there was a hidden in plain sight checkbox
> in a wizard somewhere).
> 
> I think there are two bugs here:
> 
>  1. Files that are not accessible should not abort the indexer with an
> exception. They should just be left out of the index.
>  2. The indexer process may have a memory problem. I suppose I can set heap
> dump on OOM to see if this is true.


I now know what is going on. First, I have a git repo located at '/' (the root of my volume). Somehow, this repository was added to my workspace and is visible in the repositories view. This is why the re-index lasts for 10-20 minutes.

I added -XX:+HeapDumpOnOutOfMemoryError to capture the heap when Eclipse OOM's. I guess what is going on is that in:

  at org.eclipse.jgit.lib.IndexDiff.diff(Lorg/eclipse/jgit/lib/ProgressMonitor;IILjava/lang/String;)Z (IndexDiff.java:389)


a variety of HashSets are being populated. I think the problem is that the indexer is basically trying to place my entire filesystem in the untracked HashSet. This is totally a guess. I would have expected that this process would behave similar to git status which expands a directory iff there are tracked files in there or a child (or something like that).

In any case, removing the repo at / solves my problem.

One thing that seemed scary: when I delete the repository from the "Git Repositories" view I am warned that something is going to be deleted. Sure enough, when I proceed through the popup I receive an error saying that /.git/config could not be deleted/accessed (because this is owned by root).
Comment 14 Robin Stocker CLA 2013-07-13 07:24:22 EDT
(In reply to comment #13)
> I now know what is going on. First, I have a git repo located at '/' (the
> root of my volume). Somehow, this repository was added to my workspace and
> is visible in the repositories view. This is why the re-index lasts for
> 10-20 minutes.

Yes, see comment 8 for why this happened and a possible solution.

> I added -XX:+HeapDumpOnOutOfMemoryError to capture the heap when Eclipse
> OOM's. I guess what is going on is that in:
> 
>   at
> org.eclipse.jgit.lib.IndexDiff.diff(Lorg/eclipse/jgit/lib/ProgressMonitor;
> IILjava/lang/String;)Z (IndexDiff.java:389)
> 
> 
> a variety of HashSets are being populated. I think the problem is that the
> indexer is basically trying to place my entire filesystem in the untracked
> HashSet. This is totally a guess. I would have expected that this process
> would behave similar to git status which expands a directory iff there are
> tracked files in there or a child (or something like that).

I think this is the same problem as in bug 388582, please subscribe there.

> In any case, removing the repo at / solves my problem.
> 
> One thing that seemed scary: when I delete the repository from the "Git
> Repositories" view I am warned that something is going to be deleted. Sure
> enough, when I proceed through the popup I receive an error saying that
> /.git/config could not be deleted/accessed (because this is owned by root).

Did you select "Remove Repository from View" or "Delete Repository..."? The former should not delete anything.
Comment 15 Stephen Evanchik CLA 2013-07-15 08:31:15 EDT
(In reply to comment #14)
> (In reply to comment #13)
> > I now know what is going on. First, I have a git repo located at '/' (the
> > root of my volume). Somehow, this repository was added to my workspace and
> > is visible in the repositories view. This is why the re-index lasts for
> > 10-20 minutes.
> 
> Yes, see comment 8 for why this happened and a possible solution.

Ah thanks. That explains why it found my repo at /.

I'm not sure under what circumstances this would ever find the "right" git repo. I usually have everything in ~/src and my workspace can be in ~/src/workspaces or ~/workspaces . On Windows it is similar: C:\dev\src and C:\dev\workspaces

> 
> > I added -XX:+HeapDumpOnOutOfMemoryError to capture the heap when Eclipse
> > OOM's. I guess what is going on is that in:
> > 
> >   at
> > org.eclipse.jgit.lib.IndexDiff.diff(Lorg/eclipse/jgit/lib/ProgressMonitor;
> > IILjava/lang/String;)Z (IndexDiff.java:389)
> > 
> > 
> > a variety of HashSets are being populated. I think the problem is that the
> > indexer is basically trying to place my entire filesystem in the untracked
> > HashSet. This is totally a guess. I would have expected that this process
> > would behave similar to git status which expands a directory iff there are
> > tracked files in there or a child (or something like that).
> 
> I think this is the same problem as in bug 388582, please subscribe there.

Yes it looks like it is.

> 
> > In any case, removing the repo at / solves my problem.
> > 
> > One thing that seemed scary: when I delete the repository from the "Git
> > Repositories" view I am warned that something is going to be deleted. Sure
> > enough, when I proceed through the popup I receive an error saying that
> > /.git/config could not be deleted/accessed (because this is owned by root).
> 
> Did you select "Remove Repository from View" or "Delete Repository..."? The
> former should not delete anything.

I pressed the Delete button on the keyboard. I would expect this to work like deleting a project in Eclipse: A dialog prompts the user to delete the contents on disk (which cannot be undone).
Comment 16 Robin Stocker CLA 2013-08-08 09:28:36 EDT
(In reply to comment #15)
> Ah thanks. That explains why it found my repo at /.
> 
> I'm not sure under what circumstances this would ever find the "right" git
> repo. I usually have everything in ~/src and my workspace can be in
> ~/src/workspaces or ~/workspaces . On Windows it is similar: C:\dev\src and
> C:\dev\workspaces

When you have e.g. the following structure:

/foo/bar/.git
/foo/bar/baz/.project

In that case, when you import the baz project, the bar repo is automatically found and connected. But maybe we should be more careful. We could ask for confirmation before connecting, or e.g. add special cases for / and /home (in case the user DOES want these, they can always connect them manually).

> > Did you select "Remove Repository from View" or "Delete Repository..."? The
> > former should not delete anything.
> 
> I pressed the Delete button on the keyboard. I would expect this to work
> like deleting a project in Eclipse: A dialog prompts the user to delete the
> contents on disk (which cannot be undone).

Ok, see bug 395351 for this.
Comment 17 Robin Stocker CLA 2013-08-08 10:02:17 EDT
Proposed fix for not automatically sharing /, /home or /home/username (or the platform equivalent): https://git.eclipse.org/r/15248
Comment 18 Matthias Sohn CLA 2013-08-09 18:06:22 EDT
merged as 5a60d9ee49f8ae4a63a6b9f5243f596d0d96a4ba
Comment 19 Matthias Sohn CLA 2013-08-26 02:21:32 EDT
cherry-picked for 3.0.2
Comment 20 Robin Stocker CLA 2013-10-21 11:46:39 EDT
*** Bug 414629 has been marked as a duplicate of this bug. ***
Comment 21 Philip Aston CLA 2013-10-22 03:45:07 EDT
*** Bug 405833 has been marked as a duplicate of this bug. ***