Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [cdt-dev] Indexing performance

Hello, Martin

I performed experiments you suggested on my system (Core i5 3.4 GHz, 6GB RAM, SSD Hard drive). I use Linux (Ubuntu 12.04) and Windows (Windows 7), project sources on local hard drive (on Linux - on Linux partition, on Windows - on Windows partition). I have eclipse Juno, run on 64-bit JRE, modified eclipse.ini by adding values you suggested. Project does not define much C++ template classes, although STL used intensively. Most code is C++, although some C files also present. Some source files are generated during build process, may be because of this number of indexed files on Windows and Linux differ (see below).

On Linux indexing took about 47min, here is last line from error log after indexing

Indexed 'project' (30,428 sources, 53,681 headers) in 2,751.48 sec: 6,571,612 declarations; 30,354,126 references; 10,737 unresolved inclusions; 102,640 syntax errors; 570,779 unresolved names (1.52%)
'time' command output is
real	47m18.385s
user	50m18.485s
sys	3m29.209s
Since user time here is even greater then real (seems, because of multithreading), I suppose that CPU, not hard drive is the limiting factor.

On Windows  last line is
Indexed 'project' (31 752 sources, 32 657 headers) in 7 341,33 sec: 7 254 957 declarations; 33 186 193 references; 8 786 unresolved inclusions; 449 262 syntax errors; 746 128 unresolved names (1,81 %) Indexing took about 2 hours, and Process Explorer shows that javaw.exe process used CPU during following times
Kernel  0:50:20.850
User:   1:26:51.385
Total:  2:17:12.235
Here TotalTime also near indexing time.

Do you think that behavior caused by not properly set workspace (too many unresolved includes etc.)? Or may be it is because of SSD hard drive, with faster response times?

Thank you,
Vyacheslav

On Fri, Feb 8, 2013 at 1:12 PM, Oberhuber, Martin <Martin.Oberhuber@xxxxxxxxxxxxx> wrote:
Hello Vyacheslav,

Before going anywhere deeper, I would suggest collecting some basic data:

   - What is your host OS ?
   - Where are the files located (local disk / NFS) ?
   - Is your indexer setup reasonably correct (include paths, preprocessor macros; #unresolved symbols statistics) ?
   - How many files are you looking at ?
   - Do you use many C++ templates ?
   - Running Eclipse under "time" when reparsing your project, what is the amount of user / sys / real time ?

To give you some reference point, I've been indexing a project with 150.000 files (mostly C but some C++)
On a Linux box, all files local, with 0.33% unresolved symbols in 90 minutes real-time (user: 60 min, sys: 4 min).
Note that you need -vmargs -Xmx2048m -XX:MaxPermSize=512m for such a large project.
Details here: https://bugs.eclipse.org/bugs/show_bug.cgi?id=394151#c3

As you see, 30 % of the time goes into plain file access, waiting for the disk (90 min real - 60 min user).
Personally I do not think that there is much optimization potential left in this scenario -- the index has to
be a shared database, so I don't see much potential for improvements by multi-threading here.

I think the easiest way for you to get to these numbers is this, using the "time" command on Linux
(on Windows you could probably use Task Manager and read the numbers):

   1. Set up your project In Eclipse. Make sure that Preferences "Refresh Workspace on startup" is OFF.
   2. Window > Show View > Other : Errorlog and look at the indexing statistics.
        - unresolved includes should not be too many, unresolved symbols should be < 5%
        - if these criteria are not met, you likely have incorrect config of macros/includes, and a massive index quality problem.
   3. Quit Eclipse
   4. time eclipse
   5. Right-click project > Index > Rebuild

I'm curious to see what numbers you have.

Thanks,
Martin
--
Martin Oberhuber, SMTS / Product Architect - Development Tools, Wind River
direct +43.662.457915.85  fax +43.662.457915.6


-----Original Message-----
From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Vyacheslav Chigrin
Sent: Thursday, February 07, 2013 10:01 PM
To: cdt-dev@xxxxxxxxxxx
Subject: [cdt-dev] Indexing performance

Hello,

I am using Eclipse CDT on very large C++ project and I am very interested in improving indexing performance. Searching web shows that there are already a lot of work performed in this direction. I am very new in Eclipse developing, so I am asking - is there known good start point for this task? Are there any known bottle necks? Is parallel indexing considered to do it faster on multi-core systems? I will be happy If I could to do anything useful for the project.

Thank you,
Vyacheslav
_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev
_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev


Back to the top