Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [cdt-dev] cdt-dev Digest, Vol 81, Issue 39

Hi Doug.
Consider my response "totally inapproriate" and disrespectful or not... The technical problems will not change, no matter how long topics are NOT discussed here, questions are NOT answered here and clear statements against a specific approach are missused as statements FOR that approach.
And I consider it not less disrespectful, if questions are not answered for months.
My patch is in place since 2011-07-24. My unaswered questions are in b#351659 since 2011-07-31 (my last posting there).
And with regards to project references and "incomplete functionallity" of my approach, Markus answered my question already:
>>>
Wrong or cyclic dependencies will at the worst cause information to be duplicated in the indexes of multiple projects.
<<<
Due to this statement and due to the fact that -with project references- indexing time takes days instead of minutes, I implemented my approach, that is now blamed as "incomplete". I don't consider such behaviour respectful either.
And I also do not consider it "respectful behaviour", if someone suspects me to provide a patch and then not fix the bugs that might come with that patch.
I have showed for months that I wanted to drive that topic, have asked questions, have provided profiler data for the performance issue, have asked people for help... nothing happend, and that's not very "respectful" either, and I don't think you can innovate, as long as people have such mindset.
I'm shattered at cdt-dev's bureaucracy.
I'm going to close b#351659 and continue providing a local patch for the colleagues in my team.
Regards
Volker


-----Ursprüngliche Nachricht-----
Von: cdt-dev-request@xxxxxxxxxxx
Gesendet: Nov 30, 2011 8:44:08 AM
An: cdt-dev@xxxxxxxxxxx
Betreff: cdt-dev Digest, Vol 81, Issue 39

Send cdt-dev mailing list submissions to
cdt-dev@xxxxxxxxxxx

To subscribe or unsubscribe via the World Wide Web, visit
https://dev.eclipse.org/mailman/listinfo/cdt-dev
or, via email, send a message with subject or body 'help' to
cdt-dev-request@xxxxxxxxxxx

You can reach the person managing the list at
cdt-dev-owner@xxxxxxxxxxx

When replying, please edit your Subject line so it is more specific
than "Re: Contents of cdt-dev digest..."


Today's Topics:

1. Canceled: Platform Debug Clients Meeting for December
(Pawel Piech)
2. Eclipse/CDT policy on native libraries? (Nathan Ridge)
3. Re: Parallelization of indexer (Schorn, Markus)


----------------------------------------------------------------------

Message: 1
Date: Tue, 29 Nov 2011 16:29:52 -0800
From: Pawel Piech <pawel.piech@xxxxxxxxxxxxx>
To: "Eclipse Platform Debug component developers list."
<platform-debug-dev@xxxxxxxxxxx>
Cc: "CDT General developers list." <cdt-dev@xxxxxxxxxxx>
Subject: [cdt-dev] Canceled: Platform Debug Clients Meeting for
December
Message-ID: <4ED57900.9070601@xxxxxxxxxxxxx>
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed

We're due for our monthly meeting, however I have conflict this
Thursday. Please let me know if you'd like me to reschedule
(http://wiki.eclipse.org/Debug/Meeting_Notes).

Cheers,
Pawel


------------------------------

Message: 2
Date: Wed, 30 Nov 2011 01:42:31 +0000
From: Nathan Ridge <zeratul976@xxxxxxxxxxx>
To: CDT Mailing List <cdt-dev@xxxxxxxxxxx>
Subject: [cdt-dev] Eclipse/CDT policy on native libraries?
Message-ID: <BLU162-W9249ABDB5BD2197D6AB2896B00@xxxxxxx>
Content-Type: text/plain; charset="iso-8859-1"


Hello,

What is the Eclipse/CDT policy for using native libraries?

Suppose I want to contribute code to CDT that makes use of a native
library (interfaced using JNI/JNA/SWIG/Bridj/ffi etc.).

Can such contribution be accepted?

Let's assume the library in question
? 1) is cross-platform and works on all platforms that Eclipse works on
? 2) is licensed under a permissive open-source license

Thanks,
Nate


------------------------------

Message: 3
Date: Wed, 30 Nov 2011 07:44:05 +0000
From: "Schorn, Markus" <Markus.Schorn@xxxxxxxxxxxxx>
To: "CDT General developers list." <cdt-dev@xxxxxxxxxxx>
Subject: Re: [cdt-dev] Parallelization of indexer
Message-ID:
<30D36C1BA62C5F4892C482E607D5E77E1FA57DA3@xxxxxxxxxxxxxxxxxxxxxxx>
Content-Type: text/plain; charset="utf-8"

Hi Volker,
It can take time to understand each other's point of view. In case you decide to continue the discussion with me, I will be there.
Markus.

-----Original Message-----
From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Volker Diesel
Sent: Wednesday, November 30, 2011 00:26
To: cdt-dev@xxxxxxxxxxx
Subject: [cdt-dev] Parallelization of indexer

Sorry, but could anyone @cdt-dev PLEASE stop this kind of "WRONG INFORMATION POSTING" ASAP!!!

Markus today claims...
>>>
In the given case the new feature (parallelizing the indexer across projects) is simply incomplete in that it does not respect that there needs to be some order in indexing projects.
<<<

THERE IS NO INCOMPLETENESS IN MY APPROACH!!!

In fact, my indexer approach DOES NOT honour project dependencies. This is, because Markus explicitly told me, that this is not an issue!!!
I EXPLICITLY ASKED MARKUS (before I implemented my patch), if that would be an issue, and MARKUS EXPLICITLY ANSWERED with "NO"!!! See history of this mail thread and see b#351659.

Now suddenly, an incompleteness seems to have appeared in Markus' mind and I would be glad to know, what kind of incompleteness it is, that (according to his own statements) didn't exist six months ago!!!

AND ONCE AGAIN... WE ARE USING MY PATCH IN OUR TEAM SINCE MONTH... AND THERE IS NO "INCOMPLETENESS" OR "FUNCTIONAL DIFFERENCE" BETWEEN THE INDEX GENERATED BY MY PATCH AND THE INDEX GENERATED BY OFFICIAL CDT8 (FROM AN END-USERS POINT OF VIEW)!!!




-----Urspr?ngliche Nachricht-----
Von: cdt-dev-request@xxxxxxxxxxx
Gesendet: Nov 29, 2011 11:26:12 PM
An: cdt-dev@xxxxxxxxxxx
Betreff: cdt-dev Digest, Vol 81, Issue 35

Send cdt-dev mailing list submissions to cdt-dev@xxxxxxxxxxx

To subscribe or unsubscribe via the World Wide Web, visit https://dev.eclipse.org/mailman/listinfo/cdt-dev
or, via email, send a message with subject or body 'help' to cdt-dev-request@xxxxxxxxxxx

You can reach the person managing the list at cdt-dev-owner@xxxxxxxxxxx

When replying, please edit your Subject line so it is more specific than "Re: Contents of cdt-dev digest..."


Today's Topics:

1. Re: CDT DSF-GDB (Marc Khouzam)
2. Re: Parallelization of indexer (Greg Watson) 3. Parallelization of indexer (Volker Diesel)


----------------------------------------------------------------------

Message: 1
Date: Tue, 29 Nov 2011 13:10:54 -0500
From: Marc Khouzam <marc.khouzam@xxxxxxxxxxxx>
To: "'subhashchandranv@xxxxxxxxxxxxxxx'"
<subhashchandranv@xxxxxxxxxxxxxxx>, "'CDT General developers list.'"
<cdt-dev@xxxxxxxxxxx>
Subject: Re: [cdt-dev] CDT DSF-GDB
Message-ID:
<F7CE05678329534C957159168FA70DEC578CBC2B95@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>

Content-Type: text/plain; charset="us-ascii"

> -----Original Message-----
> From: cdt-dev-bounces@xxxxxxxxxxx
> [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of subhashchandranv
> Sent: Friday, November 18, 2011 3:38 AM
> To: cdt-dev@xxxxxxxxxxx
> Subject: [cdt-dev] CDT DSF-GDB
>
> Hello,

Hi,

sorry for the dealy, I was on a business trip.

> I've been working on understanding DSF since past one month by going
> through the example plugins which include the Timer's Example and PDA:
> http://help.eclipse.org/indigo/index.jsp?topic=/org.eclipse.cd
> t.doc.isv/guide/dsf/intro/dsf_programming_intro.html
>
> I believe DSF-GDB is where DSF has been implemented and I saw the
> working of it in a flash video : http://live.eclipse.org/node/568.
>
> I tried to simulate the same into my Eclipse Indigo and it was not
> happening.
>
> I got to know that, "org.eclipse.dd.mi" , "org.eclipse.dd.gdb",
> "org.eclipse.dd.gdb.ui" are now renamed as "org.eclipse.cdt.dsf" and
> "org.eclipse.cdt.dsf.ui" respectively.
> but, I couldn find renamed plugins of follwing two plugins,
>
>
> * org.eclipse.dd.gdb.launch
> * org.eclipse.dd.gdb.launch.ui
>
> Please let me know the necessary plugins of DSF-GDB to check out from
> CVS, so that I can build them in my Eclipse Indigo to learn better
> about DSF.

We no longer use CVS, instead we use Git.
http://wiki.eclipse.org/Getting_started_with_CDT_development

The plugins from the DD project have been combined into four main plugins (not including tests or examples):

DSF:
org.eclipse.cdt.dsf
org.eclipse.cdt.dsf.ui
DSF-GDB
org.eclipse.cdt.dsf.gdb
org.eclipse.cdt.dsf.gdb.ui

> I'm also curious to know if there's any other project in eclipse where
> DSF is implemented. If yes, please help me by providing the plugin
> links.

EDC also uses DSF. It is part fo its own Git repository:
org.eclipse.cdt.edc.git

Marc


------------------------------

Message: 2
Date: Tue, 29 Nov 2011 14:02:59 -0500
From: Greg Watson <g.watson@xxxxxxxxxxxx>
To: "CDT General developers list." <cdt-dev@xxxxxxxxxxx>
Subject: Re: [cdt-dev] Parallelization of indexer
Message-ID: <541F1601-DAA5-4871-9669-16518B541363@xxxxxxxxxxxx>
Content-Type: text/plain; charset=iso-8859-1

Hi Markus,

Sounds reasonable to me. I hope the we see the new indexer fully implemented at some point.

Cheers,
Greg

On Nov 29, 2011, at 9:57 AM, Schorn, Markus wrote:

> Hi Greg,
> There is still support for multiple indexers, however the UI for that does not show up as long as you do not supply an alternative indexer. Different to earlier days of CDT, it is no longer simple to provide an alternative indexer that can come close to the one that is built into CDT. Therefore the usual path for a new indexer feature is to make it part of the existing indexer. Clearly such a feature can be made dependent on a preference setting.
>
> Whatever feature goes into CDT causes bug reports and when it comes to dealing with issues the enthusiasm of contributors and also committers is limited. I have quite a list of annoying 'experimental' features in CDT that simply don't work correctly (and probably never will). I do think it is a good idea to discuss and analyze the impact of new features before putting them into CDT.
>
> In the given case the new feature (parallelizing the indexer across projects) is simply incomplete in that it does not respect that there needs to be some order in indexing projects. It is not really difficult to implement the missing piece.
>
> Another path is to get rid of reusing indexes from referenced projects. The effect of this would be the duplication of index-information about files used from multiple dependent projects. While this makes indexing easier, it puts the burden on the clients working with the index-data because they have to deal with the redundant information. As written before, I am not convinced that it is important and we may want to change the indexer not to reuse those indexes.
>
> We may also end up with another preference setting, that allows for turning off reusing indexes from other projects. The parallelization would always work, but would work better when reusing indexes is turned off.
>
> Markus.
>
>
> -----Original Message-----
> From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx]
> On Behalf Of Greg Watson
> Sent: Tuesday, November 29, 2011 14:51
> To: CDT General developers list.
> Subject: Re: [cdt-dev] Parallelization of indexer
>
> Hi,
>
> Would it be possible to add this as an experimental indexer that could be enabled though the preferences? There used to be support for multiple indexers, but this seems to have been removed in CDT 8, presumably to avoid confusion. From the user's perspective, what's important is the speed and accuracy of the indexer. From the discussion, it sounds like the new indexer improves speed but reduces accuracy for some types of projects. I think users would be willing to give this a try if it was easy to enable/disable.
>
> Regards,
> Greg
>
> On Nov 29, 2011, at 5:17 AM, Schorn, Markus wrote:
>
>> Hi Volker!
>> Ad 1)
>> I am not interested in the discussion whether one should use a monolithic project or should split up the source into multiple projects. Both ways are a valid way of using CDT.
>>
>> Ad 2)
>> We have a specific handling of project references in place (index of referenced project is reused). One can challenge this approach (I am not very convinced of this approach). The performance issue would be a way to start this challenge. However, we cannot simply change the behavior of CDT without a discussion and some analysis on the matter, and as long as we use this approach your patch has to honor it.
>>
>> Ad 3) and 4)
>> As you realized yourself in 4), parallelization on file level would solve 2), because than you can index the reference project before the dependent one and at every time you would do that in parallel on file-level. As always, there are pros and cons to each approach.
>>
>> Ad 1)
>> Right, the discussion on parallelization on file-level vs. project level can be done in parallel.
>>
>> Ad 2)
>> You have two options: Either you make your patch honor project references, or you work towards changing CDT such that it ignores project references. For the latter a bugzilla on the performance issue may be your starting point.
>>
>> Markus.
>>
>> -----Original Message-----
>> From: cdt-dev-bounces@xxxxxxxxxxx
>> [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Volker Diesel
>> Sent: Monday, November 28, 2011 21:26
>> To: cdt-dev@xxxxxxxxxxx
>> Subject: [cdt-dev] Parallelization of indexer
>>
>> Hi Markus.
>>
>> Yes, this discussion has been somewhat frustrating and I still do not agree with your comments (and the comments of others about that topic) for several reasons.
>> 1) Everyone involved in this discussion seems to believe, that it is possible to setup one monolithic Eclipse project for a large-scale and real-life C++ project, and that this is the "normal" use-case for CDT, and that parallelization of indexer jobs across Eclipse projects therefore doesn't help much. I already questioned that opinion in b#351659, because I cannot see, how you would setup one monolithic Eclipse project, if the sources require e.g. different sets of #define's or different include pathes (and source files in real-life projects normally do require this). You won't get correct indexer results in that case. But maybe, I missed some point...
>> 2) In fact, my patch does not honour project references, but I explicitly asked, if that would be a real problem, and your reply was, that the only impact is potential replication of some of the symbols in multiple project indices. Therefore, my understanding was that there is no "real" issue, if parallelized indexer does not take project references into account.
>> 3) I cannot see, how parallelization on file level instead of project level can resolve issue 2). If indexer job#1 indexes a file from project A and indexer job#2 indexes a file from project B, then you are back at the same point... Should these jobs honour references between project A and B? I cannot see any difference.
>> 4) Only solution to problem 3) would be to limit parallel indexer jobs (on file level) to the set of files of one and the same project. In that case, there will be at least three other issues... First, when there are many projects with only a few files, parallelization will be poor. Second, at the end of the process of indexing each project, parallization will be poor, because there are more potential free CPU cores than there are files left to index in that single project. Third (and most important) all these parallel jobs on the files of one single project will run into lock contention on the project's index write lock. I already faced a similar issue with my patch and had to change some of the locking code to achieve enough throuput/CPU utilization with my approach.
>>
>> Therefore, from my point of view...
>> 1) Discussion about paralellization across projects vs. parallelization across source files is independent of the question of honouring project dependencies. Both approaches need either a fix for the performance issues mentioned in b#351659 or a decission not to honour project dependencies while indexing.
>> 2) And yes, of course I could open another bugzilla about that performance issue, but what would that help? I already mentioned that issue, I captured jprof profiling information and attached the profiler data to b#351659, and I asked for someone to look into that data. Nothing happend. Why should that change, if I opened a second bugzilla and attached the same profiler data again?
>>
>> Kind regards.
>> Volker
>>
>>
>>
>> -----Urspr?ngliche Nachricht-----
>> Date: Mon, 28 Nov 2011 06:40:34 +0000
>> From: "Schorn, Markus" <Markus.Schorn@xxxxxxxxxxxxx>
>> To: "CDT General developers list." <cdt-dev@xxxxxxxxxxx>
>> Subject: Re: [cdt-dev] Parallelization of indexer
>> Message-ID:
>> <30D36C1BA62C5F4892C482E607D5E77E1FA57921@xxxxxxxxxxxxxxxxxxxxxxx>
>> Content-Type: text/plain; charset="us-ascii"
>>
>> Hi Volker!
>> I can understand your frustration, however there is an issue with the patch as provided in bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=351659. My view on the matter is the following:
>>
>> (1) Your patch does not deal with indexing dependent projects. Currently the index of a project is reused by a dependent project. This requires the dependent project to be indexed after its dependencies. Your patch ignores this requirement.
>> You have identified that indexing with dependencies does introduce a performance issue. This needs further investigation and may lead us to changing the indexer, such that it ignores the project references. However, before we have made such a decision, your patch cannot be applied.
>>
>> (2) The approach of parallelizing indexing on project level does not help for large projects. I do agree with Sergey, that it would be more rewarding to make parallelization work on file-level.
>>
>>
>> To move forward on the issue, I encourage you to open a new bug on the performance issue of dependent projects. We need a discussion on that and only if we drop the requirement of reusing the index of a dependent project we can go back to consider your patch.
>>
>> In parallel it makes sense to think about parallelization of file-level. Because thinking long-term, this is the more promising approach the approach would find more traction.
>>
>> Markus.
>>
>>
>> -----Original Message-----
>> From: cdt-dev-bounces@xxxxxxxxxxx
>> [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Volker Diesel
>> Sent: Friday, November 25, 2011 23:34
>> To: CDT Dev
>> Subject: [cdt-dev] (no subject)
>>
>> Hello, everybody.
>> There used to be some discussion about C/C++ indexer parallelization some months ago and (initially) most people agreed, that this would be a great feature.
>> There is a patch in place, that brings down full C/C++ indexing time from 4hrs to 20mins in our project (see bug#351659).
>> This patch has now been used in our team (200+ people, 10+Mio lines of C/C++ code) without any issue for several months.
>> I provided a git patch for CDT master.
>> I provided a git patch for CDT 8.
>> I have not received any answer to my latest questions in the above mentioned bugzilla since months.
>> I wonder, if anyone out there in CDT DEV is still interested in that topic?
>> I wonder, how such an enhancement will finally find its way to any CDT codeline and what I can else do to bring this feature into official CDT release?
>> If noone at CDT DEV is any longer interrested in that topic, please let me know. In that case I would simply close that useless bugzilla.
>> Thanks and kind regards.
>> Volker
>> _______________________________________________
>> cdt-dev mailing list
>> cdt-dev@xxxxxxxxxxx
>> https://dev.eclipse.org/mailman/listinfo/cdt-dev
>>
>>
>>
>> _______________________________________________
>> cdt-dev mailing list
>> cdt-dev@xxxxxxxxxxx
>> https://dev.eclipse.org/mailman/listinfo/cdt-dev
>
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/cdt-dev
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/cdt-dev



------------------------------

Message: 3
Date: Tue, 29 Nov 2011 23:26:09 +0100 (CET)
From: "Volker Diesel" <volker.diesel@xxxxxx>
To: cdt-dev@xxxxxxxxxxx
Subject: [cdt-dev] Parallelization of indexer
Message-ID:
<1080985477.5434381.1322605569215.JavaMail.fmail@mwmweb012>
Content-Type: text/plain; charset="UTF-8"

1) About functionality of parallel indexer (Greg) No! My parallel indexer patch does NOT introduce any functional limitations from an end-user point of view. If you search the index generated by my patch, it will give you exactly the same results as the index generated by original CDT8 (at least that's what we experienced in our team during the last half year or so). The only thing an end-user notices is, that index generation is one or two orders of magnitude faster than the original CDT8 indexer (depending on how many CDT projects you have in your workspace, and depending -of course- on your hardware). And indexes might consume more disk space (depending on your project configuration).

2) About plugging in alternative indexers (Greg) As far as I can tell from my (poor) knowledge of CDT, options for easily plugging in alternative indexers have (unfortunately) been removed from CDT. According to documentation, there is an extension to do so, but I could not find any code, that implements that extension. If there were such an easy extension, I wouldn't need to go through all these discussions. I could simply publish my own indexer extension and anyone who likes it, could use it... free market, so to say:-) Unfortunately, CDT doesn't offer this (at least as far as I can tell).

3) About monolithic project setups
I do NOT consider this a minor topic or not worth to discuss, because if there is NO WAY to setup such a monolithic project, then discussion about whether going for parallelization on file level vs. going for parallelization on project level can be stopped immediately! If there are situations, where multiple CDT projects MUST be configured, we need a parallelization approach, that honours the fact of multiple (and maybe many) projects appropriately.
So I am asking anyone @cdt-dev once again to explain, if and how it is possible to setup one monolithic CDT project, if you have different source files that require e.g. different sets of #define's or different include pathes (and if you expect correct indexer search results).

4) About parallelization on file level
I don't like to be cited wrong, therefore once again and for clarification, what I ment to really tell with point 4) of my last posting...
a) Parallel indexing on file level DOES NOT SOLVE ANY SINGLE PROBLEM, that has not already been solved with my patch!
b) The problem of honouring project dependencies needs to be solved, no matter if parallelization is done on file level or on project level. And once this is solved, this solution is as valid for file-level parallelization as it is for my already implemented project-level approach!
c) If (b) is "hacked" by only parallizing indexing of files in ONE SINGLE project, THIS WILL NOT BE A SOLUTION, but will instead INTRODUCE EVEN MORE COMPLICATED performance and parallelization issues (THIS IS WHAT I CLEARLY SAID IN MY LAST POSTING). I mentioned e.g. lock contention on index write lock as one issue, which can only be solved by quite complex refactoring of index locking code... anyone out there, to do that job within this decade???
So, point 4) of my last posting is A CLEAR STATEMENT AGAINST parallelization on file level (because this approach does not solve any problem, that hasn't already been solved with my patch), introduces only a bulk of new parallelization issues and bottle necks and should therefore please no longer be missused as an argument to GO FOR parallelization on file level (at least as long as no solutions for the problems mentioned above, are explicitly given). Thanks.

5) If it helps to kick off cdt-dev administration, I will open a new bugzilla about indexer and project references and related performance issues, copy-paste the problem description (already available since months) from here to there and re-attach jperf files (already available since months) from here to there... and will then once again ask someone @cdt-dev to PLEASE, PLEASE, PLEASE have a look at these performance issues, because I do not know enough about CDT to tell, what's wrong there, and because the REAL technical issue does not appear or disappear, simply because a new bugzilla is opened or not opened!




-----Urspr?ngliche Nachricht-----
Von: cdt-dev-request@xxxxxxxxxxx
Gesendet: Nov 29, 2011 6:00:06 PM
An: cdt-dev@xxxxxxxxxxx
Betreff: cdt-dev Digest, Vol 81, Issue 34

Send cdt-dev mailing list submissions to cdt-dev@xxxxxxxxxxx

To subscribe or unsubscribe via the World Wide Web, visit https://dev.eclipse.org/mailman/listinfo/cdt-dev
or, via email, send a message with subject or body 'help' to cdt-dev-request@xxxxxxxxxxx

You can reach the person managing the list at cdt-dev-owner@xxxxxxxxxxx

When replying, please edit your Subject line so it is more specific than "Re: Contents of cdt-dev digest..."


Today's Topics:

1. Re: Parallelization of indexer (Greg Watson) 2. Re: Parallelization of indexer (Schorn, Markus)


----------------------------------------------------------------------

Message: 1
Date: Tue, 29 Nov 2011 08:51:09 -0500
From: Greg Watson <g.watson@xxxxxxxxxxxx>
To: "CDT General developers list." <cdt-dev@xxxxxxxxxxx>
Subject: Re: [cdt-dev] Parallelization of indexer
Message-ID: <5F0F2E77-6A61-4D6C-9E81-AA9249B5520E@xxxxxxxxxxxx>
Content-Type: text/plain; charset=iso-8859-1

Hi,

Would it be possible to add this as an experimental indexer that could be enabled though the preferences? There used to be support for multiple indexers, but this seems to have been removed in CDT 8, presumably to avoid confusion. From the user's perspective, what's important is the speed and accuracy of the indexer. From the discussion, it sounds like the new indexer improves speed but reduces accuracy for some types of projects. I think users would be willing to give this a try if it was easy to enable/disable.

Regards,
Greg

On Nov 29, 2011, at 5:17 AM, Schorn, Markus wrote:

> Hi Volker!
> Ad 1)
> I am not interested in the discussion whether one should use a monolithic project or should split up the source into multiple projects. Both ways are a valid way of using CDT.
>
> Ad 2)
> We have a specific handling of project references in place (index of referenced project is reused). One can challenge this approach (I am not very convinced of this approach). The performance issue would be a way to start this challenge. However, we cannot simply change the behavior of CDT without a discussion and some analysis on the matter, and as long as we use this approach your patch has to honor it.
>
> Ad 3) and 4)
> As you realized yourself in 4), parallelization on file level would solve 2), because than you can index the reference project before the dependent one and at every time you would do that in parallel on file-level. As always, there are pros and cons to each approach.
>
> Ad 1)
> Right, the discussion on parallelization on file-level vs. project level can be done in parallel.
>
> Ad 2)
> You have two options: Either you make your patch honor project references, or you work towards changing CDT such that it ignores project references. For the latter a bugzilla on the performance issue may be your starting point.
>
> Markus.
>
> -----Original Message-----
> From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx]
> On Behalf Of Volker Diesel
> Sent: Monday, November 28, 2011 21:26
> To: cdt-dev@xxxxxxxxxxx
> Subject: [cdt-dev] Parallelization of indexer
>
> Hi Markus.
>
> Yes, this discussion has been somewhat frustrating and I still do not agree with your comments (and the comments of others about that topic) for several reasons.
> 1) Everyone involved in this discussion seems to believe, that it is possible to setup one monolithic Eclipse project for a large-scale and real-life C++ project, and that this is the "normal" use-case for CDT, and that parallelization of indexer jobs across Eclipse projects therefore doesn't help much. I already questioned that opinion in b#351659, because I cannot see, how you would setup one monolithic Eclipse project, if the sources require e.g. different sets of #define's or different include pathes (and source files in real-life projects normally do require this). You won't get correct indexer results in that case. But maybe, I missed some point...
> 2) In fact, my patch does not honour project references, but I explicitly asked, if that would be a real problem, and your reply was, that the only impact is potential replication of some of the symbols in multiple project indices. Therefore, my understanding was that there is no "real" issue, if parallelized indexer does not take project references into account.
> 3) I cannot see, how parallelization on file level instead of project level can resolve issue 2). If indexer job#1 indexes a file from project A and indexer job#2 indexes a file from project B, then you are back at the same point... Should these jobs honour references between project A and B? I cannot see any difference.
> 4) Only solution to problem 3) would be to limit parallel indexer jobs (on file level) to the set of files of one and the same project. In that case, there will be at least three other issues... First, when there are many projects with only a few files, parallelization will be poor. Second, at the end of the process of indexing each project, parallization will be poor, because there are more potential free CPU cores than there are files left to index in that single project. Third (and most important) all these parallel jobs on the files of one single project will run into lock contention on the project's index write lock. I already faced a similar issue with my patch and had to change some of the locking code to achieve enough throuput/CPU utilization with my approach.
>
> Therefore, from my point of view...
> 1) Discussion about paralellization across projects vs. parallelization across source files is independent of the question of honouring project dependencies. Both approaches need either a fix for the performance issues mentioned in b#351659 or a decission not to honour project dependencies while indexing.
> 2) And yes, of course I could open another bugzilla about that performance issue, but what would that help? I already mentioned that issue, I captured jprof profiling information and attached the profiler data to b#351659, and I asked for someone to look into that data. Nothing happend. Why should that change, if I opened a second bugzilla and attached the same profiler data again?
>
> Kind regards.
> Volker
>
>
>
> -----Urspr?ngliche Nachricht-----
> Date: Mon, 28 Nov 2011 06:40:34 +0000
> From: "Schorn, Markus" <Markus.Schorn@xxxxxxxxxxxxx>
> To: "CDT General developers list." <cdt-dev@xxxxxxxxxxx>
> Subject: Re: [cdt-dev] Parallelization of indexer
> Message-ID:
> <30D36C1BA62C5F4892C482E607D5E77E1FA57921@xxxxxxxxxxxxxxxxxxxxxxx>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi Volker!
> I can understand your frustration, however there is an issue with the patch as provided in bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=351659. My view on the matter is the following:
>
> (1) Your patch does not deal with indexing dependent projects. Currently the index of a project is reused by a dependent project. This requires the dependent project to be indexed after its dependencies. Your patch ignores this requirement.
> You have identified that indexing with dependencies does introduce a performance issue. This needs further investigation and may lead us to changing the indexer, such that it ignores the project references. However, before we have made such a decision, your patch cannot be applied.
>
> (2) The approach of parallelizing indexing on project level does not help for large projects. I do agree with Sergey, that it would be more rewarding to make parallelization work on file-level.
>
>
> To move forward on the issue, I encourage you to open a new bug on the performance issue of dependent projects. We need a discussion on that and only if we drop the requirement of reusing the index of a dependent project we can go back to consider your patch.
>
> In parallel it makes sense to think about parallelization of file-level. Because thinking long-term, this is the more promising approach the approach would find more traction.
>
> Markus.
>
>
> -----Original Message-----
> From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx]
> On Behalf Of Volker Diesel
> Sent: Friday, November 25, 2011 23:34
> To: CDT Dev
> Subject: [cdt-dev] (no subject)
>
> Hello, everybody.
> There used to be some discussion about C/C++ indexer parallelization some months ago and (initially) most people agreed, that this would be a great feature.
> There is a patch in place, that brings down full C/C++ indexing time from 4hrs to 20mins in our project (see bug#351659).
> This patch has now been used in our team (200+ people, 10+Mio lines of C/C++ code) without any issue for several months.
> I provided a git patch for CDT master.
> I provided a git patch for CDT 8.
> I have not received any answer to my latest questions in the above mentioned bugzilla since months.
> I wonder, if anyone out there in CDT DEV is still interested in that topic?
> I wonder, how such an enhancement will finally find its way to any CDT codeline and what I can else do to bring this feature into official CDT release?
> If noone at CDT DEV is any longer interrested in that topic, please let me know. In that case I would simply close that useless bugzilla.
> Thanks and kind regards.
> Volker
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/cdt-dev
>
>
>
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/cdt-dev



------------------------------

Message: 2
Date: Tue, 29 Nov 2011 14:57:26 +0000
From: "Schorn, Markus" <Markus.Schorn@xxxxxxxxxxxxx>
To: "CDT General developers list." <cdt-dev@xxxxxxxxxxx>
Subject: Re: [cdt-dev] Parallelization of indexer
Message-ID:
<30D36C1BA62C5F4892C482E607D5E77E1FA57C81@xxxxxxxxxxxxxxxxxxxxxxx>
Content-Type: text/plain; charset="iso-8859-1"

Hi Greg,
There is still support for multiple indexers, however the UI for that does not show up as long as you do not supply an alternative indexer. Different to earlier days of CDT, it is no longer simple to provide an alternative indexer that can come close to the one that is built into CDT. Therefore the usual path for a new indexer feature is to make it part of the existing indexer. Clearly such a feature can be made dependent on a preference setting.

Whatever feature goes into CDT causes bug reports and when it comes to dealing with issues the enthusiasm of contributors and also committers is limited. I have quite a list of annoying 'experimental' features in CDT that simply don't work correctly (and probably never will). I do think it is a good idea to discuss and analyze the impact of new features before putting them into CDT.

In the given case the new feature (parallelizing the indexer across projects) is simply incomplete in that it does not respect that there needs to be some order in indexing projects. It is not really difficult to implement the missing piece.

Another path is to get rid of reusing indexes from referenced projects. The effect of this would be the duplication of index-information about files used from multiple dependent projects. While this makes indexing easier, it puts the burden on the clients working with the index-data because they have to deal with the redundant information. As written before, I am not convinced that it is important and we may want to change the indexer not to reuse those indexes.

We may also end up with another preference setting, that allows for turning off reusing indexes from other projects. The parallelization would always work, but would work better when reusing indexes is turned off.

Markus.


-----Original Message-----
From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Greg Watson
Sent: Tuesday, November 29, 2011 14:51
To: CDT General developers list.
Subject: Re: [cdt-dev] Parallelization of indexer

Hi,

Would it be possible to add this as an experimental indexer that could be enabled though the preferences? There used to be support for multiple indexers, but this seems to have been removed in CDT 8, presumably to avoid confusion. From the user's perspective, what's important is the speed and accuracy of the indexer. From the discussion, it sounds like the new indexer improves speed but reduces accuracy for some types of projects. I think users would be willing to give this a try if it was easy to enable/disable.

Regards,
Greg

On Nov 29, 2011, at 5:17 AM, Schorn, Markus wrote:

> Hi Volker!
> Ad 1)
> I am not interested in the discussion whether one should use a monolithic project or should split up the source into multiple projects. Both ways are a valid way of using CDT.
>
> Ad 2)
> We have a specific handling of project references in place (index of referenced project is reused). One can challenge this approach (I am not very convinced of this approach). The performance issue would be a way to start this challenge. However, we cannot simply change the behavior of CDT without a discussion and some analysis on the matter, and as long as we use this approach your patch has to honor it.
>
> Ad 3) and 4)
> As you realized yourself in 4), parallelization on file level would solve 2), because than you can index the reference project before the dependent one and at every time you would do that in parallel on file-level. As always, there are pros and cons to each approach.
>
> Ad 1)
> Right, the discussion on parallelization on file-level vs. project level can be done in parallel.
>
> Ad 2)
> You have two options: Either you make your patch honor project references, or you work towards changing CDT such that it ignores project references. For the latter a bugzilla on the performance issue may be your starting point.
>
> Markus.
>
> -----Original Message-----
> From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx]
> On Behalf Of Volker Diesel
> Sent: Monday, November 28, 2011 21:26
> To: cdt-dev@xxxxxxxxxxx
> Subject: [cdt-dev] Parallelization of indexer
>
> Hi Markus.
>
> Yes, this discussion has been somewhat frustrating and I still do not agree with your comments (and the comments of others about that topic) for several reasons.
> 1) Everyone involved in this discussion seems to believe, that it is possible to setup one monolithic Eclipse project for a large-scale and real-life C++ project, and that this is the "normal" use-case for CDT, and that parallelization of indexer jobs across Eclipse projects therefore doesn't help much. I already questioned that opinion in b#351659, because I cannot see, how you would setup one monolithic Eclipse project, if the sources require e.g. different sets of #define's or different include pathes (and source files in real-life projects normally do require this). You won't get correct indexer results in that case. But maybe, I missed some point...
> 2) In fact, my patch does not honour project references, but I explicitly asked, if that would be a real problem, and your reply was, that the only impact is potential replication of some of the symbols in multiple project indices. Therefore, my understanding was that there is no "real" issue, if parallelized indexer does not take project references into account.
> 3) I cannot see, how parallelization on file level instead of project level can resolve issue 2). If indexer job#1 indexes a file from project A and indexer job#2 indexes a file from project B, then you are back at the same point... Should these jobs honour references between project A and B? I cannot see any difference.
> 4) Only solution to problem 3) would be to limit parallel indexer jobs (on file level) to the set of files of one and the same project. In that case, there will be at least three other issues... First, when there are many projects with only a few files, parallelization will be poor. Second, at the end of the process of indexing each project, parallization will be poor, because there are more potential free CPU cores than there are files left to index in that single project. Third (and most important) all these parallel jobs on the files of one single project will run into lock contention on the project's index write lock. I already faced a similar issue with my patch and had to change some of the locking code to achieve enough throuput/CPU utilization with my approach.
>
> Therefore, from my point of view...
> 1) Discussion about paralellization across projects vs. parallelization across source files is independent of the question of honouring project dependencies. Both approaches need either a fix for the performance issues mentioned in b#351659 or a decission not to honour project dependencies while indexing.
> 2) And yes, of course I could open another bugzilla about that performance issue, but what would that help? I already mentioned that issue, I captured jprof profiling information and attached the profiler data to b#351659, and I asked for someone to look into that data. Nothing happend. Why should that change, if I opened a second bugzilla and attached the same profiler data again?
>
> Kind regards.
> Volker
>
>
>
> -----Urspr?ngliche Nachricht-----
> Date: Mon, 28 Nov 2011 06:40:34 +0000
> From: "Schorn, Markus" <Markus.Schorn@xxxxxxxxxxxxx>
> To: "CDT General developers list." <cdt-dev@xxxxxxxxxxx>
> Subject: Re: [cdt-dev] Parallelization of indexer
> Message-ID:
> <30D36C1BA62C5F4892C482E607D5E77E1FA57921@xxxxxxxxxxxxxxxxxxxxxxx>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi Volker!
> I can understand your frustration, however there is an issue with the patch as provided in bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=351659. My view on the matter is the following:
>
> (1) Your patch does not deal with indexing dependent projects. Currently the index of a project is reused by a dependent project. This requires the dependent project to be indexed after its dependencies. Your patch ignores this requirement.
> You have identified that indexing with dependencies does introduce a performance issue. This needs further investigation and may lead us to changing the indexer, such that it ignores the project references. However, before we have made such a decision, your patch cannot be applied.
>
> (2) The approach of parallelizing indexing on project level does not help for large projects. I do agree with Sergey, that it would be more rewarding to make parallelization work on file-level.
>
>
> To move forward on the issue, I encourage you to open a new bug on the performance issue of dependent projects. We need a discussion on that and only if we drop the requirement of reusing the index of a dependent project we can go back to consider your patch.
>
> In parallel it makes sense to think about parallelization of file-level. Because thinking long-term, this is the more promising approach the approach would find more traction.
>
> Markus.
>
>
> -----Original Message-----
> From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx]
> On Behalf Of Volker Diesel
> Sent: Friday, November 25, 2011 23:34
> To: CDT Dev
> Subject: [cdt-dev] (no subject)
>
> Hello, everybody.
> There used to be some discussion about C/C++ indexer parallelization some months ago and (initially) most people agreed, that this would be a great feature.
> There is a patch in place, that brings down full C/C++ indexing time from 4hrs to 20mins in our project (see bug#351659).
> This patch has now been used in our team (200+ people, 10+Mio lines of C/C++ code) without any issue for several months.
> I provided a git patch for CDT master.
> I provided a git patch for CDT 8.
> I have not received any answer to my latest questions in the above mentioned bugzilla since months.
> I wonder, if anyone out there in CDT DEV is still interested in that topic?
> I wonder, how such an enhancement will finally find its way to any CDT codeline and what I can else do to bring this feature into official CDT release?
> If noone at CDT DEV is any longer interrested in that topic, please let me know. In that case I would simply close that useless bugzilla.
> Thanks and kind regards.
> Volker
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/cdt-dev
>
>
>
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/cdt-dev

_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev


------------------------------

_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev


End of cdt-dev Digest, Vol 81, Issue 34
***************************************



------------------------------

_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev


End of cdt-dev Digest, Vol 81, Issue 35
***************************************

_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev

------------------------------

_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev


End of cdt-dev Digest, Vol 81, Issue 39
***************************************



Back to the top