Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [cdt-dev] Indexer Requirements for 3.0

I agree with Chris ... quite the challenge.

Just to add to what Chris has asked, would the combination of being
able to index offline and "merge" index files allow for a set of 
system headers (which would be great) or project headers for on highlyt
re-used componentsto be pre-indexed and provided as "packages"?

Thanks, 
 Thomas 

> -----Original Message-----
> From: cdt-dev-admin@xxxxxxxxxxx 
> [mailto:cdt-dev-admin@xxxxxxxxxxx] On Behalf Of Chris Wiebe
> Sent: January 13, 2005 4:47 PM
> To: cdt-dev@xxxxxxxxxxx
> Subject: Re: [cdt-dev] Indexer Requirements for 3.0
> 
> Looks good, Bogdan.  Will this also include indexing of 
> external headers (from the project include paths)?
> 
> Chris
> 
> 
> Bogdan Gheorghe wrote:
> > 
> > Here is a list of indexer requirements for the 3.0 release. The 
> > overall focus for the indexer for 3.0 is on improving the 
> scalability 
> > and overall usability of the indexer service for all 
> clients. Feedback 
> > is always appreciated - especially if you think there are missing 
> > requirements. A more detailed design doc is to follow.
> > 
> > Thanks,
> > Bogdan
> > 
> > 
> > 
> > 
> ----------------------------------------------------------------------
> > --
> > 
> > Indexer Requirements for 3.0
> > This document describes the proposed work items for the 
> indexer for the 
> > CDT 3.0 release. 	
> > 
> > Author 	: Bogdan Gheorghe
> > Revision Date 	: 11/29/2004 - Version: 0.1.0
> > 
> > 	: 01/10/2005 - Version: 0.1.1
> > Change History 	: 0.1.0 - Document Creation
> > 
> > 	: 0.1.1 - Revision
> > 
> > 
> > Table of Contents
> > 
> > 1. Introduction <#intro>
> > 2. Requirements <#reqs>
> > 3. UI Requirements <#proposal>
> > 4. References <#references>
> > 
> > 
> > 1. Introduction
> > 
> > The Indexer has been around since CDT 1.2 and currently provides 
> > support for Search, Navigation, and Refactoring. Its main 
> purpose is 
> > to provide rapid access to a complete database of code 
> elements and to 
> > manage this database in an efficient and non-intrusive 
> manner. As the 
> > CDT has evolved; so has the indexer - adding more elements to the 
> > index, refining job scheduling, providing feedback 
> mechanisms for indexes.
> > Although the indexer is sufficiently developed to provide most 
> > requested information to clients; it has become clear that the next 
> > step in the indexer's evolution will have to address its ability to 
> > handle very large projects efficiently.
> > 
> > Having the indexer work well on large scale projects 
> requires some new 
> > architecture to reduce the amount of time spent indexing as much as 
> > possible, reuse existing indexes as much as possible and 
> provide users 
> > with mechanisms to extend the index framework.
> > 
> > This document will address main requirements on the indexer 
> for CDT 3.0.
> > 
> > 
> >     1.0 Definitions
> > 
> > Resource
> > 	A project, folder or file within the Eclipse workspace 
> Index profiles
> > 	Separate indexes that are created for different 
> configurations of 
> > include paths/symbols
> > 
> > 
> > 
> >     1.1 Current Architecture
> > 
> > Quick overview of the indexing architecture:
> > 
> >    1. The indexer responds to resource events from the 
> workbench. These
> >       events occur whenever a resource gets created, 
> modified or deleted.
> >    2. The indexer will create index jobs based on the 
> resource events.
> >       These jobs might schedule other jobs (such as in the case of
> >       indexing an entire project) but most index jobs 
> eventually boil
> >       down to an AddCompilationUnitToIndex job.
> >    3. The indexer creates a new parser, passes in the 
> current include
> >       paths and symbol definitions and parses the file and any other
> >       files included by the file in full parse mode (which generates
> >       cross reference information).
> >    4. The index gets created as the parser returns 
> information about the
> >       elements in the file - the index is stored in memory and at
> >       certain intervals gets merged with the persisted 
> index on disk.
> > 
> > 
> > Currently the indexes have the following structure:
> > 
> >     * Index Version Number
> >     * Summary Block Location
> >     * File Blocks [1 ... N]:
> >           o Each full file block is 8k
> >           o File block entries associate the path of a file 
> with a unique ID
> >     * Word Blocks [1 ... N]:
> >           o Each full word block is 8k
> >           o Word block entries encode the element in a 
> special format
> >             and add the referring file number
> >     * Include Blocks [1 ... N]:
> >           o Each full include block is 8k
> >           o Include block entries associate a file with the 
> initial file
> >             that was parsed to get to the current file
> >     * Summary Block:
> >           o Keeps track of the total number of words, files 
> and include
> >             entries in the index
> >           o Keeps track of the first file block number, the 
> first word
> >             block number and the first include block number
> >           o Keeps track of the first file for every File Block, the
> >             first word entry for every Word Block, and the 
> first include
> >             for every Include Block
> > 
> > 
> >     1.1 Constraints
> > 
> > Number
> > 	Description
> > C1
> > 	Multiple indexer profiles not possible until similar build 
> > configuration notion appears
> > 
> > In order to get an accurate index the indexer depends on 
> being able to 
> > pass on the relevant includes/symbols to the parser.  As 
> long as there 
> > is no standard representation for build configurations in the core 
> > model for all build types (both standard and managed), it isn't 
> > possible to enable index profiles.
> > 
> > Bug 25682: Indexer Profiles
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=25682>
> > 
> > 
> > 2. Requirements
> > 
> > List of requirements for the indexer is classified into following
> > categories:
> > 
> > 
> >     2.1 Index Management Requirements
> > 
> > Number
> > 	Priority
> > 	Description
> > R1
> > 	P1
> > 	Indexer must provide different types of indexing services
> > 
> > It has become apparent that the one-size-fits all indexing approach 
> > does not meet all of our users needs. Most clients with existing 
> > legacy projects want some form of search/navigation but some don't 
> > want the hassle of having to wait for an entire project to 
> index fully 
> > before being able to use search/navigation. Thus, in order to 
> > accommodate both sets of user groups (those who are willing to wait 
> > for a full index to complete and those who just want the 
> absolute bare 
> > minimum index) we need to offer the following indexer options:
> > 
> >    1. Full Index: this is the "regular" index mode which 
> uses the CDT
> >       parser (include paths and symbol definitions need to be setup
> >       properly); everything gets indexed
> >    2. Quick Index with no setup: this will result in a "best effort"
> >       bare bones index that should enable some navigation/search
> > 
> > 
> > These indexer options are per project - so it is possible to have 
> > different indexers for different projects.
> > 
> > Bug 69078: C/C++ indexer too slow
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=69078>
> > R2
> > 	P1
> > 	Indexes should be shareable between team members
> > 
> > As part of streamlining the indexing process for large projects, 
> > indexes should be able to be shared between users working 
> on the same project.
> > All index entries should make use of path variables in 
> order to allow 
> > indexes to be translated into different workspace locations. (See 
> > Scanner Config Correctness Enhancement FDS for more details 
> about path
> > variables)
> > 
> > Bug 79661: All Index Entries should make use of Path Variables 
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=79661>
> > Bug 79518: Path/Variable Manager support service in the core(string
> > substitution) <https://bugs.eclipse.org/bugs/show_bug.cgi?id=79518>
> > R3
> > 	P1
> > 	Indexer should be able to index a project offline
> > 
> > For features that require a complete index it would be 
> ideal to have 
> > the index be created somewhere separately and imported in 
> at a later date.
> > This is especially true for medium to large projects that 
> need a long 
> > time to index.
> > 
> > Bug 74433: Offline Indexing/Index Hierarchy 
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=74433>
> > R4
> > 	P1
> > 	Indexer should be able to merge indexes
> > 
> > With all of the new options for indexing, it is conceivable 
> that any 
> > project might have several sources to look at for a single 
> index. The 
> > indexer should be able to merge new index information into 
> an existing 
> > index (through a user action), provided that both index 
> formats are alike.
> > 
> > Bug 52126: Indexer should maintain per project indices 
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=52126>
> > Bug 74433: Offline Indexing/Index Hierarchy 
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=74433>
> > R5
> > 	P2
> > 	Indexer should allow user to specify indexer settings
> > 
> > Currently the indexer uses the same settings for all projects. This 
> > might not suit all users. Indexer settings that should be 
> customizable
> > include:
> > 
> >     * Indexing policy: (default normal)
> >           o normal (always up to date)
> >           o manual (index only when manually requested)
> >           o static (don't update current index)
> >           o after build (don't index until build)
> > 
> >     * Index Progress Bar displayed: (Checkbox, default displayed)
> > 
> >     * Indexer default setting for new projects:
> >           o Currently the indexer is always on when creating a new
> >             project. This should be changed to allow users 
> to set the
> >             default index behaviour when creating a new project
> > 
> > 
> > Bug 75884: Allow C/C++ Indexing to be set on or off as a default 
> > setting <https://bugs.eclipse.org/bugs/show_bug.cgi?id=75884>
> > R6
> > 	P1
> > 	Index Manager must improve job scheduling smarts
> > 
> > The indexer currently has limited smarts when it comes to job 
> > scheduling; it will prevent the same job from being queued up. But 
> > there are other circumstances when being more aware of 
> what's in the 
> > job queue would solve numerous problems: including the ever 
> recurring 
> > double index, and source folder changes. Essentially, more 
> information 
> > needs to be added to individual job requests about the event that 
> > created the job in order to enable the index manager to make smart 
> > choices about how to best schedule the jobs.
> > 
> > The index manager should also run at the minimum priority.
> > 
> > Bug 60084: Indexer should reduce its priority when running in 
> > background <https://bugs.eclipse.org/bugs/show_bug.cgi?id=60084>
> > R7
> > 	P1
> > 	Indexer must try to keep as much as possible from all 
> failed index 
> > attempts
> > 
> > The indexer needs to persist a list of all files that are to be 
> > indexed as part of an index job. As each merge happens, the list is 
> > updated. In the event of a crash, all work that has been merged to 
> > disk shall be considered as sane and the indexer will 
> restart on the 
> > remaining files on the next startup.
> > 
> > Bug 62366: [Index] Need ability to read a partial index and resume 
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=62366>
> > R8
> > 	P1
> > 	Indexer needs to deal with new path/symbol addition/deletion 
> > gracefully
> > 
> > The indexer currently reindexes the entire project on the 
> > addition/deletion of each new path/symbol. With the new per-file 
> > scanner settings, we should only index the range of files 
> that are affected.
> > (See Scanner Config Correctness Enhancement FDS for more 
> details about 
> > per file scanner settings).
> > 
> > R9
> > 	P1
> > 	Indexer needs to handle Source Folder changes gracefully
> > 
> > The indexer currently does a brute force reindex of 
> whenever anything 
> > changes in the source folder view. This needs to change to a more 
> > scalable solution.
> > 
> > R10
> > 	P4
> > 	Resources can be added manually to the index
> > 
> > Users can request a new index on any resource in the workspace 
> > (including files that are included from external directories) by 
> > selecting one or more files and choosing the appropriate 
> option from 
> > the context menu.
> > 
> > Bug 71821: Action on the indexer to parse file/folder 
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=71821>
> > R11
> > 	P2
> > 	FileType changes should trigger indexes
> > 
> > Changing the file type settings on a project might 
> introduce new file 
> > types as extensions that have not been indexed as of yet. 
> The indexer 
> > should react in accordance to the users settings (as defined in R5).
> > 
> > Bug 72396: The indexer is not aware of the File Type 
> (ResolverModel) 
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=72396>
> > R12
> > 	P1
> > 	Indexer Extension point
> > 
> > If all the current versions of the indexer don't meet the users 
> > requirements, the CDT should provide an extension point that will 
> > allow users to write their own indexer that will populate 
> the index. 
> > Provided the information is complete, all index-based CDT features 
> > should work as normal.
> > 
> > R13
> > 	P4
> > 	Index Manager should allow for ongoing search while indexing
> > 
> > The indexer should be able to compare incoming index 
> entries with any 
> > pending search queries and return the matches.
> > 
> > Bug 72803: [Performance][Usability][Indexer/Search] Waiting 
> policy for 
> > index results <https://bugs.eclipse.org/bugs/show_bug.cgi?id=72803>
> > Bug 53792: [Scalability] Prioritizing the indexing when searching a 
> > working set <https://bugs.eclipse.org/bugs/show_bug.cgi?id=53792>
> > R14
> > 	P1
> > 	Index should provide an interface to allow clients to 
> determine what 
> > features are available with the current indexes
> > 
> > With the prospect of having various levels of detail available in a 
> > project's indexes, the Index Manager needs to be able to provide an 
> > interface that will answer any client whether there is sufficient 
> > information in the index to run the client's service. If 
> the necessary 
> > index detail is missing, it would be up to the client to 
> ask the user 
> > if they wish to schedule a new index.
> > 
> > 
> >     2.2 Index Content Requirements
> > 
> > Number
> > 	Priority
> > 	Description
> > R15
> > 	P1
> > 	Indexer will provide enough information to run searches 
> without a 
> > second parse
> > 
> > Currently the index stores just the location information for the 
> > entries
> > - this currently causes search to require two separate 
> parses: one for 
> > the initial index and another to determine the offset 
> information. The 
> > indexer should store this offset information in the initial 
> index and 
> > thus avoid the second parse. Enough information must be 
> available in 
> > the index to answer all of the possible MatchLocator queries. 
> > Additional info that needs to be added to the indexer includes:
> > 
> >     * function/method parameters
> >     * if a variable has an initializer clause, extern specifier or
> >       linkage specification (needed to determine if it is a 
> definition)
> >     * if a field is static (needed  to see if we need to check for
> >       definition)
> > 
> > 
> > Bug 74427: Indexer needs to store more info 
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=74427>
> > R16
> > 	P3
> > 	References in the index should be tied into their declarations
> > 
> > We need to be able to match up references to their declarations 
> > (possible requirement for refactoring support).
> > 
> > Bug 69606: [Search] Match locator has to make sure that the 
> reference 
> > belongs to the specified declaration 
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=69606>
> > R17
> > 	P1
> > 	New Indexer needs to be written for new AST
> > 
> > As the CDT switches over to the new AST, the indexer will 
> have to be 
> > rewritten to extract information from the AST.
> > 
> > 
> >     2.3 Problem Markers
> > 
> > 
> > Number
> > 	Priority
> > 	Description
> >  R18
> > 	 P2
> > 	Problem Markers should be able to be removed manually
> > 
> > There are a number scenarios in which Index problem markers 
> get placed 
> > on resources and cannot be removed. There needs to be some sort of 
> > menu option that can manually delete selected problem markers.
> > 
> > Bug 74284: [IProblem] All Problem markers removed for 
> project when one 
> > file is excluded 
> <https://bugs.eclipse.org/bugs/show_bug.cgi?id=74284>
> > 
> > 3. UI Requirements
> > 
> > 
> >     3.1 UI enhancements
> > 
> > Following UI enhancements are planned to support the feature:
> > 
> > Number
> > 	Description
> > UE1 	Indexer Options Preference Page
> > 
> > This page will allow users to make changes to the Indexer 
> that affect 
> > the entire workspace:
> > 
> >     * Index Progress Bar displayed
> >     * Indexer New Project Default Setting
> > 
> > UE2
> > 	Indexer Project Properties Page
> > 
> > This page will allow users to set indexer settings per project:
> > 
> >     * Indexer to use: if a number of indexers are available (ie.
> >       SourceIndexer, SourceIndexer2, CTagsIndexer) , users 
> can specify
> >       which indexer should be used for the entire workspace.
> >     * Indexing policy to use
> > 
> > 
> > 
> > 4. References
> > 
> >    1. Scanner Config Correctness Enhancement FDS
> > 
> > 
> > /Last Modified on Monday, January 10, 2005 /
> > 
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> http://dev.eclipse.org/mailman/listinfo/cdt-dev
> 


Back to the top