Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [cdt-dev] Indexer Requirements for 3.0



cdt-dev-admin@xxxxxxxxxxx wrote on 01/13/2005 08:07:04 PM:

> I agree with Chris ... quite the challenge.
>
> Just to add to what Chris has asked, would the combination of being
> able to index offline and "merge" index files allow for a set of
> system headers (which would be great) or project headers for on highlyt
> re-used componentsto be pre-indexed and provided as "packages"?
>


Yes - this is one of the driving factors behind offline indexing. Once the index is created the user will be able to import it, resolve any path variables that might differ in his particular environment, and have it merged in with an already existing index.

I'm still figuring out the exact mechanics behind it - whether it should be a real merge or a virtual merge where we store
the link information for the index and consult it as part of any index lookup. The advantage of this is that the contributed index can be swapped with a new one at any time.

> Thanks,
>  Thomas
>
> > -----Original Message-----
> > From: cdt-dev-admin@xxxxxxxxxxx
> > [mailto:cdt-dev-admin@xxxxxxxxxxx] On Behalf Of Chris Wiebe
> > Sent: January 13, 2005 4:47 PM
> > To: cdt-dev@xxxxxxxxxxx
> > Subject: Re: [cdt-dev] Indexer Requirements for 3.0
> >
> > Looks good, Bogdan.  Will this also include indexing of
> > external headers (from the project include paths)?
> >


I'm not sure if indexing non-referenced headers from a project include path as part of an overall project index is a good idea - but I understand what you're trying to do. There is an item in the requirements for being able to manually add items to the index or maybe we can allow users to pass in a list of folders they wish indexed to the offline indexer.

> > Chris
> >
> >
> > Bogdan Gheorghe wrote:
> > >
> > > Here is a list of indexer requirements for the 3.0 release. The
> > > overall focus for the indexer for 3.0 is on improving the
> > scalability
> > > and overall usability of the indexer service for all
> > clients. Feedback
> > > is always appreciated - especially if you think there are missing
> > > requirements. A more detailed design doc is to follow.
> > >
> > > Thanks,
> > > Bogdan
> > >
> > >
> > >
> > >
> > ----------------------------------------------------------------------
> > > --
> > >
> > > Indexer Requirements for 3.0
> > > This document describes the proposed work items for the
> > indexer for the
> > > CDT 3.0 release.    
> > >
> > > Author    : Bogdan Gheorghe
> > > Revision Date    : 11/29/2004 - Version: 0.1.0
> > >
> > >    : 01/10/2005 - Version: 0.1.1
> > > Change History    : 0.1.0 - Document Creation
> > >
> > >    : 0.1.1 - Revision
> > >
> > >
> > > Table of Contents
> > >
> > > 1. Introduction <#intro>
> > > 2. Requirements <#reqs>
> > > 3. UI Requirements <#proposal>
> > > 4. References <#references>
> > >
> > >
> > > 1. Introduction
> > >
> > > The Indexer has been around since CDT 1.2 and currently provides
> > > support for Search, Navigation, and Refactoring. Its main
> > purpose is
> > > to provide rapid access to a complete database of code
> > elements and to
> > > manage this database in an efficient and non-intrusive
> > manner. As the
> > > CDT has evolved; so has the indexer - adding more elements to the
> > > index, refining job scheduling, providing feedback
> > mechanisms for indexes.
> > > Although the indexer is sufficiently developed to provide most
> > > requested information to clients; it has become clear that the next
> > > step in the indexer's evolution will have to address its ability to
> > > handle very large projects efficiently.
> > >
> > > Having the indexer work well on large scale projects
> > requires some new
> > > architecture to reduce the amount of time spent indexing as much as
> > > possible, reuse existing indexes as much as possible and
> > provide users
> > > with mechanisms to extend the index framework.
> > >
> > > This document will address main requirements on the indexer
> > for CDT 3.0.
> > >
> > >
> > >     1.0 Definitions
> > >
> > > Resource
> > >    A project, folder or file within the Eclipse workspace
> > Index profiles
> > >    Separate indexes that are created for different
> > configurations of
> > > include paths/symbols
> > >
> > >
> > >
> > >     1.1 Current Architecture
> > >
> > > Quick overview of the indexing architecture:
> > >
> > >    1. The indexer responds to resource events from the
> > workbench. These
> > >       events occur whenever a resource gets created,
> > modified or deleted.
> > >    2. The indexer will create index jobs based on the
> > resource events.
> > >       These jobs might schedule other jobs (such as in the case of
> > >       indexing an entire project) but most index jobs
> > eventually boil
> > >       down to an AddCompilationUnitToIndex job.
> > >    3. The indexer creates a new parser, passes in the
> > current include
> > >       paths and symbol definitions and parses the file and any other
> > >       files included by the file in full parse mode (which generates
> > >       cross reference information).
> > >    4. The index gets created as the parser returns
> > information about the
> > >       elements in the file - the index is stored in memory and at
> > >       certain intervals gets merged with the persisted
> > index on disk.
> > >
> > >
> > > Currently the indexes have the following structure:
> > >
> > >     * Index Version Number
> > >     * Summary Block Location
> > >     * File Blocks [1 ... N]:
> > >           o Each full file block is 8k
> > >           o File block entries associate the path of a file
> > with a unique ID
> > >     * Word Blocks [1 ... N]:
> > >           o Each full word block is 8k
> > >           o Word block entries encode the element in a
> > special format
> > >             and add the referring file number
> > >     * Include Blocks [1 ... N]:
> > >           o Each full include block is 8k
> > >           o Include block entries associate a file with the
> > initial file
> > >             that was parsed to get to the current file
> > >     * Summary Block:
> > >           o Keeps track of the total number of words, files
> > and include
> > >             entries in the index
> > >           o Keeps track of the first file block number, the
> > first word
> > >             block number and the first include block number
> > >           o Keeps track of the first file for every File Block, the
> > >             first word entry for every Word Block, and the
> > first include
> > >             for every Include Block
> > >
> > >
> > >     1.1 Constraints
> > >
> > > Number
> > >    Description
> > > C1
> > >    Multiple indexer profiles not possible until similar build
> > > configuration notion appears
> > >
> > > In order to get an accurate index the indexer depends on
> > being able to
> > > pass on the relevant includes/symbols to the parser.  As
> > long as there
> > > is no standard representation for build configurations in the core
> > > model for all build types (both standard and managed), it isn't
> > > possible to enable index profiles.
> > >
> > > Bug 25682: Indexer Profiles
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=25682>
> > >
> > >
> > > 2. Requirements
> > >
> > > List of requirements for the indexer is classified into following
> > > categories:
> > >
> > >
> > >     2.1 Index Management Requirements
> > >
> > > Number
> > >    Priority
> > >    Description
> > > R1
> > >    P1
> > >    Indexer must provide different types of indexing services
> > >
> > > It has become apparent that the one-size-fits all indexing approach
> > > does not meet all of our users needs. Most clients with existing
> > > legacy projects want some form of search/navigation but some don't
> > > want the hassle of having to wait for an entire project to
> > index fully
> > > before being able to use search/navigation. Thus, in order to
> > > accommodate both sets of user groups (those who are willing to wait
> > > for a full index to complete and those who just want the
> > absolute bare
> > > minimum index) we need to offer the following indexer options:
> > >
> > >    1. Full Index: this is the "regular" index mode which
> > uses the CDT
> > >       parser (include paths and symbol definitions need to be setup
> > >       properly); everything gets indexed
> > >    2. Quick Index with no setup: this will result in a "best effort"
> > >       bare bones index that should enable some navigation/search
> > >
> > >
> > > These indexer options are per project - so it is possible to have
> > > different indexers for different projects.
> > >
> > > Bug 69078: C/C++ indexer too slow
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=69078>
> > > R2
> > >    P1
> > >    Indexes should be shareable between team members
> > >
> > > As part of streamlining the indexing process for large projects,
> > > indexes should be able to be shared between users working
> > on the same project.
> > > All index entries should make use of path variables in
> > order to allow
> > > indexes to be translated into different workspace locations. (See
> > > Scanner Config Correctness Enhancement FDS for more details
> > about path
> > > variables)
> > >
> > > Bug 79661: All Index Entries should make use of Path Variables
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=79661>
> > > Bug 79518: Path/Variable Manager support service in the core(string
> > > substitution) <https://bugs.eclipse.org/bugs/show_bug.cgi?id=79518>
> > > R3
> > >    P1
> > >    Indexer should be able to index a project offline
> > >
> > > For features that require a complete index it would be
> > ideal to have
> > > the index be created somewhere separately and imported in
> > at a later date.
> > > This is especially true for medium to large projects that
> > need a long
> > > time to index.
> > >
> > > Bug 74433: Offline Indexing/Index Hierarchy
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=74433>
> > > R4
> > >    P1
> > >    Indexer should be able to merge indexes
> > >
> > > With all of the new options for indexing, it is conceivable
> > that any
> > > project might have several sources to look at for a single
> > index. The
> > > indexer should be able to merge new index information into
> > an existing
> > > index (through a user action), provided that both index
> > formats are alike.
> > >
> > > Bug 52126: Indexer should maintain per project indices
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=52126>
> > > Bug 74433: Offline Indexing/Index Hierarchy
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=74433>
> > > R5
> > >    P2
> > >    Indexer should allow user to specify indexer settings
> > >
> > > Currently the indexer uses the same settings for all projects. This
> > > might not suit all users. Indexer settings that should be
> > customizable
> > > include:
> > >
> > >     * Indexing policy: (default normal)
> > >           o normal (always up to date)
> > >           o manual (index only when manually requested)
> > >           o static (don't update current index)
> > >           o after build (don't index until build)
> > >
> > >     * Index Progress Bar displayed: (Checkbox, default displayed)
> > >
> > >     * Indexer default setting for new projects:
> > >           o Currently the indexer is always on when creating a new
> > >             project. This should be changed to allow users
> > to set the
> > >             default index behaviour when creating a new project
> > >
> > >
> > > Bug 75884: Allow C/C++ Indexing to be set on or off as a default
> > > setting <https://bugs.eclipse.org/bugs/show_bug.cgi?id=75884>
> > > R6
> > >    P1
> > >    Index Manager must improve job scheduling smarts
> > >
> > > The indexer currently has limited smarts when it comes to job
> > > scheduling; it will prevent the same job from being queued up. But
> > > there are other circumstances when being more aware of
> > what's in the
> > > job queue would solve numerous problems: including the ever
> > recurring
> > > double index, and source folder changes. Essentially, more
> > information
> > > needs to be added to individual job requests about the event that
> > > created the job in order to enable the index manager to make smart
> > > choices about how to best schedule the jobs.
> > >
> > > The index manager should also run at the minimum priority.
> > >
> > > Bug 60084: Indexer should reduce its priority when running in
> > > background <https://bugs.eclipse.org/bugs/show_bug.cgi?id=60084>
> > > R7
> > >    P1
> > >    Indexer must try to keep as much as possible from all
> > failed index
> > > attempts
> > >
> > > The indexer needs to persist a list of all files that are to be
> > > indexed as part of an index job. As each merge happens, the list is
> > > updated. In the event of a crash, all work that has been merged to
> > > disk shall be considered as sane and the indexer will
> > restart on the
> > > remaining files on the next startup.
> > >
> > > Bug 62366: [Index] Need ability to read a partial index and resume
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=62366>
> > > R8
> > >    P1
> > >    Indexer needs to deal with new path/symbol addition/deletion
> > > gracefully
> > >
> > > The indexer currently reindexes the entire project on the
> > > addition/deletion of each new path/symbol. With the new per-file
> > > scanner settings, we should only index the range of files
> > that are affected.
> > > (See Scanner Config Correctness Enhancement FDS for more
> > details about
> > > per file scanner settings).
> > >
> > > R9
> > >    P1
> > >    Indexer needs to handle Source Folder changes gracefully
> > >
> > > The indexer currently does a brute force reindex of
> > whenever anything
> > > changes in the source folder view. This needs to change to a more
> > > scalable solution.
> > >
> > > R10
> > >    P4
> > >    Resources can be added manually to the index
> > >
> > > Users can request a new index on any resource in the workspace
> > > (including files that are included from external directories) by
> > > selecting one or more files and choosing the appropriate
> > option from
> > > the context menu.
> > >
> > > Bug 71821: Action on the indexer to parse file/folder
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=71821>
> > > R11
> > >    P2
> > >    FileType changes should trigger indexes
> > >
> > > Changing the file type settings on a project might
> > introduce new file
> > > types as extensions that have not been indexed as of yet.
> > The indexer
> > > should react in accordance to the users settings (as defined in R5).
> > >
> > > Bug 72396: The indexer is not aware of the File Type
> > (ResolverModel)
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=72396>
> > > R12
> > >    P1
> > >    Indexer Extension point
> > >
> > > If all the current versions of the indexer don't meet the users
> > > requirements, the CDT should provide an extension point that will
> > > allow users to write their own indexer that will populate
> > the index.
> > > Provided the information is complete, all index-based CDT features
> > > should work as normal.
> > >
> > > R13
> > >    P4
> > >    Index Manager should allow for ongoing search while indexing
> > >
> > > The indexer should be able to compare incoming index
> > entries with any
> > > pending search queries and return the matches.
> > >
> > > Bug 72803: [Performance][Usability][Indexer/Search] Waiting
> > policy for
> > > index results <https://bugs.eclipse.org/bugs/show_bug.cgi?id=72803>
> > > Bug 53792: [Scalability] Prioritizing the indexing when searching a
> > > working set <https://bugs.eclipse.org/bugs/show_bug.cgi?id=53792>
> > > R14
> > >    P1
> > >    Index should provide an interface to allow clients to
> > determine what
> > > features are available with the current indexes
> > >
> > > With the prospect of having various levels of detail available in a
> > > project's indexes, the Index Manager needs to be able to provide an
> > > interface that will answer any client whether there is sufficient
> > > information in the index to run the client's service. If
> > the necessary
> > > index detail is missing, it would be up to the client to
> > ask the user
> > > if they wish to schedule a new index.
> > >
> > >
> > >     2.2 Index Content Requirements
> > >
> > > Number
> > >    Priority
> > >    Description
> > > R15
> > >    P1
> > >    Indexer will provide enough information to run searches
> > without a
> > > second parse
> > >
> > > Currently the index stores just the location information for the
> > > entries
> > > - this currently causes search to require two separate
> > parses: one for
> > > the initial index and another to determine the offset
> > information. The
> > > indexer should store this offset information in the initial
> > index and
> > > thus avoid the second parse. Enough information must be
> > available in
> > > the index to answer all of the possible MatchLocator queries.
> > > Additional info that needs to be added to the indexer includes:
> > >
> > >     * function/method parameters
> > >     * if a variable has an initializer clause, extern specifier or
> > >       linkage specification (needed to determine if it is a
> > definition)
> > >     * if a field is static (needed  to see if we need to check for
> > >       definition)
> > >
> > >
> > > Bug 74427: Indexer needs to store more info
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=74427>
> > > R16
> > >    P3
> > >    References in the index should be tied into their declarations
> > >
> > > We need to be able to match up references to their declarations
> > > (possible requirement for refactoring support).
> > >
> > > Bug 69606: [Search] Match locator has to make sure that the
> > reference
> > > belongs to the specified declaration
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=69606>
> > > R17
> > >    P1
> > >    New Indexer needs to be written for new AST
> > >
> > > As the CDT switches over to the new AST, the indexer will
> > have to be
> > > rewritten to extract information from the AST.
> > >
> > >
> > >     2.3 Problem Markers
> > >
> > >
> > > Number
> > >    Priority
> > >    Description
> > >  R18
> > >     P2
> > >    Problem Markers should be able to be removed manually
> > >
> > > There are a number scenarios in which Index problem markers
> > get placed
> > > on resources and cannot be removed. There needs to be some sort of
> > > menu option that can manually delete selected problem markers.
> > >
> > > Bug 74284: [IProblem] All Problem markers removed for
> > project when one
> > > file is excluded
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=74284>
> > >
> > > 3. UI Requirements
> > >
> > >
> > >     3.1 UI enhancements
> > >
> > > Following UI enhancements are planned to support the feature:
> > >
> > > Number
> > >    Description
> > > UE1    Indexer Options Preference Page
> > >
> > > This page will allow users to make changes to the Indexer
> > that affect
> > > the entire workspace:
> > >
> > >     * Index Progress Bar displayed
> > >     * Indexer New Project Default Setting
> > >
> > > UE2
> > >    Indexer Project Properties Page
> > >
> > > This page will allow users to set indexer settings per project:
> > >
> > >     * Indexer to use: if a number of indexers are available (ie.
> > >       SourceIndexer, SourceIndexer2, CTagsIndexer) , users
> > can specify
> > >       which indexer should be used for the entire workspace.
> > >     * Indexing policy to use
> > >
> > >
> > >
> > > 4. References
> > >
> > >    1. Scanner Config Correctness Enhancement FDS
> > >
> > >
> > > /Last Modified on Monday, January 10, 2005 /
> > >
> > _______________________________________________
> > cdt-dev mailing list
> > cdt-dev@xxxxxxxxxxx
> > http://dev.eclipse.org/mailman/listinfo/cdt-dev
> >
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> http://dev.eclipse.org/mailman/listinfo/cdt-dev

Back to the top