Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [cdt-dev] Indexer Requirements for 3.0


I'd prefer for this type of contribution than to index an include path irregardless of whether or not it is used (or ever to be used).  
I think this would be a good thing to keep in mind regarding the indexer extention point.  

Cheers,
JohnC
www.eclipse.org/cdt


cdt-dev-admin@xxxxxxxxxxx wrote on 01/13/2005 08:07:04 PM:

> I agree with Chris ... quite the challenge.
>
> Just to add to what Chris has asked, would the combination of being
> able to index offline and "merge" index files allow for a set of
> system headers (which would be great) or project headers for on highlyt
> re-used componentsto be pre-indexed and provided as "packages"?
>
> Thanks,
>  Thomas
>
> > -----Original Message-----
> > From: cdt-dev-admin@xxxxxxxxxxx
> > [mailto:cdt-dev-admin@xxxxxxxxxxx] On Behalf Of Chris Wiebe
> > Sent: January 13, 2005 4:47 PM
> > To: cdt-dev@xxxxxxxxxxx
> > Subject: Re: [cdt-dev] Indexer Requirements for 3.0
> >
> > Looks good, Bogdan.  Will this also include indexing of
> > external headers (from the project include paths)?
> >
> > Chris
> >
> >
> > Bogdan Gheorghe wrote:
> > >
> > > Here is a list of indexer requirements for the 3.0 release. The
> > > overall focus for the indexer for 3.0 is on improving the
> > scalability
> > > and overall usability of the indexer service for all
> > clients. Feedback
> > > is always appreciated - especially if you think there are missing
> > > requirements. A more detailed design doc is to follow.
> > >
> > > Thanks,
> > > Bogdan
> > >
> > >
> > >
> > >
> > ----------------------------------------------------------------------
> > > --
> > >
> > > Indexer Requirements for 3.0
> > > This document describes the proposed work items for the
> > indexer for the
> > > CDT 3.0 release.    
> > >
> > > Author    : Bogdan Gheorghe
> > > Revision Date    : 11/29/2004 - Version: 0.1.0
> > >
> > >    : 01/10/2005 - Version: 0.1.1
> > > Change History    : 0.1.0 - Document Creation
> > >
> > >    : 0.1.1 - Revision
> > >
> > >
> > > Table of Contents
> > >
> > > 1. Introduction <#intro>
> > > 2. Requirements <#reqs>
> > > 3. UI Requirements <#proposal>
> > > 4. References <#references>
> > >
> > >
> > > 1. Introduction
> > >
> > > The Indexer has been around since CDT 1.2 and currently provides
> > > support for Search, Navigation, and Refactoring. Its main
> > purpose is
> > > to provide rapid access to a complete database of code
> > elements and to
> > > manage this database in an efficient and non-intrusive
> > manner. As the
> > > CDT has evolved; so has the indexer - adding more elements to the
> > > index, refining job scheduling, providing feedback
> > mechanisms for indexes.
> > > Although the indexer is sufficiently developed to provide most
> > > requested information to clients; it has become clear that the next
> > > step in the indexer's evolution will have to address its ability to
> > > handle very large projects efficiently.
> > >
> > > Having the indexer work well on large scale projects
> > requires some new
> > > architecture to reduce the amount of time spent indexing as much as
> > > possible, reuse existing indexes as much as possible and
> > provide users
> > > with mechanisms to extend the index framework.
> > >
> > > This document will address main requirements on the indexer
> > for CDT 3.0.
> > >
> > >
> > >     1.0 Definitions
> > >
> > > Resource
> > >    A project, folder or file within the Eclipse workspace
> > Index profiles
> > >    Separate indexes that are created for different
> > configurations of
> > > include paths/symbols
> > >
> > >
> > >
> > >     1.1 Current Architecture
> > >
> > > Quick overview of the indexing architecture:
> > >
> > >    1. The indexer responds to resource events from the
> > workbench. These
> > >       events occur whenever a resource gets created,
> > modified or deleted.
> > >    2. The indexer will create index jobs based on the
> > resource events.
> > >       These jobs might schedule other jobs (such as in the case of
> > >       indexing an entire project) but most index jobs
> > eventually boil
> > >       down to an AddCompilationUnitToIndex job.
> > >    3. The indexer creates a new parser, passes in the
> > current include
> > >       paths and symbol definitions and parses the file and any other
> > >       files included by the file in full parse mode (which generates
> > >       cross reference information).
> > >    4. The index gets created as the parser returns
> > information about the
> > >       elements in the file - the index is stored in memory and at
> > >       certain intervals gets merged with the persisted
> > index on disk.
> > >
> > >
> > > Currently the indexes have the following structure:
> > >
> > >     * Index Version Number
> > >     * Summary Block Location
> > >     * File Blocks [1 ... N]:
> > >           o Each full file block is 8k
> > >           o File block entries associate the path of a file
> > with a unique ID
> > >     * Word Blocks [1 ... N]:
> > >           o Each full word block is 8k
> > >           o Word block entries encode the element in a
> > special format
> > >             and add the referring file number
> > >     * Include Blocks [1 ... N]:
> > >           o Each full include block is 8k
> > >           o Include block entries associate a file with the
> > initial file
> > >             that was parsed to get to the current file
> > >     * Summary Block:
> > >           o Keeps track of the total number of words, files
> > and include
> > >             entries in the index
> > >           o Keeps track of the first file block number, the
> > first word
> > >             block number and the first include block number
> > >           o Keeps track of the first file for every File Block, the
> > >             first word entry for every Word Block, and the
> > first include
> > >             for every Include Block
> > >
> > >
> > >     1.1 Constraints
> > >
> > > Number
> > >    Description
> > > C1
> > >    Multiple indexer profiles not possible until similar build
> > > configuration notion appears
> > >
> > > In order to get an accurate index the indexer depends on
> > being able to
> > > pass on the relevant includes/symbols to the parser.  As
> > long as there
> > > is no standard representation for build configurations in the core
> > > model for all build types (both standard and managed), it isn't
> > > possible to enable index profiles.
> > >
> > > Bug 25682: Indexer Profiles
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=25682>
> > >
> > >
> > > 2. Requirements
> > >
> > > List of requirements for the indexer is classified into following
> > > categories:
> > >
> > >
> > >     2.1 Index Management Requirements
> > >
> > > Number
> > >    Priority
> > >    Description
> > > R1
> > >    P1
> > >    Indexer must provide different types of indexing services
> > >
> > > It has become apparent that the one-size-fits all indexing approach
> > > does not meet all of our users needs. Most clients with existing
> > > legacy projects want some form of search/navigation but some don't
> > > want the hassle of having to wait for an entire project to
> > index fully
> > > before being able to use search/navigation. Thus, in order to
> > > accommodate both sets of user groups (those who are willing to wait
> > > for a full index to complete and those who just want the
> > absolute bare
> > > minimum index) we need to offer the following indexer options:
> > >
> > >    1. Full Index: this is the "regular" index mode which
> > uses the CDT
> > >       parser (include paths and symbol definitions need to be setup
> > >       properly); everything gets indexed
> > >    2. Quick Index with no setup: this will result in a "best effort"
> > >       bare bones index that should enable some navigation/search
> > >
> > >
> > > These indexer options are per project - so it is possible to have
> > > different indexers for different projects.
> > >
> > > Bug 69078: C/C++ indexer too slow
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=69078>
> > > R2
> > >    P1
> > >    Indexes should be shareable between team members
> > >
> > > As part of streamlining the indexing process for large projects,
> > > indexes should be able to be shared between users working
> > on the same project.
> > > All index entries should make use of path variables in
> > order to allow
> > > indexes to be translated into different workspace locations. (See
> > > Scanner Config Correctness Enhancement FDS for more details
> > about path
> > > variables)
> > >
> > > Bug 79661: All Index Entries should make use of Path Variables
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=79661>
> > > Bug 79518: Path/Variable Manager support service in the core(string
> > > substitution) <https://bugs.eclipse.org/bugs/show_bug.cgi?id=79518>
> > > R3
> > >    P1
> > >    Indexer should be able to index a project offline
> > >
> > > For features that require a complete index it would be
> > ideal to have
> > > the index be created somewhere separately and imported in
> > at a later date.
> > > This is especially true for medium to large projects that
> > need a long
> > > time to index.
> > >
> > > Bug 74433: Offline Indexing/Index Hierarchy
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=74433>
> > > R4
> > >    P1
> > >    Indexer should be able to merge indexes
> > >
> > > With all of the new options for indexing, it is conceivable
> > that any
> > > project might have several sources to look at for a single
> > index. The
> > > indexer should be able to merge new index information into
> > an existing
> > > index (through a user action), provided that both index
> > formats are alike.
> > >
> > > Bug 52126: Indexer should maintain per project indices
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=52126>
> > > Bug 74433: Offline Indexing/Index Hierarchy
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=74433>
> > > R5
> > >    P2
> > >    Indexer should allow user to specify indexer settings
> > >
> > > Currently the indexer uses the same settings for all projects. This
> > > might not suit all users. Indexer settings that should be
> > customizable
> > > include:
> > >
> > >     * Indexing policy: (default normal)
> > >           o normal (always up to date)
> > >           o manual (index only when manually requested)
> > >           o static (don't update current index)
> > >           o after build (don't index until build)
> > >
> > >     * Index Progress Bar displayed: (Checkbox, default displayed)
> > >
> > >     * Indexer default setting for new projects:
> > >           o Currently the indexer is always on when creating a new
> > >             project. This should be changed to allow users
> > to set the
> > >             default index behaviour when creating a new project
> > >
> > >
> > > Bug 75884: Allow C/C++ Indexing to be set on or off as a default
> > > setting <https://bugs.eclipse.org/bugs/show_bug.cgi?id=75884>
> > > R6
> > >    P1
> > >    Index Manager must improve job scheduling smarts
> > >
> > > The indexer currently has limited smarts when it comes to job
> > > scheduling; it will prevent the same job from being queued up. But
> > > there are other circumstances when being more aware of
> > what's in the
> > > job queue would solve numerous problems: including the ever
> > recurring
> > > double index, and source folder changes. Essentially, more
> > information
> > > needs to be added to individual job requests about the event that
> > > created the job in order to enable the index manager to make smart
> > > choices about how to best schedule the jobs.
> > >
> > > The index manager should also run at the minimum priority.
> > >
> > > Bug 60084: Indexer should reduce its priority when running in
> > > background <https://bugs.eclipse.org/bugs/show_bug.cgi?id=60084>
> > > R7
> > >    P1
> > >    Indexer must try to keep as much as possible from all
> > failed index
> > > attempts
> > >
> > > The indexer needs to persist a list of all files that are to be
> > > indexed as part of an index job. As each merge happens, the list is
> > > updated. In the event of a crash, all work that has been merged to
> > > disk shall be considered as sane and the indexer will
> > restart on the
> > > remaining files on the next startup.
> > >
> > > Bug 62366: [Index] Need ability to read a partial index and resume
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=62366>
> > > R8
> > >    P1
> > >    Indexer needs to deal with new path/symbol addition/deletion
> > > gracefully
> > >
> > > The indexer currently reindexes the entire project on the
> > > addition/deletion of each new path/symbol. With the new per-file
> > > scanner settings, we should only index the range of files
> > that are affected.
> > > (See Scanner Config Correctness Enhancement FDS for more
> > details about
> > > per file scanner settings).
> > >
> > > R9
> > >    P1
> > >    Indexer needs to handle Source Folder changes gracefully
> > >
> > > The indexer currently does a brute force reindex of
> > whenever anything
> > > changes in the source folder view. This needs to change to a more
> > > scalable solution.
> > >
> > > R10
> > >    P4
> > >    Resources can be added manually to the index
> > >
> > > Users can request a new index on any resource in the workspace
> > > (including files that are included from external directories) by
> > > selecting one or more files and choosing the appropriate
> > option from
> > > the context menu.
> > >
> > > Bug 71821: Action on the indexer to parse file/folder
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=71821>
> > > R11
> > >    P2
> > >    FileType changes should trigger indexes
> > >
> > > Changing the file type settings on a project might
> > introduce new file
> > > types as extensions that have not been indexed as of yet.
> > The indexer
> > > should react in accordance to the users settings (as defined in R5).
> > >
> > > Bug 72396: The indexer is not aware of the File Type
> > (ResolverModel)
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=72396>
> > > R12
> > >    P1
> > >    Indexer Extension point
> > >
> > > If all the current versions of the indexer don't meet the users
> > > requirements, the CDT should provide an extension point that will
> > > allow users to write their own indexer that will populate
> > the index.
> > > Provided the information is complete, all index-based CDT features
> > > should work as normal.
> > >
> > > R13
> > >    P4
> > >    Index Manager should allow for ongoing search while indexing
> > >
> > > The indexer should be able to compare incoming index
> > entries with any
> > > pending search queries and return the matches.
> > >
> > > Bug 72803: [Performance][Usability][Indexer/Search] Waiting
> > policy for
> > > index results <https://bugs.eclipse.org/bugs/show_bug.cgi?id=72803>
> > > Bug 53792: [Scalability] Prioritizing the indexing when searching a
> > > working set <https://bugs.eclipse.org/bugs/show_bug.cgi?id=53792>
> > > R14
> > >    P1
> > >    Index should provide an interface to allow clients to
> > determine what
> > > features are available with the current indexes
> > >
> > > With the prospect of having various levels of detail available in a
> > > project's indexes, the Index Manager needs to be able to provide an
> > > interface that will answer any client whether there is sufficient
> > > information in the index to run the client's service. If
> > the necessary
> > > index detail is missing, it would be up to the client to
> > ask the user
> > > if they wish to schedule a new index.
> > >
> > >
> > >     2.2 Index Content Requirements
> > >
> > > Number
> > >    Priority
> > >    Description
> > > R15
> > >    P1
> > >    Indexer will provide enough information to run searches
> > without a
> > > second parse
> > >
> > > Currently the index stores just the location information for the
> > > entries
> > > - this currently causes search to require two separate
> > parses: one for
> > > the initial index and another to determine the offset
> > information. The
> > > indexer should store this offset information in the initial
> > index and
> > > thus avoid the second parse. Enough information must be
> > available in
> > > the index to answer all of the possible MatchLocator queries.
> > > Additional info that needs to be added to the indexer includes:
> > >
> > >     * function/method parameters
> > >     * if a variable has an initializer clause, extern specifier or
> > >       linkage specification (needed to determine if it is a
> > definition)
> > >     * if a field is static (needed  to see if we need to check for
> > >       definition)
> > >
> > >
> > > Bug 74427: Indexer needs to store more info
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=74427>
> > > R16
> > >    P3
> > >    References in the index should be tied into their declarations
> > >
> > > We need to be able to match up references to their declarations
> > > (possible requirement for refactoring support).
> > >
> > > Bug 69606: [Search] Match locator has to make sure that the
> > reference
> > > belongs to the specified declaration
> > > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=69606>
> > > R17
> > >    P1
> > >    New Indexer needs to be written for new AST
> > >
> > > As the CDT switches over to the new AST, the indexer will
> > have to be
> > > rewritten to extract information from the AST.
> > >
> > >
> > >     2.3 Problem Markers
> > >
> > >
> > > Number
> > >    Priority
> > >    Description
> > >  R18
> > >     P2
> > >    Problem Markers should be able to be removed manually
> > >
> > > There are a number scenarios in which Index problem markers
> > get placed
> > > on resources and cannot be removed. There needs to be some sort of
> > > menu option that can manually delete selected problem markers.
> > >
> > > Bug 74284: [IProblem] All Problem markers removed for
> > project when one
> > > file is excluded
> > <https://bugs.eclipse.org/bugs/show_bug.cgi?id=74284>
> > >
> > > 3. UI Requirements
> > >
> > >
> > >     3.1 UI enhancements
> > >
> > > Following UI enhancements are planned to support the feature:
> > >
> > > Number
> > >    Description
> > > UE1    Indexer Options Preference Page
> > >
> > > This page will allow users to make changes to the Indexer
> > that affect
> > > the entire workspace:
> > >
> > >     * Index Progress Bar displayed
> > >     * Indexer New Project Default Setting
> > >
> > > UE2
> > >    Indexer Project Properties Page
> > >
> > > This page will allow users to set indexer settings per project:
> > >
> > >     * Indexer to use: if a number of indexers are available (ie.
> > >       SourceIndexer, SourceIndexer2, CTagsIndexer) , users
> > can specify
> > >       which indexer should be used for the entire workspace.
> > >     * Indexing policy to use
> > >
> > >
> > >
> > > 4. References
> > >
> > >    1. Scanner Config Correctness Enhancement FDS
> > >
> > >
> > > /Last Modified on Monday, January 10, 2005 /
> > >
> > _______________________________________________
> > cdt-dev mailing list
> > cdt-dev@xxxxxxxxxxx
> > http://dev.eclipse.org/mailman/listinfo/cdt-dev
> >
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> http://dev.eclipse.org/mailman/listinfo/cdt-dev

Back to the top