Re: [cdt-dev] Indexer Requirements for 3.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [cdt-dev] Indexer Requirements for 3.0

From: Ed Warnicke <eaw@xxxxxxxxx>
Date: Fri, 14 Jan 2005 09:41:54 -0600
Delivered-to: cdt-dev@xxxxxxxxxxx
List-archive: <http://dev.eclipse.org/pipermail/cdt-dev/>
List-help: <mailto:cdt-dev-request@eclipse.org?subject=help>
List-subscribe: <http://dev.eclipse.org/mailman/listinfo/cdt-dev>, <mailto:cdt-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <http://dev.eclipse.org/mailman/listinfo/cdt-dev>, <mailto:cdt-dev-request@eclipse.org?subject=unsubscribe>
User-agent: Mozilla Thunderbird 0.9 (X11/20041124)

One other thing to keep in mind as you do this is that
the default behavior should be an easy way to import
default settings...

If I'm distributing CDT internally to a user population,
I have a pretty good idea of what the best default
settings should be so they don't have to think about
it.  This doesn't mean they shouldn't be able to
tweak it themselves (I may be wrong for their particular
case).  But the average case I optimize default settings
for one of my user populations for may differ from the
average general case that CDT in general should be
optimizing it's default settings for.

Just a thought,
Ed
Douglas Schaefer wrote:

Up until recently, I've hoped that we could keep the index and the indexersomewhat hidden, or at least out of the users face. I guess that way,things like search just worked and the user didn't need to think about howthe CDT got that information.
It is painfully clear now that this type of magic is just going to be hardto accomplish well for everybody. As such, I think it is time to startraising the profile of the indexer and allow the user to customize it'sbehavior. For example, for large projects, allow the user to select actags based index with the knowledge that this index will not containcross references from code bodies and may be a little inaccurate. Forusers that want accurate information, we need to explicitly have the UIand automation components that can help them get all of the informationthat the CDT parser needs. Bogdan is working on an extension point thatwill allow us to plug in various indexer contributions that should startus down that path.
Doug Schaefer
Ottawa Lab, IBM Rational Software Division
John Camelon/Ottawa/IBM@IBMCASent by: cdt-dev-admin@xxxxxxxxxxx
01/14/2005 09:01 AM
Please respond to
cdt-dev


To
cdt-dev@xxxxxxxxxxx
cc

Subject
RE: [cdt-dev] Indexer Requirements for 3.0
I'd prefer for this type of contribution than to index an include pathirregardless of whether or not it is used (or ever to be used).I think this would be a good thing to keep in mind regarding the indexerextention point.Cheers,JohnCwww.eclipse.org/cdt
cdt-dev-admin@xxxxxxxxxxx wrote on 01/13/2005 08:07:04 PM:
I agree with Chris ... quite the challenge.

Just to add to what Chris has asked, would the combination of being
able to index offline and "merge" index files allow for a set ofsystem headers (which would be great) or project headers for on highlyt
re-used componentsto be pre-indexed and provided as "packages"?
Thanks,Thomas
-----Original Message-----
From: cdt-dev-admin@xxxxxxxxxxx[mailto:cdt-dev-admin@xxxxxxxxxxx] On Behalf Of Chris Wiebe
Sent: January 13, 2005 4:47 PM
To: cdt-dev@xxxxxxxxxxx
Subject: Re: [cdt-dev] Indexer Requirements for 3.0
Looks good, Bogdan. Will this also include indexing ofexternal headers (from the project include paths)?
Chris


Bogdan Gheorghe wrote:
Here is a list of indexer requirements for the 3.0 release. Theoverall focus for the indexer for 3.0 is on improving the
scalability
and overall usability of the indexer service for all
clients. Feedback
is always appreciated - especially if you think there are missingrequirements. A more detailed design doc is to follow.
Thanks,
Bogdan
----------------------------------------------------------------------
--

Indexer Requirements for 3.0
This document describes the proposed work items for the
indexer for the
CDT 3.0 release.
Author    : Bogdan Gheorghe
Revision Date    : 11/29/2004 - Version: 0.1.0

  : 01/10/2005 - Version: 0.1.1
Change History    : 0.1.0 - Document Creation

  : 0.1.1 - Revision


Table of Contents

1. Introduction <#intro>
2. Requirements <#reqs>
3. UI Requirements <#proposal>
4. References <#references>


1. Introduction
The Indexer has been around since CDT 1.2 and currently providessupport for Search, Navigation, and Refactoring. Its main
purpose is
to provide rapid access to a complete database of code
elements and to
manage this database in an efficient and non-intrusive
manner. As the
CDT has evolved; so has the indexer - adding more elements to theindex, refining job scheduling, providing feedback
mechanisms for indexes.
Although the indexer is sufficiently developed to provide mostrequested information to clients; it has become clear that the nextstep in the indexer's evolution will have to address its ability tohandle very large projects efficiently.
Having the indexer work well on large scale projects
requires some new
architecture to reduce the amount of time spent indexing as much aspossible, reuse existing indexes as much as possible and
provide users
with mechanisms to extend the index framework.
This document will address main requirements on the indexer
for CDT 3.0.
   1.0 Definitions

Resource
A project, folder or file within the Eclipse workspace
Index profiles
Separate indexes that are created for different
configurations of
include paths/symbols



   1.1 Current Architecture

Quick overview of the indexing architecture:
1. The indexer responds to resource events from the
workbench. These
events occur whenever a resource gets created,
modified or deleted.
2. The indexer will create index jobs based on the
resource events.
     These jobs might schedule other jobs (such as in the case of
indexing an entire project) but most index jobs
eventually boil
     down to an AddCompilationUnitToIndex job.
3. The indexer creates a new parser, passes in the
current include
     paths and symbol definitions and parses the file and any other
     files included by the file in full parse mode (which generates
     cross reference information).
4. The index gets created as the parser returns
information about the
     elements in the file - the index is stored in memory and at
certain intervals gets merged with the persisted
index on disk.
Currently the indexes have the following structure:

   * Index Version Number
   * Summary Block Location
   * File Blocks [1 ... N]:
         o Each full file block is 8k
o File block entries associate the path of a file
with a unique ID
   * Word Blocks [1 ... N]:
         o Each full word block is 8k
o Word block entries encode the element in a
special format
           and add the referring file number
   * Include Blocks [1 ... N]:
         o Each full include block is 8k
o Include block entries associate a file with the
initial file
           that was parsed to get to the current file
   * Summary Block:
o Keeps track of the total number of words, files
and include
           entries in the index
o Keeps track of the first file block number, the
first word
           block number and the first include block number
         o Keeps track of the first file for every File Block, the
first word entry for every Word Block, and the
first include
           for every Include Block


   1.1 Constraints

Number
  Description
C1
Multiple indexer profiles not possible until similar buildconfiguration notion appears
In order to get an accurate index the indexer depends on
being able to
pass on the relevant includes/symbols to the parser. As
long as there
is no standard representation for build configurations in the coremodel for all build types (both standard and managed), it isn'tpossible to enable index profiles.
Bug 25682: Indexer Profiles
<https://bugs.eclipse.org/bugs/show_bug.cgi?id=25682>


2. Requirements

List of requirements for the indexer is classified into following
categories:


   2.1 Index Management Requirements

Number
  Priority
  Description
R1
  P1
  Indexer must provide different types of indexing services
It has become apparent that the one-size-fits all indexing approachdoes not meet all of our users needs. Most clients with existinglegacy projects want some form of search/navigation but some don'twant the hassle of having to wait for an entire project to
index fully
before being able to use search/navigation. Thus, in order toaccommodate both sets of user groups (those who are willing to waitfor a full index to complete and those who just want the
absolute bare
minimum index) we need to offer the following indexer options:
1. Full Index: this is the "regular" index mode which
uses the CDT
     parser (include paths and symbol definitions need to be setup
     properly); everything gets indexed
  2. Quick Index with no setup: this will result in a "best effort"
     bare bones index that should enable some navigation/search
These indexer options are per project - so it is possible to havedifferent indexers for different projects.
Bug 69078: C/C++ indexer too slow
<https://bugs.eclipse.org/bugs/show_bug.cgi?id=69078>
R2
  P1
  Indexes should be shareable between team members
As part of streamlining the indexing process for large projects,indexes should be able to be shared between users working
on the same project.
All index entries should make use of path variables in
order to allow
indexes to be translated into different workspace locations. (SeeScanner Config Correctness Enhancement FDS for more details
about path
variables)
Bug 79661: All Index Entries should make use of Path Variables<https://bugs.eclipse.org/bugs/show_bug.cgi?id=79661>
Bug 79518: Path/Variable Manager support service in the core(string
substitution) <https://bugs.eclipse.org/bugs/show_bug.cgi?id=79518>
R3
  P1
  Indexer should be able to index a project offline
For features that require a complete index it would be
ideal to have
the index be created somewhere separately and imported in
at a later date.
This is especially true for medium to large projects that
need a long
time to index.
Bug 74433: Offline Indexing/Index Hierarchy<https://bugs.eclipse.org/bugs/show_bug.cgi?id=74433>
R4
  P1
  Indexer should be able to merge indexes
With all of the new options for indexing, it is conceivable
that any
project might have several sources to look at for a single
index. The
indexer should be able to merge new index information into
an existing
index (through a user action), provided that both index
formats are alike.
Bug 52126: Indexer should maintain per project indices<https://bugs.eclipse.org/bugs/show_bug.cgi?id=52126>Bug 74433: Offline Indexing/Index Hierarchy<https://bugs.eclipse.org/bugs/show_bug.cgi?id=74433>
R5
  P2
  Indexer should allow user to specify indexer settings
Currently the indexer uses the same settings for all projects. Thismight not suit all users. Indexer settings that should be
customizable
include:

   * Indexing policy: (default normal)
         o normal (always up to date)
         o manual (index only when manually requested)
         o static (don't update current index)
         o after build (don't index until build)

   * Index Progress Bar displayed: (Checkbox, default displayed)

   * Indexer default setting for new projects:
         o Currently the indexer is always on when creating a new
project. This should be changed to allow users
to set the
           default index behaviour when creating a new project
Bug 75884: Allow C/C++ Indexing to be set on or off as a defaultsetting <https://bugs.eclipse.org/bugs/show_bug.cgi?id=75884>
R6
  P1
  Index Manager must improve job scheduling smarts
The indexer currently has limited smarts when it comes to jobscheduling; it will prevent the same job from being queued up. Butthere are other circumstances when being more aware of
what's in the
job queue would solve numerous problems: including the ever
recurring
double index, and source folder changes. Essentially, more
information
needs to be added to individual job requests about the event thatcreated the job in order to enable the index manager to make smartchoices about how to best schedule the jobs.
The index manager should also run at the minimum priority.
Bug 60084: Indexer should reduce its priority when running inbackground <https://bugs.eclipse.org/bugs/show_bug.cgi?id=60084>
R7
  P1
Indexer must try to keep as much as possible from all
failed index
attempts
The indexer needs to persist a list of all files that are to beindexed as part of an index job. As each merge happens, the list isupdated. In the event of a crash, all work that has been merged todisk shall be considered as sane and the indexer will
restart on the
remaining files on the next startup.
Bug 62366: [Index] Need ability to read a partial index and resume<https://bugs.eclipse.org/bugs/show_bug.cgi?id=62366>
R8
  P1
Indexer needs to deal with new path/symbol addition/deletiongracefully
The indexer currently reindexes the entire project on theaddition/deletion of each new path/symbol. With the new per-filescanner settings, we should only index the range of files
that are affected.
(See Scanner Config Correctness Enhancement FDS for more
details about
per file scanner settings).

R9
  P1
  Indexer needs to handle Source Folder changes gracefully
The indexer currently does a brute force reindex of
whenever anything
changes in the source folder view. This needs to change to a morescalable solution.
R10
  P4
  Resources can be added manually to the index
Users can request a new index on any resource in the workspace(including files that are included from external directories) byselecting one or more files and choosing the appropriate
option from
the context menu.
Bug 71821: Action on the indexer to parse file/folder<https://bugs.eclipse.org/bugs/show_bug.cgi?id=71821>
R11
  P2
  FileType changes should trigger indexes
Changing the file type settings on a project might
introduce new file
types as extensions that have not been indexed as of yet.
The indexer
should react in accordance to the users settings (as defined in R5).
Bug 72396: The indexer is not aware of the File Type
(ResolverModel)
<https://bugs.eclipse.org/bugs/show_bug.cgi?id=72396>
R12
  P1
  Indexer Extension point
If all the current versions of the indexer don't meet the usersrequirements, the CDT should provide an extension point that willallow users to write their own indexer that will populate
the index.
Provided the information is complete, all index-based CDT featuresshould work as normal.
R13
  P4
  Index Manager should allow for ongoing search while indexing
The indexer should be able to compare incoming index
entries with any
pending search queries and return the matches.
Bug 72803: [Performance][Usability][Indexer/Search] Waiting
policy for
index results <https://bugs.eclipse.org/bugs/show_bug.cgi?id=72803>
Bug 53792: [Scalability] Prioritizing the indexing when searching aworking set <https://bugs.eclipse.org/bugs/show_bug.cgi?id=53792>
R14
  P1
Index should provide an interface to allow clients to
determine what
features are available with the current indexes
With the prospect of having various levels of detail available in aproject's indexes, the Index Manager needs to be able to provide aninterface that will answer any client whether there is sufficientinformation in the index to run the client's service. If
the necessary
index detail is missing, it would be up to the client to
ask the user
if they wish to schedule a new index.


   2.2 Index Content Requirements

Number
  Priority
  Description
R15
  P1
Indexer will provide enough information to run searches
without a
second parse
Currently the index stores just the location information for theentries- this currently causes search to require two separate
parses: one for
the initial index and another to determine the offset
information. The
indexer should store this offset information in the initial
index and
thus avoid the second parse. Enough information must be
available in
the index to answer all of the possible MatchLocator queries.Additional info that needs to be added to the indexer includes:
   * function/method parameters
   * if a variable has an initializer clause, extern specifier or
linkage specification (needed to determine if it is a
definition)
   * if a field is static (needed  to see if we need to check for
     definition)
Bug 74427: Indexer needs to store more info<https://bugs.eclipse.org/bugs/show_bug.cgi?id=74427>
R16
  P3
  References in the index should be tied into their declarations
We need to be able to match up references to their declarations(possible requirement for refactoring support).
Bug 69606: [Search] Match locator has to make sure that the
reference
belongs to the specified declaration<https://bugs.eclipse.org/bugs/show_bug.cgi?id=69606>
R17
  P1
  New Indexer needs to be written for new AST
As the CDT switches over to the new AST, the indexer will
have to be
rewritten to extract information from the AST.


   2.3 Problem Markers


Number
  Priority
  Description
R18
   P2
  Problem Markers should be able to be removed manually
There are a number scenarios in which Index problem markers
get placed
on resources and cannot be removed. There needs to be some sort ofmenu option that can manually delete selected problem markers.
Bug 74284: [IProblem] All Problem markers removed for
project when one
file is excluded
<https://bugs.eclipse.org/bugs/show_bug.cgi?id=74284>
3. UI Requirements


   3.1 UI enhancements

Following UI enhancements are planned to support the feature:

Number
  Description
UE1    Indexer Options Preference Page
This page will allow users to make changes to the Indexer
that affect
the entire workspace:

   * Index Progress Bar displayed
   * Indexer New Project Default Setting

UE2
  Indexer Project Properties Page

This page will allow users to set indexer settings per project:

   * Indexer to use: if a number of indexers are available (ie.
SourceIndexer, SourceIndexer2, CTagsIndexer) , users
can specify
     which indexer should be used for the entire workspace.
   * Indexing policy to use



4. References

  1. Scanner Config Correctness Enhancement FDS


/Last Modified on Monday, January 10, 2005 /
_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/cdt-dev
_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/cdt-dev
_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/cdt-dev

References:
- RE: [cdt-dev] Indexer Requirements for 3.0
  - From: Douglas Schaefer

Prev by Date: RE: [cdt-dev] Indexer Requirements for 3.0
Next by Date: RE: [cdt-dev] Indexer Requirements for 3.0
Previous by thread: RE: [cdt-dev] Indexer Requirements for 3.0
Next by thread: RE: [cdt-dev] Indexer Requirements for 3.0
Index(es):
- Date
- Thread

Breadcrumbs