Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [platform-help-dev] What info will be returned on search hit lists?

I agree with Leigh, that using <title> in the search results is a good for
the user.  With the availability of HTML parser that we currently use
during indexing, there should not be problem with extracting the title from
the file even during indexing.  I propose we use that for HTML files, and
fall back to using the label from navigation in case we cannot extract the
title (it is missing) from some of the document.

When we move to better handling xml files, we would probably need a custom
parser, transform or stylesheet to extract text from xml instead of
indexing the raw xml.  Having a way to extract text from XML, we can use
the same approach to obtain a title string as for obtaining text string.
If for some reason this fails for a document we again can use the label
from the navigation as a fall back.

The idea of displaying lines that contain pieces of query is certainly to
the benefit of the user.  Implementation however would be involved, and I
do not see chances of having this implemented in Eclipse 2.0.  A bit
simpler solution for HTML files is to display first few lines of the hit
document, no matter if these lines contain the query words, or no.  Would
this be of value to the user?

Konrad



                                                                                                                              
                    Leigh                                                                                                     
                    Davidson/Toronto/IBM@IBMC       To:     platform-help-dev@xxxxxxxxxxx                                     
                    A                               cc:                                                                       
                    Sent by:                        Subject:     [platform-help-dev] What info will be returned on search hit 
                    platform-help-dev-admin@e        lists?                                                                   
                    clipse.org                                                                                                
                                                                                                                              
                                                                                                                              
                    02/12/2002 04:34 PM                                                                                       
                    Please respond to                                                                                         
                    platform-help-dev                                                                                         
                                                                                                                              
                                                                                                                              



I'm not sure how closely this is tied to the search engine that you select
for V2. This posting is to suggest that search hits should display the
<title> tag of the target file, plus possibly some portion of the target
file's body (first few lines or, as google seems to return, snippets of
lines that contain the search arguments).

As Kari Halsted specified in her list of req'ts on Nov 21/2001, the IBM ID
community was previously asking to have <h1> tags returned. In offline
discussion, it was pointed out that different adopters of Eclipse might use
<h_> tags in different ways or might not use them at all, whereas <title>
tags are probably more predictable/appropriate.

Our least favorite option is to return text strings from the table of
contents. We would like to uncouple t.o.c. strings from search hits, so
that we have the option of making t.o.c. entries concise (let users derive
context from parent/grandparent text) but search hits more complete (let
user derive context from the wording of the hit).

Leigh Davidson
Editor, IBM Canada Lab
Phone: 416-313-1034
email: ldavidso@xxxxxxxxxx

_______________________________________________
platform-help-dev mailing list
platform-help-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/platform-help-dev





Back to the top