Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [platform-help-dev] indexing keywords in M5

1.  In the current implementation, the name is fixed at "keywords".  The
content of the meta tag is indexed with the document.  Since there is no
special way to search only keywords from the meta tag and ignore the body
of the document it does not make much sense to support different meta
names.  Whatever exists as a value of the content attribute can be searched
by the user.  There is no restriction on what can be in the content
attribute.  The content is analyzed using same text analyzer as the rest of
the document, hence all the rules (ignoring case, punctuation, stop words,
and applying stemming) that apply for the document body apply here as well.
Authors must be aware that documents returned by search might not have
search words highlighted in the body if they only exist in the meta tag.

2. In the syntax <meta name="keywords" content="term 1, term 2, ..."> only
term 1, term 2 are translatable, the "keywords" value must remain
untranslated.  The parser ignores case, so if a document contains <META
name="KEYWORDS", the content will still be indexed.

3. Yes.  Ranking of documents with keywords in meta tag is equivalent to
ranking of document which body contained the keywords.  Lucene uses complex
ranking algorithm, but the higher the occurrence of the search word in the
document or the lower the occurrence of the search word elsewhere, or the
shorter the document, the higher the ranking will be.

Konrad Kolosowski
Eclipse Help System



                                                                                                                                              
                      Jamie                                                                                                                   
                      Roberts/Toronto/IBM@IBMC        To:       platform-help-dev@xxxxxxxxxxx                                                 
                      A                               cc:                                                                                     
                      Sent by:                        Subject:  Re: [platform-help-dev] indexing keywords in M5                               
                      platform-help-dev-admin@                                                                                                
                      eclipse.org                                                                                                             
                                                                                                                                              
                                                                                                                                              
                      02/05/2003 03:04 PM                                                                                                     
                      Please respond to                                                                                                       
                      platform-help-dev                                                                                                       
                                                                                                                                              
                                                                                                                                              



Positively great news--I look forward to working with this!

A couple of questions.

1) Any restrictions on the form of the keyword or meta name content?  I
suspect the name element value "keywords" is fixed. True? Or can we add
more name semantics?Are there any gotchas in having items in the content
list?

2) Any translation restrictions? If "keywords" above is fixed, I assume you
don't translate that :)

3) Are terms here handled by the ranking algorithm exactly as if they were
in text? Or do they have special rank weight status?

**************************
James H. (Jamie) Roberts
IBM WebSphere ID





                      Konrad

                      Kolosowski/Toronto/IBM@I        To:
platform-help-dev@xxxxxxxxxxx

                      BMCA                            cc:

                      Sent by:                        Subject:
[platform-help-dev] indexing keywords in M5

                      platform-help-dev-admin@

                      eclipse.org



                      02/05/2003 12:07 AM

                      Please respond to

                      platform-help-dev





For 2.1 M5, I have added support for indexing Meta Keywords to Search in
Help System.  The corresponding Meta tag that can be placed in the head of
HTML documents looks like:
<meta name="keywords" content="term 1, term 2, ...">
The separator used in the content attribute do not matter, since search
treats comas, semicolons, spaces as word separators and does not index
them.  It is wise to use comma, in case the text analyzers plugged into
search engine become more picky in the future.
The keywords are indexed together with the text extracted from the
document, hence ranking of search hit will not depend on whether searched
word appears in the meta tag or it is actually in the body of the document.

Konrad Kolosowski
Eclipse Help System

_______________________________________________
platform-help-dev mailing list
platform-help-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/platform-help-dev



_______________________________________________
platform-help-dev mailing list
platform-help-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/platform-help-dev





Back to the top