[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [platform-help-dev] indexing keywords in M5
|
1. In the current implementation, the name is fixed at "keywords". The
content of the meta tag is indexed with the document. Since there is no
special way to search only keywords from the meta tag and ignore the body
of the document it does not make much sense to support different meta
names. Whatever exists as a value of the content attribute can be searched
by the user. There is no restriction on what can be in the content
attribute. The content is analyzed using same text analyzer as the rest of
the document, hence all the rules (ignoring case, punctuation, stop words,
and applying stemming) that apply for the document body apply here as well.
Authors must be aware that documents returned by search might not have
search words highlighted in the body if they only exist in the meta tag.
2. In the syntax <meta name="keywords" content="term 1, term 2, ..."> only
term 1, term 2 are translatable, the "keywords" value must remain
untranslated. The parser ignores case, so if a document contains <META
name="KEYWORDS", the content will still be indexed.
3. Yes. Ranking of documents with keywords in meta tag is equivalent to
ranking of document which body contained the keywords. Lucene uses complex
ranking algorithm, but the higher the occurrence of the search word in the
document or the lower the occurrence of the search word elsewhere, or the
shorter the document, the higher the ranking will be.
Konrad Kolosowski
Eclipse Help System
Jamie
Roberts/Toronto/IBM@IBMC To: platform-help-dev@xxxxxxxxxxx
A cc:
Sent by: Subject: Re: [platform-help-dev] indexing keywords in M5
platform-help-dev-admin@
eclipse.org
02/05/2003 03:04 PM
Please respond to
platform-help-dev
Positively great news--I look forward to working with this!
A couple of questions.
1) Any restrictions on the form of the keyword or meta name content? I
suspect the name element value "keywords" is fixed. True? Or can we add
more name semantics?Are there any gotchas in having items in the content
list?
2) Any translation restrictions? If "keywords" above is fixed, I assume you
don't translate that :)
3) Are terms here handled by the ranking algorithm exactly as if they were
in text? Or do they have special rank weight status?
**************************
James H. (Jamie) Roberts
IBM WebSphere ID
Konrad
Kolosowski/Toronto/IBM@I To:
platform-help-dev@xxxxxxxxxxx
BMCA cc:
Sent by: Subject:
[platform-help-dev] indexing keywords in M5
platform-help-dev-admin@
eclipse.org
02/05/2003 12:07 AM
Please respond to
platform-help-dev
For 2.1 M5, I have added support for indexing Meta Keywords to Search in
Help System. The corresponding Meta tag that can be placed in the head of
HTML documents looks like:
<meta name="keywords" content="term 1, term 2, ...">
The separator used in the content attribute do not matter, since search
treats comas, semicolons, spaces as word separators and does not index
them. It is wise to use comma, in case the text analyzers plugged into
search engine become more picky in the future.
The keywords are indexed together with the text extracted from the
document, hence ranking of search hit will not depend on whether searched
word appears in the meta tag or it is actually in the body of the document.
Konrad Kolosowski
Eclipse Help System
_______________________________________________
platform-help-dev mailing list
platform-help-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/platform-help-dev
_______________________________________________
platform-help-dev mailing list
platform-help-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/platform-help-dev