Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[smila-dev] search record: group by vs. faceting

Hi folks,

 

We are planning to integrate the solr 3.5 release into smila (hopefully for 1.1 when the CQs come thru) and are going to add/use a new group-by/field collapsing queries (http://wiki.apache.org/solr/FieldCollapsing).

 

As a consequence I propose to the following change to our http://wiki.eclipse.org/SMILA/Documentation/Search#Query_Parameters

 

groupby: only used for real group by functionality such as the mentioned query or clustering (http://search.carrot2.org/stable/search)

faceting: used for faceting

 

the key difference between the two are:

- group by will contain for each value the results as nested result items

- faceting will just report counts but no IDs per value

 

The exact behavior (support for multiple group by fields)and capabilities will depend on the used implementation.

 

Hence, we need a new result structure for the group by feature. I propose this:

·         The records-Seq map results is empty/ becomes nested to the group by value(s)

·         The example below shows a case where on groups by multiple fields (not supported by solr)

 

<Val key="query">tv</Val>

<Seq key="groupby">

  <Map>

    <Val key="attribute">type</Val>

    <Val key="maxcount" type="long">10</Val>

    ...

  </Map>

  <Map>

    <Val key="attribute">size</Val>

    <Val key="maxcount" type="long">10</Val>

    ...

  </Map>

</Seq>

 

 

<Val key="query">tv</Val>

<Map key="groups">

  <!-- there is one map for each (existing) value of type -->

  <Map key="LED">

    <!-- optionally subesequent fields are nested -->

    <Map key="32">

      <!-- each leaf group field contains the results for the group -->

      <Seq key="results">

        ...

      </Seq>

    </Map>

    <Map key="40">

      <Seq key="results">

        ...

      </Seq>

    </Map>

  </Map>

  <Map key="Plasma">

    <Map key="32">

      <Seq key="results">

        ...

      </Seq>

    </Map>

    <Map key="40">

      <Seq key="results">

        ...

      </Seq>

    </Map>

  </Map>

</Map>

 

 

Faceting will remain as it is defined in wiki except that

·         definition happens via facet <Seq key="facet">

·         result is returned in <Map key="facet_results">

OR (more analogous to groupby/groups)

·         definition happens via facet <Seq key="facetby">

·         result is returned in <Map key="facets">

 

I realize this is a breaking change in regard to how the search record needs to be filled but I also think that this approach is more concise.

Anyhow, tell me your thoughts.

 

Thomas Menzel @ brox IT-Solutions GmbH

 

 

Taglocity Tags: smila


Back to the top