Core Configuration

TellusR integrates with Solr using isolated components that may be enabled and configured independently.

TellusR Config Request Handler

This component is required. If it is not defined in a core, the TellusR Central will not be able to connect to that core. Installing it is a matter of introducing this single line in the core’s solrconfig.xml file:

<requestHandler name="/tellusr_config" class="com.sannsyn.solrplugins.TellusRConfig"/>

The the installation script inserts this if your cores are running. Otherwise, (or if you didn’t insert it while installing the integrator manually), remember to restart Solr after inserting it in relevant files.

This component implements semantic search by expanding each term in the search query before the search. It’s configured in two phases: declaration of a search component and its listing within search handlers.

Declaration

In the declaration of the component, you can specify its name (we suggest “semanticSearch”) and configure it with the following parameters:

  • language - (Required) str field with a locale abbreviation. E.g., en, nb_no.

  • boostOriginal and boostExpanded - (Optional) float fields that specify the boost applied to the original and semantically added terms. Higher boost means higher relevance to the search.

  • strategy - (Optional) str field defining a term expansion strategy with one of the following values:

    • ExpandLeafs (Default) - will make the component parse the query tree and change each TermQuery found into a BooleanQuery “or”-list of the original term and its expansion (these are in turn wrapped in BoostQueries to differentiate boosts). Thus, this strategy preserves the original query structure, only expanding the TermQuery leaf nodes.

    • Simple - will first find all terms in the query, then expand them all into a single list and create a BooleanQuery with the original query or the list with expansions.

      Please note that this does not respect different semantic meanings of terms in the original query, i.e., a term which was part of a NOT-query will be expanded and put in the OR-query just like every other term.

  • ignoredFields - (Optional) arr field listing the names of fields which should not be semantically expanded.

Example:

<searchComponent name="semanticSearch" class="com.sannsyn.solrplugins.SemanticSearch">
  <str name="language">nb_no</str>
  <str name="strategy">ExpandLeafs</str>
  <float name="boostOriginal">100</float>
  <float name="boostExpanded">0.1</float>
  <arr name="ignoredFields">
    <str>genre</str>
    <str>name</str>
  </arr>
</searchComponent>

Listing

The configured search component must be added to existing search handler declarations by referring to it by name in the first-components field. E.g. like this:

<requestHandler name="/select" class="solr.SearchHandler">
  <arr name="first-components">
    <str>semanticSearch</str>
  </arr>
  <!-- ... -->
</requestHandler>

Query Reporter

This component is responsible for the aggregation of search metrics. It gets configured in two phases: declaration of a search component and its listing within search handlers.

Declaration

The declaration is about referring to the QueryReporter class, providing a name for the component (we suggest queryReporter) and configuring it. In the configuration, you can optionally specify either the excludeFields or includeFields list. Exclusion implies that terms from all fields but the ones specified will be reported. Inclusion means that only the terms from the specified fields will be reported. If both lists are configured for a component, the component will only use the excludeFields list, and ignore includeFields. You can also filter searches that have or lack a specific parameter with filterIfParams and filterIfNotParams.

  • excludeFields - (Optional) arr of str fields naming search fields that should not have terms reported as search terms. This might be useful if you have implemented category search by adding a category field.

  • includeFields - (Optional) arr of str fields naming which search fields should have terms reported. If this is used, only searches on the named fields and the wildcard field will be sent.

  • filterIfParams - (Optionial) arr of str of parameter names. Do not report terms for searches that include these parameters.

  • filterIfNotParams - (Optionial) arr of str of parameter names. Only report terms for searches that include these parameters. You may for example only want to report terms for searches that include the user_id parameter.

Here’s an example:

<searchComponent name="queryReporter" class="com.sannsyn.solrplugins.QueryReporter">
  <arr name="excludeFields">
    <str>genre</str>
    <str>name</str>
  </arr>
</searchComponent>

Listing

The configured search component must be added to existing search handler declarations by referring to it by name in the last-components field. E.g. like this:

<requestHandler name="/select" class="solr.SearchHandler">
  <!-- ... -->
  <arr name="last-components">
    <str>queryReporter</str>
  </arr>
</requestHandler>

Suggest Reporter

This component will cache the latest query a user makes to a suggest handler. This cache is later used by the Query Reporter and TellusR Central to improve search term statistics, so that an educated guess for which term the user started to type is logged, rather than the autocompleted term. The Suggest Reporter will log both normal queries (with the q param) and queries using Solr’s suggester api (with the suggest.q param).

For the Suggest Reporter to work, both the suggester query (with the Suggest Reporter in the component stack) and the final search (with the Query Reporter in the component stack) must have a session id (or user id) supplied in a user_id parameter.

Declaration

<searchComponent name="suggestReporter" class="com.sannsyn.solrplugins.SuggestReporter">
</searchComponent>

Listing

<requestHandler name="/suggest" class="solr.SearchHandler">
  <!-- ... -->
  <arr name="last-components">
    <str>suggestReporter</str>
    <!-- ... -->
  </arr>
</requestHandler>

Queries

A query to the suggest handler might look something like this:

http://yourserver.example.com:8983/yourcore/suggest?user_id=sessionXX&suggest.q=incompl

A query to the select handler might look like this:

http://yourserver.example.com:8983/yourcore/select?user_id=sessionXYZ&q=incomplete+search+autocompleted+via+suggest+handler

(The user_id parameter is also needed by the Action registration functionality.)

HNSW Suggester

The HNSW suggester is a misspelling tolerant suggester component. It will suggest terms from a specific field based on an incomplete term.

Declaration

The HNSW Suggester Component is based on Solr’s SuggestComponent, and supports the same configuration options, but is preconfigured by default values that make it easy to set up TellusR’s HNSWCompoundLookupFactory.

  • suggester/field - (Required) A field that contains the phrase that we want to suggest for the user. This phrase might consist of multiple terms. (If there is not a 1-to-1 relation between phrases and rows, you might want to create a separate suggester core)

  • suggester/weightField - (Optional) A numeric field which will be used to weigh the results. The result set will be prioritized and sorted based on a combination of match and weight

  • suggester/payloadField - (Optional) An extra field that will contain data that is returned in the result set.

  • returnQueryResponse

    • no - (Default) return the search result as formatted by the suggester api.
    • yes - Translate the result set into a normal query result, returning rows. This mode does not support the optional payloadField. Instead, selected fields for the whole row is returned.
  <searchComponent name="suggest" class="com.sannsyn.solrplugins.HNSWSuggestComponent">
    <lst name="suggester">
      <str name="field">INSERT_SUGGESTER_FIELD</str>
      <str name="weightField">INSERT_SUGGESTER_WEIGHT_FIELD</str>
    </lst>
    <str name="returnQueryResponse">no</str>
  </searchComponent>

Listing

You can configure a suggest handler with the Suggest Reporter and the Suggest component like this:

<requestHandler name="/suggest" class="solr.SearchHandler">
  <lst name="defaults">
    <str name="suggest">true</str>
    <str name="suggest.count">5</str>
  </lst>
  <arr name="components">
    <str>suggestReporter</str>
    <str>suggest</str>
  </arr>
</requestHandler>

Please remember that when usingcomponents, you may not also use first-components and/or last-components.

AB-testing

AB-testing (also referred to as split testing) enables the administrator of a Solr server to forward users to different search handlers randomly. When used in combination with a query reporter, this enables the administrator to analyse the performance of different handlers.

To enable this functionality, you need to declare an ABTestHandler, specifying a fallback handler that will be forwarded to if there is no active AB-tests or if the integrator for some reason cannot get in touch with the TellusR Central.

AB-testing is only supported for handlers that include statistics reporting.

For AB-testing to make sense, there must be at least two search handlers configured. A complete configuration may look something like this:

<requestHandler name="/abtest_select" class="com.sannsyn.solrplugins.ABTestHandler">
  <str name="abtest_fallback">/abtest_a</str>
</requestHandler>

<requestHandler name="/abtest_a" class="solr.SearchHandler">
  <!-- ... -->
  <arr name="last-components">
    <!-- ... -->
    <str>queryReporter</str>
  </arr>
</requestHandler>

<requestHandler name="/abtest_b" class="solr.SearchHandler">
  <!-- ... -->
  <arr name="last-components">
    <!-- ... -->
    <str>queryReporter</str>
  </arr>
</requestHandler>

The BoostedSearch component adds the ability to boost some matches so that they are shown higher in the search results. Boosting will not add new matches to the search, but if a search result that matches the boosting criteria is found, its match score will be modified. This results in a reordering of the first few pages of the search based on the active boosting rules.

The boosting rules are configured in the tellusR administration panel.

  <searchComponent name="boostedSearch" class="com.sannsyn.solrplugins.BoostedSearch">
  </searchComponent>

  <requestHandler name="/boosted_select" class="solr.SearchHandler">
    <arr name="last-components">
      <str>boostedSearch</str>
    </arr>
  </requestHandler>

Complete Example

The following is a snippet with all the aforementioned TellusR features enabled. If you use this snippet as a template, you should remove the original /select handler (or rename it to /select_fallback, for example, if you want to keep it as a reference). Important configuration from the original /select handler should be reimplemented in the SearchHandler below. (Please note that the new /select in the snippet is not a SearchHandler but an ABTestHandler that will redirect SearchHandlers, so such config should not be reimplemented for this).

	<!-- Tellusr config -->

	<!-- Enable tellusR functionality in this core -->
	<requestHandler name="/tellusr_config" class="com.sannsyn.solrplugins.TellusRConfig">
	</requestHandler>
  
	<!-- Tellusr components -->
	<searchComponent name="semanticSearch" class="com.sannsyn.solrplugins.SemanticSearch">
	  <str name="language">Norwegian</str>
	  <str name="strategy">ExpandLeafs</str>
	  <str name="boostOriginal">100</str>
	  <str name="boostExpanded">0.1</str>
	  <str name="ignoredFields">genre</str>
	</searchComponent>

	<searchComponent name="boostedSearch" class="com.sannsyn.solrplugins.BoostedSearch">
	</searchComponent>

	<searchComponent name="queryReporter" class="com.sannsyn.solrplugins.QueryReporter">
	</searchComponent>

	<searchComponent name="elevator" class="solr.QueryElevationComponent">
	  <!-- pick a fieldType to analyze queries -->
	  <str name="queryFieldType">string</str>
	  <str name="config-file">elevate.xml</str>
	</searchComponent>

	<!-- Use this as the default select handler to enable ab testing -->
	<requestHandler name="/select_ab" class="com.sannsyn.solrplugins.ABTestHandler">
	  <str name="abtest_fallback">/select</str>
	</requestHandler>

	<!-- Application specific params -->
	<initParams path="/select,/select_a,/select_b,/select_fallback">
	  <!-- Put application specific params here instead of putting them
	       in the select handler to make it easy to keep handlers in synch
	  -->
	</initParams>

	<!-- TellusR specific params -->
	<initParams path="/select_a,/select_b,/select">
	  <arr name="first-components">
	    <str>semanticSearch</str>
	  </arr>
	  <arr name="last-components">
	    <str>spellcheck</str>
	    <str>queryReporter</str>
	    <str>boostedSearch</str>
	    <str>elevator</str>
	  </arr>
	</initParams>

	<requestHandler name="/select" class="solr.SearchHandler">
	</requestHandler>

	<requestHandler name="/select_a" class="solr.SearchHandler">
	</requestHandler>
  
	<requestHandler name="/select_b" class="solr.SearchHandler">
	</requestHandler>

	<requestHandler name="/select_fallback" class="solr.SearchHandler">
	</requestHandler>