Semantic query analysis is a special use case for the Semantic Extractor that allows you to rewrite queries without having to write any custom code. This analyzer runs on the raw user query, making use of a Semantic Extractor. The output is used as a new UQL query processed by the search logic in the usual way.
Configure Semantic Query Analysis
The configuration consists of:
• a semantic extractor's compiled resource
• an optional list of semantic processors, which runs before the semantic extractor
• options to set the analyzer's behavior
Add a Semantic Query Analysis to Your Searcher
1. Go to Administration Console > Search Logics > Query Language tab.
2. In Semantic query analysis, select Enable.
3. In Resource directory, create a new semantic resource extractor or select an existing one.
4. In Semantic processors, select the semantic processors (if any) linked to the semantic resource extractor selected previously.
5. In Language, select the languages for which the analyzer is activated.
6. In Unused word policy, select one of the following options:
◦ mandatory: all query words that have not been used by the matching rule to build the output are added to the output query using a AND.
◦ optional: all query words that have not been used by the matching rule to build the output are added to the output query using an OPT.
◦ remove: all query words that have not been used by the matching rule to build the output are discarded.
You can also set the unused word policy at the rule level:
a. Select Single match mode.
b. In the Business Console, edit your rule and select the appropriate option for Unused word policy. This rule setting overrides the Unused word policy set at the Semantic Query Analysis level.
7. In Debug log file, enter the path to an HTML file for debug purposes.
8. Click Save and apply the configuration.
Configure Query Processing
You can add a list of comma-separated query names that defines which parts of the query are processed. The default value is _default_. This means that by default processing is only applied on the query entered by the user, and not on refinements and restrictions applied by query expansion.
1. Open the API Console.
2. Click Manage.
3. Select search in the list.
4. In Configuration, select setSearchLogicList.
5. Search for queryNames.
6. Replace _default_ with the list of query names.
Example 1: Define "Cheap" for an E-Commerce Site
Let us say that you have an online store. You have an index of all product names and characteristics, including a numeric price field, and you want to make sense of queries such as cheap USB flash drive or inexpensive USB flash drive.
With the following configuration, you can rewrite such queries to USB AND flash AND drive AND price<10.
Create a Resource File from the Administration Console
1. In the Resource directory field, click Create new and enter the name of your resource file.
2. Then you can define its content with the Business Console. For more information, see the Exalead CloudView Business Console User's Guide.
Compile a Resource File from the Command Line
1. Create a semantic extractor configuration as shown below:
Now that you understand the basic principle of semantic query analysis, let us look at an example where you want to define different criteria for "cheap", depending on the product type. It does not make sense for a query for "cheap tv set" to search for TVs with a price of 10€ or lower.
The solution is to create text entities for products, and associate a definition of "cheap" for that particular product.
Configure the Semantic Extractor
1. Create a semantic extractor configuration as shown below:
◦ name=threshold defines $(threshold) as the product annotation's display form (the price threshold)
◦ original=text defines $(text) as the input text annotated by product (the product name as entered by the user)
Note: We could have built an ontology instead of writing TextEntities and add an ontology matcher in the SemanticQueryAnalysisConfig, externalizing the information.
Configure Semantic Query Analysis
We now want to remove words matched by the text entity "cheap", since it is not referenced in the output.
1. Specify the Unused word policy parameter to remove to get rid of them:
The resulting syntax tree for "cheap tv set" is:
AND AND NATURAL ALPHA: text: tv k=2 (form: normalized) ALPHA: text: set k=2 (form: normalized) NUM: document_price OP_LT 100