Ontology Matcher (Resource-Based)

Configuration : Appendix - Configure Semantic Processors : Ontology Matcher (Resource-Based)

Dependencies

Rules for Ontology Matching

Sample Ontology Matcher XML File

Ontology Rules Syntax

Multilevel Ontology Example

Create the Ontology Matcher Resource File

Map an Annotation to a Category Facet

The Ontology Matcher is a semantic processor used for detecting expressions, or text forms in documents.

Detected text forms are then tagged with an annotation name and a corresponding display form. If you map this annotation to a category field, you can see its display forms as a new category in your list of facets. It gathers all the documents containing the text forms you linked to the display form.

Dependencies

If the matching rules for this processor depend on phonetic, stem, or lemma matching, you must add the corresponding processor above this one in the pipeline.

For example, if your rules require phonetic forms, place the Phonetizer processor above this processor in the analysis pipeline.

Rules for Ontology Matching

The Ontology Matcher detects expressions. Each expression belongs to an annotation package, which can be seen as a namespace.

• Results are tagged by annotations spanning the range of tokens that has been matched.

• Each annotation package creates an annotation tag.

• Each tagged expression can have several text forms.

The rules for the Ontology Matcher are defined in an XML file saved in a resource directory. They are specified during the configuration of the Ontology Matcher semantic processor in the analysis configuration.

Sample Ontology Matcher XML File

In this example, we use the Ontology Matcher to create an annotation, with value my.annotation for a document when there is a reference to the brand Coca-Cola.

These Ontology rules would create the following annotations:

Input: "Always Coca-Cola..." Result: The annotation created is displayForm="Coca-Cola" tag="my.annotation" level="exact" distance= "0"

Input: "always coca-cola..." Result: No annotation is created since the match level is set to exact and "Coca-Cola" != "coca-cola"

Input: "the famous albert e." (lang=en) Result: The annotation created is displayForm="Albert Einstein" tag="my.annotation.subannotation" level= "normalized" distance= "0"

Input: "le célèbre albert e." (lang=fr) Result: No annotation is created since the token is tagged as French.

Input: "recherche et développement" Result: The annotation created is displayForm= "Recherche et Développement" tag= "my.second.annotation" level="lowercase" distance= "1"

Input: "R&D" Result: The annotation created is displayForm= "Recherche et Développement" tag= "my.second.annotation" level="normalized" distance= "3"

Ontology Rules Syntax

An annotation package is characterized by a path and can contain:

• Subannotations whose path are concatenated using the "." separator.

• A set of expressions to detect.

For example:

Display Forms

An expression, or display form, is characterized by a value and an optional language. When a language is specified, only tokens of this language are used for detection. For example,

A display form can have one or more matches with text forms. A text form contains a value, a normalization level, and an optional distance. For example,

The distance attribute can be used for scoring the annotation depending on which alternative text form has matched. When the text form’s value is not specified, the display form for the associated expression is used instead.

Available Matching Normalization Levels

The level attribute specifies which form must be matched. Certain levels rely on other semantic processors, which you must place above the Ontology Matcher in the analysis pipeline.

Ontology Matcher Level Attribute and Possible Values
Level	Description
exact	Matches using exact form (the token).
lowercase	Matches using the lowercase form. Requires a Normalizer.
normalized	Matches using the normalized form. Requires a Normalizer.
lemmaSingularMasculine	Matches using the singular/masculine/normalized form. Requires a Lemmatizer.
stem	Matches using the stemmed form. Requires a Snowball Stemmer
phonetic	Matches using the phonetic form. Requires a Phonetizer.

Multilevel Ontology Example

Create the Ontology Matcher Resource File

Create a Resource File from the Administration Console

The most convenient method consists in creating an empty resources file in the Administration Console and defining its content with the Business Console. See Create a Resource File from the Administration Console .

To Compile a Resource File from the Command Line

This procedure describes how to manually create and compile your resources from the command line.

1. Create a rule XML file and save it in a temporary directory. For an example, see Sample Ontology Matcher XML File.

2. Compile the ontology rules XML file:

a. Go to <DATADIR>/bin/

b. Open the cvadmin command tool and start the following command.

cvconsole cvadmin> linguistic compile-ontology input=”<PATH TO ONTOLOGY XML FILE>”
output=”<PATH TO OUTPUT DIR>”

3. In the Administration Console, select Index > Data processing > Pipeline name > Semantic Processors.

4. Drag the Ontology Matcher to the required position in the Current Processors list, expand it and:

a. For Resource directory, enter the path to the compiled ontology file.

b. Select the parameters Restrict language and Keep longest match

For more information about available parameters, see in the Exalead CloudView XML Configuration Reference Guide.

Map an Annotation to a Category Facet

Once your Ontology Matcher resource file is defined, you can map an annotation to a category field. You are then able to see its display forms as a new category in your list of facets. It gathers all the documents containing the text forms you linked to the display form.

1. In the Administration Console, select Index > Data processing > Pipeline name > Semantic Processors.

2. On the Mappings tab, click Add mapping source.

a. Name: Enter the annotation name that you created in the rules file, for example, my.annotation for the sample file above.

b. Type: select Annotation.

3. (Optional) In Input from field of the mapping, restrict the mapping so it only applies to a subset of comma-separated metas (also known as contexts) associated with this annotation.

4. Click Add mapping target and add a category target.

5. Modify the category-mapping properties.

For example, the Create categories under this root property could be modified to Top/Product to contain a Coca-Cola category (corresponding to the Coca-Cola display form).

6. Go to Search > Search Logics >Your_Search_Logic > Facets and add a category group.

a. Click Add facet and enter the name to display in the Mashup UI Refinements panel.

b. For Root, enter the value you have entered for Create categories under this root in step 5, for example, Top/Product.

7. Click Apply.