The Ontology Matcher is a semantic processor used for detecting expressions, or text forms in documents.
Detected text forms are then tagged with an annotation name and a corresponding display form. If you map this annotation to a category field, you can see its display forms as a new category in your list of facets. It gathers all the documents containing the text forms you linked to the display form.
Dependencies
If the matching rules for this processor depend on phonetic, stem, or lemma matching, you must add the corresponding processor above this one in the pipeline.
For example, if your rules require phonetic forms, place the Phonetizer processor above this processor in the analysis pipeline.
Rules for Ontology Matching
The Ontology Matcher detects expressions. Each expression belongs to an annotation package, which can be seen as a namespace.
• Results are tagged by annotations spanning the range of tokens that has been matched.
• Each annotation package creates an annotation tag.
• Each tagged expression can have several text forms.
The rules for the Ontology Matcher are defined in an XML file saved in a resource directory. They are specified during the configuration of the Ontology Matcher semantic processor in the analysis configuration.
Sample Ontology Matcher XML File
In this example, we use the Ontology Matcher to create an annotation, with value my.annotation for a document when there is a reference to the brand Coca-Cola.
These Ontology rules would create the following annotations:
Input: "Always Coca-Cola..." Result: The annotation created is displayForm="Coca-Cola" tag="my.annotation" level="exact" distance="0"
Input: "always coca-cola..." Result: No annotation is created since the match level is set to exact and "Coca-Cola" != "coca-cola"
Input: "the famous albert e." (lang=en) Result: The annotation created isdisplayForm="Albert Einstein" tag="my.annotation.subannotation" level="normalized" distance="0"
Input: "le célèbre albert e." (lang=fr) Result: No annotation is created since the token is tagged as French.
Input: "recherche et développement" Result: The annotation created is displayForm="Recherche et Développement" tag="my.second.annotation" level="lowercase" distance="1"
Input: "R&D" Result: The annotation created is displayForm="Recherche et Développement" tag="my.second.annotation" level="normalized" distance="3"
Ontology Rules Syntax
An annotation package is characterized by a path and can contain:
• Subannotations whose path are concatenated using the "." separator.
An expression, or display form, is characterized by a value and an optional language. When a language is specified, only tokens of this language are used for detection. For example,
A display form can have one or more matches with text forms. A text form contains a value, a normalization level, and an optional distance. For example,
<Form value="Albert E." level="normalized" />
The distance attribute can be used for scoring the annotation depending on which alternative text form has matched. When the text form’s value is not specified, the display form for the associated expression is used instead.
Available Matching Normalization Levels
The level attribute specifies which form must be matched. Certain levels rely on other semantic processors, which you must place above the Ontology Matcher in the analysis pipeline.
Ontology Matcher Level Attribute and Possible Values
Level
Description
exact
Matches using exact form (the token).
lowercase
Matches using the lowercase form.
Requires a Normalizer.
normalized
Matches using the normalized form.
Requires a Normalizer.
lemmaSingularMasculine
Matches using the singular/masculine/normalized form.
b. Open the cvadmin command tool and start the following command.
cvconsole cvadmin> linguistic compile-ontology input=”<PATH TO ONTOLOGY XML FILE>” output=”<PATH TO OUTPUT DIR>”
3. In the Administration Console, select Index > Data processing > Pipeline name > Semantic Processors.
4. Drag the Ontology Matcher to the required position in the Current Processors list, expand it and:
a. For Resource directory, enter the path to the compiled ontology file.
b. Select the parameters Restrict language and Keep longest match
For more information about available parameters, see in the Exalead CloudView XML Configuration Reference Guide.
Map an Annotation to a Category Facet
Once your Ontology Matcher resource file is defined, you can map an annotation to a category field. You are then able to see its display forms as a new category in your list of facets. It gathers all the documents containing the text forms you linked to the display form.
1. In the Administration Console, select Index > Data processing > Pipeline name > Semantic Processors.
2. On the Mappings tab, click Add mapping source.
a. Name: Enter the annotation name that you created in the rules file, for example, my.annotation for the sample file above.
b. Type: select Annotation.
3. (Optional) In Input from field of the mapping, restrict the mapping so it only applies to a subset of comma-separated metas (also known as contexts) associated with this annotation.
4. Click Add mapping target and add a category target.
5. Modify the category-mapping properties.
For example, the Create categories under this root property could be modified to Top/Product to contain a Coca-Cola category (corresponding to the Coca-Cola display form).
6. Go to Search > Search Logics >Your_Search_Logic > Facets and add a category group.
a. Click Add facet and enter the name to display in the Mashup UIRefinements panel.
b. For Root, enter the value you have entered for Create categories under this root in step 5, for example, Top/Product.