OntologyMatcher (EXALEAD CloudView Custom code SDK)

java.lang.Object
- com.exalead.gni.GNIObject
- - com.exalead.mot.core.Processor
  - - com.exalead.mot.components.OntologyMatcher

```
public class OntologyMatcher
extends Processor
```
Ontology matcher, matches expressions from an ontology

Field Summary

Fields
Modifier and Type	Field and Description
`boolean`	`addReferencesForIgnoredTags` add references for ignored tags
`java.lang.String`	`annotationPrefix` prefix to prepend to annotations' tag
`boolean`	`enableApproxMatching` When set to true, we enable approximative matching in ontology Approximative matching use Damerau-Levenshtein edit distance
`boolean`	`ignoreSpaces` if your ontology was compiled with matchOnSeparators = false - this allows 'lemonde' to retrieve 'le monde' entry or 'le monde' to retrieves 'lemonde' if your ontology was compiled with matchOnSeparators = true - this allows 'le monde' to retrieve 'le monde'
`java.lang.String`	`matchAgainst` annotation the ontology terms are matched against
`int`	`minWordSizeForDist1` Minimum number of chars in token to enable damerau-levenshtein distance of 1
`int`	`minWordSizeForDist2` Minimum number of chars in token to enable damerau-levenshtein distance of 2
`boolean`	`packageLevelMatchDedup` When deduping results, try to keep only one match per package, i.e.
`java.lang.String`	`resource` The processor resource
`boolean`	`restrictLanguage` when set to true, keep only expression added with language == Language.XX or with document language for exemple if Ontology contains an expression added with language=En, if will be extracted only for english document with restrictLanguage is set to true
`boolean`	`splitHandling` Match a term even if there are no blanks between alphanums tokens in the text (Spellchecker may generates two tokens from one by splitting it up but no blank is inserted) When enabled, the text [air][france] will match the entry [air][ ][france]
`boolean`	`suffixApproxMatching` When set to true, all approximative match have a tag suffixed by '.approx' If you disable this suffix, the trustLevel of exact entry will all by set to 100 (this override the trustLevel in original xml file)
`java.util.List<java.lang.String>`	`tagsToIgnore` set the list of tag to be ignored this feature allows to define a list of words/expressions to ignore in the recognotion of this ontology For example if you add expression "of" and "the" with the tag "toIgnore" in the ontology A, then you add the expression "website embassy" in ontology B with tagsToIgnore=["toIgnore"], you will be able to match "website of the embassy", "website of embassy" and "website ambassy" WARNING: for the moment this option is not compatible when ontology was compiled without matchOnSeparators=false
`boolean`	`trustLevelBasedDedup` Keeps only the annotations with the highest trust level when several overlap.

Fields inherited from class com.exalead.mot.core.Processor
fields, name

Constructor Summary

Constructors
Constructor and Description
`OntologyMatcher(java.lang.String name, java.lang.String resource, java.lang.String fields)`
`OntologyMatcher(java.lang.String name, java.lang.String resource, java.lang.String matchAgainstAnnotation, java.lang.String fields, boolean restrictToLang, boolean addReferencesForIgnoredTags, boolean enableApproxMatching, boolean suffixApproxMatching, int minWordSizeForDist1, int minWordSizeForDist2, boolean ignoreSpaces, java.lang.String annotationPrefix)` Initialize a features extractor

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected void`	`addTagToIgnore(java.lang.String tag)` Add a tag to ignore
`protected void`	`finalize()`
`static java.lang.String`	`getApproxTagSuffix()`
`void`	`init(java.lang.String name, java.lang.String[] fields)` Initialize the processor
`void`	`init(java.lang.String name, java.lang.String resource, java.lang.String matchAgainstAnnotation, java.lang.String[] fields, boolean restrictToLang, boolean addReferencesForIgnoredTags, boolean enableApproxMatching, boolean suffixApproxMatching, int minWordSizeForDist1, int minWordSizeForDist2, boolean ignoreSpaces, java.lang.String annotationPrefix, boolean trustLevelDedup, boolean splitHandling, boolean packageLevelMatchDedup)`
`protected void`	`initNative(java.lang.String name, java.lang.String resource, java.lang.String matchAgainstAnnotation, java.lang.String[] fields, boolean restrictToLang, boolean addReferencesForIgnoredTags, boolean enableApproxMatching, boolean suffixApproxMatching, int minWordSizeForDist1, int minWordSizeForDist2, boolean ignoreSpaces, java.lang.String annotationPrefix, boolean trustLevelDedup, boolean splitHandling, boolean packageLevelMatchDedup)`

Methods inherited from class com.exalead.mot.core.Processor
checkResource, destroy, getName, init

Methods inherited from class com.exalead.gni.GNIObject
printInternalCppPointer

Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - resource
```
public java.lang.String resource
```
    The processor resource
  - enableApproxMatching
```
public boolean enableApproxMatching
```
    When set to true, we enable approximative matching in ontology Approximative matching use Damerau-Levenshtein edit distance
  - suffixApproxMatching
```
public boolean suffixApproxMatching
```
    When set to true, all approximative match have a tag suffixed by '.approx' If you disable this suffix, the trustLevel of exact entry will all by set to 100 (this override the trustLevel in original xml file)
  - minWordSizeForDist1
```
public int minWordSizeForDist1
```
    Minimum number of chars in token to enable damerau-levenshtein distance of 1
  - minWordSizeForDist2
```
public int minWordSizeForDist2
```
    Minimum number of chars in token to enable damerau-levenshtein distance of 2
  - ignoreSpaces
```
public boolean ignoreSpaces
```
    if your ontology was compiled with matchOnSeparators = false - this allows 'lemonde' to retrieve 'le monde' entry or 'le monde' to retrieves 'lemonde' if your ontology was compiled with matchOnSeparators = true - this allows 'le monde' to retrieve 'le monde'
  - splitHandling
```
public boolean splitHandling
```
    Match a term even if there are no blanks between alphanums tokens in the text (Spellchecker may generates two tokens from one by splitting it up but no blank is inserted) When enabled, the text [air][france] will match the entry [air][ ][france]
  - packageLevelMatchDedup
```
public boolean packageLevelMatchDedup
```
    When deduping results, try to keep only one match per package, i.e. consider only annotation name and not annotation value when comparing results for equivalence
  - restrictLanguage
```
public boolean restrictLanguage
```
    when set to true, keep only expression added with language == Language.XX or with document language for exemple if Ontology contains an expression added with language=En, if will be extracted only for english document with restrictLanguage is set to true
  - addReferencesForIgnoredTags
```
public boolean addReferencesForIgnoredTags
```
    add references for ignored tags
  - matchAgainst
```
public java.lang.String matchAgainst
```
    annotation the ontology terms are matched against
  - annotationPrefix
```
public java.lang.String annotationPrefix
```
    prefix to prepend to annotations' tag
  - tagsToIgnore
```
public java.util.List<java.lang.String> tagsToIgnore
```
    set the list of tag to be ignored this feature allows to define a list of words/expressions to ignore in the recognotion of this ontology For example if you add expression "of" and "the" with the tag "toIgnore" in the ontology A, then you add the expression "website embassy" in ontology B with tagsToIgnore=["toIgnore"], you will be able to match "website of the embassy", "website of embassy" and "website ambassy" WARNING: for the moment this option is not compatible when ontology was compiled without matchOnSeparators=false
  - trustLevelBasedDedup
```
public boolean trustLevelBasedDedup
```
    Keeps only the annotations with the highest trust level when several overlap.
- Constructor Detail
  - OntologyMatcher
```
public OntologyMatcher(java.lang.String name,
                       java.lang.String resource,
                       java.lang.String matchAgainstAnnotation,
                       java.lang.String fields,
                       boolean restrictToLang,
                       boolean addReferencesForIgnoredTags,
                       boolean enableApproxMatching,
                       boolean suffixApproxMatching,
                       int minWordSizeForDist1,
                       int minWordSizeForDist2,
                       boolean ignoreSpaces,
                       java.lang.String annotationPrefix)
```
    Initialize a features extractor
    
    Parameters:
    
    name - Its name
    
    resource - The associated resource name
    
    matchAgainstAnnotation - annotation the ontology terms are matched against
    
    fields - The list of fields on which it's active
    
    restrictToLang - Keep only expression added with language == Language.XX or with document language
    
    addReferencesForIgnoredTags - Add references for ignored tags
    
    enableApproxMatching - Approximative matching using Damerau-Levenshtein edit distance
    
    suffixApproxMatching - Approximative match have a tag suffixed by '.approx'
    
    minWordSizeForDist1 - Minimum number of chars in token to enable damerau-levenshtein distance of 1
    
    minWordSizeForDist2 - Minimum number of chars in token to enable damerau-levenshtein distance of 2
    
    ignoreSpaces - Ignore spaces when matching
    
    annotationPrefix - Prefix to add to annotations' tag
  - OntologyMatcher
```
public OntologyMatcher(java.lang.String name,
                       java.lang.String resource,
                       java.lang.String fields)
```
- Method Detail
  - init
```
public void init(java.lang.String name,
                 java.lang.String resource,
                 java.lang.String matchAgainstAnnotation,
                 java.lang.String[] fields,
                 boolean restrictToLang,
                 boolean addReferencesForIgnoredTags,
                 boolean enableApproxMatching,
                 boolean suffixApproxMatching,
                 int minWordSizeForDist1,
                 int minWordSizeForDist2,
                 boolean ignoreSpaces,
                 java.lang.String annotationPrefix,
                 boolean trustLevelDedup,
                 boolean splitHandling,
                 boolean packageLevelMatchDedup)
```
  - init
```
public void init(java.lang.String name,
                 java.lang.String[] fields)
```
    Initialize the processor
    
    Specified by:
    
    init in class Processor
    
    Parameters:
    
    name - Its name
    
    fields - The list of fields on which it's active
  - getApproxTagSuffix
```
public static java.lang.String getApproxTagSuffix()
```
  - initNative
```
protected void initNative(java.lang.String name,
                          java.lang.String resource,
                          java.lang.String matchAgainstAnnotation,
                          java.lang.String[] fields,
                          boolean restrictToLang,
                          boolean addReferencesForIgnoredTags,
                          boolean enableApproxMatching,
                          boolean suffixApproxMatching,
                          int minWordSizeForDist1,
                          int minWordSizeForDist2,
                          boolean ignoreSpaces,
                          java.lang.String annotationPrefix,
                          boolean trustLevelDedup,
                          boolean splitHandling,
                          boolean packageLevelMatchDedup)
```
  - addTagToIgnore
```
protected void addTagToIgnore(java.lang.String tag)
```
    Add a tag to ignore
  - finalize
```
protected void finalize()
```
    Overrides:
    
    finalize in class java.lang.Object

Class OntologyMatcher

Field Summary

Fields inherited from class com.exalead.mot.core.Processor

Constructor Summary

Method Summary

Methods inherited from class com.exalead.mot.core.Processor

Methods inherited from class com.exalead.gni.GNIObject

Methods inherited from class java.lang.Object

Field Detail

resource

enableApproxMatching

suffixApproxMatching

minWordSizeForDist1

minWordSizeForDist2

ignoreSpaces

splitHandling

packageLevelMatchDedup

restrictLanguage

addReferencesForIgnoredTags

matchAgainst

annotationPrefix

tagsToIgnore

trustLevelBasedDedup

Constructor Detail

OntologyMatcher

OntologyMatcher

Method Detail

init

init

getApproxTagSuffix

initNative

addTagToIgnore

finalize