OntologyMatcher

Name	Type	Default value	Description
name	string		Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
contexts	string		Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
dataModelState	string		Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
dataModelClass	string		If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
dataModelProperty	string		If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
disabled	boolean		Disables the DocumentProcessor
enableApproxMatching	boolean		Enables approximative matching in ontology. Approximative matching uses the Damerau-Levenshtein edit distance.
minWordSizeForDist1	int	3	Minimum number of chars in token to enable the Damerau-Levenshtein distance of 1.
minWordSizeForDist2	int	8	Minimum number of chars in token to enable the Damerau-Levenshtein distance of 2.
resourceDir	string		URL for the directory containing the ontology (data://, file;// or resource://).
restrictLanguage	boolean	True	Keeps only the expression added with language == Language.XX or with the document language. For example, if the Ontology contains an expression added with language=En, it will be extracted only for an English document if restrictLanguage is set to true.
keepLongestMatch	boolean	True	Keeps only the longest match. For example, if you have 5 tokens ('a', 'b', 'c', 'd', 'e') and 4 annotations 'a', 'a-c', 'b-c-d' and 'd-e', this option will only keep 'b-c-d' and remove all other annotations.
keepLongestMatchInterTag	boolean		Keeps only the longest match (tag independant). For example, if you have 5 tokens ('a', 'b', 'c', 'd', 'e') and 4 annotations 'a', 'a-c', 'b-c-d' and 'd-e', this option will only keep 'b-c-d' and remove all other annotations.
tokenizeAnnotations	boolean		If you have some multi-tokens annotations (like "super market" annotation on token "supermarket", this option will automatically subtokenize "supermarket" in "super" "market" and keep original annotations. If you enable this option, keepLongestMatch and keepLongestMatcherInterTag will be set to true.
annotationsToIgnore	string		Sets the list of annotations to be ignored (comma-separated). This feature allows you to define a list of words/expressions to ignore in the recognition of this ontology. For example, if you add: • the expressions "of" and "the" with the tag "toIgnore" in ontology A, • and the expression "website embassy" in ontology B with tagsToIgnore="toIgnore", ... you will be able to match "website of the embassy", "website of embassy" and "website embassy".
ignoreSpaces	boolean		If your ontology was compiled with matchOnSeparators=false, this allows 'lemonde' to retrieve 'le monde' or 'le monde' to retrieve 'lemonde'. If your ontology was compiled with matchOnSeparators=true, this allows 'le monde' to retrieve 'le monde'.
annotationPrefix	string		A prefix to add to each annotation tag. For example, if the package of the entry matched in the ontology is "exalead.location.country" and the annotationPrefix is "myOntology_", an annotation will be added with the tag "myOntology_exalead.location.country".
trustLevelBasedDedup	boolean		Keeps only the annotation with the highest trust level when several entries from a package match the same text chunk.