Name | Type | Default value | Description |
---|---|---|---|
name | string | Name of the Semantic Processor. This name is only used for tracing and debugging purposes. | |
contexts | string | Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. | |
dataModelState | string | Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. | |
dataModelClass | string | If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. | |
dataModelProperty | string | If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. | |
disabled | boolean | Disables the DocumentProcessor | |
enableApproxMatching | boolean | Enables approximative matching in ontology. Approximative matching uses the Damerau-Levenshtein edit distance. | |
minWordSizeForDist1 | int | 3 | Minimum number of chars in token to enable the Damerau-Levenshtein distance of 1. |
minWordSizeForDist2 | int | 8 | Minimum number of chars in token to enable the Damerau-Levenshtein distance of 2. |
resourceDir | string | URL for the directory containing the ontology (data://, file;// or resource://). | |
restrictLanguage | boolean | True | Keeps only the expression added with language == Language.XX or with the document language. For example, if the Ontology contains an expression added with language=En, it will be extracted only for an English document if restrictLanguage is set to true. |
keepLongestMatch | boolean | True | Keeps only the longest match. For example, if you have 5 tokens ('a', 'b', 'c', 'd', 'e') and 4 annotations 'a', 'a-c', 'b-c-d' and 'd-e', this option will only keep 'b-c-d' and remove all other annotations. |
keepLongestMatchInterTag | boolean | Keeps only the longest match (tag independant). For example, if you have 5 tokens ('a', 'b', 'c', 'd', 'e') and 4 annotations 'a', 'a-c', 'b-c-d' and 'd-e', this option will only keep 'b-c-d' and remove all other annotations. | |
tokenizeAnnotations | boolean | If you have some multi-tokens annotations (like "super market" annotation on token "supermarket", this option will automatically subtokenize "supermarket" in "super" "market" and keep original annotations. If you enable this option, keepLongestMatch and keepLongestMatcherInterTag will be set to true. | |
annotationsToIgnore | string | Sets the list of annotations to be ignored (comma-separated). This feature allows you to define a list of words/expressions to ignore in the recognition of this ontology. For example, if you add: • the expressions "of" and "the" with the tag "toIgnore" in ontology A, • and the expression "website embassy" in ontology B with tagsToIgnore="toIgnore", ... you will be able to match "website of the embassy", "website of embassy" and "website embassy". | |
ignoreSpaces | boolean | If your ontology was compiled with matchOnSeparators=false, this allows 'lemonde' to retrieve 'le monde' or 'le monde' to retrieve 'lemonde'. If your ontology was compiled with matchOnSeparators=true, this allows 'le monde' to retrieve 'le monde'. | |
annotationPrefix | string | A prefix to add to each annotation tag. For example, if the package of the entry matched in the ontology is "exalead.location.country" and the annotationPrefix is "myOntology_", an annotation will be added with the tag "myOntology_exalead.location.country". | |
trustLevelBasedDedup | boolean | Keeps only the annotation with the highest trust level when several entries from a package match the same text chunk. |
Name | Type | Description |
---|---|---|
fromDataModel | com.exalead.indexing.analysis.v10.SemanticProcessor | If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor |