Name | Type | Default value | Description |
---|---|---|---|
name | string | Name of the Semantic Processor. This name is only used for tracing and debugging purposes. | |
contexts | string | Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. | |
dataModelState | string | Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. | |
dataModelClass | string | If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. | |
dataModelProperty | string | If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. | |
disabled | boolean | Disables the DocumentProcessor | |
relatedTermsMinSpan | int | 3 | Minimum number of words (excluding stop words) in an automatically extracted term (not applicable to whitelist). |
relatedTermsMaxSpan | int | 6 | Maximum number of words (excluding stop words) in an automatically extracted term (not applicable to whitelist). |
maxRelatedTermsPerDoc | int | 64 | The maximum number of related terms per document. |
keepLongestMatch | boolean | True | Keeps only the longest term when several overlap. For example, if you have 5 tokens ('a', 'b', 'c', 'd', 'e') and 4 related terms 'a', 'a-c', 'b-c-d' and 'd-e', this option will only keep 'b-c-d' and remove all other related terms. |
dictionaryName | string | Name of the dictionary populated by terms extracted by this processor. If null, use the default dictionary. | |
preprocResourceDir | string | URL for the resource of the related terms preprocessor (data://, file;// or resource://). If null, we use the standard preprocessor of the product. | |
whitelistResource | string | Path to a related terms whitelist resource. | |
blacklistResource | string | Path to a related terms blacklist resource. | |
withPartOfSpeech | boolean | True | Adds a PartOfSpeechTagger to the list of processors automatically. Improves quality of automatically extracted terms. |
Name | Type | Description |
---|---|---|
fromDataModel | com.exalead.indexing.analysis.v10.SemanticProcessor | If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor |