XML Configuration Reference : Search : SemanticExtractor
 
SemanticExtractor
com.exalead.indexing.analysis.v10.SemanticExtractor
The resource describes the features to extract, with their term, type and range for numerical values according to a set of rules. Annotations generated:
Depending on the resource (See SemanticExtractorConfig)
Parent elements:
com.exalead.mercury.mami.search.v20.SemanticProcessorModule (as SemanticProcessorModule)
com.exalead.mercury.mami.search.v20.SemanticQueryAnalysisConfig (as SemanticQueryAnalysisConfig)
Attributes:
Name
Type
Default value
Description
name
string
Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
contexts
string
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
dataModelState
string
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
dataModelClass
string
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
dataModelProperty
string
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
disabled
boolean
Disables the DocumentProcessor
resourceDir
string
URL of the compiled semantic extractor file. Use the format data://, file:// or resource://.
prefix
string
Output annotations prefix
breakOnSentence
boolean
If true, there will be maximum one match per sentence, and no match for inter-sentence. This option will add the SentenceFinder automatically.
breakOnParagraph
boolean
True
If true, there will be maximum one match per paragraph, and no match for inter-paragraph.
breakOnLine
boolean
If true, there will be maximum one match per line, and no match for inter-line.
matchAllRules
boolean
True
If true, it returns the full list of matched rules. If false, it returns only the first matched rule.
language
iso code
Language for which the extractor is activated. If null, all languages are activated.
annotateUnusedTokensWith
string
Used in the context of query rewriting by the Semantic Query Analyzer.
overlappingMatches
boolean
True
If true, reports all matches even if their locations overlap. Only makes sense when matchAllRules is true.
Nested elements:
Name
Type
Description
fromDataModel
com.exalead.indexing.analysis.v10.SemanticProcessor
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor