XML Configuration Reference : Search : NamedEntitiesMatcher
 
NamedEntitiesMatcher
com.exalead.indexing.analysis.v10.NamedEntitiesMatcher
The Named Entities Matcher detects named entities such as people, organizations, or places, in the textual content of the document. It generates annotations like NE.person or NE.organization, using ontology-based matching and/or rule-based matching.
Parent elements:
com.exalead.mercury.mami.search.v20.SemanticProcessorModule (as SemanticProcessorModule)
com.exalead.mercury.mami.search.v20.SemanticQueryAnalysisConfig (as SemanticQueryAnalysisConfig)
Attributes:
Name
Type
Default value
Description
name
string
Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
contexts
string
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
dataModelState
string
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
dataModelClass
string
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
dataModelProperty
string
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
disabled
boolean
Disables the DocumentProcessor
resourceDir
string
URL for the resource (data://, file;// or resource://).
rules
string
ne
Defines which entities will be extracted:
The default value, ne triggers the extraction of people, organizations, locations and events.
The value ne-all triggers the extraction of all types of entities.
prefix
string
NE
Prefix to add in front of each annotation generated by the named entity matcher.
language
string
Languages for which the processor is activated; if no language is specified, the processor is activated for all languages.
partOfSpeechFiltering
boolean
True
It discards annotations for parts of text made of a name followed by a verb or an adverb with the first letter in uppercase. This filter is useful if your documents contain a lot of titles with several capitalized words (what is called 'Title Case'). It applies to NE.person, NE.place and NE.organization.
useKnownWordsForDisambiguisation
boolean
True
Uses a resource of known words to disambiguate named entities candidates. It works only for English and French.
Nested elements:
Name
Type
Description
fromDataModel
com.exalead.indexing.analysis.v10.SemanticProcessor
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor