XML Configuration Reference : Datamodel : SemanticType
 
SemanticType
com.exalead.datamodel.v10.SemanticType
Parent elements:
com.exalead.datamodel.v10.DataModel (as DataModel)
Attributes:
Name
Type
Default value
Description
name
string
Name for this type, to be used in the "type" field of the AlphanumProperty.
extraContexts
string
Extra analysis contexts (not datamodel-controlled) on which this text type must be applied.
tokenize
boolean
True
Splits phrases into individual words or tokens. Required for index-time semantic processing.
urlProcessing
boolean
Creates 3 prefix handlers for this property, for the 'site', 'url', and 'inurl' features.
indexNormalized
boolean
True
Indexes CaFé as cafe.
indexLowercase
boolean
Indexes CaFé as café
indexExact
boolean
Indexes CaFé as CaFé
indexSeparators
boolean
True
Indexes the position of separators to enable search within a string. Select this option when using the "split" type prefix handler.
detectLanguage
boolean
True
Determines the language of a document by analyzing its text. Required for extracting spell check ngrams, phonetic forms, named entities and related terms. For performance reasons, only select this option if the documents to be pushed do not already include a 'language' meta. Selecting this option creates a meta called 'language', as well as a Language facet for search results display in the Refinements panel.
extractWords
boolean
True
Extracts the words of each document to the dictionary targeted by this semantic type.
extractNamedEntities
boolean
Flags famous people, places, organizations or events, and annotates the corresponding index field with the prefix NE:<entity type>. This option adds:
A Named Entities Matcher processor to the semantic analysis pipeline.
Categories for each named entity annotation in a document.
Named entity facets in the search logic, to be displayed in the Refinements panel.
extractRelatedTerms
boolean
Finds important concepts within the corpus and stores them in the dictionary targeted by this semantic type. To display related terms in the Refinements panel of your search application, you must enable them through the search logics. This option adds a Related Terms Extractor processor to the semantic analysis pipeline.
extractSpellCheckNGrams
boolean
Calculates probability of word occurrences or word phrases within the corpus and stores them in the dictionary targeted by this semantic type. This significantly improves the effectiveness of spell-checking. This option adds a NGrams Extractor processor to the semantic analysis pipeline.
extractPhoneticForms
boolean
Creates a phonetic version for each word and stores them in the dictionary targeted by this semantic type. This significantly improves the effectiveness of spell check and enables phonetic search (for example, soundslike: exaleed). This option adds a Phonetizer processor to the semantic analysis pipeline.
tokenizationConfig
string
Defines the tokenization config to use for analysis and search. Found in the linguistic configuration.
rankForDedicatedMapping
long
4
Ranking value for the mapping to a dedicated index field.
rankForTextMapping
long
3
Ranking value for the mapping to the "text" index field.
dictionaryName
string
A dictionary is a structure separated from the index, that stores all the words of an indexed document, and their number of occurrences in the corpus. It's used for linguistic expansion mechanisms such as spell-checking or regular expression matching. If value is "_None_", words will not be stored in a dictionary. In admin-ui, select "None" if you want this behavior.
Nested elements:
Name
Type
Description
KeyValue
exa.bee.KeyValue*
Custom parameters.