Phonetizer

A Phonetizer can improve spell-checking at search-time, by building a phonetic form for each word in the contexts specified.

This semantic processor is a prerequisite for the following scenarios:

• To perform search-time phonetic expansion using the Phonetic query expansion module, you must extract phonetic forms from the relevant fields at index time. This creates the dictionary used by the phonetic expansion module at search-time.

Typically, you would set up phonetic extraction through the data model. See Phonetize a Field Created from a Data Model Property.

• If you have configured the Ontology Matcher, Rules Matcher, Fast Rules Matcher, or Semantic Extractor for phonetic matches, you must add a Phonetizer processor above it in the analysis pipeline as described in Configure Phonetization Manually.

Phonetize a Field Created from a Data Model Property

• In Data Model > Properties for the class name, expand the property to find out which semantic type this property uses.

• Next to the Semantic type list, click the icon to edit the semantic type.

This takes you to the appropriate section on the Semantic Types tab.

• Select Extract phonetic forms. This adds a Phonetizer semantic processor to your analysis pipeline.

• Reindex the documents that need to phonetic extraction.

• On the Home page, locate the appropriate connectors and click Scan.

Extracted phonetic forms are saved to the dictionary, so they can be accessed at search time for phonetic expansion. For details, see Configuring Query Expansion.

Follow this procedure when you need to extract phonetic forms that need to be used by other processors, such as the Ontology Matcher.

• In the Administration Console, select Index > Data Processing > Pipeline name > Semantic Processors.

• Drag the Phonetizer processor to the pipeline.

• In the pipeline, expand the Phonetizer:

◦ Language: specify a comma-separated list of language ISO codes. Leave blank to process all languages.

◦ Input from: specify a comma-separated list of document context names to be processed. Leave blank to process all input contexts.

• Drag the processor that requires the extracted phonetic forms so it is below the Phonetizer in the analysis pipeline.

• Expand this dependant processor:

◦ Language: specify the same comma-separated list of language ISO codes as specified for Phonetizer. Leave blank to process all languages.

◦ Input from: specify the same comma-separated list of document context names as specified for Phonetizer. Leave blank to process all input contexts.

• Finish configuring the dependant processor as described in: