Semantic processing often requires resources such as thesauruses, synonyms or block lists, at both index and search time.
You can distribute semantic resource files either by:
• Creating the resource in the Administration Console.
• Using the resource manager through the cvadmin command. The following resources must be complied and published using cvadmin: XML-compliant, CSV-compliant, and custom.
Create a Resource File from the Administration Console
You can create resource files directly from the Administration Console. Once created, the resource is automatically added in the background to the Resource Manager, which automatically compiles and deploys the resources across all hosts.
This workflow uses an Ontology Matcher as an example, however the process is similar for other semantic resources.
1. In the Administration Console, go to Index > Data processing > Pipeline name > Semantic Processors.
2. Add an Ontology Matcher processor to the pipeline.
3. In the Resource directory of the processor, click Create new.
Once the resource is added the Resource directory path is set to resourcemanager://indexing/<RESOURCE_NAME>
4. Click Apply. Apply changes before you can edit the resource file in the Business Console.
5. Click Edit to edit the resource file in the Business Console.
See "Adding Ontology Resources" in the Business Console.
Manage Resources in cvadmin
The Resource Manager allows you to edit linguistic and semantic resources without having to manually compile or publish them.
The resource manager publishes semantic resource files to specified roles, which are organized into groups to update interrelated resource files in unison.
The Resource Manager:
• compiles resources
• assigns versions to resources
• publishes resources, including to multiple hosts
• converts resource formats, such as from XLS to XML
This section explains how to use the Resource Manager by taking the Ontology Matcher semantic processor as an example. The procedure is the same, however, for all resource types.
Edit ResourceManager.xml
The first step is to edit <DATADIR>/config/ResourceManager.xml. In this file, you define groups of resources and which roles to publish these resources to.
Grouping resource files enables you to keep dependant resources together (for example, if a Rules Matcher depends on the Ontology Matcher’s results). This ensures consistent updates since the group is published as a unit.
The default configuration includes the following resource groups:
• indexing, which targets the analyzers
• search, which targets the searcher
Let us assume you want to include an OntologyMatcher in our semantic pipeline. We would then include an ontology resource in the indexing ResourceGroup as shown in the following ResourceManager.xml example:
Once you have edited your ResourceManager.xml file, we need to apply the configuration by using the API Console (select Manage, then search for the applyConfiguration method and click Send).
Upload, Compile, and Publish the Ontology
We now have a new configuration and we have created our own resource file called ontology.xml.
We now need to upload this file to the Resource Manager. To do so, we use the cvadmin command-line tool. First check that the resources have been properly added:
The name of our resource is the one we defined in the ResourceManager.xml file.
Note: By using the optional argument publish=true, the file is uploaded, compiled (if required) and published in one command. If we wanted to separate these actions, after uploading our file, we would use:
This publishes all the resources in the specified group. If the last upload date is later than the last compile date, then this also proceeds to a compilation.
Errors may come up because the resource file (here ontology.xml) is not correct. Since the Resource Manager is part of the gateway, we can check the gateway's log file for causes of a compilation failure.
The resources published are available under resourcemanager://group_name/resource_name
For spellcheck block list and allow list resources, you need to perform an extra step. After declaring the resources, you must also edit the dictionary.xml.
1. Declare the two resources handling the allow and block listing. In this example, we add them to the "content-filtering" group of the Resource Manager.
These resource formats are available using the resources get-sample command of cvadmin, where RESOURCE_NAME is the name of the sample. In cvadmin, enter: