When you have a slow connector, or want to accelerate indexing throughput, use a document cache. The document cache stores documents pushed by a connector, before any processing was performed on them.
The document cache lifecycle is the same as that of the index: when a commit is made to the index, everything in the document cache (as well as everything being processed by the index server or that has gone through the PAPI server) is saved to disk.
Understand Typical Use Cases
Specifically, the typical use cases are:
• During development, source throughput is too low.
• In production, because the fetch (required for document fetch, thumbnail, and preview) latency is too high.
• To ensure incremental updates for certain features, that is, updating a document without repushing it entirely. For example, Mashup Builder’s social features such as tags require the document cache. When a tag is added, it is stored in the Mashup storage and this triggers a repush from cache operation for the impacted document. This repush from cache allows a document processor to retrieve the tags that are used to enrich documents before indexing.
Enable Document Cache
1. Enable document cache for the build group.
a. In the Administration Console, go to Deployment > Push to PAPI server.
b. Select a build group, for example bg0.
c. Select the Document cache option.
By default, the document cache is enabled on all connectors of the build group.
2. To control the caching per connector:
◦ Go to the Connectors > CONNECTOR NAME > Deployment tab and disable/enable the Store in document cache property.
◦ You can also open the <DATADIR>\config\Connectors.xml file and edit the SourceCachingConfig parameters of source connectors to specify whether to enable the cache, and the maximum and minimum cache size.
3. Apply your changes.
Change the Location of the Document Cache on the File System
By default the document cache is stored in the cache subdirectory of the build group. Yet, if the document cache grows too big for the build group’s file system (for example, when the build group is on an SSD), you can specify another storage location.
1. Stop Exalead CloudView.
2. Open the <DATADIR>\config\BuildGroups.xml file.
3. Edit the DocumentCacheConfig node to add the path attribute: path="path/to/new/Document/cache/location".
4. To generate the configuration, run <DATADIR>/bin/buildgct.
5. To keep the default document cache storage state, move the original document cache directory to the new location specified in step 3.
6. Restart Exalead CloudView.
Repush from Document Cache
1. Go to the Home page.
2. Under Indexing, click More actions.
3. Click Repush.
Clear the Document Cache Entirely
1. Go to the Home page.
2. Under Indexing, click Clear.
3. Select the Document cache for bg0 check box.
4. Click Clear.
Clear the Document Cache for a Specific Connector Only
1. Go to the Home page, or select Connectors > name > Operation.
2. Click Clear documents.
3. Select the Clear cache entries for this connector check box.
4. Click Accept.
This clears documents from both the index and the document cache.