CloudView indexing relies on 3 phases: push, analysis and import. In previous versions, the Push Server's role was to receive pushed documents, place them in a task queue, trigger document analysis using the Analyzer, and then send documents to Index Builder for indexing. At best, you could achieve a 10-second latency because the documents were serialized 3 times to disk.
In V6R2014, we've replaced the Push Server, Analyzer and Index Builder processes with the Indexing Server. Instead of using a disk-based task queue, the Indexing Server receives documents, triggers their analysis on the fly, and sends them to the index entirely in memory. This dramatically increases index throughput, and reduces indexing latency to the sub-second level. This new design does not impact the Push-API.
NOTE: If needed, you can still enable a task queue. It's a build group option. When migrating from previous versions, this option will be enabled automatically.
For more details, see "Indexing Process" and "Configuring Indexing" in the CloudView Technical Reference Guide.
Semantic resource management made easy
Semantic processing often requires resources such as thesauruses, synonyms or blacklists. In V6R2013, we introduced the Resource Manager to manage & deploy these resources on single and multi-host installations, but it was only available on the command line.
In V6R2014, you can now manage resources through the UI:
• Create, manage, and deploy the resource files in Administration Console. For more details, see "Managing Semantic Resources" in the CloudView Technical Reference Guide.
• Edit, test, and publish resource file content in the Business Console. For more details, see "Semantic Resources" in the Business Console User Guide.
The Business Console provides a simple XML editor for all resources, as well as rich editors for ontologies, synonyms and rule-based matchers.
Plus, you can test your resources by seeing how sample text gets annotated.
Multiple dictionaries for improved spell-check and related terms
The dictionary keeps track of all the words and their frequency in every indexed document. The dictionary drives spell-checking, regexp matching, related terms as well as corpus-based relevance calculations (Tf/Idf).
In V6R2014, you can now create multiple dictionaries, and configure individual prefix handlers to target a specific dictionary. This reduces the noise that can occur in a single dictionary, which greatly improves quality for spell-check, related terms, and so on. You can further improve quality by defining rules & policies for keeping or discarding words.
For more details, see "Configuring Dictionaries" in the CloudView Technical Reference Guide.
Package and deploy apps in minutes
A CloudView Search Based Application includes a frontend and backend configuration, plugins (connectors, widgets...) and resources (thesauruses, CSS, JS, images, ...). Now you can use the App packager to bundle all these into a single package, then deploy to another CloudView instances. Variables are available to simplify deploying to different environments (dev, pre-prod, test, production).
For more details, see "Deploying Applications on a CloudView instance" in the CloudViewInstallation & Deployment Guide.
More documentation
• Build a custom managed connector using the new CloudView Connector Programmer Guide.
• View "Release Notes" (including V6 version history) in the Online Help.
• See "What's New" in this version in the Online Help.
• Consult the CloudView Glossary
• See the "Getting more help" section in the Online Help and at the end of the PDF guides
Faster navigation using the search bar
Jump right to the relevant Administration Console screen thanks to the new search bar. Just start typing the feature name to see the suggested components. It also searches elements you created, like connectors and search logics.
Create richer user experiences
Mashup Builder's now allows you to control the layout within container widgets, such as tabs.
It also introduces new analytic widgets, including scatter-plots, spider and radar charts.
Pick your map provider
We believe you should be able to select the technology that best suits your needs: in addition to the existing Google Map widget, we now support maps from OpenStreetMap and Bing. Just pick your favorite service!
Fine tune app performance
Identify and fix bottlenecks with the Firebug-like Mashup reporting feature. Once enabled, a timeline provides you with everything you need to profile feed and widget rendering performance. Shave off milliseconds to make your end-users happier.
For more details, see "Enabling Reporting" in the Mashup Builder User Guide.
Other Improvements
Platform
• Indexing compact policies can be defined per index field
• Semantic Factory module is automatically installed during initial setup
• The Semantic Factory Java SDK has been refactored to simplify the build of MOT pipelines
• A full semantic pipeline with all processors can now be used on the query expansion
• New Annotation Manager semantic processor for manipulating annotations (select first, select most frequent, drop annotation, ...)
• New "Camel Case" semantic processor to split "CamelCased" words
• New "Normalizer" semantic processor to convert contexts to "normalized" form
• Part of speech tagger can now process long sentences and malformed text chains
• "Value selector" document processor now selects the first value instead of the first context
• The Advanced processing pipeline has a new tab for setting mapping limits
• Spellcheck and related terms white and black lists can be defined on each search logic
• The resource manager supports a new "xmlraw" resource type to handle any XML content you might need for your application
• Platform plugins (connectors, doc procs, ...) can be uploaded using the "Plugins" screen of the Administration Console
• XY geographical fields now support negative values
• New virtual expression functions on multi-valued fields: #sum, #avg, and #stddev
• All build tasks received by a frozen build group are now queued and processed when unfreezed, instead of being rejected
• INNER JOIN operator can use a different field for each set as the join key. For example, subject:exalead INNERJOIN/field1=field2 fulltext:france
Mashup Builder
• The Mashup Expression Language (MEL) editor now provides syntax coloring, error highlighting and auto-completion of both keywords and variables. Just type Ctrl+Space!
• An application can now be localized (i18n) using the debug mode of the Mashup UI
• The Dashboard page displays the list of users connected to Mashup, users can be manually disconnected using this list
• Access to Mashup UI pages can be restricted to users based on their user groups (from security source)
• The Google map widget can now display geographical facets into heatmap instead of shapes
• The Google map widget can display a marker at your position when geolocation is activated
• Create Composite Widgets from an existing page or from a group of widgets
• Available options for end-user can be selected in Advanced Facet and Advanced Result Table widgets
• Navigation widget can display a list to select the number of results per page
• New "Advanced 2D Facet Table" similar to "Advanced Facet Table" for 2D facets
• The timeline widget supports dynamic date facets
• Triggers can now be disabled like widgets
• The new "Conditional display trigger" allows defining a display condition using MEL
• Composite widgets can be packaged into Mashup plugins
• In order to package App, migrate, ..., Mashup plugins can now be created from command line
• Storage service supports new functions to calculate aggregations (COUNT, MIN, MAX, AVG and SUM) and delete keys
• New "use_logic_facets" option on faceting query builder to filter facets retrieved from search logic
Business Console
• Alerting system can publish notification to web-service alert publisher in JSON format in addition to ATOM
Connectors
• The HTTP crawler has a new tool for testing rules matching a given URL and triggered actions (index, follow, ignore, ...)
• The HTTP crawler now supports Digest and NTLM proxy authentication
• A CSV connector is now included in the product
• Connectors can have multiple scan schedules
APIs
• New TLDs for accessing the Mashup export controller to export feed data to PDF and CSV
• var parameter of <search:getMetaValues /> is no longer mandatory if glue is specified
• Multi-dimensional facets can now be output as an XML tree representation in addition to binary format
Upgrade notes
• Mashup Builder: Page level inline CSS & JS edition is now accessible through the "Code" tab between Design and Preview
Migration notes
• No migration from V6R2014 RC to GA will be supported.
• Index cache format has changed, it cannot be migrated and must be cleared.
• Migrating the DIH is a very slow process (~2h for 100 millions entries on an average machine), if possible, consider re-indexing instead.
• During migration, the dictionary is rebuilt from scratch. This process starts after the Gateway process starts, and can take both time and memory. To verify the dictionary has been rebuilt correctly, go to Index > Linguistic > Dictionaries tab and verify the size is > 0 bytes. CAUTION: If the dictionary migration fails (you'll see Out of Memory errors in the logs), you must redo your migration using different settings.
• Due to XY geographical fields now supporting negative values, it is not possible to migrate those fields, if you are using such fields, you must re-index.
Changes that can break things
Environment specific
• Thumbnails generation is no longer supported on Solaris (#13801)
Config system
• LearningManager does not exist anymore
Build chain
• Document processor "JavaScriptProcessor" is now deprecated
• Semantic processor "FeatureExtractor" is now deprecated and replaced with "SemanticExtractor"
• "pipeline" member of SemanticPipeDocumentProcessor class is now deprecated
• "linguistic" member of AnalysisConfig class is now deprecated
• "extractResources" member of ContextMapping class was deprecated and has been removed
• Index files have moved from "index6" to "build" directory (under specific buildgroup directory) to be consistent with other build group data (dih, cache, checkpoints) and other index data (dictionaries, resources).
Search
• VirtualConstants have been removed. Migration automatically replace them with virtual fields. (#17014)
• <search:forEachMeta />: attribute showEmptyMetas is deprecated (#17333)
• Horizontal stack widget has been removed (#17043, #17044)
• Mutable chart widget has been removed (#17326)
Admin UI
• Tokenization config: resourceFile field for Japanese and Chinese tokenizers has been removed. (#17308)
• Custom tokenizer has been removed from the list of tokenizer types. It can still be configured directly in xml config file. (#17308)
Known Issues
• Partial document update is no longer available. As such, connectors using this feature will not be able to update documents. This is the case for Microsoft SharePoint connector when updating the security of an already indexed document.
• Twitter feed no longer works. This is due to the shutdown of the twitter v1 API.