Java Processors

Connectors : Consolidation Server : Writing Transformation and Aggregation Processors : Java Processors

Java Processors

Define Java Transformation Processors

Transformation Operations

Define Java Aggregation Processors

Aggregation Operations

Company Hierarchy Example

Every Java Processor defined in the Consolidation Server implicitly implements the CVComponent Exalead CloudView interface, required to define a Exalead CloudView Component.

For more information, see the "Creating custom components for CloudView" in the Exalead CloudView Programmer's Guide.

Consequently, it is possible to:

• Create Java processors externally within your IDE,

• Package this appropriately in a Jar/Zip,

• And deploy it into the Exalead CloudView instance to enable your processors selectively.

This is one of the key advantages over Groovy, as Groovy processors are added and written within the Exalead CloudView Administration Console. With the Exalead CloudView component mechanism, you can also define runtime properties that to customize the component behavior. It thus becomes possible to write a generic processor that can be customized using runtime properties defined within the Administration Console later on.

Define Java Transformation Processors

Transformation Operations

Define Java Aggregation Processors

Aggregation Operations

Company Hierarchy Example

Define Java Transformation Processors

You can define transformation processors using a set of default processors made for generic simple operations, or through custom java code if your needs are more specific.

Use Default Transformation Processors

1. Under Transformation processors, click Add processor.

2. In Add processor, select Java, give a name to the processor, and then choose one of the following default processors.

Transformation Processor	Description
Basic Arc Creation Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. CreateArcBasedOnMetaValueTransformationProcessor Creates an arc from the processed document. The target is the value of the given meta name.
Basic Document Creation Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. CreateDocumentBasedOnMetaValueTransformationProcessor Creates a managed document from the processed document. The target is the value of the given meta name.
Set Directive Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. SetDirectiveTransformationProcessor Sets the given directive on the processed document
Set Meta Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. SetMetaTransformationProcessor Set the given meta on the processed document
Set Type Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. SetTypeTransformationProcessor Sets the given type on the processed document
Split Text Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. SplitTextTransformationProcessor Splits the given source meta using the specified delimiting regex pattern, and add/set the result to the target meta. Note: The target meta must be multivalued to contain all text chunks resulting from the split operation.
Storage Service Key Linker Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. StorageServiceKeyLinkerTransformationProcessor Create arcs between the Storage Service data and the document it is linked to. For a use case example, see UC-8: Consolidating Data from Storage Service.

3. Click Apply.

Create Custom Transformation Processors

To define a Java Transformation processor in the create/update action context, you need to implement the IJavaAllUpdatesTransformationProcessor interface.

/**
* Defines the interface for all the Java transformation processors that
* need to perform operations in a non-delete context.
*/
public interface IJavaAllUpdatesTransformationProcessor extends IJavaTransformationProcessor {
/**
* Performs the add or update operations of the clientâ€™s processor for the
* the document transmitted, with the help of the handler provided.
* @param handler The transformation handler providing the allowed operations for the processor.
* @param document The reference document.
* @throws Exception Occurs for whatever reason in the client's implementation.
* The exception will most likely be wrapped with contextual information before further propagation.
*/
public void process(final IJavaAllUpdatesTransformationHandler handler,
final IMutableTransformationDocument document) throws Exception;
}

The parent interface is defined as follows:

/**
* Defines the common operations for all Java transformation processors.
*/
public interface IJavaTransformationProcessor extends CVComponent {
/**
* Returns a document type on which the processor will perform a transformation.
* If one returns null or an empty string, then it will be applied on all source documents.
* @return A valid document type or null or empty string.
*/
public String getTransformationDocumentType();
}

The following picture shows the complete class hierarchy associated with the Java transformation processors in the create/update action context.

The handlers hierarchy (in green) defines the list of operations allowed for the processor and for the particular action context. The documents hierarchy (in purple) defines the document received with the transformations allowed on them.

If you implement the IJavaAllUpdatesTransformationProcessor interface as requested, you then have to implement the two following methods with a particular constructor receiving the component config.

Java Example 1

@CVComponentDescription("My First Transformation Processor Component")
@CVPluginVersion("1.0")
@CVComponentConfigClass(configClass=MyComponentConfig.class)
public class MyFirstTransformationProcessor implements IJavaAllUpdatesTransformationProcessor {
public MyFirstTransformationProcessor(final MyComponentConfig config) {
}

@Override
public String getTransformationDocumentType() {
return "city";
}

@Override
public void process(final IJavaAllUpdatesTransformationHandler handler,
final IMutableTransformationDocument document) throws Exception {
// Do nothing, that is transmit all "city" documents to the next processor
// or to the Consolidation Store.
logger.info("Processing " + document.getURI());
}
private final Logger logger = Logger.getLogger("app-name.conso-server.transformation");

}

Once the code above in packaged and deployed on the Exalead CloudView instance, you can define its associated source as shown below.

In this simple example:

• We defined a constructor with the component config instance that you can use to customize the processor using end-user properties. Here we do not store the instance because we have no use for it. In general, we would save the instance in the processor class and use it in the process method to read specific properties. A component configuration is always required for the definition of each of your Java processors.

• The processor treats the documents coming from the cities source connector.

• The processor also treats, from such source, documents with the city type only. The rules of selection are detailed in Processor Type Inheritance and Runtime Selection.

• The process method contains the processor implementation. Here it contains no action (except from the logging), so all city documents from the cities source connector is automatically transmitted to the next available transformation processor, or to the Consolidation Store.

You can reduce the config class to the following implementation:

public final class MyComponentConfig implements CVComponentConfig {
// No property defined
}

Similarly, in the delete action context you need to implement the IJavaDeleteTransformationProcessor interface. The methods to implement are mostly the same, except from the process method, which has a different signature, to emphasize that in such context, allowed operations are different from in the other one. Here is the class hierarchy defined in a delete action context:

Transformation Operations

This section lists the available transformation operations.

ITransformationHandler

The base interface of the transformation handler provides the two following methods, which control how documents are transformed within the processing pipeline.

Method	Description
discard()	Discards the current processor document, that is to say, prevents it from going to the next processor or next stage.
yield(doc)	Yields the newly created document to the next processor or to the Consolidation Store. Use this call for documents created in a transformation processor with the IJavaCreateTransformationHandler methods.

ICreateTransformationHandler

The interface to add new documents to the Consolidation Store provides three different create methods.

Tip:

When child URI is forged using the method:

IMutableTransformationDocument childDoc (IJavaAllUpdatesTransformationHandler) handler.createChildDocument(document, subURI , sdcType);
childDoc.getUri()

The created URI is "document.URI" + childSeparator + "sub-URI" but as the childSeparator is a private string that is not visible in generated URIs, it is impossible to reforge this URI later without the same method.

Recommendation: To handle partial update use cases, create links from child to parent from the child only.

Instead of:

document.addArcTo("hasTextualElement", child.getUri());

Prefer:

child.addArcFrom("hasTextualElement", document.getUri());

Method	Description
createDocument(uri,type,parentTypes)	Creates a transformation document with the required given properties and with automatic memory management. In other words, if no edges point on it at the end of the transformation phase, the document is deleted by the Consolidation Server automatically.
createChildDocument(parentDoc,subURI,type,parentTypes)	Creates a transformation document from a parent one with the given properties.
createUnmanagedDocument(uri,type,parentTypes)	Creates a transformation document with the given properties without automatic memory management. This is the opposite behavior of the createDocument method in terms of memory management.
getDocumentChildrenPath(String parentURI, String childURI)	This method is useful to create a child URI when you do not have access to the child himself. Never forge a URI by hand.

IDeleteTransformationHandler

The interface to send delete orders to the Consolidation Store.

Method	Description
deleteDocument()	Sends a recursive delete order for the current document.
deleteDocument(uri)	Sends a recursive delete order for the document with the given URI prefix.
deleteDocument(uri,boolean)	Sends a delete order for the specified document URI, recursively or not. If the boolean flag is true, then all URIs with a prefix matching the given URI are also deleted.
deleteDocument(doc)	Sends a recursive delete order for the specified document and possibly all documents with a prefix matching the document URI.
deleteDocument(doc,boolean)	Sends a delete order for the specified document, recursively or not. If the boolean flag is true, then all URIs with a prefix matching the document URI are also deleted.
deleteDocumentChildren(doc,path)	Sends a delete order for all document children matching the given path. The document itself is not deleted.
deleteDocumentChildren(doc)	Sends a delete order for all document children. The document itself is not deleted.
deleteDocumentChildren(parentURI,path)	Sends a deletion order for all document children matching the path of the given parent URI. The document itself is not deleted.
deleteDocumentChildren(parentURI)	Sends a deletion order for all document children with the given parent URI prefix. The document itself is not deleted.
deleteDocumentRootPath(rootURI)	Sends a deletion order for all documents matching the root URI prefix.

IMutableTransformationDocument

The following interface provides the operations specific to a Transformation document.

Method	Description
addArcFrom(arcType, fromDoc)	Registers an arc addition from the specified document to the current one.
addArcFrom(arcType, fromDocURI)	Registers an arc addition from the document specified by the URI to the current one.
addArcTo(arcType, toDoc)	Registers an arc addition from the current document to the specified document.
addArcTo(arcType, toDocURI)	Registers an arc addition from the current document to the document specified by the URI.
removeAllPredecessorArcs()	Registers for deletion all adjacent arcs heading to the current one.
removeAllSuccessorArcs()	Registers for deletion all adjacent arcs starting from the current one.
removeArcFrom(arcType, fromDoc)	Registers for deletion the arc starting from the specified document to the current one, with the given type.
removeArcFrom(arcType, fromDocURI)	Registers for deletion the arc starting from the specified document to the current one, with the given type.
removeArcTo(arcType, toDoc)	Registers for deletion the arc starting from the current document to the specified document, with the given type.
removeArcTo(arcType, toDocURI)	Registers for deletion the arc starting from the document specified by the URI to the current one, with the give type.
setType(documentType, parentTypes)	Defines the document type, as well as its possible parents as defined in getTypeInheritance().

IConsolidationDocument

The following interface gives access to the default data encapsulated within a consolidation document, either for transformation or aggregation.

Method	Description
isOfType(type)	Indicates if the type transmitted is among the list of the current document types.
getAllDirectives()	Returns all the directives defined in this document.
getAllMetas()	Returns all the metas defined in this document.
getAllParts()	Returns all the parts defined in this document.
getDirectiveNames()	Returns all the document directive names.
getDirective(name)	Returns the first directive value for the given name.
getDirectives(name)	Returns all the directives for the given name.
getMetaNames()	Returns all the meta names.
getMeta(name)	Returns the first meta value for the given name.
getMetas(name)	Returns all the meta values for the given name.
getOriginalSources()	Returns the list of original sources for the given document.
getPartNames()	Returns all the document part names.
getPart(name)	Returns the first document part for the given name.
getParts(name)	Returns the list of document parts for the given name.
getSource()	Returns the document original source that produced it.
getType()	Returns the document representative type.
getTypeInheritance()	Returns the type inheritance for the document. The first one in the list is a descendant of the second one, the second one of the third one, and so on. So types are ordered from the most specific to the most generic.
getUri()	Returns the document unique identifier.
hasDirective(name)	Indicates if the directive name has an associated value within the document.
hasMeta(name)	Indicates if the meta name has an associated value within the document.
hasPart(name)	Indicates if the part name has an associated value within the document.

IMutableConsolidationDocument

This interface enriches the operations available within IConsolidationDocument with a list of operations allowing the modifications of internal data.

Method	Description
deleteDirective(name)	Deletes all the directive values associated to the specified directive name.
deleteDirectives(name, values)	Deletes only the given values for the specified directive name.
deleteMeta(name)	Deletes all the meta values associated to the specified meta name.
deleteMetas(name, values)	Deletes only the given meta values from the specified meta name.
deleteParts(name)	Deletes the document parts related to the specified part name.
deleteParts(name, documentParts)	Deletes all the part directive values for the specified part name.
setDirective(name, value)	Assigns the given value to the specified directive name.
setAllDirectives(directives)	Assigns all the directive name/values associated to the current document.
setMeta(name, value)	Assigns the given meta value to the specified meta name.
setMeta(name, values)	Assigns the given meta values to the specified meta name.
setAllMetas(metas)	Assigns all the meta name/values associated to the current document.
setPart(name, docPart)	Assigns the given document part to the specified part name.
setParts(name, docParts)	Assigns the given document parts to the specified part name.
setAllParts(parts)	Assigns all the parts associated to the current document.
withDirective(name, value)	Adds the value of a specific directive to the possible list of predefined directive values. If none is defined, a new list is created.
withDirectives(name, values)	Adds the values of a specific directive to the possible list of predefined directive values. If none is defined, a new list is created.
withDirectives(directives)	Adds the list of directive key-values to the possible list of predefined directive values.
withMeta(name, value)	Adds the value of a specific meta to the possible list of predefined meta values. If none is defined, a new list is created.
withMeta(name, values)	Adds the values of a specific meta to the possible list of predefined meta values. If none is defined, a new list is created.
withMetas(metas)	Adds the list of meta key-values to the possible list of predefined meta values.
withPart(name, docPart)	Adds the document part to the list of existing predefined parts. If none is defined, a new list is created.
withPart(name, docParts)	Adds the sequence of document parts to the list of existing predefined parts. If none is defined, a new list is created.
withParts(allParts)	Adds the list of parts associated to the current document.

Define Java Aggregation Processors

In the Transformation phase, you have possibly filtered, modified, linked, and pushed documents into the Consolidation Store. In the Aggregation phase, you are then ready to aggregate or enrich them together for the Exalead CloudView Index. You can also decide to notify the Indexer to delete some documents generated during the Aggregation.

You can define aggregation using default processors made for generic operations, or through custom java code if your needs are more specific.

Use Default Aggregation Processors

1. Under Aggregation processors, click Add processor.

2. In the Add processor dialog box, select Java, give a name to the processor, and then choose one of the following default processors.

Aggregation Processor	Description
Basic Aggregation Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. BasicAggregationProcessor Add/set metas, directives, or parts from documents at the end of paths, returned by the given graph matching expression. See the example below this table.
Classification Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. ClassficationAggregationProcessor Generates classification metadata representing path nodes ('node1_id/node2_id/node3_id...')
Discard Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. DiscardAggregationProcessor Discards documents matching the given document types. For a use case example, see UC-8: Consolidating Data from Storage Service.
Set Directive Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. SetDirectiveAggregationProcessor Sets the given directive on the processed document.
Set Meta Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. SetMetaAggregationProcessor Sets the given meta on the processed document.
Storage Service Key Flattener Processor	Class Id: com.exalead.cloudview.consolidation.processors.java.classic. StorageServiceKeyFlattenerAggregationProcessor Sets metas on a document coming from the Storage Service. For a use case example, see UC-8: Consolidating Data from Storage Service.
Interconnector Aggregator Processor	Class Id: com.exalead.cloudview.consolidation.processors.java. InterconnectorAggregatorProcessor Aggregates a parent document with its child document, given a graph path from parent to child. For a use case example, see in the Exalead CloudView Connectors Guide.

For example, with the Basic Aggregation Processor, you can replace the following Groovy code, which rewrites metas at the end of paths:

process("eno:bo:CATPart") {
for (node in match(it, "-eno:from[eno:co:Viewable].eno:to[eno:bo:CgrViewable].-eno:thumbnail")*.last()) {
it.metas."3dthb_46_phyid" += node.metas."physicalid";
it.metas."3dthb_46_name" += node.metas."sdc_46_3dthb_46_name";
it.metas."3dthb_46_format" += node.metas."sdc_46_3dthb_46_format";
}
}

By the following configuration:

3. Click Apply.

Create Custom Aggregation Processors

In Java, to define an Aggregation processor in the create/update action context, you need to implement the IJavaAllUpdatesAggregationProcessor interface. Here is the actual interface definition:

/**
* Defines the interface for all Java aggregation processors that need to perform document operations
* in a non-delete context.
*/
public interface IJavaAllUpdatesAggregationProcessor extends IJavaAggregationProcessor {
/**
* Performs the aggregation operations of the client's processor for the document transmitted,
* with the help of the handler provided.
*
* @param handler The aggregation handler with the allowed operations for the processor.
* @param document The reference document.
* @throws Exception Occurs for whatever reason in the client's implementation.
* The exception will most likely be wrapped with contextual information before further propagation.
*/
public void process(final IJavaAllUpdatesAggregationHandler handler, final IAggregationDocument document)
throws Exception;
}

The parent interface is defined as follows:

/**
* Defines the interface for all Java aggregation processors that need to
* perform document operations in a non-delete context.
*/
public interface IJavaAllUpdatesAggregationProcessor extends IJavaAggregationProcessor {
/**
* Performs the aggregation operations of the client's processor for the document transmitted,
* with the help of the handler provided.
*
* @param handler The aggregation handler with the allowed operations for the processor.
* @param document The reference document.
* @throws Exception Occurs for whatever reason in the client's implementation.
* The exception will most likely be wrapped with contextual information before further propagation.
*/
public void process(final IJavaAllUpdatesAggregationHandler handler, final IAggregationDocument document)
throws Exception;
}

Note: This time, there is no need to specify the source connector within the Administration Console, since all documents are loaded from the Consolidation Store.

The class hierarchy is the following:

For the delete action context, you have to implement the IJavaDeleteAggregationProcessor interface as follows:

/**
* Defines the interface for all Java aggregation processors that need to perform delete
* processing on documents.
*/
public interface IJavaDeleteAggregationProcessor extends IJavaAggregationProcessor {
/**
* Performs the aggregation operations of the client's processor for the document transmitted,
* with the help of the handler provided.
*
* @param handler The aggregation handler with the allowed operations for the processor.
* @param document The reference document.
* @throws Exception Occurs for whatever reason in the client's implementation.
* The exception will most likely be wrapped with contextual information before further propagation.
*/
public void process(final IJavaDeleteAggregationHandler handler, final IAggregationDocument document)
throws Exception;}

And finally, the class hierarchy is:

Aggregation Operations

This section lists the available aggregation operations.

IAggregationHandler

The base interface of the aggregation handler provides the next fundamental methods.

Method	Description
discard()	Discards the current processor document, that is to say, prevent it from going to the next processor or next stage.
getReason()	Returns a string representing the reason why the document is pushed to aggregation. It can have one of the following values: ADDED, DELETED, IMPACTED.
match(doc,graphMatchingExpression)	Finds the list of paths in the graph that start from the specified IAggregationDocument and that satisfy the graphMatchingExpression. Returns them as a list of documents.
matchPathEnd(doc,graphMatchingExpression)	Finds the documents at the end of each path in the graph, that starts from the specified IAggregationDocument and that satisfy the graphMatchingExpression. Returns them as a list of documents. This is useful when you do not want to overload the Consolidation Server with a lot of useless intermediary documents, found on the path between the starting document and the document level you chose as path end. In other words, instead of considering all the vertices on a given path, it only considers the one at the end.
matchPathEnd(doc,graphMatchingExpression,metas)	Finds the documents at the end of each path in the graph, that starts from the specified IAggregationDocument, satisfy the graphMatchingExpression. Returns them as a list of documents. The goal of this method is to avoid impacting elements if the meta that changed is not used. Instead of considering all the vertices on a given path, it only considers the one at the end, only if the meta used has changed. This is triggered when the impact detection is launched during the incremental scan. Warning: This method does not work with Date metas.
matchPathEnd(doc,graphMatchingExpression, testDirectives,testParts,metas)	Finds the documents at the end of each path in the graph, that starts from the specified IAggregationDocument, satisfy the graphMatchingExpression. Returns them as a list of documents. The goal of this method is to avoid impacting elements if the meta that changed is not used, and if directives and parts are the same. Instead of considering all the vertices on a given path, it only considers the one at the end, only if the meta used has changed, or if directives are different, or if parts are different. This is triggered when the impact detection is launched during the incremental scan. Warning: This method does not work with Date metas.
yield(doc)	Yields the newly created document to the forward rules without passing through the whole pipeline of aggregation processors. Use this call for documents created in an aggregation processor with the IJavaCreateAggregationHandler methods.
yieldAndForward(doc)	Yields the documents newly created in an aggregation processor to the next aggregation processor in the pipeline of aggregation processors. Use this call for documents created in an aggregation processor with the createDocument or the createChildDocument methods. This is to make sure that the document is forwarded to the next processor and not sent to the specified forward rules directly, unlike the yield(doc) method.

IJavaAggregationHandler

This interface extends the IAggregationHandler interface to provide a different approach for collecting graph matching results when using Java.

Method	Description
match(doc,graphMatchingExpression,matchResultVisitor)	Finds the list of paths in the graph that start from the specified IAggregationDocument and that satisfy the graphMatchingExpression. Unlike the other match method, it provides the results using the matchResultVisitor instance with all unique documents matching the graph matching expression (independently of the paths reached).

ICreateAggregationHandler

The interface to add new documents to the forward rules provides two different create methods and a specific service to fetch document parts from a connector instance.

Method	Description
createDocument(uri,type,parentTypes)	Create an aggregation document with the given properties. Unlike ICreateTransformationHandler.createDocument, this document is not automatically deleted if there are no edges point on it at the end of the aggregation phase. It is pushed as is to the forward rules, and sent (or not) to an Indexing Server or another Consolidation Instance.
createChildDocument(parentDoc,subURI,type,parentTypes)	Creates an aggregation document from a parent one with the given properties.
isFetchOperation()	When a Fetch Server performs a fetch operation request to the Consolidation Server, this handler (and in this case only) returns true. When this is the case, all the aggregation operations performed in the processor are directed in return to the Fetch Server. None of the documents aggregated proceed to the forward rules handler, and thus to the Indexing Server. The operations allowed in such event are the ones of a create/update context, and the fetchParts operation. In most cases, you do not have to deal with this kind of situation.
fetchParts(document,connectorName, connectorDocumentURI)	Fetches the parts corresponding to the connectorDocumentURI document from the connector specified by connectorName and appends them to the given document. This call makes sense only when the isFetchOperation() method returns true.

IDeleteAggregationHandler

The interface to send delete orders to the forward rules. Unlike IDeleteTransformationHandler, all methods are similar, apart from an extra parameter, which receives a possible list of document types, that is added to all signatures.

When you create new custom documents during the aggregation phase using the create'*' methods of the ICreateAggregationHandler interface in one processor, and later try to send a delete order for these documents in another processor, you no longer have access to any of the document metadata, especially the document types.

Such information is only known by the Indexing Server or by another Consolidation Server instance, depending on the routing strategy applied by the forward rules handler.

As a result, if you want to send a delete order to custom aggregated documents, you need to specify their types so that the forward rules handler can apply a dedicated routing strategy.

You do not need to specify the types for all documents present in the Consolidation Store that are processed during the aggregation phase (unlike the transformation phase). The Consolidation Server provides all required metadata to the forward rule handler so that it can operate accurately.

Method	Description
deleteDocument()	Sends a recursive deletion order for the document being aggregated, and all the other documents with a prefix matching the current document URI.
deleteDocument(docTypes...)	Sends a recursive deletion order for the document being aggregated, and all the other subdocuments with a prefix matching the current document URI. Moreover, to delete documents not recognized in the Consolidation Store and allow correct routing/filtering by the forward rules handler, a recursive deletion order is sent for the specified document types matching the current document URI.
deleteDocument(uri,boolean)	Sends a deletion order for the specified document URI, recursively or not. If the boolean flag is true, then all URIs with a prefix matching the given URI are also deleted.
deleteDocument(uri, docTypes...)	Sends a recursive deletion order for the document with the specified URI prefix.
deleteDocument(uri,boolean,docTypes...)	Sends a deletion order for the specified document URI, recursively or not. If the boolean flag is true, then all URIs with a prefix matching the given URI are also deleted. Moreover, to delete documents not recognized in the Consolidation Store and allow correct routing/filtering by the forward rules handler, a deletion order (recursive or not) is sent for the given document types matching the specified document URI.
deleteDocument(doc)	Sends a recursive deletion order for the specified aggregated document and possibly all documents with a prefix matching the document URI.
deleteDocument(doc,docTypes...)	Sends a recursive deletion order for the specified aggregated document and possibly all documents with a prefix matching the document URI. Moreover, a recursive deletion order with the given document is sent with the additional forward rule types provided, to delete documents not recognized in the Consolidation Store while allowing correct routing/filtering by the forward rules handler (if required). Moreover, to delete documents not recognized in the Consolidation Store and allow correct routing/filtering by the forward rules handler, a recursive deletion order is sent for the given document types matching the current document URI.
deleteDocument(doc,boolean)	Sends a deletion order for the given document, recursively or not. If the boolean flag is true, then all URIs with a prefix matching the document URI are also deleted.
deleteDocument(doc,boolean,docTypes...)	Sends a deletion order for the given document, recursive or not. If the boolean flag is true, then all URIs with a prefix matching the document URI are also deleted. Moreover, to delete documents not recognized in the Consolidation Store and allow correct routing/filtering by the forward rules handler, a deletion order (recursive or not) is sent for the specified document types matching the document URI.
deleteDocumentChildren(doc,path)	Sends a deletion order for all document children matching the given path. The document itself is not deleted.
deleteDocumentChildren(doc,path,docTypes...)	Sends a deletion order for all document children matching the given path. The document itself is not deleted. Moreover, to delete documents not recognized in the Consolidation Store and allow correct routing/filtering by the forward rules handler, a recursive children deletion order is sent for the specified document types matching the current document URI.
deleteDocumentChildren(uri,path)	Sends a deletion order for all document children of the given URI matching the given path. The document itself is not deleted.
deleteDocumentChildren(uri,path,docTypes...)	Sends a deletion order for all document children of the given URI matching the given path. The document itself is not deleted. Moreover, to delete documents not recognized in the Consolidation Store and allow correct routing/filtering by the forward rules handler, a recursive children deletion order is sent for the specified document types matching the specified document URI.
deleteDocumentChildren(doc)	Sends a deletion order for all document children. The document itself is not deleted.
deleteDocumentChildren(doc,docTypes...)	Sends a deletion order for all document children. The document itself is not deleted. Moreover, to delete documents not recognized in the Consolidation Store and allow correct routing/filtering by the forward rules handler, a recursive children deletion order is sent for the specified document types matching the current document URI.
deleteDocumentChildren(uri)	Sends a deletion order for all document children of the given URI. The document itself is not deleted.
deleteDocumentChildren(uri,docTypes...)	Sends a deletion order for all document children of the given URI. The document itself is not deleted. Moreover, to delete documents not recognized in the Consolidation Store and allow correct routing/filtering by the forward rules handler, a recursive children deletion order is sent for the specified document types matching the specified document URI.
deleteDocumentRootPath(rootURI)	Deletes all the documents matching the root URI prefix.
deleteDocumentRootPath(rootURI,docTypes...)	Deletes all the documents matching the root URI prefix, and with some forward rule types to allow correct routing/filtering by the forward rules handler.

IConsolidationDocument

The following interface gives access to the default data encapsulated within a consolidation document, either for transformation or aggregation.

Method	Description
isOfType(type)	Indicates if the type transmitted is among the list of the current document types.
getAllDirectives()	Returns all the directives defined in this document.
getAllMetas()	Returns all the metas defined in this document.
getAllParts()	Returns all the parts defined in this document.
getDirectiveNames()	Returns all the document directive names.
getDirective(name)	Returns the first directive value for the given name.
getDirectives(name)	Returns all the directives for the given name.
getMetaNames()	Returns all the meta names.
getMeta(name)	Returns the first meta value for the given name.
getMetas(name)	Returns all the meta values for the given name.
getOriginalSources()	Returns the list of original sources for the given document.
getPartNames()	Returns all the document part names.
getPart(name)	Returns the first document part for the given name.
getParts(name)	Returns the list of document parts for the given name.
getSource()	Returns the document original source that produced it.
getType()	Returns the document representative type.
getTypeInheritance()	Returns the type inheritance for the document. The first one in the list is a descendant of the second one, the second one of the third one, and so on. So types are ordered from the most specific to the most generic.
getUri()	Returns the document unique identifier.
hasDirective(name)	Indicates if the directive name has an associated value within the document.
hasMeta(name)	Indicates if the meta name has an associated value within the document.
hasPart(name)	Indicates if the part name has an associated value within the document.

IMutableConsolidationDocument

This interface enriches the operations available within IConsolidationDocument with a list of operations allowing the modifications of internal data.

Method	Description
deleteDirective(name)	Deletes all the directive values associated to the specified directive name.
deleteDirectives(name, values)	Deletes only the given values for the specified directive name.
deleteMeta(name)	Deletes all the meta values associated to the specified meta name.
deleteMetas(name, values)	Deletes only the given meta values from the specified meta name.
deleteParts(name)	Deletes the document parts related to the specified part name.
deleteParts(name, documentParts)	Deletes all the part directive values for the specified part name.
setDirective(name, value)	Assigns the given value to the specified directive name.
setAllDirectives(directives)	Assigns all the directive name/values associated to the current document.
setMeta(name, value)	Assigns the given meta value to the specified meta name.
setMeta(name, values)	Assigns the given meta values to the specified meta name.
setAllMetas(metas)	Assigns all the meta name/values associated to the current document.
setPart(name, docPart)	Assigns the given document part to the specified part name.
setParts(name, docParts)	Assigns the given document parts to the specified part name.
setAllParts(parts)	Assigns all the parts associated to the current document.
withDirective(name, value)	Adds the value of a specific directive to the possible list of predefined directive values. If none is defined, a new list is created.
withDirectives(name, values)	Adds the values of a specific directive to the possible list of predefined directive values. If none is defined, a new list is created.
withDirectives(directives)	Adds the list of directive key-values to the possible list of predefined directive values.
withMeta(name, value)	Adds the value of a specific meta to the possible list of predefined meta values. If none is defined, a new list is created.
withMeta(name, values)	Adds the values of a specific meta to the possible list of predefined meta values. If none is defined, a new list is created.
withMetas(metas)	Adds the list of meta key-values to the possible list of predefined meta values.
withPart(name, docPart)	Adds the document part to the list of existing predefined parts. If none is defined, a new list is created.
withPart(name, docParts)	Adds the sequence of document parts to the list of existing predefined parts. If none is defined, a new list is created.
withParts(allParts)	Adds the list of parts associated to the current document.

Company Hierarchy Example

In the following use case, we have people and companies, and we want to enrich the company with a meta indicating the number of employees present at any time.

We have two types of documents:

• company: Contains the company name in its URI. It holds possibly many other metas that identify the company.

• employee: Contains the employee's name in its URI. It holds possibly many other metas that identify the employee, but contains at least two metas:

◦ company_name contains the company's name in which the employee is working.

◦ service_name contains the service in which the employee is working (sales, R&D, marketing, etc.).

Connect Employees to Services and Services to Companies

We want to connect each employee to the service, and the service to the appropriate company with the following data model.

The code for such a transformation may look like the following:

Example 1. Employee's Transformation Processor

@CVComponentConfigClass(configClass=CVComponentConfigNone.class)
public class EmployeeTransformationProcessor implements IJavaAllUpdatesTransformationProcessor {
public EmployeeTransformationProcessor(final CVComponentConfigNone config) {
}

@Override
public String getTransformationDocumentType() {
return "employee";
}

@Override
public void process(final IJavaAllUpdatesTransformationHandler handler,
final IMutableTransformationDocument document) throws Exception {
final String companyName = document.getMeta("company_name");
final String serviceName = document.getMeta("service_name");
if ((companyName != null) && (! companyName.isEmpty()) && (serviceName != null) &&
(! serviceName.isEmpty())) {
final ITransformationDocument serviceDoc = addService(handler, document, serviceName, companyName);
document.addArcTo("employee", serviceDoc.getUri());
}
}
private ITransformationDocument addService(final IJavaAllUpdatesTransformationHandler handler,
final IMutableTransformationDocument document, final String serviceName, final String companyName) {
final IMutableTransformationDocument newDoc = handler.createDocument("service=" + serviceName + "&",
"service");
newDoc.addArcTo("service", "company=" + companyName + "&");
handler.yield(newDoc);
// Note that the yield here is required because it is a document created
// during the Transformation phase
return newDoc;
}
}

The drawback of this implementation is that it pushes the arcs that link services to the company several times. In the end, since these arcs have the same type, only the relevant ones persist (with no redundancies in the store).

However, it is always better to minimize the number of redundant operations. If we had the required information, we could create the different services that the company has, with unique URIs, and then at the employee level, we would link employees to services.

A possible implementation could be:

...
@Override
public void process(final IJavaAllUpdatesTransformationHandler handler,
final IMutableTransformationDocument document) throws Exception {
final String serviceName = document.getMeta("service_name");
if ((companyName != null) && (! companyName.isEmpty()) && (serviceName != null) &&
(! serviceName.isEmpty())) {
document.addArcTo("employee", "service=" + serviceName + "_" + companyName + "&");
}
}
...

Despite its imperfection, let us stick to this first implementation from now on. For more information about the method used in this sample, see IMutableTransformationDocument .

Keep the Business Logic Within the Connector

Sometimes, you might want to keep the business logic within your connector, even if it is not recommended. You can do that using the com.exalead.cloudview.consolidationapi.PushAPITransformationHelpers. The sample below shows how to embed the graph modeling done by the EmployeeTransformationProcessor directly within your connector.

final PushAPI employeePushAPI = CloudviewAPIClientsFactory.newInstance(GATEWAY_URL).newPushAPI
(PUSH_API_SERVER, CONNECTOR_NAME);
final List<Document> employees = new ArrayList<Document>();

Document employee = new Document("employee=Alice&");
employee.addMeta("company_name", "3ds");
employee.addMeta("service_name", "RandD");

employees.add(employee);

employee = new Document("employee=Bob&");
employee.addMeta("company_name", "3ds");
employee.addMeta("service_name", "RandD");

employees.add(employee);

employee = new Document("employee=John&");
employee.addMeta("company_name", "3ds");
employee.addMeta("service_name", "Sales");

employees.add(employee);
final Iterator<Document> employeesIt = employees.iterator();
while (employeesIt.hasNext()) {employee = employeesIt.next();
final String serviceURI = getServiceURI(employee.getMetaContainer().getMeta("service_name").getValue());
// Create service managed document
PushAPITransformationHelpers.createDocument(employee, serviceURI, "service");

// Add arc from employee to service
PushAPITransformationHelpers.addArcTo(employee, "employee", serviceURI);

// Add arc from service to company
final String companyURI = getCompanyURI(employee.getMetaContainer().getMeta("company_name").getValue());
PushAPITransformationHelpers.addArcTo(employee, "service", serviceURI, companyURI);
employeePushAPI.addDocument(employee);
}

Count the Number of Employees and Push Updated Documents

Now, for each company document, we want to add a nb_employees meta that counts the total number of employees, and push updated document to the Indexing Server. You can perform this kind of task during the aggregation phase.

A possible implementation could be:

@CVComponentConfigClass(configClass=CVComponentConfigNone.class)
public final class CompanyAggregationProcessor implements IJavaAllUpdatesAggregationProcessor {
public CompanyAggregationProcessor(final CVComponentConfigNone config) {
}
@Override
public String getAggregationDocumentType() {
return "company";
}

@Override
public void process(final IJavaAllUpdatesAggregationHandler handler, final IAggregationDocument document)
throws Exception {
int nbEmployees = 0;
for (final IAggregationDocument serviceDoc : GraphMatchHelpers.getPathsEnd(handler.match(document,
"-service"))) {
nbEmployees += handler.match(serviceDoc, "-employee").size();
}
document.withMeta("nb_employees", String.valueOf(nbEmployees));
}
}

We first retrieve all the services that belong to a given company with the following call:

handler.match(document, "-service")

This returns all the paths starting from the company document that follow the service edge in reverse order.

We know by design, and also from the match query, that such paths contain only one document, the neighbors of the company document. So GraphMatchHelpers.getPathsEnd is responsible for accessing it. The Java code for such helper method must be equal (or equivalent) to:

public static <T> List<T> getPathsEnd(final List<List<T>> paths) {
return Lists.transform(paths, new Function<List<T>, T>() {
@Override
public T apply(final List<T> path) {
return Iterables.getLast(path);
}
});
}

Then for each service document:

handler.match(serviceDoc, "-employee").size()

Returns all the paths leading to a unique employee in the service. We need to get the number of paths to get the number of employees in the service. The company document is then enriched with the nb_employee meta with the variable that allowed us to sum up all the different paths that were found.

A better and simpler implementation is:

Example 2. Company's Aggregation Processor

Reach Employee Documents from the Company Document

The graph matching expression language allows us to specify an arbitrary long path, with various quantifiers (see Appendix - Matching Expressions Grammar). We can therefore reach the employee documents from the company document directly, with the expression:

handler.match(document, "-service.-employee")

Push the Number of Employees Present in Each Service

We could also want, as a refinement, to push the number of employees present in each service. Writing the following code would then be enough:

@CVComponentConfigClass(configClass=CVComponentConfigNone.class)
public final class ServiceAggregationProcessor implements IJavaAllUpdatesAggregationProcessor {
public ServiceAggregationProcessor(final CVComponentConfigNone config) {
}

@Override
public String getAggregationDocumentType() {
return "service";
}

@Override
public void process(final IJavaAllUpdatesAggregationHandler handler, final IAggregationDocument document)
throws Exception {
document.withMeta("nb_employees", String.valueOf(handler.match(document, "-employee").size()));
}

}

Important: If you have written the above processor first, avoid writing the following processor afterward to aggregate the number of employees for the company.

@CVComponentConfigClass(configClass=CVComponentConfigNone.class)
public final class CompanyAggregationProcessor implements IJavaAllUpdatesAggregationProcessor {
public CompanyAggregationProcessor(final CVComponentConfigNone config) {
}

@Override
public String getAggregationDocumentType() {
return "company";
}
@Override
public void process(final IJavaAllUpdatesAggregationHandler handler, final IAggregationDocument document)
throws Exception {
int nbEmployees = 0;
for (final IAggregationDocument serviceDoc : GraphMatchHelpers.getPathsEnd(handler.match(document,
"-service"))) {
final String nbServiceEmployees = serviceDoc.getMeta("nb_employees");
if ((nbServiceEmployees != null) && (! nbServiceEmployees.isEmpty()) {
nbEmployees += Integer.valueOf(nbServiceEmployees);
}
}
document.withMeta("nb_employees", String.valueOf(nbEmployees));
}
}

The code above collects all service documents, and for each of them, sums up the values coming from its nb_employees meta.

This code works because even if we sum up the services meta values while more documents are still arriving, the impact detection ensures that the processor for these specific company documents is re-evaluated.

What may prevent this code from working properly is that the data visible during the Aggregation phase comes from the data pushed to the Consolidation store only, and nothing more! In other words, all document modifications and newly created custom documents that occur during the Aggregation phase are not visible to one another. So the company meta does not have any visibility on the new nb_employees meta created during the aggregation phase by the service processor.