Configuration : Appendix - ELLQL Language : Compound Operators
 
Compound Operators
 
Unary Operators
Binary Operators
Nary Operators
Proximity Operators
All these operators hold other operators. The options of these operators, called below internal_options, can be used, for example, for scoring purposes:
positions.merge:
KEEP - the positions of all children operators are never merged.
MERGE - the positions of all children operators are merged when they are the same.
<nodePropertyName>.policy={ADD,MAX,MIN} - specifies how the node properties of children are merged.
*.policy= - specifies the same policy for all node properties.
Unary Operators
Syntax
Use
Examples
#not{internal_options}(query)
Returns the documents that do not match the query
Retrieve all the documents that do not contain toto #not(#alphanum(text, "toto"))
#opt{internal_options}(query)
Makes optional query parts (only useful for ranking)
-
#autocache{internal_options}(query)
Caches the documents returned by query, and uses the cache for next queries. The scoring information associated to each document in the cache is discarded.
Load all the documents in the cache and return the documents:
#autocache{expectedSize=LARGE}(#category(categories, "Top/Source/default"))
Use the cache:
#autocache(#category(categories, "Top/Source/default"))
#at(position, query)
Only works on alphanum fields.
It applies query at an exact position in the field. The position is expressed in terms of indexed tokens; which means that usually this position does not take into account spaces, punctuation, etc.
position can be a positive value from 0, or a negative value (backward position, with -1 meaning the last position).
When the position is negative, only #alphanum of #anumpattern can be used.
Important: Be careful when using #at with a text field since by default, several contexts are mapped into text (including title, htmlcontext, etc.)
Retrieve all the documents beginning with toto:
#at(0, #alphanum{k=2}(text, "toto"))
Retrieve all the documents ending with toto:
#at(-1, #alphanum{k=2}(text, "toto"))
Retrieve all the documents with toto as third word:
#at(2, #alphanum{k=2}(text, "toto"))
Retrieve all the documents with exactly one word:
toto (the same toto at the end and the beginning of the document):#at(0, #at(-1, #alphanum{k=2}(text, "toto")))
#filter{internal_options}("virtual_expr", query)
Returns the results of query only if "virtual_expr" is true.
-
Binary Operators
Syntax
Use
Examples
#butnot{internal_options}(search_query, avoid_query)
Returns all the documents matching at least one search_query with different positions that avoid_query
New BUTNOT "New York" is:
#butnot(#alphanum{k=2}(text, "New"), #seq(#alphanum{k=2}(text, "New") #alphanum{k=2}(text, "York")))
#split{internal_options}(search_query, separator_query)
Applies search_query, taking care that all results are contained in a same separator_query (that may be a page, a sentence, or a paragraph)
Searching for A & B in the same page may be written in as:
#split(#and(#alphanum{k=2}(text, "A") #alphanum{k=2}(text, "B")), #page(text))
#innerjoin{internal_options}(join_id, lead_query, filter_query)
Inner join has a few specific node options (internal_options) in addition to common options:
joinPolicy
BITSET_JOIN - Join with a compact bitset. Be careful, its size is limited to 1GB. If the limit is reached, users does not get any results. Yet, the error is reported in the Indexing Server log as: "Key for innerjoin is too large for a bitset policy, unable to execute the query.". To solve this issue, consider changing to SPARSESET_JOIN.
SPARSESET_JOIN - Default value. Join with a sparse bitset.
MERGE_JOIN - Compute all matching dids at initialization. Ranking keys are not propagated in this mode.
<rankingKeyName>.join={JOIN_LEFT,JOIN_RIGHT,JOIN_LEFT_RIGHT} - specifies how the ranking key value is merged.
JOIN_LEFT - Default value. The ranking key is propagated from (and only from) left member of join node.
JOIN_RIGHT - For a given join id, the ranking key is merged from all documents that match right member according to the innerjoin ranking key merge policy.
JOIN_LEFT_RIGHT - For a given join id, the ranking key is merged from left member and all documents that match right member according to the innerjoin ranking key merge policy.
Return the emails with foo in the subject and a PDF in attachment:
#innerjoin(mail, #alphanum{k=2}(subject, "foo"), #category(attachement_file_type, 3))
Nary Operators
Syntax
Use
Examples
#seq{internal_options}(query1 query2 ...)
Searches for a sequence. Each query must have its position following the previous query
"New York" is:
#seq(#alphanum{k=2}(text, "New") #alphanum{k=2}(text, "York"))
#and{internal_options}(query1 query2 ...)
Searches for documents matching all the queries
New York is:
#and(#alphanum{k=2}(text, "New") #alphanum{k=2}(text, "York")
New and York can be at any position in the document.
#or{internal_options}(query1 query2 ...)
Searches for documents matching at least one query
banana OR apple is:
#or(#alphanum{k=2}(text, "banana") #alphanum{k=2}(text, "apple")
#bor{internal_options}(query1 query2 ...)
Searches for documents matching at least one query.
To be used only for a fast OR on many documents (no expansion, no ranking)
banana BOR apple is:
#bor(#alphanum{k=2}(text, "banana") #alphanum{k=2}(text, "apple")
#fuzzyand{fuzzyand_option, internal_options}(query1 query2 ...)
Searches for documents matching at least X queries, where X is determined according to the fuzzyand_option.
The score is adjusted according to the number of matching queries.
fuzzyand_option can be:
either maxFailure=X which means that up to X queries can fail,
or minSuccess=X means that at least X queries are expected to succeed.
-
#consecutive{internal_options}(query1 query2 ...)
Executes the queries in order (query1, then query2, etc.) and acts as an OR.
However, when a timeout occurs, the first queries are more likely to have been fully completed than the last queries.
-
Proximity Operators
Syntax
Use
#prox{internal_options}(query1 minDistance12 maxDistance12 query2 minDistance23 maxDistance23 ... maxDistance(N-1)N queryN)
Each position of query i must be between minDistance(i-1)i and maxDistance(i-1)i positions from the query (i-1)
minDistance and maxDistance are signed, which enables several matching strategies for #prox(A minDistance maxDistance B):
minDistance>0, maxDistance>0 - B is between minDistance and maxDistance positions after A.
minDistance=maxDistance=distance>0 - B is exactly at distance positions after A. This can also be written in a more concise way as #prox(A distance B).
minDistance<0, maxDistance<0 - B is between minDistance and maxDistance positions before A.
minDistance=maxDistance=distance<0 - B is exactly at distance positions before A. This can also be written in a more concise way as #prox(A distance B).
minDistance<0, maxDistance>0 - B is near A.
If all min (resp. max) distances are the same, the operator can be written in a more concise way, by specifying the distance before all query nodes:
#prox{internal_options}(minDistance maxDistance, query1 query2 ...)
These operators can use an optional sameposok option to indicate that a distance of 0 between children matches.