Configuration : Configuring Search Queries : User Query Language (UQL)
 
User Query Language (UQL)
 
The Different Types of Search in UQL
Reserved Characters in UQL
Operands
Operators by Priority
More About INNERJOIN
User Query Language (UQL) serves for real user queries.
It allows you to make simple or rich queries using various query operators, such as Boolean operators (AND, OR, NOT), word sequence operators (NEAR, NEXT, BEFORE, AFTER), score operators (MAX, MIN), etc., and also prefix handlers to focus on specific metas.
The Different Types of Search in UQL
Reserved Characters in UQL
Operands
Operators by Priority
More About INNERJOIN
The Different Types of Search in UQL
This section describes the different types of search that you can make through UQL.
Important: Errors occur when you make queries using a single word in capital letters that is also an UQL operator or an operand. For example, if you search for AND you get an error like [code=360142] Error while processing CloudView SearchAPI request... as AND is an UQL operator. It works if you search for and in lowercase.
List of UQL operators that you cannot search alone in capital letters: AFTER, AND, BEFORE, BOR, BUTNOT, FUZZYAND, NEAR, NEXT, NOT, OPT, OR, SPLIT, TO, XOR.
Search by Exact Phrase
Operator
"" (quotation marks)
Purpose
You can get more results than expected is you enter a search phrase (that is, two or more search terms meant to appear together), but do not enclose the phrase with quotation marks.
To search for documents on 2018 sales, typically people would enter: 2018 sales
In this case, the search results would include any document that contains both 2015 and sales, but not necessarily next to each other.
Example
To search for documents containing the exact phrase 2018 sales, use quotation marks: "2018 sales"
Search by Exact Words
Operator
+
Purpose
You can override the matching behavior using the + (plus) operator to search for exact words only. It is typically useful to search for:
link words (the, a, of, or, and) that are ignored by default,
the plural of a word.
This operator is useful for building very specific queries.
You can also prepend words by + in your query to search for the exact forms of these words only. For example, with the query foo +bar, foo has the standard semantic expansion (like lemmatization if activated) but not bar, which returns the exact form only (that is, bar).
Search with Logical Expressions
Operators
OR, AND, NOT, XOR, BOR
Purpose
Searches for documents containing:
OR: either one search term OR another
AND: one search term AND another search term
NOT: one search term BUT NOT another search term
XOR: either one search term OR another BUT NOT both
BOR: either one search term OR another. Only use it for a fast OR on many documents (no expansion, no ranking).
Example
Use OR to specify a list of similar terms that may occur in the document you are looking for. (movie star) OR (celebrities) searches for documents containing either movie star or celebrities.
Search with Excluded Words
Operators
NOT, -XX, BUTNOT
Purpose
Excludes documents containing a specific word or phrase from the search with a - (minus sign) or a NOT operator before the word to exclude.
Example
new -york OR new NOT york searches for documents containing new but not york.
Note: NOT and - are unary operators and depend on the implicit default operator AND. The expressions new -york OR new NOT york are therefore interpreted as new AND NOT york.
You can also use the BUTNOT operator:
"Martin Luther" BUTNOT "Martin Luther King" matches if there is at least an instance of Martin Luther not followed by King.
Search with Prefix Handlers
Operators
Use prefix handlers: see the list of prefix handlers defined in Search logics > Query Language.
Purpose
Refine your queries by targeting specific index fields with default prefix handlers like text:, title:, etc.
You can also
specify aliases for these prefix handlers. For a list of aliases, see that prefix handler’s Alias field, in Search logics > Query Language.
search by category values
search numerical fields by a range of values
define custom prefix handlers to go further than the index field level, and trigger very specific search.
For more details, see Using Prefix Handlers.
Example
Search with a default prefix handler: title:foo searches for foo in document titles.
Search with an alias: for the prefix handler document_file_size, you have the following aliases by default: file_size, imap_mail_file_size, nntp_post_file_size, ldap_record_file_size
Search by category values: categories:fileattributes/extension/PDF
Search a numerical range of values: NumericalPrefixHandler:[100 TO 200]
Custom prefix handler: for a similarity search, we could enter a query like: similar: (ID1, ID2, ID3) where ID1, ID2, ID3 are the IDs of related terms, to search for all the documents having a part or all of these related terms.
Phonetic Search
Operators
Prefix handler soundslike:
You must create this prefix handler beforehand. For more details, see Using Prefix Handlers.
Purpose
Finds documents using the phonetic spelling of search terms.
Important: The language used for the query is important and must match the language specified in your Mashup UI configuration. If none is specified, Exalead CloudView uses the web browser’s preferred language.
Example
To find a coworker with a name that sounds like Brona, enter: soundslike:brona to return results such as Bronagh and Branagh.
Search with Approximate Spelling
Operators
Prefix handler spellslike:
Purpose
Finds documents that do not exactly match the search terms. This is useful if uncertain of the correct spelling, or there are several accepted spellings for a search term.
Example
Searching for spellslike:organise also returns documents containing organize.
Search by Date
Operators
Prefix handlers date:, document_lastmodifieddate:, document_before:, document_after:
Purpose
Retrieves documents based on a given date, or date range.
By default, the input format is detected automatically. If you need to define a custom format, update the Input format field for your prefix handler in Search Logics > Query Language.
What you must know:
We support the date formats: RFC 822, RFC 850, asctime, ISO 8601, and date format YYYY/MM/DD-HH:MM:SS (DD/MM/YYYY is NOT supported)
Operators are =, ==, <=, <, >=, >, != and :
The default timezone is GMT.
Quotes are required in search queries when there is at least a blank space in the date. For example, myDatePrefixHandler="12/15/2018 15:23:22 GMT+02"
Supported formats
Sun, 06 Nov 1994 08:49:37 GMT
RFC 850:
Sunday, 06-Nov-94 08:49:37 GMT
Fri Nov 21 11:18:47 CET 2014
asctime:
Sun Nov 6 08:49:37 1994
RFC 822:
Fri, 21 Nov 2014 16:59:27 MET DST
Fri, 21 Nov 2014 17:59:14 EET
Fri, 21 Nov 2014 15:59:16 +0000 (UTC)
Fri, 21 Nov 2014 16:59:42 MET
Fri, 21 Nov 2014 15:58:04 +0000 (UTC)
Fri, 21 Nov 2014 07:58:28 -0800
American date format:
12/23/2014 15:23:22
12/23/2014 15:23:22 GMT+02
09/23/2014 08:52:59 [+00:00]
2014/12/23 15:23:22
2014/01/23-22:11:37
2014/12/23
2014/12
ISO 8601 samples (ISO works with / or - separators):
2014-03-12 15:23:22
2014-03-12
2014-03
2014
2014-12-06T15:31Z
2014-12-06T15:31:17+00:00
Week numbers like 2016-W18-1T09:49:38Z are NOT supported
Example
Let’s say that we give the modidied alias to the document_lastmodifieddate prefix handler. We could have:
modified="11/23/2018 10:18:02 GMT+01" for a fully explicit date query
modified="2018/11/23 10:18:02+00:00" for a fully explicit date query
modified="2018/11/23 10:18:02" for a date query with the default GMT time zone interpreted implicitly.
modified="11/23/2018 10:18" for a query with an implicit range of 1 minute.
modified=2018/11/23 for a query with an implicit range of one day.
modified=2018/11 for a query with an implicit range of 1 month.
modified=2018 for a query with an implicit range of 1 year.
modified<"11/23/2018 10:18:02 GMT+01" for all documents before the explicit date.
modified<"2018/11/23T10:18:02+01:00" for all documents before the explicit date.
modified<=11/23/2018 for all documents until the end of the 11/23/2014 day.
modified<=2018 for all documents until the end of the 31/12/2018 day.
modified:[2014/12/23 TO "2018/01/21-22:11:37 GMT+01"] to search documents in a specific date range. This range notation is inclusive, and works with numerical values too.
We can also restrict a search query according to a document’s last modification or creation date:
"movie star" AND date >= 2018/05/21 finds documents containing movie star modified after May 21, 2014.
and "movie star" AND date <= 2018/05/21 finds documents on movie star modified before May 21, 2018.
Search by Size
Operator
Prefix handler file_size:
Purpose
Searches based on file size in bytes.
Example
file_size:1024 returns documents with a file size of 1 KB.
file_size>=1024 returns documents with a file size larger than 1 KB.
Search by Language
Operator
Prefix handler language:XX
Purpose
Limits your search to the documents of a specific language using the language:XX prefix handler (where XX can be EN, FR, DE, etc.).
This is useful when you need to search using a term that you can find in many languages, but has different meanings from one language to another.
Example
"Tour de France" language:en searches for English-language documents about the Tour de France.
Search in URL
Operator
Prefix handler inurl:
Purpose
Includes all web pages with URLs containing the search keywords. Unlike site:, this is a full text search of the URL text.
Example
inurl:example returns:
http://www.example.com/
http://www.exalead.com/blog/another_cloudview_example/
Search for URL
Operator
Prefix handler url:
Purpose
Searches for pages with the same normalized URLs.
You do not need to include the leading http://, https://, www., and trailing slashes in the query.
Example
url:example returns:
http://www.example.com/
http://www.exalead.com/blog/another_cloudview_example/
Search Site Content
Operator
Prefix handler site:
Purpose
Returns all documents on a site. Only expect results for documents with a publicurl meta, such as those pushed by the Crawler and the Feed Fetcher connectors.
The leading "http://" or "https://" and "www.", and trailing slashes are optional in the query.
Example
site:example.com always returns the same documents as site:http://www.example.com/
Search with Optional Terms
Operator
OPT
Purpose
Specifies an optional word to include in the search. Use it to specify several terms without limiting the scope of the search.
Example
cow OPT mad searches for documents containing cow that preferably also include mad.
Search by Word Proximity
Operators
NEAR, NEXT, AFTER, BEFORE
Purpose
Find documents where search terms are in proximity of one another. By default the maximum distance between terms is 16 words.
Edit this value using the Search > Search Logics > Query Language > Default distance for proximity operators property.
Example
"movie star" AFTER hollywood searches for documents where movie star appears soon after hollywood.
Note: "movie star" is equivalent to movie NEXT star, the NEXT operator having a distance of 1 with the following word.
You can also specify the maximum distance of the words by using NEAR/x, AFTER/x, and BEFORE/x. For example:
"movie star" NEAR/5 hollywood searches for documents where movie star appears within 5 words of hollywood,
and "movie star" BEFORE/5 hollywood searches for documents where movie star appears within 5 words before hollywood.
Important:  
You cannot use proximity operators with expressions whose "position" cannot be computed. For example, the query music NEAR (Madonna AND mp3) does not work, because the expression Madonna AND mp3 cannot be associated with a single word position value.
Some queries using proximity operators may fail with a No occurrence for query message when you want to open the preview of Office documents. This issue is linked to a format conversion limitation.
Prefix Search
Operator
*
Purpose
Searches using the beginning of a word to find a proper noun using its short form, or its linguistic root.
Example
Jenn* searches for documents containing words starting with Jenn, such as Jennifer, Jennie, Jenni, and Jenna.
Pattern Search
Operator
Regular expression patterns based on Perl 5.
You must open and close patterns with a / (slash) character.
Purpose
Searches using the beginning of a word to find a proper noun using its short form, or its linguistic root.
Example
/s.ren..pi.y/ searches for documents with words that match the pattern S . R EN .. PI . Y and would find documents with the word serendipity.
/mpg(1|2|3)?/ searches for documents containing any of the following: mpg, mpg1, mpg2, or mpg3.
Geographic Search
Operator
Prefix handler geo:
Purpose
Example
To search within a radius or polygon using UQL, see Search with a Radius or Polygon (UQL).
Search with INNERJOIN
Operator
INNERJOIN
Purpose
Combine records from two documents whenever there are matching values in a common field.
Search by Document Sections
Operator
SPLIT
Purpose
Searches for words in specific sections of a document.
Reserved Characters in UQL
This section gives a list of reserved characters in UQL and describes how to escape their interpretation.
List of Reserved Characters in UQL
If you need to use them as words in your query, you must enclose them in quotes.
Name
Character
Slash
/
Use it for:
passing options, for example, NEAR/12
pattern search as a regexp operator, for example, text:/bug.*/
Tab
\t
Line feed
\n
Carriage return
\r
Round brackets
( or )
Square brackets
[ or ]
Curly brackets
{ or }
Colon
:
Equal sign
=
Greater than or less than
< or >
Comma
,
Note: % is not a reserved character.
Escaping UQL Operators Interpretation
You can add a backslash (\) to disable the interpretation of UQL operators (parentheses, =, {, }, etc.).
Note: If any tokenization takes place afterward, the tokenizer still decides how to enter the character. For example, \= is interpreted as a simple = but the default tokenization considers this character as punctuation. It is therefore removed from the query as any other punctuation character.
Operands
This section describes the operands that you can use in UQL queries.
Standard Operands
Operand
Description
Predicate value
(e1)
Parenthesized sub expression, used to modify priority.
For example:
((fast OR speed) AND NOT light)
e1
"e1"
Quoted expression, used to escape all special characters. Inside a double quoted group, words are handled in a tight (NEXT) sequence. All operators are ignored.
e1
"word1 word2"
Quoted expression.
word1 NEXT word2
"word1 OR word2"
Quoted expression
word1 NEXT "OR" NEXT word2
Regexp and Wildcard Operands
You can use wildcards and regular expressions in UQL queries. The following table illustrates some examples.
Kind
Syntax
Example
Regular expressions
[field:]/pattern/{options}
title:/desi.*/{w=10000,#=100}
Wildcard
[field:]word*{options} [field:]*word{options}
desi* title:*faces
Score Modifier Operands
Operand
Description
Predicate value
s=N
Replace the predicate's score by an explicit value.
price<500{s=1000}
s+=N
Increase the predicate's score by a given value.
GUI{s+=100000}
s-=N
Decrease the predicate's score by a given value.
Note: The score of a given predicate can be negative, but the final score of the document can never be lower than 0.
corporate/tree:"Top/Attributes/XXX" {s-=100000}
w=N
Replace the predicate's weight by an explicit value.
design{w=10000}
w*=N
Multiply the predicate's weight by a given value.
design{w*=2}
Note:  
The score modifiers are applied in the same order as they appear. In the case of two conflicting modifiers, for example, {s=1000,s=2000}, the last one is applied.
In the case when an explicit score (s=) is specified, the predicate's weight is ignored so the 'w' options have no effect.
The explicit score (s=) can only be used for nontextual predicates (numeric values and categories) since this modifier completely ignores the ranking score class set when the document was indexed.
Word Matching Options
Word matching options are specified inside {} and directly appended to search words. They must be comma-separated.
Option
Description
Example
k=number
Set explicit matching level.
k=1 is the lowercase matching mode
k=2 is the normalized matching mode.
hl=0
Deactivates search result highlighting and summary for a specific node of the query only.
word1 word2{hl=0} word3
Operators by Priority
The query expansion modules rewrite the query based on the operators. To correctly expand the query, each operator has a priority.
Operators by Processing Priority, where e1 and e2 are Expressions
Priority
Operator
Explanation
Example
1
prefix handlers
Prefix handlers are always processed first.
Obama before:2009/01/01, searches for all documents relating to Obama before January 2009.
2
FUZZYAND/option (expression)
Search for documents that match at least N queries, where N is determined by the fuzzyand option.
This option can be either:
minimum success: at least N queries must match. N is a positive integer.
or maximum failures: up to N queries can fail. N is a negative integer.
For a document that contains:
The quick brown fox.
If min. success=2: FUZZYAND/2 (the quick brown foxes) matches, but FUZZYAND/2 (a brown foxx) does not.
If max. failure=-1: FUZZYAND/-1 (a quick brown fox) matches, but FUZZYAND/-1 (a quick foxx) does not.
3
OPT e1
Optional operator
OPT graphical
4
NOT e1
Negation operator
NOT myword
5
e1 NEXT e2
Explicit sequence operator for adjacency match.
user NEXT interface
6
e1 AFTER e2
Proximity (mono-directional) match
interface AFTER user
6
e1 AFTER/distance e2
AFTER with explicit word distance
interface AFTER/4 user
6
e1 BEFORE e2
Proximity with mono-directional match
user BEFORE interface
6
e1 BEFORE/distance e2
BEFORE with explicit word distance
interface BEFORE/4 user
7
e1 NEAR e2
Proximity operator with bidirectional match
user NEAR interface
7
e1 NEAR/distance e2
NEAR with explicit word distance
user NEAR/4 interface
8
e1 SPLIT e2
A document is returned if e1 appears in at least one of the document sections delimited by the e2 delimiter.
If the e2 delimiter is not present in a document, then the document is returned if e1 is valid at the document level.
user interface SPLIT Chapter
Matches if user interface appears between two "Chapters"
9
e1 e2
Implicit match operator on a sequence of words. It uses the implicit operator, which is AND by default.
search engine
10
(e1) INNERJOIN/key (e2)
Search for documents matching e1 where e2 appears in child documents. The relation between documents is contained in a key index field, which must be an unsigned integer.
For performance reasons, it is best to enable the Stored in Memory option.
subject:exalead INNERJOIN/msgId fulltext:france
We first select documents whose subject is exalead, then the join is made with documents containing the word france.
11
e1 BUTNOT e2
The search matches if there is at least an instance of e1 is not also an instance of e2 at the same position.
York BUTNOT "New York"
12
e1 AND e2
Explicit conjunction match.
user AND interface
13
e1 XOR e2
Exclusive OR operation. It can be either e1 OR e2, but not e1 AND e2
design XOR conception
14
e1 OR e2
Disjunction match
design OR conception
15
e1 BOR e2
Disjunction match
To use only for a fast OR on many documents
design BOR conception
More About INNERJOIN
You can use the INNERJOIN operator to join "left" and "right" documents based on a common numerical field. INNERJOIN only returns matching documents from the "left side" of the query.
For example, if you have order and customer documents that both contain a customer_id join key field, you can perform queries such as:
order_price>42 INNERJOIN/customer_id (customer_name:john AND customer_firstname:doe)
This returns all orders (the left-side documents) for John Doe that had an order price greater than 42.
Important: INNERJOIN queries have a significant impact on search performance.
When and How to Use INNERJOIN
Only use INNERJOIN for complex multicriteria search on the right-side document. In the above example, this is the customer.
Ideally, for better search performance, aim instead to inline the right-side document information in the left-side document. However, INNERJOIN is mandatory for certain multicriteria queries.
Requirements
The field must be an integer, a date or a value, retrievable and RAM-based.
For a value field, the innerjoin does not support two different join key fields for left and right side. For details about value fields, see Create Value Facets for Nonhierarchical Metas.
Perform an INNERJOIN Query
When the join field name is the same for both left and right documents
UQL: (left-side query) INNERJOIN/field (right-side query)
ELLQL: #innerjoin(field, left-side query, right-side query)
When the join field name is different for the left and right documents (not applicable to value fields)
UQL: (left-side query) INNERJOIN/field1=field2 right-side query)
ELLQL: #innerjoin(field1,field2, left-side query, right-side query)
Important: When using a join field that was generated by a Data Model property, always specify the full field name, prefixed by the declaring class name, even in UQL.
Hide INNERJOIN from Users
To avoid exposing the INNERJOIN operator to the user, you can use query templates to combine ELLQL and UQL to perform the join from the user query. For details, see Defining Query Templates.
Handling Documents in Multiple Slices
The INNERJOIN operator is intra-slice only: you cannot join a left-side document (LD) from one slice to a right-side document (RD) in another slice.
There are two main ways of handling this, depending on the use case.
Use Case 1: LD and RD are shardable in the Same Way
For example, when indexing customers and orders, we can shard the index by customer ID, and make sure that all orders go to the same slice as the customer.
To do that, we must override the automatic slice balancing algorithm, using a PAPI directive called forcedSlice.
Note: A PAPI directive is a small value associated to a document, which is not part of the textual content, that gives processing instructions or hints to the processing chain.
Set a PAPI Directive
document.setCustomDirective(name, value);document.setCustomDirective("forcedSlice", "2");
You then must correctly balance all slices.
Use Case 2: LD and RD are not shardable in the Same Way
In this case, the only solution is to fully duplicate the right-side documents (RD) in each slice. As the RD is never retrieved by the INNERJOIN query, there is no risk of duplication.
This works best when the set of RDs is relatively small, so duplicating them does not dramatically impact the slices.
You need to push each RD N times (where N=number of slices), each time with a different URI, for example /myoriginaluri/slice3.
For each document, you need to set the forcedSlice directive as described in the Use Case 1: LD and RD are shardable in the Same Way.
You do not need to do anything special for your LD, since the automatic slice allocator keeps dispatching them.