User Query Language (UQL)

Configuration : Configuring Search Queries : User Query Language (UQL)

User Query Language (UQL) serves for real user queries.

It allows you to make simple or rich queries using various query operators, such as Boolean operators (AND, OR, NOT), word sequence operators (NEAR, NEXT, BEFORE, AFTER), score operators (MAX, MIN), etc., and also prefix handlers to focus on specific metas.

The Different Types of Search in UQL

Reserved Characters in UQL

Operands

Operators by Priority

More About INNERJOIN

The Different Types of Search in UQL

This section describes the different types of search that you can make through UQL.

Important: Errors occur when you make queries using a single word in capital letters that is also an UQL operator or an operand. For example, if you search for AND you get an error like [code=360142] Error while processing CloudView SearchAPI request... as AND is an UQL operator. It works if you search for and in lowercase.

List of UQL operators that you cannot search alone in capital letters: AFTER, AND, BEFORE, BOR, BUTNOT, FUZZYAND, NEAR, NEXT, NOT, OPT, OR, SPLIT, TO, XOR.

Search by Exact Phrase

Operator	"" (quotation marks)
Purpose	You can get more results than expected is you enter a search phrase (that is, two or more search terms meant to appear together), but do not enclose the phrase with quotation marks. To search for documents on 2018 sales, typically people would enter: 2018 sales In this case, the search results would include any document that contains both 2015 and sales, but not necessarily next to each other.
Example	To search for documents containing the exact phrase 2018 sales, use quotation marks: "2018 sales"

Search by Exact Words

Operator

Purpose

You can override the matching behavior using the + (plus) operator to search for exact words only. It is typically useful to search for:

• link words (the, a, of, or, and) that are ignored by default,

• the plural of a word.

This operator is useful for building very specific queries.

You can also prepend words by + in your query to search for the exact forms of these words only. For example, with the query foo +bar, foo has the standard semantic expansion (like lemmatization if activated) but not bar, which returns the exact form only (that is, bar).

Search with Logical Expressions

Operators	OR, AND, NOT, XOR, BOR
Purpose	Searches for documents containing: • OR: either one search term OR another • AND: one search term AND another search term • NOT: one search term BUT NOT another search term • XOR: either one search term OR another BUT NOT both • BOR: either one search term OR another. Only use it for a fast OR on many documents (no expansion, no ranking).
Example	Use OR to specify a list of similar terms that may occur in the document you are looking for. (movie star) OR (celebrities) searches for documents containing either movie star or celebrities.

Search with Excluded Words

Operators	NOT, -XX, BUTNOT
Purpose	Excludes documents containing a specific word or phrase from the search with a - (minus sign) or a NOT operator before the word to exclude.
Example	new -york OR new NOT york searches for documents containing new but not york. Note: NOT and - are unary operators and depend on the implicit default operator AND. The expressions new -york OR new NOT york are therefore interpreted as new AND NOT york. You can also use the BUTNOT operator: "Martin Luther" BUTNOT "Martin Luther King" matches if there is at least an instance of Martin Luther not followed by King.

Search with Prefix Handlers

Operators	Use prefix handlers: see the list of prefix handlers defined in Search logics > Query Language.
Purpose	Refine your queries by targeting specific index fields with default prefix handlers like text:, title:, etc. You can also • specify aliases for these prefix handlers. For a list of aliases, see that prefix handlerâ€™s Alias field, in Search logics > Query Language. • search by category values • search numerical fields by a range of values • define custom prefix handlers to go further than the index field level, and trigger very specific search. For more details, see Using Prefix Handlers.
Example	Search with a default prefix handler: title:foo searches for foo in document titles. Search with an alias: for the prefix handler document_file_size, you have the following aliases by default: file_size, imap_mail_file_size, nntp_post_file_size, ldap_record_file_size Search by category values: categories:fileattributes/extension/PDF Search a numerical range of values: NumericalPrefixHandler:[100 TO 200] Custom prefix handler: for a similarity search, we could enter a query like: similar: (ID1, ID2, ID3) where ID1, ID2, ID3 are the IDs of related terms, to search for all the documents having a part or all of these related terms.

Phonetic Search

Operators	Prefix handler soundslike: You must create this prefix handler beforehand. For more details, see Using Prefix Handlers.
Purpose	Finds documents using the phonetic spelling of search terms. Important: The language used for the query is important and must match the language specified in your Mashup UI configuration. If none is specified, Exalead CloudView uses the web browserâ€™s preferred language.
Example	To find a coworker with a name that sounds like Brona, enter: soundslike:brona to return results such as Bronagh and Branagh.

Search with Approximate Spelling

Operators	Prefix handler spellslike:
Purpose	Finds documents that do not exactly match the search terms. This is useful if uncertain of the correct spelling, or there are several accepted spellings for a search term.
Example	Searching for spellslike:organise also returns documents containing organize.

Search by Date

Operators	Prefix handlers date:, document_lastmodifieddate:, document_before:, document_after:
Purpose	Retrieves documents based on a given date, or date range. By default, the input format is detected automatically. If you need to define a custom format, update the Input format field for your prefix handler in Search Logics > Query Language. What you must know: • We support the date formats: RFC 822, RFC 850, asctime, ISO 8601, and date format YYYY/MM/DD-HH:MM:SS (DD/MM/YYYY is NOT supported) • Operators are =, ==, <=, <, >=, >, != and : • The default timezone is GMT. • Quotes are required in search queries when there is at least a blank space in the date. For example, myDatePrefixHandler="12/15/2018 15:23:22 GMT+02"
Supported formats	• Sun, 06 Nov 1994 08:49:37 GMT RFC 850: • Sunday, 06-Nov-94 08:49:37 GMT • Fri Nov 21 11:18:47 CET 2014 asctime: • Sun Nov 6 08:49:37 1994 RFC 822: • Fri, 21 Nov 2014 16:59:27 MET DST • Fri, 21 Nov 2014 17:59:14 EET • Fri, 21 Nov 2014 15:59:16 +0000 (UTC) • Fri, 21 Nov 2014 16:59:42 MET • Fri, 21 Nov 2014 15:58:04 +0000 (UTC) • Fri, 21 Nov 2014 07:58:28 -0800 American date format: • 12/23/2014 15:23:22 • 12/23/2014 15:23:22 GMT+02 • 09/23/2014 08:52:59 [+00:00] • 2014/12/23 15:23:22 • 2014/01/23-22:11:37 • 2014/12/23 • 2014/12 ISO 8601 samples (ISO works with / or - separators): • 2014-03-12 15:23:22 • 2014-03-12 • 2014-03 • 2014 • 2014-12-06T15:31Z • 2014-12-06T15:31:17+00:00 • Week numbers like 2016-W18-1T09:49:38Z are NOT supported
Example	Letâ€™s say that we give the modidied alias to the document_lastmodifieddate prefix handler. We could have: • modified="11/23/2018 10:18:02 GMT+01" for a fully explicit date query • modified="2018/11/23 10:18:02+00:00" for a fully explicit date query • modified="2018/11/23 10:18:02" for a date query with the default GMT time zone interpreted implicitly. • modified="11/23/2018 10:18" for a query with an implicit range of 1 minute. • modified=2018/11/23 for a query with an implicit range of one day. • modified=2018/11 for a query with an implicit range of 1 month. • modified=2018 for a query with an implicit range of 1 year. • modified<"11/23/2018 10:18:02 GMT+01" for all documents before the explicit date. • modified<"2018/11/23T10:18:02+01:00" for all documents before the explicit date. • modified<=11/23/2018 for all documents until the end of the 11/23/2014 day. • modified<=2018 for all documents until the end of the 31/12/2018 day. • modified:[2014/12/23 TO "2018/01/21-22:11:37 GMT+01"] to search documents in a specific date range. This range notation is inclusive, and works with numerical values too. We can also restrict a search query according to a documentâ€™s last modification or creation date: • "movie star" AND date >= 2018/05/21 finds documents containing movie star modified after May 21, 2014. • and "movie star" AND date <= 2018/05/21 finds documents on movie star modified before May 21, 2018.

Search by Size

Operator	Prefix handler file_size:
Purpose	Searches based on file size in bytes.
Example	• file_size:1024 returns documents with a file size of 1 KB. • file_size>=1024 returns documents with a file size larger than 1 KB.

Search by Language

Operator	Prefix handler language:XX
Purpose	Limits your search to the documents of a specific language using the language:XX prefix handler (where XX can be EN, FR, DE, etc.). This is useful when you need to search using a term that you can find in many languages, but has different meanings from one language to another.
Example	"Tour de France" language:en searches for English-language documents about the Tour de France.

Search in URL

Operator	Prefix handler inurl:
Purpose	Includes all web pages with URLs containing the search keywords. Unlike site:, this is a full text search of the URL text.
Example	inurl:example returns: • http://www.example.com/ • http://www.exalead.com/blog/another_cloudview_example/

Search for URL

Operator	Prefix handler url:
Purpose	Searches for pages with the same normalized URLs. You do not need to include the leading http://, https://, www., and trailing slashes in the query.
Example	url:example returns: • http://www.example.com/ • http://www.exalead.com/blog/another_cloudview_example/

Search Site Content

Operator	Prefix handler site:
Purpose	Returns all documents on a site. Only expect results for documents with a publicurl meta, such as those pushed by the Crawler and the Feed Fetcher connectors. The leading "http://" or "https://" and "www.", and trailing slashes are optional in the query.
Example	site:example.com always returns the same documents as site:http://www.example.com/

Search with Optional Terms

Operator	OPT
Purpose	Specifies an optional word to include in the search. Use it to specify several terms without limiting the scope of the search.
Example	cow OPT mad searches for documents containing cow that preferably also include mad.

Search by Word Proximity

Operators	NEAR, NEXT, AFTER, BEFORE
Purpose	Find documents where search terms are in proximity of one another. By default the maximum distance between terms is 16 words. Edit this value using the Search > Search Logics > Query Language > Default distance for proximity operators property.
Example	"movie star" AFTER hollywood searches for documents where movie star appears soon after hollywood. Note: "movie star" is equivalent to movie NEXT star, the NEXT operator having a distance of 1 with the following word. You can also specify the maximum distance of the words by using NEAR/x, AFTER/x, and BEFORE/x. For example: • "movie star" NEAR/5 hollywood searches for documents where movie star appears within 5 words of hollywood, • and "movie star" BEFORE/5 hollywood searches for documents where movie star appears within 5 words before hollywood. Important: • You cannot use proximity operators with expressions whose "position" cannot be computed. For example, the query music NEAR (Madonna AND mp3) does not work, because the expression Madonna AND mp3 cannot be associated with a single word position value. • Some queries using proximity operators may fail with a No occurrence for query message when you want to open the preview of Office documents. This issue is linked to a format conversion limitation.

Prefix Search

Operator	*
Purpose	Searches using the beginning of a word to find a proper noun using its short form, or its linguistic root.
Example	Jenn* searches for documents containing words starting with Jenn, such as Jennifer, Jennie, Jenni, and Jenna.

Pattern Search

Operator	Regular expression patterns based on Perl 5. You must open and close patterns with a / (slash) character.
Purpose	Searches using the beginning of a word to find a proper noun using its short form, or its linguistic root.
Example	• /s.ren..pi.y/ searches for documents with words that match the pattern S . R EN .. PI . Y and would find documents with the word serendipity. • /mpg(1\|2\|3)?/ searches for documents containing any of the following: mpg, mpg1, mpg2, or mpg3.

Geographic Search

Operator	Prefix handler geo:
Purpose	See Configuring Geographic Search.
Example	To search within a radius or polygon using UQL, see Search with a Radius or Polygon (UQL).

Search with INNERJOIN

Operator	INNERJOIN
Purpose	Combine records from two documents whenever there are matching values in a common field. See More About INNERJOIN.

Search by Document Sections

Operator	SPLIT
Purpose	Searches for words in specific sections of a document.

Reserved Characters in UQL

This section gives a list of reserved characters in UQL and describes how to escape their interpretation.

List of Reserved Characters in UQL

If you need to use them as words in your query, you must enclose them in quotes.

Name	Character
Slash	/ Use it for: • passing options, for example, NEAR/12 • pattern search as a regexp operator, for example, text:/bug.*/
Tab	\t
Line feed	\n
Carriage return	\r
Round brackets	( or )
Square brackets	[ or ]
Curly brackets	{ or }
Colon	:
Equal sign	=
Greater than or less than	< or >
Comma	,

Note: % is not a reserved character.

Escaping UQL Operators Interpretation

You can add a backslash (\) to disable the interpretation of UQL operators (parentheses, =, {, }, etc.).

Note: If any tokenization takes place afterward, the tokenizer still decides how to enter the character. For example, \= is interpreted as a simple = but the default tokenization considers this character as punctuation. It is therefore removed from the query as any other punctuation character.

Operands

This section describes the operands that you can use in UQL queries.

Standard Operands

Operand	Description	Predicate value
(e1)	Parenthesized sub expression, used to modify priority. For example: ((fast OR speed) AND NOT light)	e1
"e1"	Quoted expression, used to escape all special characters. Inside a double quoted group, words are handled in a tight (NEXT) sequence. All operators are ignored.	e1
"word1 word2"	Quoted expression.	word1 NEXT word2
"word1 OR word2"	Quoted expression	word1 NEXT "OR" NEXT word2

Regexp and Wildcard Operands

You can use wildcards and regular expressions in UQL queries. The following table illustrates some examples.

Kind	Syntax	Example
Regular expressions	[field:]/pattern/{options}	title:/desi.*/{w=10000,#=100}
Wildcard	[field:]word{options} [field:]word{options}	desi* title:*faces

Score Modifier Operands

Operand	Description	Predicate value
s=N	Replace the predicate's score by an explicit value.	price<500{s=1000}
s+=N	Increase the predicate's score by a given value.	GUI{s+=100000}
s-=N	Decrease the predicate's score by a given value. Note: The score of a given predicate can be negative, but the final score of the document can never be lower than 0.	corporate/tree:"Top/Attributes/XXX" {s-=100000}
w=N	Replace the predicate's weight by an explicit value.	design{w=10000}
w*=N	Multiply the predicate's weight by a given value.	design{w*=2}

Note:

• The score modifiers are applied in the same order as they appear. In the case of two conflicting modifiers, for example, {s=1000,s=2000}, the last one is applied.

• In the case when an explicit score (s=) is specified, the predicate's weight is ignored so the 'w' options have no effect.

• The explicit score (s=) can only be used for nontextual predicates (numeric values and categories) since this modifier completely ignores the ranking score class set when the document was indexed.

Word Matching Options

Word matching options are specified inside {} and directly appended to search words. They must be comma-separated.

Option	Description	Example
k=number	Set explicit matching level.	k=1 is the lowercase matching mode k=2 is the normalized matching mode.
hl=0	Deactivates search result highlighting and summary for a specific node of the query only.	word1 word2{hl=0} word3

Operators by Priority

The query expansion modules rewrite the query based on the operators. To correctly expand the query, each operator has a priority.

Operators by Processing Priority, where e1 and e2 are Expressions
Priority	Operator	Explanation	Example
1	prefix handlers	Prefix handlers are always processed first.	Obama before:2009/01/01, searches for all documents relating to Obama before January 2009.
2	FUZZYAND/option (expression)	Search for documents that match at least N queries, where N is determined by the fuzzyand option. This option can be either: • minimum success: at least N queries must match. N is a positive integer. • or maximum failures: up to N queries can fail. N is a negative integer.	For a document that contains: The quick brown fox. If min. success=2: FUZZYAND/2 (the quick brown foxes) matches, but FUZZYAND/2 (a brown foxx) does not. If max. failure=-1: FUZZYAND/-1 (a quick brown fox) matches, but FUZZYAND/-1 (a quick foxx) does not.
3	OPT e1	Optional operator	OPT graphical
4	NOT e1	Negation operator	NOT myword
5	e1 NEXT e2	Explicit sequence operator for adjacency match.	user NEXT interface
6	e1 AFTER e2	Proximity (mono-directional) match	interface AFTER user
6	e1 AFTER/distance e2	AFTER with explicit word distance	interface AFTER/4 user
6	e1 BEFORE e2	Proximity with mono-directional match	user BEFORE interface
6	e1 BEFORE/distance e2	BEFORE with explicit word distance	interface BEFORE/4 user
7	e1 NEAR e2	Proximity operator with bidirectional match	user NEAR interface
7	e1 NEAR/distance e2	NEAR with explicit word distance	user NEAR/4 interface
8	e1 SPLIT e2	A document is returned if e1 appears in at least one of the document sections delimited by the e2 delimiter. If the e2 delimiter is not present in a document, then the document is returned if e1 is valid at the document level.	user interface SPLIT Chapter Matches if user interface appears between two "Chapters"
9	e1 e2	Implicit match operator on a sequence of words. It uses the implicit operator, which is AND by default.	search engine
10	(e1) INNERJOIN/key (e2)	Search for documents matching e1 where e2 appears in child documents. The relation between documents is contained in a key index field, which must be an unsigned integer. For performance reasons, it is best to enable the Stored in Memory option.	subject:exalead INNERJOIN/msgId fulltext:france We first select documents whose subject is exalead, then the join is made with documents containing the word france.
11	e1 BUTNOT e2	The search matches if there is at least an instance of e1 is not also an instance of e2 at the same position.	York BUTNOT "New York"
12	e1 AND e2	Explicit conjunction match.	user AND interface
13	e1 XOR e2	Exclusive OR operation. It can be either e1 OR e2, but not e1 AND e2	design XOR conception
14	e1 OR e2	Disjunction match	design OR conception
15	e1 BOR e2	Disjunction match To use only for a fast OR on many documents	design BOR conception

More About INNERJOIN

You can use the INNERJOIN operator to join "left" and "right" documents based on a common numerical field. INNERJOIN only returns matching documents from the "left side" of the query.

For example, if you have order and customer documents that both contain a customer_id join key field, you can perform queries such as:

order_price>42 INNERJOIN/customer_id (customer_name:john AND customer_firstname:doe)

This returns all orders (the left-side documents) for John Doe that had an order price greater than 42.

Important: INNERJOIN queries have a significant impact on search performance.

When and How to Use INNERJOIN

Only use INNERJOIN for complex multicriteria search on the right-side document. In the above example, this is the customer.

Ideally, for better search performance, aim instead to inline the right-side document information in the left-side document. However, INNERJOIN is mandatory for certain multicriteria queries.

Requirements

• The field must be an integer, a date or a value, retrievable and RAM-based.

• For a value field, the innerjoin does not support two different join key fields for left and right side. For details about value fields, see Create Value Facets for Nonhierarchical Metas.

Perform an INNERJOIN Query

When the join field name is the same for both left and right documents

• UQL: (left-side query) INNERJOIN/field (right-side query)

• ELLQL: #innerjoin(field, left-side query, right-side query)

When the join field name is different for the left and right documents (not applicable to value fields)

• UQL: (left-side query) INNERJOIN/field1=field2 right-side query)

• ELLQL: #innerjoin(field1,field2, left-side query, right-side query)

Important: When using a join field that was generated by a Data Model property, always specify the full field name, prefixed by the declaring class name, even in UQL.

Hide INNERJOIN from Users

To avoid exposing the INNERJOIN operator to the user, you can use query templates to combine ELLQL and UQL to perform the join from the user query. For details, see Defining Query Templates.

Handling Documents in Multiple Slices

The INNERJOIN operator is intra-slice only: you cannot join a left-side document (LD) from one slice to a right-side document (RD) in another slice.

There are two main ways of handling this, depending on the use case.

Use Case 1: LD and RD are shardable in the Same Way

For example, when indexing customers and orders, we can shard the index by customer ID, and make sure that all orders go to the same slice as the customer.

To do that, we must override the automatic slice balancing algorithm, using a PAPI directive called forcedSlice.

Note: A PAPI directive is a small value associated to a document, which is not part of the textual content, that gives processing instructions or hints to the processing chain.

Set a PAPI Directive

document.setCustomDirective(name, value);document.setCustomDirective("forcedSlice", "2");

You then must correctly balance all slices.

Use Case 2: LD and RD are not shardable in the Same Way

In this case, the only solution is to fully duplicate the right-side documents (RD) in each slice. As the RD is never retrieved by the INNERJOIN query, there is no risk of duplication.

This works best when the set of RDs is relatively small, so duplicating them does not dramatically impact the slices.

You need to push each RD N times (where N=number of slices), each time with a different URI, for example /myoriginaluri/slice3.

For each document, you need to set the forcedSlice directive as described in the Use Case 1: LD and RD are shardable in the Same Way.

You do not need to do anything special for your LD, since the automatic slice allocator keeps dispatching them.