Analyzing User Queries with Reporters

Configuration : Troubleshooting : Analyzing User Queries with Reporters

Output Reporting Data to a JDBC Database

Output Reporting Data to the Internal SQLite Database

Available Fields for the Reporting Publishers

CloudView contains several reporters allowing you to collect reporting information related to the behavior of your front-end applications.

For example, reporters can retrieve information related to the execution time or the CPU time of the search service, the suggest service, the mashup components of your applications (page, feed, widget, and trigger), etc. Reporters are therefore useful to analyze and troubleshoot performance issues.

Once you have selected a reporter, you can choose between different output formats to export collected data. To do so, you can select one or several publishers: CSV, JDBC, Reporting Store, PAPI.

Once data is exported to CSV, JDBC or Reporting Store, you can build reports out of it using external reporting tools.

About Reporters

Output Reporting Data to CSV Files

Output Reporting Data to a JDBC Database

Output Reporting Data to the Internal SQLite Database

Index Reporting Data as a Data Source

Available Fields for the Reporting Publishers

About Reporters

This section describes the use of the reporters delivered by default when you install Exalead CloudView.

In the Administration Console, if you select the Search > Reporting menu:

• All of them have a Reporting Store Publisher,

• the search-reporting and suggest-reporting reporters also have CSV publishers.

What Can You Do?

You can:

• Choose to add more publishers to the default reporters to output data in different formats as explained in the following "Output data to..." sections.

• Add and configure your own custom reporter if the default ones do not cover your needs.

What Can Reporters Do?

Reporter	Description
search-reporting	Collects user query data submitted to the /search-api/ command of the Search API.
suggest-reporting	Collects query data submitted to the Suggest command of the Search API.
mashup-ui-reporting	Collects data relative to task execution and to CPU activity on the Mashup UI. For example, when a user queries a page, the reporter retrieves data such as the execution and CPU time of pages, widgets, and triggers. Once configured in the Administration Console, this reporter must be enabled in the Mashup Builder > Application > Application Properties menu. Note: When you enable the mashup-ui-reporting reporter and the Mashup UI debug mode, you can open a Timeline tab in your Mashup UI application. This tab shows a set of reporting fields with bars representing either real or CPU time values.
mashup-api-reporting	Collects data relative to feeds, subfeeds, and triggers execution. This reporter allows you to understand explicitly the feed execution process, with subfeeds and triggers and to identify possible problematic issues. Once configured in the Administration Console, this reporter must be enabled in the Mashup Builder > Application > API Properties menu. Note: When you enable the mashup-api-reporting reporter and the Mashup UI debug mode, you can open a Timeline tab in your Mashup UI application. This tab shows a set of reporting fields with bars representing either real or CPU time values.
opendocs-reporting	Collects data relative to the download and preview of documents on your application pages. In other words, what happens when people click the Download and the Preview links of the Result List widget. Once collected internally, data is displayed in the Business Console’s Query Reporting > Top Opened Documents tab.

Output Reporting Data to CSV Files

Exalead CloudView can output reporting data to CSV files.

For all reporters, you can add or remove reporting fields, and adjust the rotation frequency for the output.

Note: The CSV publisher is configured by default for the search-reporting and suggest-reporting services. Reporting fields are already selected but you can edit this default configuration.

Configure the CSV Output (Optional)

1. In the Administration Console, go to Search > Reporting.

2. Under Reporter, select the CSV Publisher.

a. For Published fields, add or remove fields.

b. Modify the other options if required.

Option	Description
Output to File	Allows you to specify a name for the CSV file. The default names are suggest.csv and search.csv for the default suggest-reporting and search-reporting reporters. You can specify other file names if required.
Max file size (MB)	Allows you to specify the maximum size allowed for the CSV file. If you crossed the max size, the rotation is launched automatically.
Rotate every N months/days/hours	Allows you to specify when to write the data to a new CSV file. The previous CSV files remain on the server.
Max files to keep	Maximum number of reporting files to keep. The oldest files are discarded at rotation time. 0 means that no limit is enforced, whereas 1 discards all rotated files.
Max days to keep	Maximum file age in days to keep. The oldest files are discarded at rotation time. 0 means that no limit is enforced, whereas 1 only keep today’s files.
Max size to keep (MB)	Maximum size allowed for CSV files. The oldest files are dcarded at rotation time. 0 means that no limit isis enforced.

3. Click Apply.

Access CSV Query Data

1. Go to your <DATADIR>/run/searchserver-ss0/ directory:

2. Then for:

◦ search-reporting: /search-reporting/search.csv

◦ suggest-reporting: /suggest-reporting/suggest.csv

◦ mashup-ui-reporting: /mashup-ui-reporting/<FILENAME>.csv

◦ mashup-api-reporting: /mashup-api-reporting/<FILENAME>.csv

◦ opendocs-reporting: /opendocs/<FILENAME>.csv

Output Reporting Data to a JDBC Database

You can create your own JDBC database to store search or suggest query data, then set up the required connection details and fields to output in Exalead CloudView. For all reporters, you can configure the export to only include a subset of these fields. You can also specify additional fields. For details, see the procedures below.

Create the JDBC Database

1. Create a JDBC database:

a. Create a dedicated table for each reporter. A table to store your search query data, a separate table to store suggest query data, etc.

b. Define the fields you want to report on.

2. Copy your JDBC driver to <DATADIR>/javabin.

3. On the Administration Console Home page, restart the searchserver and connector processes.

4. Add the JDBC reporting publisher. See Add a JDBC Reporting Publisher.

Add a JDBC Reporting Publisher

1. Follow the steps in Create the JDBC Database.

2. In the Administration Console, go to Search > Reporting.

3. Expand the configuration for the reporter to export. For example, for search queries, click search-api, for suggest queries, click suggest-reporting, etc.

4. Click Add reporting publisher.

a. Select JDBC Publisher.

b. Click Accept.

5. Click the newly added reporting publisher to display its configuration settings, and:

a. For Published fields, select the fields to include.

b. Specify the connection details to the database you created in the previous procedure:

Option	action
Driver	Enter the JDBC driver class name. For MySQL, it is com.mysql.jdbc.Driver.
Connection string	Enter the JDBC URL of the database.
Login	Enter your database login, if any.
Password	Enter your database password, if any.
Table	Enter the name of the database table you want to report on.

6. Click Apply.

Output Reporting Data to the Internal SQLite Database

You can use the Reporting Store Publisher to send data to the embedded CloudView SQLite database.

The Reporting Store Publisher is the default publisher used by all reporters. You can view reporting data for the:

• search-reporting in the Business Console’s Query Reporting menu.

• mashup-ui-reporting in the Mashup UI Debug Mode > Timeline tab.

• mashup-api-reporting, in the Mashup UI Debug Mode > Timeline tab.

• opendocs-reporting in the Business Console’s Query Reporting > Top Opened Documents.

Output Data to the Reporting Store Publisher

1. In the Administration Console, go to Search > Reporting.

2. Under Reporter, expand one of the reporters, and select Reporting Store Publisher.

3. For Schema, select one of the predefined schemas of the embedded SQLite database. For example, for the:

◦ search-reporting reporter, select queries,

◦ suggest-reporting reporter, select suggests,

◦ etc.

4. For Rotation cron, enter a cron command to run the rotation job periodically. By default, 0 0 * * * runs once a day at midnight. A rotation is also triggered every time a collection is queried.

5. For Max records to keep, enter the maximum number of records that can be accumulated. When you reach the limit, the oldest records are discarded. 0 means that there is no limit to the database size.

6. Click Save.

Access Reporting Store Data

1. Go to <DATADIR>/reporting_store/<SELECTED SCHEMA>/collection.db

Index Reporting Data as a Data Source

Using a PAPI Publisher allows you to use reporting data as a CloudView data source. You are then able to index this reporting data and use the Mashup Builder widgets (charts, tables, etc.), to create your own graphical reports.

Output Data to the PAPI Publisher

1. In the Administration Console, go to Search > Reporting.

2. Under Reporter, expand one of the reporters, for example, search-reporting.

3. Click Add reporting publisher.

a. Select PAPI Publisher.

b. Click Accept.

4. Click the newly added PAPI Publisher to display its configuration settings, and:

a. For Connector name, select the Push API (unmanaged) connector.

b. For Host, enter the connector hostname.

c. For Port, enter the Push API port number <BASEPORT> + 2.

5. Click Apply.

Index Data Collected by the PAPI Publisher

1. Go to Data Model > Classes, and select Trace all metas to retrieve the reporting fields.

2. Perform a query in your front-end application.

3. In the Administration Console, go to the Home page, you see that the default connector is working as it processes the search data.

4. To add the reporting fields as configurable properties in the Data model:

a. Go back to Data Model > Classes.

b. Click Add class to create a new class for your reporting properties.

c. Click Add properties from traced metas and define how reporting fields must be indexed.

5. Click Apply.

Once done, you can search for reporting data.

Available Fields for the Reporting Publishers

By default, all the fields listed in the following tables are exported. However, for the CSV publisher, you can configure the export to only include a subset of these published fields.

Search Reporting Fields

search-reporting Query Fields
Field Name	Type	Description
timestamp	datetime	The date and time of the export of the data.
apiclient_ip	string	IP address of the client for this API request.
query_logic	string	Search logic used for the query.
query_target	string	Search target used for the query.
query_querystring	string	The UQL query (q=) entered by the user. This is the same as the _default_ value for the query template defined in searchLogicList.xml.
query_language	string	ISO language code
query_start	unsigned integer	First requested full hit.
query_hf	unsigned integer	Number of requested full hits.
query_origin	string	Explains "what" created this request: page load on Mashup UI; AJAX load on Mashup UI; trusted queries; cache warm-up; isAlive; alerting; and so forth.
answer_nmatches	unsigned integer	Total number of matches.
answer_nhits	unsigned integer	Number of hits.
time_total	unsigned integer	Total query time in microseconds.
query_full	string	Full query parameters in URL form.
query_id	unsigned integer	Auto-assigned internal query ID.
spellcheck_enabled	Boolean	Was spellcheck enabled on this query?
spellcheck_suggestions	unsigned integer	Number of spellcheck suggestions.
spellcheck_autocorrect	Boolean	Was autocorrect enabled?
spellcheck_autocorrected	Boolean	Was autocorrect triggered?
applicationId	string	Mashup application ID passed by the API client.
user_id	string	User ID passed by the API client.
usersession_id	string	Session ID passed by the API client.
userquery_id	string	Query ID passed by the API client.
processing_indexquery	string	ELLQL query
answer_status	unsigned integer	Answer status. 0=ok, 1=error, 2=timeout, 3=limit reached
time_queue	unsigned integer	Time in query processing queue in microseconds.
time_queryprocessing	unsigned integer	Time for query parsing and processing in microseconds.
time_exec	unsigned integer	Time for partial hits execution in microseconds.
time_synfh	unsigned integer	Time for synthesis and full hits execution in microseconds.
cputime_queryprocessing	unsigned integer	CPU time for query parsing and processing in microseconds.
cputime_exec_searcher	unsigned integer	CPU time for partial hits execution, searcher side, in microseconds.
cputime_exec_index	unsigned integer	CPU time for partial hits execution, index side, in microseconds.
cputime_synthesis_searcher	unsigned integer	CPU time for synthesis execution, searcher side, in microseconds.
cputime_synthesis_index	unsigned integer	CPU time for synthesis execution, index side, in microseconds.
cputime_fullhits_searcher	unsigned integer	CPU time for full hits execution, searcher side, in microseconds.
cputime_fullhits_index	unsigned integer	CPU time for full hits execution, index side, in microseconds.
searchserver	string	The search server that processed this query.
expansion_languages	string	Language detected at search-time for the expansion.

Suggest Reporting Fields

suggest-reporting Fields
Field name	Type	Description
timestamp	datetime	The date and time of the export of the data.
apiclient_ip	string	IP address of the client for this API request.
query_service	string	Name of the suggests or dispatchers called.
query_querystring	string	The UQL query (q=) entered by the user. This is the same as the _default_ value for the query template defined in searchLogicList.xml.
query_output	string	output format: JSON or XML.
query_full	string	Full query parameters in URL form.
answer_status	Boolean	0 = OK, 1 = error
answer_nhits	unsigned integer	Number of suggestions returned.
answer_blacklisted	unsigned integer	Number of removed suggestions.
time_total	unsigned integer	Total query time in microseconds.
query_distance	unsigned integer	Approximate matching. The greater the distance, the more approximate the match. 0 for exact match.
query_cursor_pos	unsigned integer	The cursor position.
query_recursive	Boolean	Was the query processed recursively?
query_autocomplete	Boolean	Was the original query auto-completed?
query_min_d1	unsigned integer	If distance >= 1: minimum entry length to perform approximative suggestions with distance set to 1.
query_min_d2	unsigned integer	If distance >= 2: minimum entry length to perform approximative suggestions with distance set to 2.
query_logic	string	Search logic used for the query.
query_callback	string	The javascript callback that was called.
query_exhaustive	Boolean	Did the output contain exhaustive information?
searchserver	string	The search server that processed this query.

Mashup UI & Mashup API Reporting Fields

These are the fields that can be applied for both mashup-ui-reporting and mashup-api-reporting.

mashup-ui-reporting & mashup-api-reporting Fields
Field name	Type	Description
timestamp	datetime	The date and time of the export of the data.
user_id	string	User ID passed by the API client.
application_id	unsigned integer	Auto-assigned internal application ID.
report_id	unsigned integer	Auto-assigned internal report ID.
component_type	string	Indicates the reported component types. For example, Page, PreRequestTrigger, Widget, MashupWidgetTrigger, etc.
component_name	string	Indicates the names of all the components that can be found on the page, that is to say the name of the page itself, the widget names and the trigger names.
component_event	string	Indicates the event types. For example, render, before_query, after_query, before_rendering, after_rendering, etc.
start_cpu	unsigned integer	CPU start time value in microseconds for each page component event.
stop_cpu	unsigned integer	CPU stop time value in microseconds for each page component event.
start_nanotime	unsigned integer	Real start execution-time value in nanoseconds for each page component event.
stop_nanotime	unsigned integer	Real stop execution-time value in nanoseconds for each page component event.
service_instance	string	Mashup UI or Mashup API instance name (as specified in Deployment > Roles). For example, mu0 for Mashup UI and ac0 for Mashup API.
user_session	string	Auto-assigned internal user session ID.
query_querystring	string	The full query string received by the page or the Mashup API.
client_ip	string	The web client (browser) IP address.
client_user_agent (for mashup-ui-reporting only)	string	The web client (browser) user agent.
client_accept_language (for mashup-ui-reporting only)	string	The web client (browser) default language.
response_size	unsigned integer	The web client (browser) or the Mashup API response size in bytes.

opendocs Reporting Fields

opendocs-reporting Fields
Field name	Type	Description
timestamp	datetime	The date and time of the export of the data.
document_source	string	The connector name.
document_uri	string	The document URI that was downloaded or previewed.
document_filename	string	The document file name that was downloaded or previewed.
user_id	string	User ID passed by the API client.
application_id	string	The name of the mashup UI application on which documents were downloaded or previewed.
query_querystring	string	The UQL query (q=) entered by the user. This is the same as the _default_ value for the query template defined in searchLogicList.xml.
query_queryfull	string	Full query parameters in URL form.
buildgroup	string	The name of the build group in which the document is indexed.
type	string	The type of action that was executed: Download or Preview.