CMIS-Based Query Language

→ "Try it out - API Component "YADB" ("POST search documents by search query" endpoint)"
→ check GitHub Python calls | Java Script calls | Java calls | Postman collection


The query language based on Content Management Interoperability Services (CMIS) serves to standardize requests using the yuuvis® API. It consists of a subset of the SQL-92 grammar, augmented by a syntax for requesting fulltext. Thus, the CMIS query language provides a relational view of the data. A query is structured as follows

Query structure SELECT [virtual column list] FROM [virtual table names] WHERE [conditions] ORDER BY [sorting specification]

The virtual columns requested in the SELECT correspond to the CMIS properties (metadata fields).

  • Use * as virtual column list to query all metadata fields
  • It is possible to use the SCORE() function in the virtual columns list

The virtual tables requested in the FROM clause correspond to object types of the document management system (DMS)
Note that currently only the query of a single object type is supported.

Simple SELECT Queries

Select all objects of type 'email'
SELECT * FROM email;
Select the customer ID (custno) and the date (date) of all invoices (invoice).
SELECT custno,date FROM invoice;

Queries with Aliases

Properties can be queried with an alias name. The values of these properties are returned in the result with the alias name. The alias can be specified by a space separated behind the property name or with the key word AS between property name and alias name.

Alias without keyword: Select the customer ID (custno) with alias c and the date (date) with alias d of all invoices (invoice).
SELECT custno c,date d FROM invoice;
Alias with keyword: Select the customer ID (custno) with alias c and the date (date) with alias d of all invoices (invoice).
SELECT custno AS k,date AS d FROM invoice;

Queries with Conditions

Operators and Predicates

In the WHERE clause, conditions can be formulated and combined with logical operators. The following operators are supported:

  • comparison operators: <, >, <=, >=, <>, =
  • join operators: AND, OR, NOT
  • additional predicates: LIKE, IN, NULL
Select all invoices (invoice) where the invoice amount (amount) is greater than 100.
SELECT * FROM invoice WHERE amount>100;
Select all invoices (invoice) where the invoice amount (amount) is greater than 100 and the supplier (lieferant) is 'ACME'.
SELECT * FROM invoice WHERE amount>100 AND lieferant='ACME';
Select all invoices (invoice) where the invoice amount (amount) is greater than 100 or less than 10 and the supplier (lieferant) is either 'ACME', 'OS' or 'ICH'.
SELECT * FROM invoice WHERE (amount>100 OR amount<10) AND lieferant IN ('ACME','OS','ICH');

Wildcards

Wildcard characters are used with the LIKE operator. There are two wildcards used in conjunction with the LIKE operator:

  • % - The percent sign represents zero, one, or multiple characters
  • _ - The underscore represents a single character
Select all invoices (invoice) where the supplier (lieferant) starts with 'ACM' and the index data field 'amount' is not NULL
SELECT * FROM invoice WHERE lieferant LIKE 'ACM%' AND amount NOT NULL;

Full-Text Search

At the moment, there is a shortlist of stop words that will NOT be indexed.

Some Examples

and is the
hello it ok
to too

Full list of Stop Words

To perform full-text search, use CONTAINS in the WHERE clause.

Select all invoices (invoice) where in fulltext either 'reminder', 'discount' or 'delay' was found.
SELECT * FROM invoice WHERE CONTAINS('reminder discount delay');
Select all invoices (invoice) where the invoice amount (amount) is greater than 1000 and in fulltext either 'reminder', 'discount' or 'delay' was found.
SELECT * FROM invoice WHERE amount>1000 AND CONTAINS('reminder discount delay');

The SCORE function returns the score values produced by the CONTAINS function in the SEARCH_SCORE field. The score value is a relative ranking, with values in the range of 0 to 1 and increases with the relevance. The SCORE function can also be queried with an alias name.

Select all invoices (invoice) and their score value where the invoice amount (amount) is greater than 1000 and in fulltext either 'reminder', 'discount' or 'delay' was found.
SELECT *,SCORE() FROM invoice WHERE amount>1000 AND CONTAINS('reminder discount delay');
With alias: Select all invoices (invoice) and their score value where the invoice amount (amount) is greater than 1000 and in fulltext either 'reminder', 'discount' or 'delay' was found.
SELECT *,SCORE() score FROM invoice WHERE amount>1000 AND CONTAINS('reminder discount delay');

Full-text search only in certain fields

Select all invoices (invoice) and their score value where the field 'status' either contains 'reminder', 'discount' or 'delay'.
SELECT *,SCORE() FROM invoice WHERE status CONTAINS('reminder discount delay');

Sorting with ORDER BY

To sort the result list, use the ORDER BY clause.

Select all invoices (invoice) where the invoice amount (amount) is greater than 1000, with the result list sorted in descending order of the invoice amount.
SELECT * FROM invoice WHERE amount>1000 ORDER BY amount DESC;
Select all invoices (invoice) where the invoice amount (amount) is greater than 1000, with the result list sorted in descending order of the invoice amount and in ascending order of the date (date).
SELECT * FROM invoice WHERE amount>1000 ORDER BY amount DESC, date ASC;

In ORDER BY clauses, you can use the alias names given in the column definitions.

Select the invoice amount (amount) with alias a of all invoices (invoice) where the invoice amount (amount) is greater than 1000, with the result list sorted in descending order of the alias a (invoice amount).
SELECT amount a FROM invoice WHERE amount>1000 ORDER BY a DESC;

To sort by the score value, you must assign an alias name to the SCORE () function.

Select all invoices (invoice) and their score value where the field 'status' either contains 'reminder', 'discount' or 'delay', with the result list sorted in ascending order of alias s (score value).
SELECT *,SCORE() s FROM invoice WHERE status CONTAINS('reminder discount delay') ORDER BY s ASC;

Aggregations (COUNT(*), GROUP BY)

To determine the number of objects with special properties, queries can be aggregated. All objects with the same entry in the field specified after the GROUP BY clause are summarized. The number of objects is returned to the field 'OBJECT_COUNT' without the assignment of aliases.

Select the number of objects (enaio:object) with type specification, aggregated by the type of the object.
SELECT COUNT(*),type FROM enaio:object GROUP BY type;

It is also possible to aggregate several fields. In this case, all combinations of the fields mentioned are returned.

Select the number of objects (enaio:object) with type and title specification, aggregated by the type and title of the object.
SELECT COUNT(*),type,title FROM enaio:object GROUP BY type,title;

The GROUP BY function can be used without the COUNT(*) function. Then all combinations of the mentioned fields are returned without their numbers.

Select 'type' of all objects (enaio:object), aggregated by the type of the object.
SELECT type FROM enaio:object GROUP BY type;

To sort the result list of an aggregation, use the ORDER BY clause, with or without alias.

Select the number of objects (enaio:object) with type specification, aggregated by the type of the object, with the result list sorted in ascending order of 'type'.
SELECT COUNT(*),type FROM enaio:object GROUP BY type ORDER BY type ASC;
With alias: Select the number and 'type' with alias 't' of all objects (enaio:object), aggregated by 'type', with the result list sorted in descending order of alias t (type).
SELECT COUNT(*),type t FROM enaio:object GROUP BY type ORDER BY t DESC;
Select the number with alias 'c' and 'type' of all objects (enaio:object), aggregated by 'type', with the result list sorted in descending order of alias c (number of objects).
SELECT COUNT(*) c,type FROM enaio:object GROUP BY type ORDER BY c DESC;

Provided date functions in WHERE clause

The query language provides some frequently used time span functions for WHERE clauses.

  • today() - a date field contains a today's date
  • yesterday() - a date field contains a yesterday's date
  • thisweek() - a date field contains a this week's date, a week starts on moday and ends on sunday
  • lastweek() - a date field contains a last week's date, a week starts on moday and ends on sunday
  • thismonth() - a date field contains a this month's date
  • lastmonth() - a date field contains a last month's date
  • thisyear() - a date field contains a this year's date
  • lastyear() - a date field contains alast year's date
Usage
SELECT ... FROM ... WHERE enaio:creationDate IN today();
Examples
SELECT contentStreamId, fileName, repositoryId FROM enaio:object WHERE enaio:creationDate IN lastmonth();
SELECT * FROM document WHERE enaio:lastModificationDate IN today() AND enaio:creationDate IN lastyear() ORDER BY enaio:creationDate DESC;
SELECT COUNT(*) c, enaio:creationDate FROM enaio:object WHERE enaio:creationDate IN thisyear() GROUP BY enaio:creationDate ORDER BY c DESC;

Queries on contentstream fields

Additional to property fields of a DMS object, all contentstream fields can be queried in SELECT statements as described above. The following queriable contentstream fields exist:

  • contentStreamId- the ID of the contentstream
  • repositoryId- the ID of the repository the content is stored in
  • mimeType - the mime type of the content
  • fileName - the file name of the content
  • length - the length in bytes of the file
  • range - for compound documents the byte range which represents the object content in the file
  • digest - the sha256 hash of the content
  • archivePath - the file path within an archive
Examples
SELECT contentStreamId, fileName, repositoryId FROM enaio:object;
SELECT * FROM document WHERE mimeType'message/rfc822' ORDER BY title DESC;
SELECT COUNT(*) c, fileName FROM enaio:object GROUP BY fileName ORDER BY c DESC;

Query - JSON format

A search can be called by HTTP POST Endpunkt /api/dms-core/objects/search by giving the following JSON-structure:

Input JSON with use of parameters
{
  "query" : {
    "statement" : "SELECT * from sysemail WHERE email = @emailAddress AND sysfrom IN @from",
    "skipCount" : 0,         										// optional for Paging
    "maxItems" : 50,         										// optional for Paging
	"useCache" : true,       										// optional
	"handleDeletedDocuments" : "DELETED_DOCUMENTS_EXCLUDE",			// optional DELETED_DOCUMENTS_INCLUDE | DELETED_DOCUMENTS_ONLY | DELETED_DOCUMENTS_EXCLUDE default: DELETED_DOCUMENTS_EXCLUDE 
 	"parameters": {													// optional, only if @ Parameter occurs in statement 
		"emailAddress": "info@optimal-systems.de",
		"from": "('hallo@huhu.de','info@huhu.de')"
	}
 }
}

The field statement contains the SQL query, skipCount (default: 0) defines the number of skipped objects, from with the maxItems (default: 50) number of found objects will be delivered. This defines the paging of the search call. The field useCache (default: true) defines if caching will be used. Within a default search, all objects marked as deleted will be excluded (DELETED_DOCUMENTS_EXCLUDE). By using the optional field handleDeletedDocuments this behaviour can be changed in including this objects (DELETED_DOCUMENTS_INCLUDE) or delivering only deleted objects (DELETED_DOCUMENTS_ONLY).

Using of parameters

Within a SQL statement two predefined parameters can be referenced. By @userId the ID of the actual user (taken from the user session) and by @userRoles the roles of the actual user (also taken from the user session) can be referenced for replacement at runtime.

Input JSON with use of @userId and @userRoles parameters
{
  "query" : {
    "statement" : "SELECT * from enaio:object WHERE enaio:createdBy = @userId AND enaio:roles IN @userRoles"
 }
}

The JSON field parameters can be used to define parameter (reserved names: userRoles, userId) values which can be referenced by @<parameter> within a preformed sql statement for replacement at runtime. The reserved parameter names userRoles, userId cannot be used for own definition of a parameter. A given definition under this names affects an exception.

Input JSON with use of parameters
{
  "query" : {
    "statement" : "SELECT * from enaio:object WHERE email = @emailAddress AND email:from IN @from",
 	"parameters": {													// optional, only if @ Parameter occurs in statement 
		"emailAddress": "info@optimal-systems.de",
		"from": "('hallo@huhu.de','info@huhu.de')"
	}
 }
}