Lucid Technologies & Solutions Pvt. Ltd.

Solution 1:

A custom Search Service configured using Relevancy and Display Rules

Pre-requisites: Collibra DGC 5.6+ with API v2, Elasticsearch 6.7

Packaging: ontainerized application exposing the Search API; custom workflow to set up the relevancy and display rules; Collibra asset model extension

Licensing: Annual subscription

To enable users always find appropriate data assets from the Enterprise catalog with the least number of clicks, we have created a custom search service that can be embedded in any application, including within a dashboard in Collibra Data Governance Center (DGC). Relevancy rules that reflect the Enterprise Data Standards can be configured in DGC, thereby ensuring that the assets that best meet the Enterprise Data Standards always rise to the top of the search results, thus rewarding adherence to standards.

How is it different from the Collibra Search API?

• Keyword matches against a specific attribute of an asset or a related asset (for e.g. label or name of a report attribute can be matched and the report name returned in the result)
• Search results can be ordered as specified by a configurable relevancy rule and with exact matches before partial matches – for e.g. on the exact match of the search keyword, rather than results being ordered alphabetically, we can specify the order of the results based on rules such as search keyword
(1) exact match to name or acronym of an “approved” metric/kpi/business term;
(2) exact match with a label or name of a report attribute of a report having a value “certified” for status;
(3) exact match to a name of a linked business term of a column of a certified view;
(4) exact match with a label or name of a report attribute of a report having a value “pending certification” for status;
(5) exact match to a name of a column in a table;
• Exclusion criteria can be specified as a parameter for the Search API for e.g. do not match any asset having a value of “Deprecated” for status
• Content to be returned in the search results can be managed using the display rules





Scoring Rules: Rules that specify what assets are relevant in the context of the search, attributes to be matched against the search keyword(s) on either the asset or a related asset and relative relevancy weight (used to order the search result).

Display Rules:Rules that specify what assets and attributes should be displayed in the search result. When a search keyword is matched against an attribute of a related asset, the matched asset is also made available in the search result.


Custom Workflows: To create and manage the Scoring and Display Rules in Collibra DGC


Extract: Assets are periodically extracted from Collibra DGC using Collibra REST APIs. The extraction (refresh) frequency can be configured. Initial and incremental refresh options are supported.


Search Repository: Custom Elasticsearch repository used for persisting extracted content.


Scoring Engine: Calculates the relevancy score of assets using the scoring rules. The relevancy score will be used to order the search results.


Search Service: Delivered as a REST API that can be used in any application.

Governance Analytics can also be powered using this custom Elasticsearch Repository.