Packaging: This utility comprises of a workflow and a set of custom services distributed as a containerized application.
Licensing: Annual subscription
Since Collibra has a catalog of all the physical data stores and reporting platforms, it can be leveraged to reduce data debt and drive additional operational efficiencies through consolidation of similar assets. To identify potential duplicates, we use the business context associated with the asset.
The utility checks for “similar” assets using the glossary elements like business terms or metrics linked to the data attribute (column) or report attribute and generates a match score across assets of the same category. A configurable threshold can be set for the match score, and a notification will be generated for the technical stewards for the groups of assets that have a match score greater than the threshold, flagging them as potential duplicates. The potential duplicates can also be downloaded into an Excel sheet for offline review.
Additional configuration is also available to restrict the match to the same source platform or check for the same asset class (e.g. report) across platforms.
