Metadata Management: 1+1 =3

Continuing from the earlier blog on why Metadata management cannot be an afterthought, let us get a quick 101 intro on metadata types before look at their management aspects.

Business metadata captures the business definitions for each of the data elements in the analytical platform along with optional elements such as Business policies, standards and rules that govern the data elements such as range of valid values, data quality rules and so on. This is primarily used by Business to ensure there is a common vocabulary of data elements used across the platform. Like the typical example of a ‘customer’, it means different things to different groups – for sales, customer is somebody who has bought a product, for marketing, it is a prospect but for customer service it is somebody who is using the product. It could include active customers, inactive customers. It could include individuals, organizations and so on. So imagine a platform that does not define what a specific data element is but we use that to analyze values it contains!

Technical metadata includes the attribute/column descriptions, database constraints, data to the detailed derivations and calculations on the data element that is built into various ETL and reporting/analytical tools. We need to chain the metadata within all the data processing and storage layers of the platform to help us trace the lineage of the data element to its system-of-record.

Operational metadata captures the metrics around operation of the analytical platform such as the data load job start/end times, execution logs, resource utilization and so on. That helps us understand if the platform is performing well to deliver results as per the SLAs or provide the needed support to meet the SLAs. We need to make the necessary provisions within the platform to log this metadata.

Each of these metadata types has a purpose and a value. However, when they are brought together into a single platform, the value multiplies manifold.

When we are looking at such as integrated platform for metadata, what does the platform need to support?
(A) Enabling the capture of the metadata. At the minimum, it needs a workflow to create and modify the metadata. Business Metadata requires lot more human intervention to capture than Technical and Operational. The ETL and reporting/analytical products have the capability to capture them automatically. It is more for us to ensure the capture is enabled and the captured metadata is leveraged.
(B) Storage of the metadata along with their versions. Apart from the built-in metadata repositories of the ETL and reporting products that store technical metadata, we need a mechanism to store rest of the metadata. This needs to support changes to metadata with their history and ability to support workflow for their creation and modification.
(C) Access the stored metadata for reporting, impact analysis and so on. This is the interface through which both the Business and Technical users leverage the metadata for the value it brings in. So this concerns aspects of access control and navigation

In subsequent blogs, we will look at the ’How’ – the process, people and the tools – aspects of a typical metadata management platform…

About the Author: Anand Govindarajan

Anand Govindarajan

Chief Data Architect
Email: anandg@lucidtechsol.com
Linkedin: http://in.linkedin.com/in/anandgovindarajan/