Data Observability and Metadata Observability – Same Problem, Different Solutions
What do you think the two shapes above have in common? Would you have guessed that they have the same perimeter? Well – they do! The perimeter of these shapes are the same, but they are not the same shape, they don’t have the same dimensions, their angles are not the same size, and […]
What do you think the two shapes above have in common?
Would you have guessed that they have the same perimeter? Well – they do!
The perimeter of these shapes are the same, but they are not the same shape, they don’t have the same dimensions, their angles are not the same size, and colors are different. In fact, one of them has a very small dot embedded inside it, not even noticeable at first glance.
This is exactly what Data Observability and Metadata Observability have in common and also not so in common.
Data observability is the degree of visibility you have into the data at any given point. Data Observability knows exactly what goes on inside the data, its state, its shape, form, value, uniqueness, and changes it has seen through time.
This full picture of data is not and will not come from its metadata.
Metadata – being data about data – doesn’t exactly know what is inside. It only infers information about the data, given limited facts about it. Any SQL database readily provides this information about its tables and federated views. For example, the numbers rows in a table, the last it was updated, the range or min/max values in its various columns, its primary key, and whether the table saw some schema change, such as columns that were dropped or added.
Metadata Observability in the analogy above only gives us the perimeter of the data.
Let’s look at an example. Would a table that was updated in the last hour indicate that its data is fresh and reliable? What is the barometer to determine freshness? Is it only a timestamp? Maybe, maybe not. What if I tell you that just a few rows were updated and not the whole table? What if the table was updated but it collected some garbage? Is the data in the table fresh, and is it reliable?
While Metadata Observability looks at data about the data, Data Observability on the other hand looks at the actual data itself, and its values. It is able to validate the accuracy of the data. It can identify that the data has drifted during an update and that the number anomalies have increased, or decreased.
Additionally, metadata can not be the observability gauge for complex datasets such as semi-structured sources, streaming data, data directly coming from an application, or data retrieved by APIs. These data sources do not conform to a data model and often lack proper metadata.
While both types of observability platforms have their own use cases, a clear understanding of the differences between Data Observability and Metadata Observability helps in choosing the right tool for the right use case and setting the right expectations. And some platforms like Telmai actually offer both.
|Data Observability||Metadata Observability|
|When you need to monitor all your data across every stage of the pipeline.||When you only want to monitor your data warehouse and other structured databases.|
|When data accuracy and validity is critical to your business.||When a peripheral view of data recency and volume levels gives you a piece of mind.|
|When you need to monitor the actual data values to detect anomalies and drifts in your data.||When you can rely on only metadata updates to only see schema changes in your operational data stores.|
ArticlesSee all articles
See what’s possible with Telmai
Request a demo to see the full power of Telmai’s data observability tool for yourself.