Data Quality

Identify and resolve data quality issues at scale

The Problem

Detecting data quality should take both known and unknown issues into account. Unfortunately, rule-based tools don’t work for unknown-unknown. They are hard to maintain and manage, and you have to constantly tweak and learn from your mistakes.

Metadata observability tools are limited in monitoring only a few KPIs like schema change, which does not capture unknowns, and onlys works on relational sources. And while dev tools may provide more accuracy, they are not usable by other teams who know the most about the data context.

The Solution

Quickstart with 40+ pre-defined metrics

Telmai uses machine learning to scan all your datasets and generate a set of health metrics for each including uniqueness, volume, distribution, pattern drifts, schema changes, and more. These metrics are automatically classified into Data Quality Key Indicators (KPIs) like accuracy, validity, freshness, completeness, and more.

ML and metadata driven metrics

Temai uses both the data values itself and the metadata to identify and predict your data quality issues. Telmai uses metadata to provide insights on volume, schema changes, and recent table updates while using machine learning to detect trends, outliers and drifts in actual data values. Together, these two techniques give you a full picture of your data quality.

Monitor complex data types

Gain an end-to-end visibility into your entire data pipeline regardless of the type, form, shape, frequency and volume of data in the pipeline. Telmai monitors data of all types - from data warehouses and analytic databases, to data lakes, semi-structured sources such as JSON, streaming data such as Kafka queues, pub-sub messages, and even data extracted from APIs.

Auto-remediation via API integration

Using Telmai’s REST APIs, you can automatically invoke a remediation flow. Telmai can also be integrated with pipeline and workflow orchestration systems such as Airflow to assert the data monitoring outcomes into your data pipelines. Possible workflow steps include: Reject bad data, and create a help desk ticket for remediation by upstream source; reject bad data, and relaunch the previous pipeline step; or accept bad data, label and remediate it in the future.

Video for the solution

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.

Start your data observibility today

Connect your data and start generating a baseline in less than 10 minutes. 

Telmai is a platform for the Data Teams to proactively
detect and investigate anomalies in real-time.
© 2022 Telm.ai All right reserved.