Clearbit Uses Telmai to Deliver Accurate Data to its Customers

Data Observability helps Clearbit bring reliable data to over 4.5 billion IP addresses without a need to increase engineering resources

Overview

With data being the product at Clearbit, and with large volumes of dynamic data that are changing constantly, Clearbit needed to create a scalable data quality solution that not only keeps their data engineering work contained, but also is capable of proving the value of their data to customers.

Key Benefits

Centralized visibility into key data quality KPIs like completeness, accuracy, validity, and freshness

Identifying and resolving data quality issues before they reach data consumers

Using ML/AI to detect unknown issues in 3rd party data

Metrics that measure data quality at the record level

Track data quality improvements over time

The challenge: large volumes of new data raises the stakes for a scalable data quality solution

With a strong data platform and large number of growing customers, Clearbit was absolutely clear about the importance of trusted data and had already created a data quality engine for its rich and curated data products. 

However, with dozens of new data sources and a drive to experiment with new intelligent algorithms across hundreds of data attributes, they wanted a solution that could give them the agility to innovate and bring products to market faster. Every new change in data needed to be tested and measured quickly to understand its impact on quality before shipping. 

Clearbit didn’t just want a platform tracking problems and notifying data owners of anomalies. They were looking for a solution to build trust with their customers using data quality metrics.

The approach: treating data quality as a product became the north star for scale

To build a data quality function, Clearbit first embarked on hiring for a new role - a data quality product manager. Shortly after they hired Alejandra (Ale) Cabrera, a product manager with a pedigree in data science. 

In her newly appointed role, Ale started to define requirements for data quality as a product, like any other. Her key strategic goal was to ensure that changes to the data would follow a product life cycle - design, engineering, test, and user adoption. 

Ale’s background in data science and product management gave her the right skills and mindset to establish the data quality product initiative. However, she had to build that product with limited engineering and data science resources. Extending a home-grown rules engine to test and check every individual field at the record level was time-consuming and unscalable. To accelerate the work and to prioritize her team’s efficiency, Ale decided to look into Data Observability platforms. 

"At Clearbit our data is always growing and evolving and being able to look at all data without the right tool is time consuming. Telmai allows us to detect what data might be at risk, so we can prioritize and decide where we need to focus our attention. We use Telmai to understand the data quality and also to prioritize the focus areas for improvement.”
- Alejandra Cabrera, Data Product Manager, Clearbit

Evaluation: enlisting data observability to streamline data quality

In enlisting a solution that can sift through the highly diverse and constantly changing data that Clearbit deals with on a daily basis, Ale and the team selected three vendors. Of the three, they selected Telmai due to the need to detect data quality issues at a record level.

Other solutions provided metadata monitoring, and metrics such as schema changes, numbers of unique records, and numbers of null values. Telmai provided those and also the ability to monitor and investigate issues at the record level to measure accuracy and detect anomalies and drifts in the data at a much more granular level. 

With 48M company records, 389M contact records, 4.5 billion mapped IPs and a 94% accuracy in email deliverability, Clearbit had to select a platform that could stand behind their industry leadership in data quality.

In evaluating Telmai, Clearbit was able to sift and sort through their large body of data and determine the most critical components to focus on. Minutes after plugging in their data, Clearbit was able to see data quality KPIs across the entire data. There was no need to sample, because Telmai’s platform scaled to analyze the entirety of Clearbit’s data and provided the team with completeness, correctness, and volume metrics out of the box.

"The ability for Telmai to quickly pull data quality metrics by profiling and assessing all our data, without sampling and without any prior knowledge of our data, was impressive. We saw how quickly Telmai was able to add value to our team."
- Alejandra Cabrera, Data Product Manager, Clearbit

In addition to the summary health metrics in the first view, Clearbit was able to dig deeper and investigate their data values further. While the prior home-grown, rule-based solution was capable of checking for the types of issues they’d seen before, with Telmai, Clearbit was able to see their blind spots. 

For example, data consistency metrics ensure that no cities are mistakenly included in the Country field, or that for countries such as Canada, city names that are spelled both in French and in English – such as Montreal - are normalized and counted correctly. 

The solution: data quality became the guiding star for creating new data features

Since the evaluation, Telmai has enabled Clearbit to treat any data-related change or feature request as a product and apply the same rigor and standards around usability, quality control, trust, and performance as any other product. 

Clearbit’s product manager, Ale, is able to investigate new data quality change requests with signals in the data, create a roadmap, and build a case to pitch a new solution, together with her data engineering and data science teams. 

Telmai Data Observability @ Clearbit

Centralized data observability helps to prioritize data engineering and data science work

With Telmai, Clearbit is able to prioritize their engineering and data science resources to tackle the most important initiatives that add critical value to their customers, and also identify new areas for further innovation. For example, with Telmai Clearbit can quickly pinpoint the quality of attributes most frequently required by users. They can also build data quality KPIs calculated based on data segments most meaningful to their customers.

The ability to pinpoint data quality issues correctly and guide the team to the right direction helps Clearbit reduce its time to detect and time to resolve data quality problems. With the growing number of new customers and the introduction of new products, this has been instrumental in scaling the team. 

"At Clearbit, our data is growing everyday. We use Telmai to monitor and quality check the data that is coming in, the data that goes through our proprietary algorithms, and the data we package for our customers. Telmai is in every step of our data pipeline to ensure that the data we deliver to our customers is credible and reliable.
- Harlow Ward, CTO, Clearbit

Data observability further proves the value of good data

For Clearbit, data quality is partly about catching bad data before it reaches the customer and partly about demonstrating to customers the great quality of the data that they provide. Clearbit uses Telmai to build trust and prove the value of their solution to customers. 

To take a specific scenario, the account enrichment product is subject to companies changing information all the time. Businesses close down, expand, grow, merge, and restructure their teams. With Telmai, Clearbit can show the  accuracy and freshness of data so customers can rely on that data to better invest their sales and marketing efforts, and ultimately grow faster.

Clearbit is building a Data Quality scorecard with 8 data quality dimensions to further expand their vision. These include Accuracy, Freshness/Timeliness, Completeness, Consistency, Integrity, Reasonability, Uniqueness, and Validity. 

"Data quality is a problem of trust. To build that trust, you need good data quality KPIs, and for that you need the power of a really good tool. Telmai gives us that.”
- Alejandra Cabrera, Data Product Manager, Clearbit

Why Telmai

In many ways Clearbit views Telmai as a platform that instruments a movie production set about the data. Unlike theater, where you only see the part of the stage that is in front of you, in a movie production, you as the director have lots of cameras and shots available to you, giving you the entire view of the play and the ability to drill into any corner and see all the sides of a scene.  

By staging data in Telmai, Clearbit has the ability to see the high level metrics, and the ability to drill in and investigate data in every attribute and every record. They also have the ability to set up alerts and notifications for just the parts of the scene that are most important to them, and use business dashboards to roll up the most important data quality metrics for a high level summary view. . 

In short, Clearbit selected Telmai as their data observability solution because:

  • First view into data quality KPIs as soon as soon as the data is plugged in
  • Data quality at scale across millions of data records without sampling
  • Ability to drill down to see the route cause of the data quality issues 
  • ML to detect unknown data patterns, anomalies and drifts in data from 3rd party sources
  • Ability to define custom expectations and alerting 
  • Visual investigative capabilities to analyze data drifts
  • Reduce time to detect and time to resolve data quality issues
Telmai Data Observability @ Clearbit

Start your data observibility today

Connect your data and start generating a baseline in less than 5 minutes. 

Telmai is a platform for the Data Teams to proactively detect and investigate anomalies in real-time.
© 2022 Telm.ai All right reserved.