Data Validation Archives

Telmai Brings Autonomous-Ready Data Observability for the Agentic AI Era

Posted on October 27, 2025October 27, 2025 by Anoop Gopalam

Introducing Telmai’s Data Reliability Agents

Telmai, the AI-powered data observability platform, today announced its Agentic offerings to make enterprise data truly Autonomous-Ready. These new capabilities ensure agentic AI workflows can communicate, decide, and execute actions on real-time trusted data with minimal human oversight.

Agentic AI significantly changes the requirements for how organizations manage their data and thus their data quality (DQ). Because Agentic AI requires low-latency and real-time access to validated data, it’s imperative that data quality happens right at the source, not downstream, where most companies focus their DQ efforts today.

But validation alone isn’t enough. AI agents also need to understand whether data is truly fit for purpose in the context of their actions. This involves delivering contextual information about data health as metadata into catalogs and semantic layers that AI agents can access.

Only when trust and context are combined can AI agents operate responsibly and enterprises deploy them with real confidence.

Telmai has the unique ability to continuously validate, monitor, and enrich data with quality signals at the lake and can push that data quality metadata for consumption by agents. This creates the trusted foundation that autonomous AI products need to operate reliably and at scale.

With Telmai’s latest product launch, AI agents can continuously access reliable data and the critical data quality context needed to automate downstream workflows.

Real-Time, Continuous, Agentic AI-Ready Data

Telmai’s Data Reliability Agents ensures continuous validation, context, and governance across open lakehouses

At the core of this update is the introduction of Telmai’s MCP-compliant server, which enables LLM-powered agents like Claude, Bedrock, or Vertex to query Telmai directly. Telmai continuously validates data, whether structured, semi-structured, or unstructured. Additionally, it generates comprehensive data quality metadata alongside the validated data, providing essential context on data health to ensure the data is reliable and AI-ready. Through the MCP layer, AI agents can access and retrieve validated data and metadata into their agentic workflows, eliminating the need for third-party transformations or complex workarounds.

“In the era of model commoditization, true competitive advantage will emerge from trustworthy, dynamic, and contextually aware data,” said Sanjeev Mohan, industry analyst and principal at SanjMo. “Telmai’s latest release is a big step in this process. It offers continuous validation and contextual metadata that enable AI agents to act responsibly, while reducing the operational debt that has long hindered enterprise adoption.”

Natural Language AI Assistants & Decentralized Data Trust

Building on this foundation, Telmai is introducing a suite of AI assistants called Data Reliability Agents accessible through natural language interfaces, enabling both technical and non-technical users to interact directly with the platform. This decentralization means that ownership of data reliability no longer sits solely with engineering, accelerating time to value by making platform management and critical data quality insights accessible and actionable to all relevant stakeholders.

Autonomous Detection and Remediation

Telmai’s Data Reliability Agents enable autonomous detection and resolution of data anomalies. These intelligent agents continuously monitor data pipelines for irregularities and provide clear, plain-language explanations of root causes. Identifying and resolving complex data quality issues that once required deep technical expertise are now easily understood and addressed by both technical and business teams. Beyond detection, the Data Reliability Agents provide actionable recommendations and assist in generating data quality rules tailored to newly identified anomalies.

Furthermore, these Data Reliability Agents augment existing automated workflows, such as ticket creation and alert triggers, to help data teams proactively adapt and drive continuous improvement in their data quality processes.

This comprehensive approach closes the loop from detection through triage and remediation, ensuring that data being fed into the downstream processes is not only trustworthy but consistently ready for autonomous consumption and decision-making.

“As AI agents take the reins of decision-making, we believe autonomy should never come at the cost of reliability,” said Mona Rakibe, Co-founder & CEO of Telmai. “With these updates, Telmai is laying the groundwork for true intelligent automation and allowing enterprise data teams to shift their focus to driving measurable business value via Agentic AI.”

For more information or to learn more about Telmai’s Data Reliability Agents, request early access today.

Want to stay ahead on best practices and product insights? Click here to subscribe to our newsletter for expert guidance on building reliable, AI-ready data pipelines.

Telmai + Atlan unify trust and context to scale autonomous enterprise AI systems

Posted on October 3, 2025April 21, 2026 by Anoop Gopalam

With a surge in AI investment, enterprise leaders are under mounting pressure to deliver reliable, scalable AI solutions that create measurable business impact. A recent MIT study found that 95  percent of generative AI projects failed to produce measurable outcomes, as many organizations struggle to move beyond experimentation and into reliable execution. AI pilots are failing to deliver, not because the models don’t work, but because the underlying foundational data systems upon which they are built lack reliability and context.

Autonomous systems and AI agents act on data in microseconds, so there’s no time for late-stage downstream fixes where most companies focus their data quality efforts today. To power AI-native ecosystems at scale, organizations must build trust at the source as data is ingested and ensure that data quality metadata is pushed to data catalogs and metadata systems, allowing agents to evaluate fitness before consumption. This creates the trusted foundation that Autonomous AI products need to operate reliably and at scale.

Image source – The AI Value Chasm

By combining Telmai’s AI-first data quality platform with Atlan’s AI-native metadata and governance platform through Atlan’s App Framework, enterprises gain a seamless way to detect, resolve, and govern data issues directly within the tools their teams already use.

In this article, let’s dive deeper into how they together create a single fabric of trust + context that allows enterprises to move beyond pilots and scale AI responsibly.

Why is real-time validation at ingestion critical for reliable AI?

The failure point for most AI initiatives isn’t in the model, but rather it’s in the underlying data pipeline feeding it. Business Intelligence (BI) is inherently deterministic and descriptive, working with structured historical data to explain what happened through predefined reports and dashboards. AI, in contrast, is non-deterministic and predictive. It consumes both structured and unstructured data to learn patterns, forecast outcomes, and make autonomous decisions.

The old adage “garbage in, garbage out” takes on far higher stakes here. AI and LLMs always identify patterns from the inputs they receive and derive insights without context or judgment. If those inputs are incomplete, drifting, or biased, the model confidently reproduces those flaws at scale.
Traditional data quality approaches were designed for a reporting world, where errors could be corrected after a dashboard broke or a KPI looked suspicious. AI-native workloads break this model entirely. AI and autonomous systems operate at machine speed, where thousands of micro-decisions are made every second. Waiting until the BI or monitoring layer to enforce quality is simply too late, as the damage has already propagated through to downstream business-critical applications.

That’s why ingestion-layer validation has become non-negotiable. Data quality must be ensured before Agentic workflows can access or read the data, not after the data lands in the access layer, at the data lake, or before it enters the lake through event streams. Reliability must be enforced as data is ingested into the lake, especially in open formats like Apache Iceberg and Delta Lake, where it is profiled and validated before being published to production tables.

This is exactly where Telmai comes in. Purpose-built for AI-first architectures, Telmai continuously monitors and validates data through your data pipeline, irrespective of volume or velocity. Telmai’s ML-driven and rule-based checks automatically detect anomalies, schema changes, and data drift before they impact production. Further, Telmai can publish data health KPIs into data catalogs and metadata systems like Atlan, enriching lineage and governance with real-time data quality context.

Agentic AI systems don’t have the luxury of waiting for late-stage fixes. They act in microseconds. That means trust must be built into data at ingestion, and that trust must travel with context across the enterprise, “ said Mona Rakibe, Co-Founder and CEO of Telmai. “With the App Framework, Telmai and Atlan will give teams a trusted data layer ready to power applications and integrations that let AI move beyond pilots and deliver at scale.”

How Atlan extends this trust with context

Trust in data is only half the story. For enterprises to scale AI responsibly, trust must travel with context so every consumer, whether human, system, or AI agent, knows what the data means, where it came from, and how it can be used.

As part of Atlan’s new App Framework, Telmai is now integrated directly into Atlan’s Enterprise Context Layer. For the first time, enterprises can unify monitoring within the same foundation that powers column-level lineage, business-ready data products, AI governance, and policy & compliance monitoring.

With this integration, customers using Telmai and Atlan can:

Unify trust and context in the Metadata Lakehouse – Telmai’s data quality signals, such as freshness, anomaly detection, schema changes, and volume drift, are automatically surfaced inside Atlan, enhancing lineage and metadata with actionable insights that empower data consumers and AI agents alike.
Enable true interoperability for agentic AI – For agentic systems to truly scale, interoperability is critical. The tools and services that agents depend on, whether for validation, access, enrichment, or downstream action, must be accessible through a common layer. Atlan delivers this through its open Metadata Lakehouse by providing consistent, versioned context across raw ingestion data and curated data products, ensuring data fitness can be evaluated at every step.
Enforce policy and compliance at scale – With Telmai’s data quality metadata embedded in the context layer, data trust signals can flow downstream via column-level lineage and bidirectional tag management to other platforms like Databricks, Snowflake, or data access systems. When Data Quality issues are encountered, they can trigger automated governance workflows, ensuring policy compliance and reducing risk across autonomous AI pipelines.

Enterprises can’t scale AI responsibly without a foundation of trust and context. Telmai brings real-time, ingestion-level validation, and Atlan serves as the context layer, ensuring that trust travels with context across every system, workflow, and AI agent.” said Marc Seifer, Head of Global Alliances at Atlan. “Together, Telmai and Atlan are enabling organizations to move beyond pilots and build AI systems that operate reliably, responsibly, and at scale.”

Get AI-Ready—Now

For enterprises to successfully transition AI pilots into production, they need real-time, low-latency access to validated data, along with metadata that carries context about its health, lineage, and governance. Without this foundation, AI agents operate blindly, lacking visibility into whether the data they consume is fit for use, where it originated, or whether they are authorized to access it.

Telmai and Atlan close this gap. Telmai continuously monitors and validates data in open table formats, such as Apache Iceberg and more, as it lands in the lake layer, detecting anomalies and data quality issues before they propagate downstream. It then generates rich observability metadata, which flows into Atlan’s Metadata Lakehouse. There, these signals are combined with lineage, policies, and business glossaries, providing a complete picture of data health and context for both humans and AI agents.

By bringing Telmai’s data quality signals into Atlan’s Enterprise Context Layer, enterprises can now drive measurable impact on their AI implementation with reliability and context fabric that enables AI to scale responsibly.
Want to learn how Telmai and Atlan can work together to scale your existing data infrastructure to be AI-ready? Click here to connect with our team for a personalized demo.

Want to stay ahead on best practices and product insights? Click here to subscribe to our newsletter for expert guidance on building reliable, AI-ready data pipelines.

Driving Reliable Graph Analytics on your Open Lakehouse Data with Telmai and PuppyGraph

Posted on August 11, 2025August 11, 2025 by Anoop Gopalam

As enterprise data ecosystems expand, data pipelines have become increasingly distributed and heterogeneous. Critical business information streams in from numerous sources, landing in diverse cloud systems such as data warehouses and lakehouses. With the rapid adoption of open table formats like Apache Iceberg and Delta Lake, extracting meaningful context from this complex data landscape has grown more challenging than ever.

Graph databases are emerging as a vital tool for enterprises aiming to surface nuanced insights hidden within vast, interconnected datasets. Yet, the reliability of any graph model hinges on the quality of its underlying data. Without clean, trusted data, even the most advanced graph engines can produce misleading insights.

This article examines why graph modeling on open lakehouses demands a heightened focus on data quality and how the combined strengths of Telmai and PuppyGraph deliver a robust, transparent, and scalable solution to ensure your knowledge graphs stand on a foundation of clean, reliable data.

What is a Graph Database?

A graph database models data as nodes and edges rather than rows and tables. This structure makes it ideal for querying complex relationships that include customer behavior, fraud detection, supply chain optimization, or even social behavior graphs.

Graph databases use nodes (representing entities) and edges (representing their relationships) to reflect real-world connections naturally. This model is especially powerful for answering questions like:

How are customers, products, and transactions interrelated?
Which suppliers, shipments, or touchpoints form a risk-prone chain?
What are the shortest paths or networks among organizational entities?

There are two main types of graph databases:

RDF (Resource Description Framework): Schema-driven, commonly used for semantic web applications.
Property graphs: More flexible, allowing arbitrary attributes on both nodes and edges, making them intuitive for a wide range of use cases.

PuppyGraph is a high-performance property graph engine that lets you query structured data as a graph, making it easy to uncover relationships and patterns without moving your data into a separate graph database. It supports Gremlin and Cypher query languages, integrates directly with tabular data sources like Iceberg, and avoids the heavyweight infrastructure typical of traditional graph databases.

Graph Power Without Data Migration

Historically, running advanced analytics meant extracting data from storage and loading it into tightly coupled, often proprietary platforms. This process was slow, risky, and led to fragmentation and vendor lock-in.

Modern compute engines like PuppyGraph break this pattern by enabling direct, in-place graph querying over data in object storage. This creates a centralized source of truth while maintaining architectural flexibility, reducing complexity, and preserving data integrity, future-proofing your analytics stack.

Why Data Quality Must Be Built Into Your Graph Pipeline

Modern lakehouses built on open table formats like Apache Iceberg or Delta Lake promise agility, scale, and interoperability for enterprise data. However, their very openness can mask a new breed of data quality issues that quietly erode the value of downstream analytics, especially in graph modeling.

Key data quality issues common in open table formats and distributed pipelines that could affect graph modeling include:

Schema drift and type inconsistencies: Data evolving over time may introduce mixed data types or missing columns, breaking parsing logic and causing graph construction failures or unexpected node/edge omissions.

Null or missing foreign keys: Missing references between tables can create orphaned nodes or broken edges, fragmenting the graph and skewing relationship metrics.

Inconsistent or mixed timestamp formats: Time-based event relationships rely on accurate event sequencing. Mixed formats disrupt these sequences, making time-based graph queries unreliable.
Out-of-range or anomalous values: Erroneous measurements or outliers can bias graph algorithms, for example by inflating edge weights or misrepresenting geospatial relationships.

Duplicate or partial records: These create redundancy and fragmentation, inflating graph size and complicating pattern detection.

Referential mismatches across distributed datasets: Inaccurate joins lead to false or missing relationships, diluting the reliability of graph analytics.

The distributed and heterogeneous nature of lakehouse pipelines amplifies these challenges, as data flows through multiple ingestion points and transformations before reaching the graph layer. Without systematic, automated data quality validation before graph modeling, these hidden errors remain undetected—leading to delayed insights, costly rework, and even production outages.

Embedding rigorous data quality checks early in the pipeline ensures that your graph analytics start from a clean, consistent, and trusted foundation. This is where the combined strengths of Telmai and PuppyGraph offer a breakthrough.

How Telmai and PuppyGraph Transform Raw Data into Trusted Graph Analytics

Enterprise analytics delivers true value only when the data relationships it relies on are accurate and transparent. Telmai and PuppyGraph offer an integrated solution that validates and models data in real time ensuring every data node, edge, and relationship is trustworthy. This unified approach enables teams to interpret complex datasets with clarity and agility.

Figure: Telmai & PuppyGraph Architecture

To bring the joint value of Telmai and PuppyGraph into sharp focus, let’s walk through a practical example using the Olist dataset — a publicly available e-commerce dataset rich with customer, order, product, and seller information.

In this dataset, we injected common data quality challenges such as:

Null foreign keys (e.g., missing customer_id or seller_id), which break the critical links between customers, orders, and products
Inconsistent timestamp formats : Mixing formats such as MM/DD/YYYY with ISO 8601 timestamps leads to unreliable temporal relationships. For graph analytics that rely on event sequencing, like tracking purchase funnels or supply chain timelines, this inconsistency results in erroneous ordering of events, skewed path analyses, and misleading temporal insights.
Out-of-range values, like unrealistic product weights that skew relationship weighting and analytics
Data type mismatches that lead to processing errors or dropped nodes during graph construction

If these issues remain undetected and uncorrected, the resulting graph will have broken edges, orphan nodes, and inaccurate relationship metrics,ultimately producing misleading insights and undermining trust in your analytics.

This is where Telmai plays a pivotal role. Before the data ever reaches the graph engine, Telmai performs comprehensive, full-fidelity data profiling and validation directly on the raw Iceberg tables in their native cloud storage location.

It automatically detects null keys, inconsistent formats, schema drift, and anomalous values, without resorting to sampling that might miss critical errors. Telmai surfaces these issues early, enabling data teams to correct or flag problematic data before graph modeling begins.

With this validated, clean data in place, PuppyGraph ingests the Iceberg datasets natively—eliminating the need for costly data migrations or fragile ETL processes. PuppyGraph then constructs accurate, high-performance property graphs that faithfully represent the true entity relationships and temporal sequences within your data.

Graph algorithms depend heavily on the correctness of edges and nodes to surface meaningful relationships, identify patterns, and detect anomalies. By integrating Telmai’s rigorous data quality validation with PuppyGraph’s flexible, in-place graph computation, organizations gain confidence that their knowledge graphs are built on solid ground. This ensures faster onboarding, fewer silent errors, and graph analytics that reliably power critical business applications—from customer journey analysis to fraud detection and supply chain optimization.

The old adage “garbage in, garbage out” holds especially true here: graphs built on noisy or inconsistent data risk misleading conclusions, operational disruptions, and lost business opportunities.

Conclusion

Together, Telmai and PuppyGraph offer a seamless, scalable solution that enables enterprises to build trustworthy knowledge graphs on top of open lakehouses. By integrating rigorous data validation with high-performance graph modeling, here are the key benefits that this joint solution can offer:

Faster onboarding: Validated data minimizes back-and-forth between data engineers and graph modelers, speeding up time to value.
Fewer silent errors: Early detection prevents costly rework and avoids customer-facing problems caused by inaccurate graph outputs.
Smarter data products: Reliable, high-quality graphs enable more precise personalization, recommendations, and fraud detection—driving better business outcomes.

Ready to build trusted, scalable graphs? Click here to talk to our team and learn how to turn your lakehouse into a source of clean, reliable insights.

Want to stay ahead on best practices and product insights? Click here to subscribe to our newsletter for expert guidance on building reliable, AI-ready data pipelines.