Blog

You built a data lake. You needed a river.

January 26, 2026

The core problem: Data lakes give AI semantic consistency but no ability to act. Without operational context, AI hallucinates. You need an ontology—semantic consistency that’s executable and connected to your systems, and compounds with every action.

What do you want artificial intelligence (AI) to do for your telco? Apply an offer to a subscriber? Update a billing record? Trigger workflows in your operational platforms? All of the above?

Then don’t build a data lake. Because lakes can’t do that.

If the data is good enough to run your business today—to bill subscribers, provision services, report financials—it’s good enough for AI. The data lake isn’t a prerequisite. It’s an unnecessary side quest.

Lakes collect. Rivers FLOW.

When you build a data lake, you spend months extracting data from billing, customer relationship management (CRM), provisioning, and care systems. You clean it, transform it, deduplicate it, and load it into your new data repository. Two years and $20 million later, what do you have? Historical snapshots of data that was accurate the day it was copied over.

Is that why you built the lake? Partly. But to prepare for AI, some people are building data lakes to solve their semantic consistency problem. The thought is since your operational systems don’t have it, you’re going to need common definitions to give AI the context it needs to stay on track and deliver trustworthy results. So, the data lake becomes the solution. Normalize the data. Standardize the terms. Create consistency in a separate environment where you control the schemas.

It’s logical. It’s tempting. It’s also a trap.

Because once you achieve perfect semantic consistency in your data lake and you finally have a single source of truth with clean, standardized definitions…

Now what?

The dead end

You wanted to use that hard-won consistency to actually DO something, right? Grow revenue. Reduce cost. Run a better business.

But your data lake won’t help, because data lakes can’t take ACTION.

It’s true: when your AI analyzes the lake and identifies a high-churn-risk subscriber, what happens next? The AI can’t apply a retention offer in your billing system. It can’t update the CRM. It can’t trigger provisioning. It can’t do any of the things that would actually retain that subscriber.

Why? Because those actions happen in your operational systems—the same systems you extracted the data from. The lake has copies. It doesn’t have control.

That’s not AI transformation. That’s a very expensive recommendation engine—the same thing the analytics industry has been selling for decades: here’s what’s happening, good luck doing something about it.

Dashboards don’t grow revenue. Actions do.

The irony is rich: you “prepared your data for AI” by disconnecting it from the systems where actions actually happen. You solved the semantic problem in a place where it doesn’t matter, while the operational systems—where things actually get done—remain as fragmented as they ever were.

A lake is a dead end. A river will take you places.

AI needs an ontology

Everyone asks: “How do I get my data ready for AI?” That’s the wrong question.

The right question: “How do I give AI the ability to act while making sure it doesn’t hallucinate?”

AI doesn’t hallucinate because models are bad. It hallucinates because it lacks operational context. That’s an architecture problem that data lakes can’t solve.

Ask the right question, and you’ll go somewhere completely different. Not to a lake, but to a river—a semantic layer that connects your systems, provides semantic consistency AND gives AI the controls to operate through them.

AI needs to know what things CAN DO. It needs to know which operations are valid, what sequences are required, what happens when you apply a promotion to a subscriber with an expiring contract in a specific service tier.

Only an ontology can give you that—through executable semantics.

The Totogi ontology doesn’t just define “subscriber” consistently across systems. It encodes what “subscriber activation” means operationally—across billing, provisioning, and inventory. It knows how “rating” relates to “mediation,” “balances,” and “offers.” It understands which state transitions are valid and which create revenue leakage. And it can take action.

When AI queries the ontology, it doesn’t just retrieve definitions. It learns how your telco actually works. And because the ontology connects to your operational systems, AI can act on what it learns. “High-churn-risk subscriber with expiring contract” becomes a coordinated action: query eligible retention offers, validate margin constraints, check provisioning capacity, configure billing, update CRM. AI doesn’t write out the steps for a human to take. It actually executes the workflow.

And every action makes the ontology smarter. The outcome feeds back into the system: what worked, what didn’t, which offers converted, which sequences failed. Your competitors start from zero while your ontology compounds operational knowledge with every transaction.

You don’t have to “clean” anything.

Let AI work directly with the actual, live, messy systems that run your business every day. Yes, they’re distributed across 50 vendors. Yes, the data is messy and the schemas don’t match.

Doesn’t matter.

Because this disorder is a semantic understanding problem, not a data quality problem. And semantic understanding doesn’t require moving your data anywhere. It requires giving AI the context to understand what your data means across systems—and the operational logic to act on it.

Don’t waste two years building a dead-end lake. Instead, keep your data where it is, and spend the crucial time adding understanding and building the flow that will give your AI the ability to act.

Build the river

At Totogi, we built the telco ontology on this principle: semantic understanding across your existing systems—without moving data anywhere. AI can query federated data through natural language, and more importantly, it can orchestrate actions across all your systems: BSS, OSS, network, and everything in between.

No two-year lake project. No historical snapshots. No stagnant pools of data that can’t DO anything. Just AI that flows through your business the way your operations actually run.

Lakes collect. Rivers power. Build the river.

Recent Posts

Get my FREE insider newsletter, delivered every two weeks, with curated content to help telco execs across the globe move to the public cloud.

Get started

Contact Totogi today to start your cloud and AI journey and achieve up to 80% lower TCO and 20% higher ARPU.

Transform

Turbocharge your systems

Michael Walker, who leads enterprise AI deployment strategy at Totogi, talks about our approach, and how you don’t need data lake migrations to make AI work.

LISTEN NOW

Engage

Connect with an expert today!

Set up a meeting to learn how the Totogi platform can unify and supercharge your legacy systems, kickstarting your AI-first transition.

CONNECT

Discover

How ontology works

Telcos have one of the most semantically fragmented environments in any industry. This blog post explains how a telco ontology decodes the chaos and unlocks the value of AI.

READ NOW

Frequently Asked Questions

1. How do I clean my data to prepare for an AI project in telecom?

You probably don’t need to. The assumption that AI requires cleaned, centralized data is what drives expensive data lake projects—but it’s solving the wrong problem. AI doesn’t need your data to be cleaner or moved to a new environment. It needs semantic understanding of what your data means across existing systems and the ability to act on it. If your data is running your business today—billing subscribers, provisioning services, handling support tickets—it’s already good enough for AI to work with directly through a semantic layer.

2. Do I need to clean my telecom data before implementing AI?

No. AI can work directly with your existing operational systems—even if they’re distributed across 50 vendors with inconsistent schemas and definitions. This is a semantic understanding problem, not a data quality problem. What AI needs is a semantic layer that provides context for what your data means across systems and the ability to call APIs that execute actions. Your live, messy data running your business today is exactly what AI should work with.

3. What should I build first for AI: a data lake or something else?

Build a semantic layer that connects your existing operational systems first. A data lake gives you historical consistency but can’t execute actions. It’s a dead end for operational AI. A semantic layer (like an ontology) gives AI the understanding to interpret your live data across billing, CRM, provisioning, and care systems, plus the ability to call APIs that make things happen. With an ontology, you can skip the two-year, $10-20 million lake project and get AI that can actually run operations, not just generate reports.

4. Why can’t a data lake enable AI to take action in telecom operations?

Data lakes provide semantic consistency for analysis, but they’re read-only historical snapshots. When AI identifies an opportunity—like a high-churn-risk subscriber—the lake can’t apply retention offers, update billing systems, or trigger provisioning workflows. AI would need to call APIs in your operational systems to make things happen, but the lake only holds copies of data, not the controls to execute changes. This creates a disconnect where insights require manual intervention across multiple systems to become actions.

5. What’s the difference between semantic consistency in a data lake versus an ontology?

A data lake achieves semantic consistency by standardizing how data is defined and stored—creating a unified view of what things ARE. An ontology goes further by encoding what things CAN DO. It captures operational logic: how subscriber activation works across billing and provisioning, which state transitions are valid, what sequences are required. This executable semantic understanding lets AI not just analyze patterns but orchestrate actual workflows across your live operational systems.

6. How much does building a data lake typically cost telecom operators?

Building a comprehensive data lake typically requires two years and $10-20 million in investment. This includes extracting data from billing, CRM, provisioning, and care systems, then cleaning, transforming, deduplicating, and loading it. But the good news is you probably don’t need one! Data lakes only have historical snapshots of your data; they can’t execute operational changes. If you want an AI system that can take action, you need an ontology, not a data lake.

7. Why do telecom companies think they need a data lake for AI?

Because operational systems have semantic inconsistencies—”subscriber” means different things in billing versus CRM, product hierarchies don’t match, data formats conflict. The data lake feels like the solution: normalize everything in one place where you control the schemas. But this solves semantic consistency in a read-only environment that can’t take action. The real need isn’t cleaner data in a separate system. It’s semantic understanding across your live operational systems where AI can actually execute workflows and drive business outcomes.