Data Is a Commodity (And to be Clear ... That’s Not a Compliment)

The real advantage isn’t in the data you collect — it’s in the intelligence you engineer.

Sep 10, 2025

You probably don’t want to hear this. And many will argue the point. But it’s more true than most executives are willing to admit: data has become a commodity.

Everyone has it.

Especially in finance; where nearly every firm is working with the same datasets, just sliced differently, repackaged, or wrapped in slightly different ways. Supersets, subsets, derivatives … but fundamentally, the same raw material.

Yet the mantra persists: “But it’s gold!” So firms chase more data feeds, more exhaust, more licensing deals. Convinced that more data means more edge. The reality? Data on its own has become undifferentiated, over-abundant, and over-hyped.

That’s the problem. Too many leaders are fixated on the data itself rather than what comes after. Being “AI ready” isn’t about hoarding cleaner data. It’s about building intelligence that goes well beyond the raw inputs. Because if your AI isn’t smarter than your data, you’re in for a world of hurt.

The Illusion of “Special Data”

Here’s the hard truth: collecting more data won’t make you unique.

Drowning in an ocean of data doesn’t make you differentiated. It makes you indistinguishable from everyone else who bought the same feeds and filed them away in their warehouse. Every firm insists their dataset is “special.” In practice? It looks a lot like their competitors’.

Everyone talks about signal-to-noise. But piling on more noise doesn’t magically increase the signal.

Special doesn’t come from the raw material. It comes from what you do with it: the methods, frameworks, and intelligence that transform baseline data into insight. The real advantage lies in the ability to:

See around corners.
Develop strategies others can’t.
Generate credible predictions.
Stress-test “what if” scenarios faster than the competition.

That’s where firms separate themselves — not in owning the ore, but in refining it.

What About the Bloombergs of the World?

Bloomberg is often treated like the exception. The company with “special” data. And to be fair, even I’m impressed with the engine they’ve built. But let’s not kid ourselves: what Bloomberg sells isn’t data.

The real value is in the insights — the context, analysis, and narratives layered on top of the same underlying numbers everyone else has. In this sense, Bloomberg’s data is not unique. What’s unique is the human intelligence machine they’ve assembled to transform that commodity into gold in the form of articles, commentary, predictions and services that feel indispensable to decision-makers.

That’s the lesson too many firms still miss. Bloomberg didn’t win because they had “better” data. They won because they knew how to manufacture value at scale from data everyone else also had.

Human machine vs AI machine for signal-grade insights.

If you’re still trying to compete on raw data, you’ve already lost. Bloomberg cracked that code decades ago. Your only shot is to build the intelligence layer that turns adequate inputs into an unfair advantage. Anything else is just being an also-ran.

Why Chasing the Foundation Model Playbook Is a Trap

Too many CIOs and CTOs are benchmarking themselves against the wrong competitors. They look at OpenAI, Anthropic, or xAI and assume the game is about stockpiling data to feed the models. But those companies are in an entirely different business. They build general-purpose models that need “planetary-scale” training data.

You don’t need to replicate that approach. In fact, you shouldn’t. For most enterprises, the smart play is leveraging those general-purpose models selectively; and with disciplined retrieval-augmented generation (RAG) pipelines, domain-specific orchestration, and robust observability.

Even the foundation model vendors are hitting diminishing returns on “planetary-scale”. The industry is already learning that more data isn’t always better. Beyond a certain point, it becomes noise. Overfitting, bias amplification, and poor generalization creep in. The “more is better” mindset is not just expensive — it’s dangerous.

Stop Fixating on Data and Fixate on the Problem

The fixation on data has become absurd.

Here’s the pivot executives and investors should be looking for:

Start with the problem. What is the actual decision, workflow, or prediction you’re trying to enable?
Engineer intelligence for the solution. The architecture, not the dataset, is the differentiator.
Pull in the necessary data. Not all of it. Not endlessly more of it. The right signal-grade data.

Enterprises that keep chasing the “data advantage” narrative will end up overspending and under-delivering. The ones who align intelligence with outcomes and use data as fuel, not fetish, will actually win.

Diversity in data is far more valuable than quantity.

The “More Than Adequate” Lesson

At Charli, we get asked constantly about our data.

Where does it come from?
How extensive is it?
How deep does it go?

I usually give the same answer I once heard in a Rolls Royce video, when someone asked about horsepower and torque on their vehicles: “More than adequate.”

Because those who obsess over the individual specs are missing the entire point. Rolls Royce never got caught up in a horsepower or torque arms race because they understood something deeper: the real experience wasn’t in the numbers, it was in the craftsmanship. The seamless, magic-carpet ride. The engineering precision you feel, not quantify.

Likewise, your edge won’t come from bragging about whose dataset has more horsepower. It comes from how you engineer intelligence. The science, architecture, and design that converts “more than adequate” inputs into signal-grade outcomes.

At Charli, we didn’t chase data quantity. We focused on generating insights that actually help customers see around corners. That’s where the alpha lives — not in the data, but in what you do with it.

As we argued in The Data Abyss article, it’s not the dataset; it’s the questions you ask, how you ask them, and the signal-grade insights you generate.

The Real Equation

So why do companies still fight tooth and nail over licensing and hoarding more data? Because it’s easier than asking the harder question: what problem are we really trying to solve? For CIOs, CTOs, and boards, that question should be front and center because the obsession with data often masks deeper structural issues in strategy and execution.

Stop trying to mimic the foundation model giants. That game is theirs, not yours. They’re in the commodity business of vacuuming up oceans of data to brute-force train general-purpose models. Enterprises aren’t. And chasing their playbook is a fast track to wasted budgets and stalled initiatives.

Yes, you need raw materials. But don’t confuse the ore for the finished product. The real value comes from refining it and transforming noise into signal by building proprietary insights, and compounding them into a durable advantage.

That’s where competitive moats are built.

As an end note … and if you are interested … the topic of “baby models” is one area where your own data can offer a durable and compounding optimization advantage.

Rethinking AI Infrastructure to Unlock Automation and New Insights

Inside Charli AI Labs

Discussion about this post