The best metaphor for real-time analytics

A precise definition is elusive. In cases like these, analogies are useful.
Alasdair Brown
DevRel Lead
Jul 29, 2022
 ・ 
  min read

Remember Frogger? That pixelated bit of arcade magic where you navigate your amphibious avatar across roads and rivers without getting smushed by passing cars or swept away by rushing water?

an animation of the frogger video game
Go Frogger go!

As it turns out, Frogger is a perfect metaphor for real-time analytics. If you never played it, then grab some virtual quarters and spend 1 minute on this emulator. Otherwise, this might not make sense…

--

Now, close your eyes. Imagine that you’re Frogger. You’re looking across the treacherous highway. You see cars whiz by, you hear the rumble of motorcycles, you smell the exhaust from passing trucks. And you must get to the other side.

Here’s a question: What if there was a delay between the sights, sounds, and smells that you perceived and your ability to activate your hoppers to hop? Asked another way, what if the synapses of your central nervous system transmitted information from your senses to your muscles slower than the speed of a passing semi?

an animation showing semi trucks passing quickly on a highway, used as a metaphor to explain realtime analytics

Would you ever cross the road? 

Analytics is the synapse of business

For all the talk these days about business decisions being “data-driven,” you might be fooled into thinking that this is a modern phenomenon enabled by digital technology. It’s not. Data-driven decision-making has been around for at least as long as central nervous systems. We sense, we process, and we act. Just like Frogger. And for most humans, that process happens within a few milliseconds.

Broadly speaking, “Analytics” is the way that businesses sense, process, and act, ideally in a way that creates value. Businesses generate data, gain insight from that data, and connect those insights to the people and processes that operate the business so they can act upon it.

A chart showing the progression of analytics from data to insight to action.
The basic analytical process.

Most modern businesses generate new data constantly. And for most businesses, that data has a shelf life. The longer data sits unprocessed and not acted upon, the less valuable it becomes.

Of course, the rate at which its value declines depends on the speed of the business. A global online retailer selling thousands of articles of clothing a minute on Black Friday has very different data needs from a small, brick-and-mortar boutique in a rural town on a Tuesday morning. 

Frogger’s ability to safely cross the road depends on how many cars there are, how fast they’re going, and how quickly he can hop.

So what is real-time analytics?

Real-time analytics, then, is the synapse that connects high-speed data generation to processes and activities that keep the business alive and profitable. It keeps companies on top of the constant barrage of data that could be (and should be) informing their decisions, and it lets them act upon that data at peak value. It’s the system whereby Frogger nimbly springs from lane to lane, one hop ahead of his wheeled nemeses.

A chart showing how the value of data decreases rapidly with time, and real-time analytics attempts to capture the value of data as soon as it is created.
For real-time applications, the value of data drops precipitously as time passes.
Said as simply as possible, real-time analytics is the ability to do something valuable with data as quickly as it is generated.

And for data-intensive businesses, this is a hard problem.

Why is real-time analytics hard?

In order to be valuable to businesses, real-time analytical architectures have to deal with three data attributes:

  1. Freshness,
  2. Latency, and
  3. Concurrency

Freshness is the delay between when data is created through a real-world event and when it is available to act upon. In SQL terms, you could think of freshness as the difference between the result of the ``now()`` function and the latest timestamp in your database. 

Latency is how quickly you can request that data and get a result. It’s how long it takes your dashboard to load when you click refresh.

Concurrency is how many different clients want low-latency access to fresh data at the same time.

Frankly, it’s really hard to build systems that do all 3 well. It’s like that sign you see at some hole-in-the-wall restaurants:

A restaurant sign used as a metaphor to explain the challenges of realtime analytics
Typical data architectures: Fast & fresh won't be concurrent. Concurrent & fast won't be fresh. Fresh & concurrent won't be fast.

To be able to ingest massive volumes of data being generated constantly, enrich and transform that data, and expose that data to applications and interfaces accessed by many concurrent users is still exceptionally difficult, and it’s not something that established data architectures were built for.

Data warehouses, in particular, were built to solve a particular problem: Complex, batch analysis over huge volumes of historical data. They do this very well. But when it comes to building low-latency, high-concurrency applications on top of these massive volumes of fresh data, they fail. They simply can’t keep up with real-time demands. There’s a technical reason for this, and I’ll cover it in a subsequent blog.

What’s next for real-time analytics?

Real-time analytics is certainly a growing space within the larger data ecosystem, but historically it has only been achievable for big players with deep pockets who can dedicate large teams to building and maintaining beefy infrastructure. Fortunately, that may be changing.

To begin with, serverless technologies, including DBaaS (Database as a Service) offerings, are becoming increasingly widespread. They’re developer-friendly and accelerate the ability to write differentiated code.

But the headache doesn’t stop at the database. Real-time analytics requires so much more than a scalable DBMS.

Here’s what’s missing:

First: Developer experience matters. There’s a very good chance that analytical databases will follow the path of general compute: The stuff under the hood will eventually become a commodity. When you’re engaged in any new data project, your ability to gain insight from your data relies on being able to understand and leverage the capabilities of the tools you’re using. You need to be able to get started easily, and you should never be held back as you progress. 

There are many good reasons people love Postgres more than MySQL. The most important one is UX.

Second, as I mentioned, the value of real-time analytics doesn’t end at “Insight” but rather at “Action.” Sure, you can run low-latency, high-concurrency queries on your database, but the only thing you’ll gain is more Insight. Action means putting that data to use within applications. It means automation. It means customer value. It means doing sh*t. Until Action is precipitated from Insight, “data-driven” businesses get locked into “explore and explain” infinity, where there is little value to be had.

There are plenty of companies going after the real-time analytics prize. Tinybird is one of them. But we’re doing it with a fundamentally different approach than most. We’re making it delightful for developers to build real-time applications, and we’re not stopping at Business Intelligence. Choose Tinybird and serve data out of a database with HTTP APIs, transforming Insight into Action. No need to build & scale your own API layer, no need for database-specific driver dependencies in your projects.

Just ingest your data, analyze it with SQL, and consume it anywhere. That’s Tinybird.

Ready to build fast APIs, faster?

Sign up for a Tinybird Build Plan. It's free forever, with no credit card required.
Start Building
No credit card needed