🚀
Introducing Versions: Develop data products using Git. Join the waitlist

Multi-tenancy in Tinybird with a shared Workspace

If you're building a multi-tenant SaaS, you'll need an approach to ensure data separation. In Tinybird, you can establish secure multi-tenancy with a shared Workspace and per-tenant Data Sources created from materialized Pipes.
Cameron Archer
Content Lead
Nov 17, 2022
 ・ 
  min read

Multi-tenancy is a common concern for developers building SaaS applications on Tinybird. You have many customers (or at least you plan to), and you’re collecting all their data into Tinybird. How do you create APIs that achieve logical isolation of tenant data when that data is stored on a shared cluster?

In this blog, I’ll explain how to achieve a multi-tenant architecture in Tinybird using a single, shared Workspace with separate Data Sources for each tenant.

Why use multiple Data Sources for multi-tenancy?

The multiple-Data Source approach is one of a few ways to do multi-tenancy in Tinybird. A single shared Data Source with row-level security is another. But there are some good reasons to go with the per-tenant Data Source approach.

First, you can make table-level changes to any tenant’s data. Tinybird Data Sources are effectively just database tables. If a specific tenant needs a new table schema or custom columns, you can do that easily without impacting other tenants. If you want to drop a tenant completely, it’s as simple as dropping a table.

Second, token management is simpler. Instead of having to augment user tokens with column expressions for row-level security, each tenant just needs a single token with Data Source read access for their specific Data Source.

Put simply, you’ll have a little more control. It does come with the tradeoff of more overhead, but depending on your application, this might be worth it.

How to implement multi-tenancy on a shared Workspace

The best way to approach single Workspace multi-tenancy in Tinybird is by first ingesting all events data into a common “landing” Data Source. Then, you can create Pipes that filter out data from other tenants, and materialize the results into new, tenant-specific Data Sources. 

All tenant data is ingested into a single landing Data Source that is logically isolated from tenant access. Tinybird Pipes materialize tenant-specific rows into new, incrementally updating Data Sources.

Since Materialized Views update in Tinybird as soon as new data is ingested to the landing Data Source, you’ll always have fresh, filtered data that you can query over in your tenant-specific Pipes. From these Pipes, you can easily generate the API endpoints that you need to power your SaaS. And you can easily duplicate the assets and automate their deployment across multiple tenants, as I’ll show you in a bit.

For the sake of demonstration, here’s an example of how to achieve single Workspace multi-tenancy in Tinybird.

An example

Suppose you’re building a SaaS and you want to offer usage-based billing. So you instrument your product to generate usage events as they occur, and you send them to Tinybird. You want to build a simple feature so that your customers can explore their usage in a dashboard and receive notifications when their usage hits a certain threshold.

This example sends activity events data to Tinybird using the Events API, but you could just as easily do it with something like Kafka.

Let’s look at the steps to create some secure, multi-tenant API endpoints for this use case.

Step 1: Ingest data into a shared data source.

Sending events data into Tinybird is simple with the Events API. You just specify the Data Source name in the request, pack up the events in some JSON, and issue a request. 

Here’s how that code might look if you’ve built your backend in Python or JavaScript:

… and how that ``events`` json might look:

All of the data will be sent to the same ``landing_datasource`` Data Source in Tinybird with the same schema.

The landing data source contains rows for every tenant, but tenants won't have access to this table.

Step 2: Create and Materialize a Pipe to filter out tenant rows.

Next, you’ll create tenant-specific Data Sources using Materialized Views. You can do this in Tinybird by materializing the SQL queries in your Pipes. As new data hits the landing Data Source, these Materialized Views will update incrementally.

Here’s the CLI command to create a Pipe that filters out everything but data for Tenant 1:

{%tip-box title="Note"%}I’m using ``SELECT *`` here for brevity, but best practice is to explicitly define columns. That way if you add columns to ``landing_datasource`` later, it won’t break ingestion into the MVs due to a mismatched schema. It's also worth pointing out that you should drop the ``tenant_id`` column, since very row will be the same, but I kept it here for the sake of demonstration.{%tip-box-end%}

Then to materialize it, use the ``tb materialize`` command to create a new Materialized Data Source called ``tenant_1_ds``.

And you now have a new Data Source containing only the rows from ``landing_datasource`` where ``tenant_id = 1``. As new events that match this filter hit ``landing_datasource``, they automatically populate into ``tenant_1_ds``.

The Data Source for tenant 1 contains only rows where tenant_id = 1.

Step 3: Create an API endpoint.

With the tenant-specific Data Source created, you can now create API endpoints. To keep this example simple and this blog short(ish), let’s just create an API that returns the total event count for the current month.

So, create a new Pipe in the CLI:

When you push this Pipe to the server with ``tb push``, it’s automatically published as an API endpoint. You can then create a token with read access to that endpoint and ``tenant_1_ds``.

Step 4: Automate to duplicate.

Now, why did we do this whole workflow in the CLI? Because automation! Using these CLI commands, it’s relatively trivial to script the creation of new Materialized Views and Endpoints for each new tenant.

How you automate will depend on what you’re trying to accomplish. If you’re building a SaaS with an event driven architecture, you could trigger this workflow any time a new user signs up.

You could also just create a simple bash script, like the one below, that gets triggered using CI/CD workflows on something like GitHub Actions.

Have questions?

If you'd like to recreate this example, the files are located in this GitHub Repo. As always, if you have any questions about anything you’ve read here, feel free to check out the Tinybird docs or join the Tinybird Slack community.