Log Analytics Case Study

Overview

This case study demonstrates a production-ready log analytics system built with Tinybird. The system handles high-volume log ingestion, real-time aggregations, efficient pagination for log exploration, and powers a modern Next.js dashboard. The architecture prioritizes ingestion performance and query efficiency, with optimization handled through multiple materialized views for different time ranges and use cases.

Source Code: logs-explorer-template

Related Template: OpenTelemetry Template - For internal observability use cases, connecting existing tools like Grafana. The Logs template (this case study) is designed for user-facing logs in SaaS applications and developer tools.

Scale & Testing: This project has been tested with multibillion-scale logs and throughput of hundreds of thousands of logs per second. The repository includes a data generator for testing purposes.

Learn More:

Real-time Logs Analytics Architectures - Deep dive into architectures for real-time log analytics
Build a Datadog Alternative in 5 Minutes - Step-by-step guide to building a log analytics system
Build a Datadog Alternative in 5 Minutes - Part 2 - Scaling log analytics to handle larger datasets

Loading diagram...

0. Quick Start

# 1. Sign up and install CLI
curl https://tinybird.co | sh
tb login

# 2. Deploy template
tb --cloud deploy --template https://github.com/tinybirdco/logs-explorer-template/tree/main/tinybird

# 3. Copy the dashboard token
tb --cloud token copy read_pipes

Instrument your application to send logs:

const data = {
    timestamp: new Date().toISOString(),
    level: 'info',
    service: 'my-app',
    message: 'This is a test message',
    request_id: '1234567890',
    environment: 'development',
    status_code: 200,
    response_time: 100,
    request_method: 'GET',
    request_path: '/',
    host: 'my-app.com',
    user_agent: req.headers.get('user-agent')
}
await fetch(
    `https://<YOUR_TINYBIRD_HOST>/v0/events?name=logs`,
    {
        method: 'POST',
        body: JSON.stringify(data),
        headers: { Authorization: `Bearer ${process.env.TINYBIRD_APPEND_TOKEN}` },
    }
)

Tip

Where to get your Tinybird host: After deploying to Tinybird Cloud, your host is displayed in the Tinybird dashboard. You can also find it in your workspace settings or by running tb info. The host format is typically api.tinybird.co for the default region, or api.{region}.tinybird.co for other regions (e.g., api.us-east-1.tinybird.co).

Deploy the dashboard to Vercel and configure environment variables. View your logs at the deployed dashboard URL.

Tip

The template supports multi-tenant access using JWT tokens with fixed parameters for secure tenant-isolated access. Configure Clerk authentication and Tinybird JWT tokens for production use.

1. Data Sources & Schema Design

Landing Table Schema

-- datasources/logs.datasource
SCHEMA >
    `timestamp` DateTime64(3) `json:$.timestamp`,
    `level` LowCardinality(String) `json:$.level`,
    `service` LowCardinality(String) `json:$.service`,
    `message` String `json:$.message`,
    `request_id` String `json:$.request_id`,
    `environment` LowCardinality(String) `json:$.environment`,
    `status_code` UInt16 `json:$.status_code`,
    `response_time` UInt32 `json:$.response_time`,
    `request_method` LowCardinality(String) `json:$.request_method`,
    `request_path` String `json:$.request_path`,
    `host` LowCardinality(String) `json:$.host`,
    `user_agent` LowCardinality(String) `json:$.user_agent`

ENGINE MergeTree
ENGINE_PARTITION_KEY toYYYYMM(timestamp)
ENGINE_SORTING_KEY timestamp, environment, service, level
ENGINE_TTL toDateTime(timestamp) + INTERVAL 180 DAY

JSONPath Extraction: json:$.field_name extracts values from JSON at ingestion time. All fields are extracted directly into columns for efficient querying without JSON parsing overhead.

Design Decisions:

Structured columns: Unlike web analytics which uses a flexible payload column, logs use structured columns for predictable filtering and querying. Log schemas are typically more stable than web analytics events.
level, service, environment are LowCardinality: Limited distinct values (INFO, ERROR, WARN, etc.) reduce storage and improve performance. These are the most common filter dimensions.
message is String: Full log messages require text search capabilities. Uses multiSearchAnyCaseInsensitive() for pattern matching.
timestamp is DateTime64(3): Millisecond precision for accurate log ordering and time-based queries.
Sorting key: timestamp, environment, service, level: Time-first enables efficient time-range queries. Environment and service enable data skipping for multi-tenant queries.
TTL configured: ENGINE_TTL toDateTime(timestamp) + INTERVAL 180 DAY automatically removes logs older than 180 days, reducing storage costs and improving query performance.

Trade-offs: Landing table prioritizes ingestion speed and structured querying. All columns are extracted at ingestion time, requiring schema changes for new fields. Materialized views handle query optimization for different time ranges and use cases.

2. Engines, TTLs, Sorting Keys, and Time-Based Partitioning

Engine and Partitioning

MergeTree: Standard engine for landing tables, handles high-volume ingestion with automatic background merges.

Monthly partitioning: toYYYYMM(timestamp) produces 1-300 GB partitions, ideal for ClickHouse. Daily partitions create overhead; yearly partitions become too large.

TTL: ENGINE_TTL toDateTime(timestamp) + INTERVAL 180 DAY automatically removes data older than 180 days. Tinybird configures correct settings so partition pruning works efficiently without impacting cluster performance.

Sorting Key

Order matters for query performance:

timestamp first: Time-series queries are the most common. Enables efficient time-range filtering and ordering.
environment second: Common filter for multi-tenant isolation (production, staging, development).
service third: Common filter for microservices architectures.
level last: Enables efficient filtering by log level (ERROR, WARN, INFO).

Queries with WHERE timestamp >= X AND timestamp <= Y AND environment = 'production' skip entire data ranges, dramatically reducing scan size.

Time Precision

DateTime64(3) provides millisecond precision, essential for:

Accurate log ordering when multiple logs have the same second
Precise time-range queries
Debugging timing issues in distributed systems

3. Materialized Views - Time Series, Dimensions, and Pagination

Materialized View 1: `logs_daily_timeseries_mv`

Aggregates daily metrics for time series analysis:

-- materializations/logs_daily_timeseries_mv.pipe
SELECT
    toStartOfDay(timestamp) as date,
    service, level, environment,
    request_method, status_code, request_path, user_agent,
    countStateIf(level = 'ERROR') as error_count,
    countState() as total_requests,
    maxState(toFloat64(response_time)) as response_time_max,
    avgState(toFloat64(response_time)) as response_time_avg
FROM logs
GROUP BY date, service, level, environment, request_method, status_code, request_path, user_agent

The materialized view writes to this datasource:

-- datasources/logs_daily_timeseries.datasource
SCHEMA >
    `date` DateTime,
    `error_count` AggregateFunction(count, UInt64),
    `total_requests` AggregateFunction(count, UInt64),
    `response_time_avg` AggregateFunction(avg, Float64),
    `response_time_max` AggregateFunction(max, Float64),
    `service` LowCardinality(String),
    `level` LowCardinality(String),
    `environment` LowCardinality(String),
    `request_method` LowCardinality(String),
    `status_code` UInt16,
    `request_path` String,
    `user_agent` LowCardinality(String)

ENGINE AggregatingMergeTree
ENGINE_PARTITION_KEY toYYYYMM(date)
ENGINE_SORTING_KEY date, environment, service, level, request_method, status_code, request_path, user_agent

countStateIf(), countState(), maxState(), and avgState() pre-aggregate metrics. When endpoints query this data, they use countMerge(), maxMerge(), and avgMerge() to finalize aggregations. This powers the log_timeseries endpoint for charts.

Materialized View 2: `logs_daily_dimensions_mv`

Aggregates daily counts for each combination of dimensions:

-- materializations/logs_daily_dimensions_mv.pipe
SELECT
    toStartOfDay(timestamp) as date,
    service, level, environment,
    request_method, status_code, request_path, user_agent, host,
    countState() as count
FROM logs
GROUP BY date, service, level, environment, request_method, status_code, request_path, user_agent, host

Powers the generic_counter endpoint for dimension-based metrics (e.g., "show me all services and their log counts").

Materialized View 3: `logs_range_15m_mv`

Aggregates 15-minute time ranges for efficient pagination:

-- materializations/logs_range_15m_mv.pipe
SELECT
    toStartOfFifteenMinutes(timestamp) AS start_ts,
    start_ts + toIntervalMinute(15) AS end_ts,
    environment, service, level,
    countState() AS n_rows
FROM logs
GROUP BY start_ts, end_ts, environment, service, level

This materialized view enables efficient pagination by pre-calculating row counts per 15-minute window. The log_explorer endpoint uses cumulative sums to determine which time ranges contain the requested page, dramatically reducing query time compared to scanning all logs.

Why 15-minute windows? 15-minute windows provide a good balance between granularity and aggregation efficiency. Smaller windows (1 minute) create too many rows; larger windows (1 hour) reduce pagination precision.

Aggregation and Sorting Key Design

All materialized views aggregate by day or 15-minute intervals, dramatically reducing data volume. Millions of logs become thousands of aggregated rows.

The sorting key order matches the GROUP BY columns, and the order matters. Since time-series queries always filter by date/time range, date or start_ts comes first. environment and service come next for multi-tenant filtering. The remaining columns follow the GROUP BY order.

For example, logs_daily_timeseries_mv uses date, environment, service, level, request_method, status_code, request_path, user_agent, enabling efficient filtering by any combination of these dimensions.

This combination of time-based aggregation and dimension-first sorting keys scales to billions of logs without requiring additional optimizations like data skipping indices. Queries filter by time range first, then by dimensions, allowing ClickHouse to skip entire data ranges that don't match.

4. Endpoints - Pagination, Filtering, Time Series, and Counters

Endpoint Architecture

Endpoints in Tinybird are built using nodes, reusable SQL components that compose into complex queries. Nodes can reference other nodes as subqueries, in JOINs, or as scalar values, enabling powerful composition patterns.

Endpoints typically follow this flow:

Time range calculations: Calculate date ranges, granularity based on time span
Conditional data source selection: Choose between materialized views (for long ranges) or raw table (for short ranges)
Filtering: Apply service, level, environment, and other filters using templating
Aggregating: Query materialized views using *Merge() functions or aggregate raw data
Pagination (for explorer): Calculate time ranges containing requested page using cumulative sums
Final processing: Format output, handle nulls, apply sorting

Endpoints use Jinja2 templating for conditional logic ({% if date_diff_in_hours(start_date, end_date) >= 24 * 10 %} ... {% end %}), parameter injection ({{ Array(service) }}), error handling, and dynamic SQL.

TOKEN Authentication

Endpoints use token-based authentication:

TOKEN "read_pipes" READ

Tokens can have READ or APPEND permissions. They can also include fixed parameters (like user_id, org_permission, service) for automatic filtering, enabling secure multi-tenant access without exposing tenant data in URLs.

Endpoint 1: `log_explorer` - Efficient Pagination

The log_explorer endpoint implements efficient pagination using pre-aggregated 15-minute windows:

Loading diagram...

How it works:

cummulative_rows: Calculates cumulative sum of rows per 15-minute window, ordered by time (ascending or descending based on order parameter).
min_ts_range: Finds the 15-minute window where cumulative sum first exceeds page * page_size.
max_ts_range: Finds the 15-minute window where cumulative sum first exceeds (page + 1) * page_size.
logs_result: Queries raw logs table within the time range determined by min_ts_range and max_ts_range, applies filters, orders, and applies LIMIT/OFFSET.

This approach dramatically reduces query time because:

Instead of scanning billions of logs, it scans pre-aggregated 15-minute windows (thousands of rows)
It only queries the raw table for the specific time range containing the requested page
Cumulative sums enable efficient "find the time range containing row N" calculation

Example: To get page 10 (rows 200-219) from 1 billion logs:

Without materialized view: Scan all logs, count to row 200, return 20 rows (slow)
With materialized view: Scan 15-minute windows (fast), find time range containing row 200, query only that range (fast)

Endpoint 2: `log_analysis` - Simple Filtering

The log_analysis endpoint provides simple filtering and pagination without time-range optimization:

SELECT timestamp, request_id, request_method, status_code, service, request_path, level, message, user_agent, response_time, environment
FROM logs
WHERE timestamp >= {{DateTime(start_date)}}
    AND timestamp <= {{DateTime(end_date)}}
    {% if defined(service) %} AND service in {{Array(service)}} {% end %}
    {% if defined(level) %} AND level in {{Array(level)}} {% end %}
    -- ... more filters
ORDER BY {{column(sort_by, 'timestamp')}} {{order}}
LIMIT {{Int32(page_size, 100)}}
OFFSET {{Int32(page, 0)}} * {{Int32(page_size, 100)}}

This endpoint is simpler but less efficient for large datasets. Use log_explorer for production pagination.

Endpoint 3: `log_timeseries` - Time Series Charts

The log_timeseries endpoint provides time series data for charts with automatic granularity selection:

-- Conditional logic: Use materialized view for long ranges, raw table for short ranges
{% if date_diff_in_hours(start_date, end_date) >= 24 * 10 %}
    -- Long range: Use daily materialized view
    SELECT date, countMerge(error_count) as error_count, countMerge(total_requests) as total_requests
    FROM logs_daily_timeseries
    WHERE date >= {{DateTime(start_date)}} AND date <= {{DateTime(end_date)}}
{% else %}
    -- Short range: Query raw table with appropriate granularity
    SELECT
        {% if date_diff_in_hours(start_date, end_date) > 3 %}
            toStartOfHour(timestamp) as date,
        {% else %}
            toStartOfMinute(timestamp) as date,
        {% end %}
        countIf(level = 'ERROR') as error_count,
        countIf(level != 'ERROR') as total_requests
    FROM logs
    WHERE timestamp >= {{DateTime(start_date)}} AND timestamp <= {{DateTime(end_date)}}
{% end %}
GROUP BY date
ORDER BY date

Granularity selection:

10+ days: Daily granularity from logs_daily_timeseries_mv
1-10 days: Hourly granularity from raw logs table
< 1 day: Minute granularity from raw logs table

The endpoint also generates complete time series (using day_intervals, hour_intervals, or minute_intervals nodes) and LEFT JOINs with actual data to fill gaps, ensuring continuous charts even when time periods have no logs.

Endpoint 4: `generic_counter` - Dimension Metrics

The generic_counter endpoint provides counts for any dimension (service, level, environment, etc.):

SELECT
    toString({{column(column_name, 'level')}}) as category,
    {% if date_diff_in_hours(start_date, end_date) >= 10 %}
        countMerge(count) as count
    {% else %}
        count() as count
    {% end %}
FROM
    {% if date_diff_in_hours(start_date, end_date) >= 10 %}
        logs_daily_dimensions
    {% else %}
        logs
    {% end %}
WHERE ...
GROUP BY {{column(column_name, 'level')}}
ORDER BY count DESC

This endpoint powers sidebar widgets showing "Top Services", "Error Levels", etc. It conditionally uses logs_daily_dimensions_mv for long ranges or queries the raw table for short ranges.

Common Parameters

Common parameters include start_date/end_date for time range filtering, service, level, environment for dimension filtering, request_method, status_code, request_path, user_agent for HTTP-specific filtering, message for text search (only for short ranges), page/page_size for pagination, and sort_by/order for sorting.

Conditional Logic Examples

The log_timeseries endpoint switches data sources based on time range:

{% if date_diff_in_hours(start_date, end_date) >= 24 * 10 %}
    -- Use materialized view for 10+ days
    SELECT date, countMerge(error_count) as error_count FROM logs_daily_timeseries ...
{% else %}
    -- Use raw table for shorter ranges
    SELECT toStartOfHour(timestamp) as date, countIf(level = 'ERROR') as error_count FROM logs ...
{% end %}

Text search (message parameter) is only enabled for short ranges (≤ 7 days) because:

Full-text search on billions of rows is expensive
Short ranges limit the search space
Materialized views don't store full messages

{% if date_diff_in_hours(start_date, end_date) <= 24 * 7 %}
    {% if defined(message) and message != '' %}
        AND multiSearchAnyCaseInsensitive(message, {{split_to_array(String(message, ''), separator='|')}})
    {% end %}
{% end %}

5. Ingestion - Events API, Applications, Vector

Ingestion Options

Applications can send logs in multiple ways:

Loading diagram...

Direct API Ingestion

Applications send logs directly to Tinybird Events API:

const data = {
    timestamp: new Date().toISOString(),
    level: 'info',
    service: 'my-app',
    message: 'User logged in',
    request_id: '1234567890',
    environment: 'production',
    status_code: 200,
    response_time: 100,
    request_method: 'GET',
    request_path: '/api/users',
    host: 'my-app.com',
    user_agent: req.headers.get('user-agent')
}
await fetch(
    `https://<YOUR_TINYBIRD_HOST>/v0/events?name=logs`,
    {
        method: 'POST',
        body: JSON.stringify(data),
        headers: { Authorization: `Bearer ${process.env.TINYBIRD_APPEND_TOKEN}` },
    }
)

Benefits: Simple, no additional infrastructure, low latency.

Considerations: Each application needs to handle retries, batching, and error handling.

Vector Log Aggregator

Vector can collect logs from multiple sources and send them to Tinybird:

# examples/vector/vector_docker_logs.yaml
sources:
  docker_logs:
    type: docker_logs

transforms:
  parse_logs:
    type: remap
    inputs: [docker_logs]
    source: |
      # Parse Docker logs and transform to Tinybird schema
      .timestamp = now()
      .level = "info"
      .service = "docker"
      .message = .message

sinks:
  tinybird:
    type: http
    inputs: [parse_logs]
    uri: "https://<YOUR_TINYBIRD_HOST>/v0/events?name=logs"
    encoding:
      codec: json
    auth:
      strategy: bearer
      token: "${TINYBIRD_APPEND_TOKEN}"

Benefits: Centralized log collection, transformation, batching, retries, and routing.

Use cases: Multiple applications, Docker containers, system logs, centralized logging infrastructure.

Tinybird Events API

The Events API provides HTTP streaming ingestion with automatic batching. Events are buffered in memory, then batched into optimized ClickHouse inserts. The batching balances throughput and latency automatically.

Token-based authentication: Uses an append token with APPEND permission. Schema validation ensures events match the datasource schema.

Error handling: Invalid events are rejected with error messages. Applications should implement retry logic for transient failures.

Scalability

Tinybird handles batching automatically, events are buffered and inserted in batches. Monthly partitions prevent part explosion even with high ingestion rates. Multi-tenant isolation doesn't impact performance since each tenant's data is stored together (thanks to sorting key), enabling efficient queries.

Monitoring: Monitor ingestion rate, errors, and latency. Track system.parts for part count and system.merges for merge queue. If needed, distribute load across multiple ingestion endpoints or use Vector for load balancing.

Error Handling

Client-side errors should be non-blocking, log failures shouldn't break applications. Implement retry logic with exponential backoff. ClickHouse handles backpressure, if too many parts are created, inserts can be delayed or errored. Monitor max_parts_in_total setting.

6. Schema Evolution - Changing Landing Table, Materialized Views

Changing the Landing Table

To add a new column (e.g., user_id for user-level log filtering), modify the datasource file:

-- datasources/logs.datasource
SCHEMA >
    `timestamp` DateTime64(3) `json:$.timestamp`,
    -- ... existing columns
    `user_id` String `json:$.user_id` DEFAULT ''

Optionally add a FORWARD_QUERY for backfilling historical data, then deploy with tb --cloud deploy. The Deployment API automatically:

Adds the column with a DEFAULT value (zero-downtime)
Backfills historical data if FORWARD_QUERY is provided
Updates materialized views to use the new column
Preserves all existing data

Changing Materialized Views

To add a new aggregation dimension (e.g., user_id), modify the materialized view:

-- materializations/logs_daily_timeseries_mv.pipe
SELECT
    toStartOfDay(timestamp) as date,
    service, level, environment, user_id,  -- New dimension
    countStateIf(level = 'ERROR') as error_count,
    countState() as total_requests,
    -- ...
FROM logs
GROUP BY date, service, level, environment, user_id, ...

Deploy with tb --cloud deploy. Ingestion continues while backfilling happens (zero-downtime), historical data is automatically backfilled into the new structure, and endpoints can be updated to use the new dimension.

Adding New Log Types

Unlike web analytics which uses an action column to differentiate event types, logs typically use structured columns. To add new log types:

Add new columns to logs datasource (if needed)
Applications send logs with new fields
Materialized views automatically include new fields (if using * or explicit column lists)
Endpoints can filter by new fields using templating

For example, to add application performance monitoring (APM) logs:

-- Add APM-specific columns
SCHEMA >
    -- ... existing columns
    `span_id` String `json:$.span_id` DEFAULT '',
    `trace_id` String `json:$.trace_id` DEFAULT '',
    `duration_ms` UInt32 `json:$.duration_ms` DEFAULT 0

No materialized view changes needed if they use * or include these columns. Endpoints can filter by trace_id or span_id using templating.

Migration Patterns

With Tinybird, you don't think about migration patterns. Apply changes to schemas and files, and the Deployment API chooses the most efficient pattern:

Expand and Contract: Adding columns with DEFAULT values
Branch + Exchange: Creating new materialized views and switching endpoints atomically
Versioning: Using different log schemas for different application versions

All migrations happen with no data loss and zero downtime.

Developer Experience

Tinybird handles backfilling automatically, no migration SQL to write. Materialized views process new data immediately while backfilling historical data. Keep old views until new ones are validated. Staging deployments allow testing before production. Optionally use branches to test changes with a subset of production data.

7. Integration - Next.js Dashboard Integration

API Client with zod-bird

The dashboard uses zod-bird for type-safe Tinybird API calls:

// dashboard/log-analyzer/src/lib/tinybird.ts
import { Tinybird } from '@chronark/zod-bird';
import { z } from 'zod';

export const getTinybird = (token: string) => {
  return new Tinybird({ 
    baseUrl: process.env.NEXT_PUBLIC_TINYBIRD_API_URL!, 
    token 
  });
};

export const createLogExplorerApi = (token: string) => {
  const tb = getTinybird(token);
  return tb.buildPipe({
    pipe: 'log_explorer',
    parameters: z.object({
      page: z.number(),
      page_size: z.number(),
      start_date: z.string().optional(),
      end_date: z.string().optional(),
      service: z.array(z.string()).optional(),
      level: z.array(z.string()).optional(),
      environment: z.array(z.string()).optional(),
      // ... more filters
    }),
    data: LogExplorerResponseSchema,
  });
};

zod-bird provides:

Type safety: TypeScript types generated from Zod schemas
Parameter validation: Automatic validation of query parameters
Response validation: Automatic validation of API responses
IntelliSense: Autocomplete for pipe names, parameters, and responses

React Hooks

The dashboard uses custom hooks for data fetching:

// dashboard/log-analyzer/src/lib/hooks/useTinybirdApi.ts
export function useTinybirdApi() {
  const { token } = useTinybirdToken();

  const apis = useMemo(() => ({
    logAnalysisApi: createLogAnalysisApi(token),
    logExplorerApi: createLogExplorerApi(token),
    genericCounterApi: createGenericCounterApi(token),
  }), [token]);

  return apis;
}

The TinybirdProvider manages token generation and refresh, using JWT tokens with fixed parameters for multi-tenant access.

Dashboard Components

The dashboard consists of:

Sidebar: Filter widgets using generic_counter endpoint (Top Services, Error Levels, etc.)
Top Bar: Date range selector, search bar, refresh button
Time Series Chart: Uses log_timeseries endpoint with automatic granularity
Log Table: Uses log_explorer endpoint with infinite scroll pagination

Log Table with Pagination

The LogTableWithPagination component implements infinite scroll:

// dashboard/log-analyzer/src/components/logs/LogTableWithPagination.tsx
export function LogTableWithPagination({ pageSize }: { pageSize: number }) {
  const [page, setPage] = useState(0);
  const [logs, setLogs] = useState<LogEntry[]>([]);
  const { logExplorerApi } = useTinybirdApi();
  const { dateRange } = useDefaultDateRange();

  const { data, isLoading } = useQuery({
    queryKey: ['logs', page, dateRange, filters],
    queryFn: () => logExplorerApi({
      page,
      page_size: pageSize,
      start_date: dateRange.start,
      end_date: dateRange.end,
      ...filters
    }),
  });

  // Infinite scroll: append new logs when scrolling to bottom
  const observerRef = useInfiniteScroll(() => {
    if (!isLoading && hasMore) {
      setPage(p => p + 1);
    }
  });

  return (
    <LogTable
      logs={logs}
      observerRef={observerRef}
      isLoading={isLoading}
      hasMore={hasMore}
    />
  );
}

The component uses React Query for caching and automatic refetching. Infinite scroll triggers when the user scrolls to the bottom, loading the next page.

Time Series Chart

The TimeSeriesChartWrapper component displays error rates over time:

// dashboard/log-analyzer/src/components/charts/TimeSeriesChartWrapper.tsx
export function TimeSeriesChartWrapper() {
  const { dateRange } = useDefaultDateRange();
  const { logTimeseriesApi } = useTinybirdApi();

  const { data } = useQuery({
    queryKey: ['timeseries', dateRange, filters],
    queryFn: () => logTimeseriesApi({
      start_date: dateRange.start,
      end_date: dateRange.end,
      ...filters
    }),
  });

  return (
    <TimeSeriesChart
      data={data || []}
      xAxisKey="date"
      yAxisKey="error_count"
    />
  );
}

The chart automatically adjusts granularity based on the time range (minute, hour, or day) thanks to the log_timeseries endpoint's conditional logic.

Multi-Tenant Access

The dashboard uses JWT tokens with fixed parameters for secure multi-tenant access:

// dashboard/log-analyzer/src/middleware.ts
const token = await new jose.SignJWT({
    workspace_id: process.env.TINYBIRD_WORKSPACE_ID,
    name: `frontend_jwt_user_${userId}`,
    scopes: [
      {
        type: "PIPES:READ",
        resource: "log_explorer",
        fixed_params: { user_id: userId, org_permission: orgName, service: "web" }
      },
      // ... more scopes
    ],
  })
    .setProtectedHeader({ alg: 'HS256' })
    .sign(secret);

Fixed parameters ensure that users can only access logs for their organization/service, even if they try to modify query parameters. The JWT token is refreshed periodically to maintain security.

Integration Patterns

Date range filtering: All components share the same date range via useDefaultDateRange() hook. When the user changes the date range in the TopBar, all components automatically re-fetch with the new range.

Filter synchronization: Sidebar filters (service, level, environment) update URL parameters, and all components read from URL parameters to maintain filter state across page refreshes.

Real-time updates: The dashboard can poll endpoints periodically for real-time log updates. Use React Query's refetchInterval option:

const { data } = useQuery({
  queryKey: ['logs', ...],
  queryFn: () => logExplorerApi(...),
  refetchInterval: 5000, // Refetch every 5 seconds
});

Key Takeaways

Landing table prioritizes ingestion and structured querying - The logs table uses structured columns for efficient filtering and querying. Multi-tenant support via environment and service columns enables data isolation. TTL configured for automatic data retention.
Materialized views enable fast queries for different use cases - Three AggregatingMergeTree materialized views (logs_daily_timeseries_mv, logs_daily_dimensions_mv, logs_range_15m_mv) pre-aggregate data at insert time using *State() functions. The 15-minute range view enables efficient pagination by pre-calculating row counts per time window.
Endpoints use conditional logic for optimal performance - Endpoints conditionally switch between materialized views (for long time ranges) and raw tables (for short ranges) based on date_diff_in_hours(). The log_explorer endpoint implements efficient pagination using cumulative sums over 15-minute windows, dramatically reducing query time for large datasets.

Overview

0. Quick Start

1. Data Sources & Schema Design

Landing Table Schema

2. Engines, TTLs, Sorting Keys, and Time-Based Partitioning

Engine and Partitioning

Sorting Key

Time Precision

3. Materialized Views - Time Series, Dimensions, and Pagination

Materialized View 1: logs_daily_timeseries_mv

Materialized View 2: logs_daily_dimensions_mv

Materialized View 3: logs_range_15m_mv

Aggregation and Sorting Key Design

4. Endpoints - Pagination, Filtering, Time Series, and Counters

Endpoint Architecture

TOKEN Authentication

Endpoint 1: log_explorer - Efficient Pagination

Endpoint 2: log_analysis - Simple Filtering

Endpoint 3: log_timeseries - Time Series Charts

Endpoint 4: generic_counter - Dimension Metrics

Common Parameters

Conditional Logic Examples

5. Ingestion - Events API, Applications, Vector

Ingestion Options

Direct API Ingestion

Vector Log Aggregator

Tinybird Events API

Scalability

Error Handling

6. Schema Evolution - Changing Landing Table, Materialized Views

Changing the Landing Table

Changing Materialized Views

Adding New Log Types

Migration Patterns

Developer Experience

7. Integration - Next.js Dashboard Integration

API Client with zod-bird

React Hooks

Dashboard Components

Log Table with Pagination

Time Series Chart

Multi-Tenant Access

Integration Patterns

Table of Contents

Ship fast over a Managed ClickHouse®

Materialized View 1: `logs_daily_timeseries_mv`

Materialized View 2: `logs_daily_dimensions_mv`

Materialized View 3: `logs_range_15m_mv`

Endpoint 1: `log_explorer` - Efficient Pagination

Endpoint 2: `log_analysis` - Simple Filtering

Endpoint 3: `log_timeseries` - Time Series Charts

Endpoint 4: `generic_counter` - Dimension Metrics

Ship fast over a Managed ClickHouse^®