Architecture & Technology - Trackser Live API

Deployment Platform

Trackser Live is built on Cloudflare Workers, a serverless edge computing platform that runs our code on Cloudflare's global network. This provides:

Global Distribution: Code runs close to users for minimal latency
Automatic Scaling: Handles traffic spikes without manual intervention
High Availability: Redundancy across multiple data centers
DDoS Protection: Built-in security against attack traffic
Cost Efficiency: Pay only for actual usage, no idle server costs

Tech Stack

🔧 Core Runtime

Cloudflare Workers - Runs our TypeScript/JavaScript code globally

🗄️ Storage

Cloudflare R2 - Object storage for processed train snapshots and historical data

KV Store - Low-latency distributed cache for frequently accessed data

📡 Scheduled Tasks

Cron Triggers - Process TfL data updates on regular schedules

💬 Real-time Updates

Durable Objects - Per-line state management for train tracking

🔗 Data Ingestion

TfL TrackerNet API - XML-based real-time train data (primary source)

TfL Unified API - Modern JSON API for extended coverage

Data Processing Pipeline

The system follows this processing flow:

1. Data Acquisition
Fetch fresh XML feeds from TfL's TrackerNet API for each line. These updates happen approximately every 30 seconds per station.

2. Parsing & Validation
Parse XML responses and extract train, location, and platform data. Validate required fields and reject malformed records.

3. Location Normalization
Standardize station names using our comprehensive mapping database. Infer locations from track codes when direct station information is unavailable or inconsistent.

4. Deduplication
Identify and merge duplicate train records within the same update and across historical data. Removes stale entries that haven't moved in extended periods.

5. Enrichment
Add computed fields: destination codes, stopping patterns, historical location averages, stall detection flags, reformation indicators.

6. Source Merging
If Unified API is enabled for this line, intelligently combine data from both sources. Prefer TrackerNet detail but use Unified as fallback/enhancement.

7. Stall Analysis
Compare current location duration against historical averages. Flag trains that are moving slower than normal for that location.

8. Output Generation
Format final JSON response with status, stats, and normalized train array. Compress using gzip for efficient transmission.

Caching Strategy

Trackser Live uses a multi-layer caching approach:

Edge Cache: Cloudflare's CDN caches responses at edge locations with 60-second TTL
KV Cache: Distributed key-value store for frequently accessed line data
Snapshot Storage: Compressed JSON snapshots in R2 for historical queries and replay
Browser Cache: Public pages cached for 1 hour; API responses use no-store headers

Data Storage

Live Data

Current train snapshots are stored in R2 with a key pattern: live/{lineId}/{variant}/latest.json

line variant - Per-line only train data
map variant - Map-optimized with area groupings
Both compressed (gzip) and plain versions available

Historical Data

Train movements are archived for analysis and replay:

Pattern: replay/{lineId}/{date}.json.gz
Daily snapshots with hourly granularity
Enables time-travel queries and pattern analysis

Reformation Tracking

Train set changes are logged separately:

Pattern: reformations/{lineId}/{date}.csv
Tracks when train sets are split, joined, or reconfigured

Monitoring & Observability

The system includes built-in monitoring:

Per-line Health Status: Available via /admin/status endpoint
Error Tracking: Last error timestamp and message per line
Data Quality Metrics: Train counts before/after processing (pre/post filtering)
Source Statistics: Breakdown of data from TrackerNet vs. Unified
Build Performance: Track how long data processing takes
Custom Logging: Integration with Cloudflare Analytics Engine for detailed insights

Performance Optimizations

Compression

All API responses use gzip compression, reducing typical response sizes by 70-80%

Parallel Processing

Line data is processed concurrently using Promise.all() for faster overall updates

Smart Fallbacks

If fresh data is unavailable, serve cached snapshots to maintain availability

Request Batching

Multiple API client requests for the same line are deduplicated to avoid redundant processing

Connection Pooling

Reuse HTTP connections to TfL APIs to reduce handshake overhead

Security

API Key Authentication: All data endpoints require valid API keys
HTTPS Only: Enforced encryption for all requests
CORS Headers: Proper cross-origin resource sharing configuration
Rate Limiting: Per-key request throttling to prevent abuse
Input Validation: All user inputs sanitized and validated
Worker Limits: Cloudflare's built-in quotas and request inspection

Future Considerations

As Trackser Live grows, potential improvements include:

Extended TfL network coverage (Elizabeth line, DLR, Overground)
WebSocket support for real-time streaming updates
Predictive analytics using historical patterns
Cross-line journey planning integration
Enhanced reformation detection and train composition analysis