Building a Real-time Transit Monitoring Dashboard: A Nuremberg Transit Project

Listen to this article

After months of experiencing unreliable transit schedules in Nuremberg, I decided to create a solution that would provide better transparency and insight. Standing at bus stops wondering when my ride would actually arrive became a daily frustration, and while the local VGN (Verkehrsverbund Großraum Nürnberg) app provides basic information, I wanted to understand broader patterns in the transit system's performance.

The Motivation Behind the Project

As a regular user of Nuremberg's public transportation, I found myself asking questions that standard transit apps couldn't answer:

Which routes consistently experience delays?
How does the system perform during peak hours versus off-peak times?
Are some neighborhoods receiving better service than others?

Transit agencies collect this data, but rarely present it to the public in an accessible, analytical format. My goal was to bridge this gap by creating a dashboard that transforms real-time transit data into useful insights for both riders and potentially transit planners.

Technical Implementation

I developed an end-to-end monitoring system with several key components:

Data Collection: I created a Python script that interfaces with the VGN API to collect real-time departure information from transit stops throughout Nuremberg.
Data Storage: I implemented a two-tier storage system using Redis as a caching layer for real-time data and PostgreSQL for storing static transit information (routes, stops, schedules) imported from GTFS feeds.
Analysis Layer: I wrote functions to calculate performance metrics like on-time percentages, average delays by route and time of day, and service frequency across different areas.
Visualization: Using Streamlit and Plotly, I built an interactive dashboard that makes these insights accessible and easy to understand.

Technical Challenges

The project presented several interesting challenges:

The VGN API had limited documentation, requiring significant experimentation to understand the correct request formats and response structures. One particular difficulty was mapping between the GTFS stop IDs in my database and the numeric IDs required by the API.
With over 1,500 transit stops in the network, querying real-time data for all locations simultaneously wasn't feasible. I developed a sampling approach that monitors a representative subset of stops, providing meaningful insights without overwhelming the API or the dashboard.
Synchronizing the real-time data with the dashboard required careful design to maintain responsiveness. I implemented a background process that continuously updates the Redis cache, allowing the dashboard to access fresh data without waiting for API calls.
The map visualization component proved unexpectedly challenging, with several technical issues related to coordinate formatting and data structure that took considerable time to resolve.

Dashboard Features and Insights

The completed dashboard provides several valuable views:

A network overview showing all transit stops with their current status
Route analysis displaying performance metrics by route type and line number
Real-time status indicators showing on-time percentages and delay distributions
Comparative analysis between different neighborhoods and their transit service quality

What makes this project meaningful is how it transforms raw transit data into actionable information. Rather than just knowing that a particular bus is five minutes late, users can see patterns—like how certain routes consistently underperform during specific times of day.

Practical Applications

This dashboard serves both practical and analytical purposes:

For transit users, understanding system-wide performance helps set realistic expectations and make better travel decisions. Knowing which routes typically experience delays allows for better planning.
For transit planners, the insights could potentially inform resource allocation and service improvements by highlighting consistent problem areas in the network.

Technical Architecture

The system architecture consists of five main components:

Data Fetcher: A Python script running as a background service that queries the VGN API at regular intervals.
Redis Cache: Serves as temporary storage for real-time departure data, with an organized key structure for efficient retrieval.
PostgreSQL Database: Stores the static transit information that provides context for the real-time data.
Analysis Module: Contains the logic for calculating performance metrics and generating insights.
Streamlit Dashboard: Provides the user interface for exploring the data through interactive visualizations.

This architecture balances performance with functionality, allowing for real-time updates while maintaining responsive interaction with the dashboard.

Future Development

While the current version provides valuable insights, I'm considering several enhancements:

Implementing historical data storage to track performance trends over time
Developing predictive models to forecast delays based on factors like weather and time of day
Creating more detailed neighborhood analysis to better understand transit equity
Adding personalized monitoring options for specific routes or stops

Technologies Used:

View GitHub Repository

Back to Blog