Building a Real-Time Demand Forecasting System
Predicting customer demand is a classic challenge — but what if we could forecast it in real time, with streaming data, automated pipelines, and a dashboard ready to visualize insights on demand? That’s exactly what I set out to build.
In this post, I’ll walk through how I built a real-time demand forecasting system from the ground up using event streaming, time series modeling, and a full-stack deployment approach. The end result: a scalable system with real-time data ingestion, daily forecasts, and a cloud-hosted dashboard.
The Goal
Build a fully automated pipeline that:
- Simulates real-time order data
- Stores and aggregates that data
- Forecasts demand using Prophet
- Visualizes it all in an interactive dashboard
Stack Overview
Layer | Tools |
---|---|
Ingestion | Kafka + Zookeeper |
Storage | PostgreSQL (local with Docker, then Supabase in production) |
Modeling | Prophet (time series forecasting) |
Backend Automation | Python scripts + cron job |
Frontend | Streamlit dashboard |
Orchestration | Docker (local), Cron (scheduling) |
Local Development with Kafka & Docker
I began by building the system locally. Kafka simulated a continuous stream of customer orders, and a Kafka consumer wrote those orders to a PostgreSQL database running in Docker. This local setup helped validate the ingestion pipeline and allowed me to iterate quickly on the data model.
Orders included metadata like:
- Region (e.g., Northeast, Midwest)
- Product category
- User segment
- Timestamp and total price
I used Docker Compose to manage Kafka, Zookeeper, and Postgres containers. The orders
table grew as new events came in, forming the foundation for daily aggregations.
Transition to the Cloud with Supabase
Once the local system was stable, I moved the Postgres database to Supabase, allowing for remote access and persistence. This enabled my Streamlit dashboard, deployed on Streamlit Cloud, to query real-time data.
The backend scripts — order simulation, aggregation, and forecasting — were adapted to write to and read from Supabase’s hosted Postgres.
Aggregating & Forecasting
Each day, the pipeline:
- Aggregates total revenue, total orders, and average order value
- Writes metrics to a
daily_metrics
table - Forecasts each metric using Prophet, storing results in
forecast_metrics
I introduced variation by simulating weekends, holidays, and promotions — giving the model realistic seasonality to learn from.
Scheduling with Cron
To automate the system, I created a run_pipeline.sh
script that:
- Simulates orders
- Aggregates daily metrics
- Generates forecasts
I scheduled it using cron
to run at 10:30 AM daily, ensuring daily updates without manual intervention.
Interactive Dashboard
The frontend, built with Streamlit, allows users to:
- Filter by region, product category, user segment, and hour of day
- Choose a metric to display (orders, revenue, or AOV)
- See historical and forecasted data in a unified chart
- Download a CSV of the data
Promo days are highlighted with dashed lines for visual clarity.
Challenges Faced
- Streamlit Cloud DB Connections: Needed to switch from direct
psycopg2
to Streamlit’s newst.connection
for compatibility - Supabase Query Limits: Paginating queries to fetch full datasets was necessary, since Supabase defaults to 1,000-row caps
- Time Synchronization: Aligning time zones and timestamps between simulation, aggregation, and visualization took some fine-tuning
- Dashboard Interactivity: Ensuring that the dashboard looked clean and responsive even as data volume grew
What I Learned
- How to build real-time streaming pipelines with Kafka and Docker
- Using Prophet for production-grade forecasting
- Writing modular, automated pipelines in Python
- Deploying dashboards that connect directly to cloud databases
Try It Out
This project helped me bring together data engineering, machine learning, and full-stack development into a cohesive, automated system. It’s one of the most technically complete projects I’ve built — and a strong addition to my data science portfolio.