2 minute read

Link to Project

Predicting customer demand is a classic challenge — but what if we could forecast it in real time, with streaming data, automated pipelines, and a dashboard ready to visualize insights on demand? That’s exactly what I set out to build.

In this post, I’ll walk through how I built a real-time demand forecasting system from the ground up using event streaming, time series modeling, and a full-stack deployment approach. The end result: a scalable system with real-time data ingestion, daily forecasts, and a cloud-hosted dashboard.

The Goal

Build a fully automated pipeline that:

  • Simulates real-time order data
  • Stores and aggregates that data
  • Forecasts demand using Prophet
  • Visualizes it all in an interactive dashboard

Stack Overview

Layer Tools
Ingestion Kafka + Zookeeper
Storage PostgreSQL (local with Docker, then Supabase in production)
Modeling Prophet (time series forecasting)
Backend Automation Python scripts + cron job
Frontend Streamlit dashboard
Orchestration Docker (local), Cron (scheduling)

Local Development with Kafka & Docker

I began by building the system locally. Kafka simulated a continuous stream of customer orders, and a Kafka consumer wrote those orders to a PostgreSQL database running in Docker. This local setup helped validate the ingestion pipeline and allowed me to iterate quickly on the data model.

Orders included metadata like:

  • Region (e.g., Northeast, Midwest)
  • Product category
  • User segment
  • Timestamp and total price

I used Docker Compose to manage Kafka, Zookeeper, and Postgres containers. The orders table grew as new events came in, forming the foundation for daily aggregations.

Transition to the Cloud with Supabase

Once the local system was stable, I moved the Postgres database to Supabase, allowing for remote access and persistence. This enabled my Streamlit dashboard, deployed on Streamlit Cloud, to query real-time data.

The backend scripts — order simulation, aggregation, and forecasting — were adapted to write to and read from Supabase’s hosted Postgres.

Aggregating & Forecasting

Each day, the pipeline:

  • Aggregates total revenue, total orders, and average order value
  • Writes metrics to a daily_metrics table
  • Forecasts each metric using Prophet, storing results in forecast_metrics

I introduced variation by simulating weekends, holidays, and promotions — giving the model realistic seasonality to learn from.

Scheduling with Cron

To automate the system, I created a run_pipeline.sh script that:

  1. Simulates orders
  2. Aggregates daily metrics
  3. Generates forecasts

I scheduled it using cron to run at 10:30 AM daily, ensuring daily updates without manual intervention.

Interactive Dashboard

The frontend, built with Streamlit, allows users to:

  • Filter by region, product category, user segment, and hour of day
  • Choose a metric to display (orders, revenue, or AOV)
  • See historical and forecasted data in a unified chart
  • Download a CSV of the data

Promo days are highlighted with dashed lines for visual clarity.

Challenges Faced

  • Streamlit Cloud DB Connections: Needed to switch from direct psycopg2 to Streamlit’s new st.connection for compatibility
  • Supabase Query Limits: Paginating queries to fetch full datasets was necessary, since Supabase defaults to 1,000-row caps
  • Time Synchronization: Aligning time zones and timestamps between simulation, aggregation, and visualization took some fine-tuning
  • Dashboard Interactivity: Ensuring that the dashboard looked clean and responsive even as data volume grew

What I Learned

  • How to build real-time streaming pipelines with Kafka and Docker
  • Using Prophet for production-grade forecasting
  • Writing modular, automated pipelines in Python
  • Deploying dashboards that connect directly to cloud databases

Try It Out

Live Dashboard
GitHub Repo


This project helped me bring together data engineering, machine learning, and full-stack development into a cohesive, automated system. It’s one of the most technically complete projects I’ve built — and a strong addition to my data science portfolio.