TGV Predictor
ML Train Delay Prediction
Year
2023
Type of Project
Side Project
My Role
Full-Stack Developer
Case Study
Objective
Build a machine learning model to predict TGV train punctuality rates using 7 years of SNCF open data. The goal was to demonstrate end-to-end ML skills: data exploration, feature engineering, model training, and deployment as a web application.
Process
Started with exploratory analysis of 13,000+ records from SNCF Open Data covering 130 train routes. Engineered features from temporal and route data, then trained a Random Forest model achieving 3.58% MAE. Built a bilingual Streamlit interface with interactive Plotly visualizations and deployed to Streamlit Cloud.
Outcome
Delivered a fully functional prediction tool that identifies punctuality patterns across routes and seasons. The project strengthened my skills in Python, Scikit-learn, and data visualization while building a concrete portfolio piece for ML engineering roles.
Standout Features
Smart Categorization
Quick Save Extension
Powerful Search & Filters
Customizable Collections