Rail break prediction

Machine Learning
Rail break prediction

Tech Stack

Databricks
Machine Learning
Agile
Project management

Description

This project focuses on building a data-driven machine learning pipeline to predict rail break risk from historical operational and condition data. I worked end-to-end across the analytics workflow: cleaning and validating raw datasets, designing informative features, training and tuning predictive models, and evaluating performance under a competitive leaderboard setting.

The key challenge was transforming noisy, real-world rail data into stable signals for prediction. Through iterative experimentation, systematic hyperparameter tuning, and careful evaluation, the final solution achieved strong leaderboard performance (Rank #2 in the recorded snapshot), demonstrating both model effectiveness and disciplined ML engineering practice.

  • Built an end-to-end ML pipeline: data cleaning, preprocessing, feature engineering, training, and evaluation
  • Experimented with and compared multiple models (baseline → advanced), selecting the best performer via metric-driven benchmarking
  • Evaluated performance with Accuacy, F1-score and AUC-PR, emphasizing imbalanced-class robustness
  • Applied explainability analysis like SHAP, DICE to interpret feature importance and support model validation
  • Wroked as a scrum master for one sprint

Page Info

Rank

/projects/rail/logo2.png

Submit History

/projects/rail/complete.png

    Zihan Luo - Software Engineer