Customer Churn Prediction
Project Type: Machine Learning | Classification
Tools Used: Python, Pandas, Matplotlib, Seaborn, Scikit-Learn, XGBoost
Dataset: Telco Customer Churn (Kaggle)
Objective
Customer churn directly impacts a company’s revenue and growth. This project builds a predictive model to identify high-risk customers, enabling proactive retention strategies.
Key Steps
- Exploratory Data Analysis (EDA): Identified key trends in customer behavior and churn rates.
- Data Preprocessing & Feature Engineering: Handled missing values, transformed categorical variables, and engineered features for better model performance.
- Model Training & Evaluation: Compared Logistic Regression, Random Forest, SVM, and XGBoost, optimizing for recall and AUC-ROC.
Results
- Best Model: Logistic Regression (Highest explainability, strong performance)
- Key Insights: Customers with fiber optic internet and electronic check payments had higher churn rates.
- Business Impact: Predicting churn allows companies to proactively retain customers, reducing revenue loss.
🔗 View my blog post: Link
🔗 View Notebook on GitHub: GitHub Link
🔗 View Interactive Notebook (nbviewer): nbviewer Link