Back to Projects
Data Science • Project

Telecom QoE Analytics

End-to-End Analytics Pipeline from EDA to Strategic Insights

XGBoost LightGBM SHAP Isolation Forest STL Decomposition Optuna Pandas Seaborn
2024 - 2025
Python 3.11
Data Science Portfolio

Overview

Telecom QoE Analytics is a comprehensive Data Science Practice project utilizing synthetic telecom-digital-twin dataset. It demonstrates end-to-end analytics capability from raw data profiling and rigorous statistical testing to advanced machine learning modeling and strategic troubleshooting, focused on improving Quality of Experience (QoE) in telecommunications networks.

The project implements a six-phase analytics pipeline: (1) Data Profiling & EDA, (2) Statistical Analysis & Causal Inference, (3) ML Regression for QoE Prediction, (4) ML Classification for Degradation Prediction, (5) Unsupervised Learning & Anomaly Detection, (6) Executive Summary & Strategic Insights. All phases prioritize interpretability and actionability over theoretical complexity.

Key findings include: Cell Congestion has massive effect size (Cohen's d = -2.12) on QoE, far outweighing other metrics. XGBoost achieved strong R² performance (0.7247) for QoE prediction. LightGBM achieved high ROC-AUC (0.9645) for degradation classification with excellent recall (0.92). Anomalies cluster around 5 PM busy hour, suggesting peak load correlation.

Key Achievement: Built comprehensive six-phase analytics pipeline with statistical rigor, SHAP interpretability, and high-performance ML models demonstrating end-to-end data science methodology

Key Metrics & Results

6 Phase
Analytics Pipeline
Statistical
Rigor
SHAP
Interpretability
XGBoost LightGBM
Models

Problem Statement

Telecom operators need to understand drivers of Quality of Experience (QoE) degradation to prioritize network investments. Traditional analytics provide correlations but lack causal inference. ML models need interpretability for field engineers. Anomaly detection must balance recall vs precision for SLA compliance.

Business Context

Network operations require actionable insights, not just model predictions. Field engineers need to understand why cells are degraded (feature importance). False negatives (missing outages) are more costly than false positives (false alarms). Strategic recommendations must translate technical findings to business value.

Technical Challenges

Solution Architecture

A six-phase structured pipeline: (1) Data Profiling with schema validation and QoE distribution analysis, (2) Statistical Analysis with ANOVA and effect size (Cohen's d), (3) ML Regression using XGBoost with Optuna tuning, (4) ML Classification using LightGBM with class imbalance handling, (5) Unsupervised Learning with STL decomposition and Isolation Forest, (6) Executive Summary translating findings to strategic recommendations.

System Components

Statistical Rigor Phase

Hypothesis testing (ANOVA) confirms QoE differences between segments. Effect size analysis (Cohen's d) quantifies impact magnitude. Identified congestion as primary driver (d=-2.12).

ML Regression Pipeline

XGBoost Regressor tuned with Optuna. Achieved R²=0.7247, MAE=0.3672, RMSE=0.4560 on test set. Feature importance identifies latency and congestion as top predictors.

ML Classification Pipeline

LightGBM Classifier handling class imbalance. Achieved ROC-AUC=0.9645 with precision=0.46 and recall=0.92 for minority 'Low QoE' class. Excellent recall ensures proactive intervention. Serves as CEM dashboard engine.

Anomaly Detection System

STL Decomposition for trend/seasonality removal, followed by Isolation Forest. Successfully isolated anomalies (~5% of data) clustering around 5 PM busy hour.

SHAP Interpretability

Game-theoretic feature attribution proving congestion (not just signal strength) is primary QoE driver. Provides explainability for business stakeholders.

Technology Stack Rationale

XGBoost/LightGBM chosen over deep learning for tabular data and superior interpretability. SHAP provides consistent feature attribution vs biased gain metrics. Isolation Forest handles multivariate anomalies. STL decomposition removes temporal patterns. Optuna enables automated hyperparameter tuning.

Implementation Highlights

Key Features

Detailed Code Documentation

Deep dive into the technical implementation with annotated code examples

View Technical Details

Challenges & Solutions

Challenge 1

Moving beyond correlation to causal inference

Solution

Implemented ANOVA hypothesis testing and effect size analysis (Cohen's d). Quantified impact magnitude: congestion has d=-2.12, far outweighing other factors. Proved congestion is primary driver, not just correlated.

Challenge 2

Ensuring ML model interpretability for business stakeholders

Solution

Adopted SHAP for feature attribution. Provides game-theoretic guarantees of consistency. Proved congestion (not just signal strength) is primary QoE driver, directly influencing backhaul expansion recommendation.

Challenge 3

Handling class imbalance in degradation prediction

Solution

Used LightGBM with class_weight parameter. Tuned threshold to maximize recall (sensitivity) for minority 'Low QoE' class. Achieved strong ROC-AUC performance with high recall.

Challenge 4

Defining 'normal' in highly dynamic networks

Solution

Used STL decomposition to remove trend/seasonality from time-series. Applied Isolation Forest on residuals. Successfully isolated anomalies (~5%) clustering around 5 PM busy hour.

Results & Impact

Identified congestion as primary QoE driver (effect size d=-2.12). Achieved strong ML performance on synthetic dataset: XGBoost R²=0.7247, LightGBM ROC-AUC=0.9645. Generated strategic recommendations: prioritize backhaul expansion, optimize latency, deploy proactive alerts. Models demonstrate capability for CEM dashboard deployment.

Production Performance

  • XGBoost Regression: R²=0.7247, MAE=0.3672, RMSE=0.4560 on test set
  • LightGBM Classification: ROC-AUC=0.9645 with precision=0.46, recall=0.92 for Low QoE
  • Anomaly Detection: Successfully isolated ~5% anomalies clustering at 5 PM
  • Effect Size Analysis: Congestion has d=-2.12 (massive impact)
  • SHAP Interpretability: Proved congestion is primary driver, not signal strength

Lessons Learned

What Worked Well

What I'd Do Differently

Future Enhancements

Related Projects