Skip to content

theovidal/hickathon-2025

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Hi!ckathon 2025 - Group 36

Submission for Hi!ckathon 2025, organized by Hi! Paris Competition on theme: "AI & Education: From PISA data to an innovative AI solution"

More specifically addressing the challenge Mental health & well-being: how can we come up with innovative solutions to support students?:

Mental health issues are increasingly prevalent in schools: stress, anxiety cyberbullying, social isolation, academic pressure, etc. A student experiencing distress may see a decline in their motivation, performance, and self-esteem. New approaches are emerging (prevention, digital support, self-assessment tools, discussion forums), but there is still much to be done to effectively support young people.

MindUp: Predicting Student Well-Being Impact on Academic Performance

Team Members:

  • Celia Chopelin (HEC) - celia.chopelin@hec.edu
  • Paola Dana Garcia (X) - paola.dana-garcia@polytechnique.edu
  • Emma Dufaure (HEC) - emma.dufaure@hec.edu
  • ThΓ©o Vidal (ENSTA) - theo.vidal@ensta.fr
  • Tom Hommola (HEC) - tom.hommola@hec.edu

Full report


πŸ“Š Project Overview

This project analyzes PISA 2022 data to predict students' mathematics scores using machine learning, with a particular focus on understanding the impact of well-being factors on academic performance. Our findings reveal that student mental health and social engagement are critical predictors of academic success, leading us to propose MindUp - a prevention-focused solution to address student well-being.

Key Achievement

  • RΒ² Score: 51% on validation data
  • Successfully identified well-being factors as strong predictors of academic performance
  • Developed a business solution addressing a €56.4B market opportunity

🎯 The Problem

The Problem

According to our analysis of PISA data:

  • 1 in 6 students experience physical stress-related symptoms (headache, stomach pain, back pain, feeling depressed, irritability, nervousness, sleep difficulties, dizziness, or anxiety)
  • 82% report not doing any extracurricular activities
  • 66% report high loneliness

Our predictive model shows that well-being factors have strong impact on test score predictions, making this not only an ethical issue but a problem of national competitiveness.


πŸ—οΈ Technical Architecture

Architecture

Dataset available on the Hi!ckathon drive: load the X_train.csv, y_train.csv, and X_test.csv files in the root folder of the project.

Data Processing πŸ“Š

  • Started with comprehensive data pre-processing and exploration
  • Removed all math-related questions from features (as they're not available during inference)
  • Transformed categorical and boolean features for proper model handling
  • Deleted questions with less than 0.1% answers to reduce noise

Model Training πŸ”₯

  • Used CatBoost - an entropy-based tree model offering:
    • Excellent handling of categorical variables (especially country)
    • Strong mix of explainability and predictive power
  • Training metric: RMSE
  • Evaluation metric: RΒ² β†’ 51% on test data
  • Extensive hyperparameter grid search for optimization

Explainability πŸ“ˆ

  • Feature importance analysis based on total entropy gain
  • Tree depth analysis showing decision hierarchy
  • SHAP values for robust model interpretation, fairly attributing prediction deviations to each input feature

πŸ” Key Findings

Top Predictive Features

Our model identified the following as the most important predictors (excluding test questions):

  1. Country (CNT) - Geographic and systemic factors
  2. Interest in scientific topics (ST095)
  3. Engagement with scientific inquiry (ST098)
  4. Home technological devices (IC001)
  5. Occupation code - Self (OCOD3)
  6. Number of books at home (ST255)
  7. Total class periods per week (ST059)
  8. Student International Grade (ST001D01T)
  9. Engagement in broad science activities (ST146)
  10. Home educational resources (ST011)

Well-Being Impact

SHAP Analysis

Well-being factors showing strong predictive impact:

  • Empathy and emotional understanding (ST311) - Highest impact
  • Emotions during last math class (WB166)
  • Communication with friends (WB160)
  • Sense of belonging at school (ST034)
  • Encouragement for creativity (ST336)
  • Experience of bullying/aggression (ST038)
  • Missing school > 3 months (MISSSC)
  • Overall appearance satisfaction (WB153)

πŸ’‘ Our Solution: MindUp

MindUp App

Based on our findings, we propose MindUp - a mobile application designed to prevent mental health issues through social engagement and physical activities.

Key Features

🀝 Encourage Social Interactions

Students join group activities with friends or peers, naturally increasing social interaction, confidence, and sense of belonging.

🎁 Incentivize Through Rewards

  • Students get 3 free monthly activities
  • Completing activities earns extra class credits
  • 90% class participation requirement encourages collaboration

🎯 Personalized Recommendations

Effortless discovery of activities matching student interests, helping build routines, make friends naturally, and decrease anxiety without formal help.

🀝 Partnership Model

Partners with local clubs, student associations, gyms, and cultural venues to offer students 2-3 free sessions.


πŸ’° Market Opportunity

Market Sizing

€56.4 Billion per year spent on mental-health-related insurance claims in Germany alone (public + private insurers)

Business Model

Prevention beats treatment!

MindUp reduces long-term insurance costs by preventing issues rather than treating them.

Revenue Streams

  1. High Schools Purchase Partnership Packages

    • Yearly or monthly fee for 3 free activity sessions per student per month
    • Students incentivized through extra credit system
    • 90% participation hurdle encourages peer collaboration
  2. Revenue Sharing with Activity Partners

    • App redirects majority of school payments to local sports clubs, cultural centers, and student associations
    • Small percentage retained by MindUp for operations and relationship management

πŸš€ Getting Started

Prerequisites

pip install pandas numpy scikit-learn catboost matplotlib seaborn plotly shap

Running the Analysis

  1. Data Processing and Model Training:

    # Open code.ipynb and run the cells sequentially
  2. Quick Pipeline:

    # Use condensed_pipeline.ipynb for streamlined workflow
  3. Generate Predictions:

    # The model automatically generates submission.csv

πŸ“ˆ Model Performance

  • Validation RΒ² Score: 51%
  • Model Type: CatBoost Regressor
  • Key Hyperparameters:
    • Depth: 9
    • Iterations: 1000
    • Learning Rate: 0.03
    • L2 Leaf Regularization: Optimized via grid search

πŸ“š Resources


πŸŽ“ Methodology Highlights

Feature Engineering

  • Removed biased columns (form-related, country identifiers)
  • Handled categorical features with proper encoding
  • Managed missing values strategically
  • Created interaction features between subjects

Model Selection Rationale

  • CatBoost chosen over XGBoost for:
    • Superior categorical feature handling
    • Symmetric tree structure stability
    • Built-in handling of missing values
    • Better interpretability

Explainability Techniques

  1. Feature Importance: Entropy-based ranking
  2. Tree Visualization: Understanding decision paths
  3. SHAP Values: Individual prediction explanation
  4. Feature Interactions: Identifying synergistic effects

πŸ“ License

This project was developed for the Hi!ckathon 2025 competition.


πŸ™ Acknowledgments

  • PISA 2022 for providing comprehensive educational assessment data
  • Hi!ckathon organizers for the opportunity
  • Advisors at Hi! Paris and Capgemini for valuable feedback

For questions or collaboration opportunities, please contact any team member listed above.


"Prevention beats treatment. MindUp - For a better future, for our children."

About

Submission for Hi!ckathon 2025 on theme "AI & Education: From PISA data to an innovative AI solution"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors