A real-time AI coaching application that analyzes your fighting form using computer vision and machine learning. This application acts as a virtual coach, providing instant feedback on your technique for various strikes.
- Real-time Pose Detection: Utilizes MediaPipe to track 33 key body landmarks in real-time.
- Move Classification: Automatically identifies moves including:
- Jab
- Cross
- Uppercut
- Kick
- Knee
- Neutral Stance
- Instant Feedback: Provides corrective feedback on your form (e.g., "Extend arm more", "Keep elbow bent", "Rotate hips") based on biomechanical analysis.
- Voice Control: Hands-free operation with voice commands:
- "Start Coach" to begin a session.
- "End Coach" to stop and review.
- Session Recording: Automatically records your session and provides a replay with synchronized feedback logs.
- Visual Overlay: Displays feedback and status directly on the video feed.
-
Clone the repository:
git clone <repository-url> cd <repository-directory>
-
Install dependencies:
pip install -r requirements.txt
Note: Ensure you have
flask,pandas,scikit-learn,joblib,opencv-python, andmediapipeinstalled.
-
Start the application:
python app.py
-
Open the interface: Open your web browser and navigate to
http://localhost:5001. -
Start Training:
- Allow camera and microphone permissions when prompted.
- Stand back so your full body is visible in the camera frame.
- Click the Start Coach button or say "Start Coach".
-
Receive Feedback:
- Perform strikes (jabs, crosses, etc.).
- Read the real-time feedback on the screen.
- Listen for audio cues (if enabled).
-
Review Session:
- Click End Coach or say "End Coach".
- A modal will appear with a video replay of your session and a detailed log of all feedback received.
- Backend: Python, Flask
- Computer Vision: MediaPipe, OpenCV
- Machine Learning: Scikit-learn (Random Forest Classifier), Pandas, NumPy
- Frontend: HTML5, CSS3, JavaScript (Canvas API, MediaRecorder API, Web Speech API)
The main entry point for the Flask application. It handles:
- Routing: Defines endpoints for the web interface (
/) and API routes for session management (/start_session,/end_session) and pose processing (/process_pose). - Session Management: Manages active user sessions and stores temporary pose data using a sliding window approach.
- Data Flow: Receives landmarks from the frontend, forwards them to the backend logic, and returns real-time feedback.
Contains the core logic for the application, including:
- Feature Extraction: Converts raw MediaPipe landmarks into biomechanical features (angles, velocities, relative positions).
- Motion Analysis: Implements the
build_features_from_windowfunction to create a feature vector from a sequence of frames. - Feedback Generation: The
generate_feedbackfunction analyzes the predicted move and specific feature values to produce actionable, biomechanically accurate advice (e.g., correcting arm extension or guard position). - Model Loading: Loads the pre-trained Random Forest classifier and MediaPipe configuration.
The frontend user interface, built with HTML5, CSS3, and JavaScript. Key components include:
- Video Processing: Uses the MediaPipe Pose JS solution to detect landmarks directly in the browser.
- Canvas Rendering: Draws the video feed, landmarks, and overlay feedback on an HTML5 Canvas.
- Session Control: Manages the recording state, handles the "Start/End Coach" buttons, and captures video for replay using the MediaRecorder API.
- Voice Recognition: Integrates the Web Speech API for hands-free voice commands.
A directory containing the machine learning assets:
- Random Forest Model: The trained
sklearnmodel used to classify martial arts moves. - Pose Landmarker: The MediaPipe model file used for pose detection (if running in backend mode, though the frontend also uses the JS version).