🚀 LLaMA 4 is here—and it’s bringing back memories.
Just a week ago, LLaMA 4 was released, and it immediately reminded me of the early days when the LLaMA series set out to bridge the gap between open-source and proprietary LLMs. Although the LLaMA-4-Maverick-17B-128E-Instruct model has dropped in ranking from 2nd to 32nd (likely due to evolving benchmarks and newer versions), I’m still especially excited to experiment with LLaMA 4 Maverick.
I first noticed support via the Together.ai API, but—as usual—OpenRouter was lightning fast in integrating the model too. With its clean interface and broad model support (from DeepSeek to Gemini 2.5 Pro), OpenRouter has become my go-to for testing.
🔍 Before diving into some toy benchmarking experiments, here are a few notable LLaMA 4 advances worth highlighting:
LLaMA 4 Scout supports 10 million token context windows, enabled by interleaved attention layers (iRope)—removing the need for positional embeddings entirely.
For those curious about the underlying theory, the paper "Scalable-Softmax is Superior for Attention" is a must-read. In short, scalable-softmax helps:
Speed up pretraining convergence
Generalize better to longer contexts
Improve performance on Needle-in-a-Haystack tasks
🧪 Toy Benchmarking Repo
I built a small repo to evaluate how various LLMs handle LeetCode-style problems. Each script includes:
Code generation
Detailed test case outputs
Explanation logging
Basic performance statistics
Key insight: Most modern LLMs breeze through these tasks—achieving over 82% success in a single pass. LeetCode problems might no longer be the real test for top-tier models.
🎮 Fun use-case testing: Mini Games
Both DeepSeek-V3 0324 and LLaMA 4 Maverick performed well on simple tasks like:
🏃 Endless Runner Game: I preferred DeepSeek’s version—it even changes the background color when the game ends.
🔄 Bounce Ball Game: DeepSeek followed spatial constraints better (balls inside the rotating hexagon), while LLaMA 4 sometimes placed them outside.
🌀 Mandelbrot Set Visualization: Both models produced visually accurate and smooth outputs.
💡 Overall, LLaMA 4 continues to impress with its technical innovation and usability. If you're experimenting with it too—curious to hear your thoughts!
#LLaMA4 #OpenSourceLLM #DeepSeek #OpenRouter #AI #Maverick #Benchmarking #GenerativeAI #LLM #LeetCode #Mandelbrot #Research
- Online LLM Leetcode Comparison Report - View LLM Leetcode comparison results
- Online LLM Game Comparison Report - View LLM game comparison results
- LeetCode Solutions Documentation - Full list of solutions organized by category
leetcode_questions/- Contains all LeetCode problem solutions organized by categoryllm_analysis_result/- Contains HTML reports comparing different AI model solutionsdeepseek_solutions/,llama4_maverick_solutions/,gemini_solutions/- Solutions generated by different AI modelssolver_scripts/- Contains scripts for generating solutions using different AI modelsllm_leetcode_analyzer.py- Main analyzer scriptserve.py- Local web server for viewing the analysis reports
The LeetCode Solution Analyzer is a Python-based tool that:
- Analyzes Python solutions for LeetCode problems
- Generates detailed HTML reports with statistics
- Tracks solution status (Pass/Fail/Unsolved)
- Provides links to solution files
- Shows error details for failed solutions
- Supports multiple AI model solutions
python llm_leetcode_analyzer.py- Online version: GitHub Pages Report
- Local version:
- Run
python serve.pyto start the local server - The browser will automatically open to the comparison report
- Or navigate to: http://localhost:8000/llm_analysis_result/models_comparison_report.html
- Run
python solver_scripts/deepseek_leetcode_solver.py
python solver_scripts/llama4_leetcode_solver.py
python solver_scripts/gemini_leetcode_solver.py- Automatic solution verification
- Error detection and reporting
- Creation time tracking
- Problem status classification
- Detailed error messages
- Interactive HTML interface
- Model-specific tabs
- Real-time statistics
- Solution file links
- Error detail toggles
- Responsive design
- Python 3.x
- Standard Python libraries:
- os
- subprocess
- re
- datetime
- webbrowser
- requests
- dotenv
- llm_leetcode_analyzer.py: Main script that analyzes solutions and generates reports
- deepseek_leetcode_solver.py: Generates solutions using DeepSeek model
- llama4_leetcode_solver.py: Generates solutions using Llama-4 Maverick model
- gemini_leetcode_solver.py: Generates solutions using Gemini 2.5 Pro model
- models_comparison_report.html: Compares solutions from all models
- deepseek_solution_report.html: DeepSeek model solutions and statistics
- llama4_maverick_solution_report.html: Llama-4 model solutions and statistics
- gemini_solution_report.html: Gemini model solutions and statistics
Feel free to submit issues and enhancement requests!
This project is open source and available under the MIT License.