Learn Retrieval-Augmented Generation, vector search, embeddings, AI agents, function calling, evaluation, monitoring, hybrid search, reranking, and more - all in a free, open-source, hands-on course by DataTalks.Club.
⭐ Star this repo to stay updated with new modules and cohort announcements
| Resource | Link |
|---|---|
| 📁 Course materials | GitHub repository |
| 🎥 Video lectures | YouTube playlist |
| 📅 Cohort schedule & deadlines | courses.datatalks.club |
| 💬 Slack community | #course-llm-zoomcamp |
| 📣 Announcements | Telegram |
| 🏆 2025 cohort projects | courses.datatalks.club/llm-zoomcamp-2025/projects |
LLM Zoomcamp teaches you how to build practical, production-ready LLM applications step by step.
This course is for people who learn by doing. After completing it, you'll have a working codebase and the hands-on experience to build your own LLM-powered applications.
- Software Engineers: Add LLMs, RAG, and modern search capabilities to real products
- Data Engineers: Understand how vector search, hybrid search, and retrieval pipelines fit into production systems
- ML Practitioners: Get a structured way to evaluate and monitor LLM-based applications
- Python: You can write code confidently
- Command Line: Comfortable with terminal
- Docker: Basic familiarity
- ML / LLMs: Not required
- Hardware: Any laptop or PC. No GPU needed
- Expenses: ~$1-5 in API credits
Note
If you can write a Python function and have heard of ChatGPT, you have enough to get started.
There are two ways to follow the course: live and self-paced.
| Live Cohort | Self-Paced | |
|---|---|---|
| Start | June 8, 2026, 17:00 CET | Anytime |
| Lectures | Pre-recorded | Pre-recorded |
| Homework | Graded | Available but not scored |
| Leaderboard | ✅ Yes | ❌ No |
| Peer Review | ✅ Yes | ❌ No |
| Certificate | ✅ Yes | ❌ No |
| Cost | Free | Free |
| Register | Sign up here | Just start learning! |
Important
"Live cohort" does not mean live classes. All lectures are pre-recorded. "Live" means working with others, having deadlines, getting your homework and project scored, review your peers, and getting a certificate at the end.
Self-paced steps:
- Follow the materials on GitHub
- Ask questions and share progress in Slack
- Do homeworks (self-checked) and build a project for your portfolio
- 1. Introduction to LLMs & RAG. Build a basic RAG pipeline with text search
- 2. Vector Search. Index and retrieve documents using semantic embeddings
- 3. Agents. Add autonomous tool use and function calling to RAG
- Workshop - Data Ingestion. Ingest data with dlt from external sources into your RAG system
- 4. Evaluation. Measure retrieval and answer quality with offline and online eval
- 5. Monitoring. Monitor user feedback and system health with live dashboards
- 6. Best Practices. LangChain, hybrid search. Combine vector + keyword search; rerank results for higher precision
- 7. End-to-End Project. A complete project example: a fitness assistant built with LLMs
- Capstone Project. Ship a complete end-to-end project of your choice from scratch
Recommended approach:
- Watch the video for each module
- Complete the homework to reinforce the concepts
- Build your capstone project applying everything end-to-end
The capstone is your chance to apply everything end-to-end. You'll build a complete, working RAG application built and owned by you.
What you'll build:
- A searchable knowledge base. Choose a dataset, ingest, clean, and store it for retrieval
- A retrieval pipeline. Implement the full RAG flow: retrieve context, assemble prompts, call an LLM, return grounded answers
- An evaluation process. Measure how well your system retrieves and answers using search metrics or LLM-as-a-Judge
- A user-facing interface. A simple UI or API (Streamlit, FastAPI, or similar) so others can try your app
- Monitoring & feedback loops. Track queries, feedback, and performance over time
- Fitness & nutrition assistant
- Study companion for textbooks or course notes
- Medical FAQ assistant
- Codebase Q&A bot
- News summarization and retrieval tool
Note
See the full capstone project guidelines and browse all 2025 and 2024 cohort submissions for inspiration.
To earn your certificate:
- Complete the final project. Build a real-world RAG application demonstrating all course concepts
- Peer review 3 projects. Evaluate and provide written feedback on three fellow students' submissions
- Meet the deadlines. Submit your project and reviews within the cohort schedule
Certificates are issued after all peer reviews are completed. Self-paced learners are not eligible for certification but can build portfolio projects freely.
|
Alexey Grigorev Founder, DataTalks.Club Founder of DataTalks.Club and creator of multiple open-source ML courses reaching tens of thousands of learners worldwide. Former principal data scientist with deep expertise in ML systems and engineering. |
Timur Kamaliev Senior Data Scientist AI Engineer specializing in building production LLM systems, RAG pipelines, and agentic applications. Hands-on practitioner with real-world experience shipping GenAI products. |
A huge thanks to our sponsors for making this course possible!
Tip
Interested in supporting the DataTalks.Club community? Reach out to alexey@datatalks.club.
"This course gave me hands-on experience in building LLM-powered applications, including prompt engineering, retrieval-augmented generation (RAG), pipeline orchestration, and vector search optimization."
— Alexander Daniel Rios, LLM Zoomcamp Graduate
"Not gonna lie - this course took longer than planned. By the end, I was running on fumes, forcing myself to push through the final modules. But I made it. What I loved: hands-on experience building real AI systems (not just theory!), deep dives into RAG, vector databases, evaluation, and monitoring, and the wealth of production-ready practices that matter in enterprise environments."
— Vasiliy Chernykh, LLM Zoomcamp Graduate
Read more testimonials from past graduates →
Join the #course-llm-zoomcamp channel on DataTalks.Club Slack for discussions, troubleshooting, and networking with fellow learners and the course team.
To keep discussions useful for everyone:
- Follow our posting guidelines when asking questions
- Review the community guidelines
We actively encourage sharing your progress online throughout the course. Post what you're building on LinkedIn, Twitter/X, or a blog. It helps you get noticed and connect with others in the field. It also earns you bonus points toward your homework and project scores.
Full FAQ: datatalks.club/faq/llm-zoomcamp.html
Q: Is this course really free?
A: Yes. All videos, materials, and homework are free. You may spend $1-5 in OpenAI API credits if you run the code yourself.
Q: Do I need a GPU?
A: No. All exercises are designed to run on a standard laptop using cloud APIs.
Q: What does "live cohort" mean? Are there live classes?
A: No mandatory live classes. "Live" means homework deadlines, automatic scoring, a leaderboard, peer review, and certificate eligibility are all enabled. All lectures are pre-recorded.
Q: Can I join after the cohort has started?
A: Yes. You can join after the start date, but deadlines remain fixed. Some homework forms may already be closed.
Q: Can I join mid-cohort or self-paced?
A: Yes. All materials stay available after each cohort ends. Self-paced learners are always welcome, though certificates require a live cohort.
Q: Will I get a certificate?
A: Yes. Complete the final project and peer review 3 students' projects during the live cohort to earn your certificate. Self-paced mode does not include certification.
Q: Do I need to complete every homework to get a certificate?
A: No. You only need to complete the final project and peer reviews to get it.
Q: What if I get stuck?
A: Discuss your problem in #course-llm-zoomcamp on Slack. The community and instructors are active there. Also check the FAQ page for detailed answers.
Q: How much time should I expect to spend?
A: Expect roughly 5-10 hours per week, depending on your background and how deep you go into the materials.
Found a bug in the course materials? Know how to improve an explanation or fix broken code? Contributions are welcome and appreciated.
- Fork the repository
- Make your fix or improvement
- Open a pull request with a clear description
Every contribution helps future learners. Thank you 🙏
DataTalks.Club is a global online community of data enthusiasts — a place to learn, share knowledge, ask questions, and support each other through free courses, events, and an active Slack community.
Website • Slack • Newsletter • Events • Google Calendar • YouTube • GitHub • LinkedIn • Twitter
Note
Most activity happens on Slack. Join us there for updates, discussions, and community events. Learn more at DataTalksClub Community Navigation.