✨ QuizGen — LLM-Powered MCQ Generator

Turn any document into ready-to-use multiple-choice quizzes with Llama 3.1, LangChain, and a two-stage generate-and-review pipeline.

Upload a PDF or TXT, pick a subject + difficulty, get a clean CSV of quiz questions back —
each one generated AND reviewed by Llama 3.1 before it reaches you.

🌟 What Makes This Project Different

LangChain MCQ generators are everywhere. This one earns its place through four engineering choices:

🔗 Two-stage SequentialChain — generation and review are separate LLM passes; the second pass critiques the first and fixes unsuitable questions before the user ever sees them
📐 Structured output enforcement — strict JSON schema injected into the prompt (RESPONSE_JSON) makes the LLM output parseable every time, no regex cleanup needed
📦 Production packaging — proper src/mcqgenerator/ Python package with setup.py, logging module, and utility separation — not a notebook dump
🎨 Polished UI — custom Streamlit theme with gradient hero, card-based MCQ display, and color-coded answer reveals — not the default look

🗺️ The Two-Stage Pipeline

┌──────────────────────────────────────────────┐
│       User uploads PDF / TXT + settings      │
│   (subject, difficulty, number of MCQs)      │
└─────────────────────┬────────────────────────┘
                      │
                      ▼
        ┌──────────────────────────┐
        │   Stage 1 — Quiz Chain   │  Llama 3.1 generates N MCQs
        │   PromptTemplate +       │  with strict JSON schema
        │   LLMChain               │  Output → quiz (JSON string)
        └─────────────┬────────────┘
                      │
                      ▼
        ┌──────────────────────────┐
        │  Stage 2 — Review Chain  │  Llama 3.1 critiques the quiz
        │  Same LLM, new prompt    │  for difficulty + suitability
        │  Rewrites unsuitable Qs  │  Output → review + corrections
        └─────────────┬────────────┘
                      │
                      ▼
        ┌──────────────────────────┐
        │   JSON parsing → table   │  Structured to: MCQ | Choices | Correct
        └─────────────┬────────────┘
                      │
                      ▼
        ┌──────────────────────────┐
        │   Streamlit UI + CSV     │  Preview + one-click download
        └──────────────────────────┘

The SequentialChain orchestrates both stages — Stage 1's output feeds Stage 2 as input, all in one call from the Streamlit app.

💬 QuizGen in Action

🎨 Clean, intuitive interface

Upload a document, configure subject/difficulty, hit generate. That's it.

📋 Beautiful card-based quiz display

Each MCQ renders as its own card with numbered badge, choices, and a color-coded correct answer reveal:

🔄 Multiple questions, consistent quality

The two-stage pipeline ensures every question gets reviewed before display:

🔍 AI Quality Review + CSV Export

After generation, the AI's own quality review is shown alongside a one-click CSV export:

📊 Example Output

Input: sample_ml_fundamentals.txt (a 6 KB article on machine learning basics) Settings: Subject = Machine Learning, Difficulty = Medium, Number = 6

A few of the generated questions:

#	Question	Correct
1	What type of machine learning uses labeled data to learn from examples?	c (Supervised learning)
3	What is the term for a model that learns the training data too well and performs poorly on new data?	b (Overfitting)
6	What is the term for the phenomenon where a model's performance degrades over time as the world changes?	a (Concept drift)

Plus an AI quality review: "Moderate difficulty. No changes needed."

🔬 How It Works

Stage 1 — Quiz Generation Prompt

The LLM is given the source text plus a strict instruction to return only valid JSON matching a schema:

TEMPLATE_QUIZ = """
{system_msg}
Context: {text}
Your task is to write exactly {number} multiple-choice questions based on the
above content. The questions should be appropriate for {subject} students and
written in a {difficulty} difficulty.
Return ONLY a JSON object matching the format shown in RESPONSE_JSON below.
Do not include any extra explanation.

### RESPONSE_JSON
{response_json}
"""

The RESPONSE_JSON (loaded from Response.json) gives the model a concrete schema to mimic:

{
  "1": {
    "mcq": "multiple choice question",
    "options": {"a": "choice", "b": "choice", "c": "choice", "d": "choice"},
    "correct": "correct answer"
  }
}

This pattern — show the model the exact output shape — is dramatically more reliable than describing the format in prose.

Stage 2 — Review Chain

The generated quiz is fed back into the LLM with a new prompt:

TEMPLATE_REVIEW = """
{system_msg}
Below is a quiz for {subject} students. Review its difficulty in no more
than 50 words. If any question is not suitable, rewrite only the problem
parts in a suitable difficulty.

Quiz:
{quiz}
"""

This catches questions that are too easy, too obscure, or off-topic — a critical safety net for educational content.

SequentialChain Orchestration

Both chains are wired together so a single call produces both outputs:

combined_chain = SequentialChain(
    chains=[quiz_chain, review_chain],
    input_variables=["system_msg", "text", "number", "subject",
                     "difficulty", "response_json"],
    output_variables=["quiz", "review"],
    verbose=True
)

📁 Repository Structure

quizgen/
├── src/
│   └── mcqgenerator/
│       ├── __init__.py
│       ├── MCQgenerator.py    # SequentialChain definition
│       ├── utils.py           # read_file, get_table_data
│       └── logger.py          # Logging setup
├── docs/
│   └── images/                # Screenshots for this README
├── app.py                     # Streamlit web interface
├── mcq.ipynb                  # Pipeline development notebook
├── test.py                    # Logging test
├── data.txt                   # Sample input
├── Response.json              # JSON schema template
├── quiz.csv                   # Sample output
├── requirements.txt
├── setup.py
└── .env                       # HUGGING_FACE_API_KEY (git-ignored)

⚙️ Tech Stack

Layer	Technology
LLM	Llama 3.1 8B Instruct (via HuggingFace Inference API, auto-provider routing)
Orchestration	LangChain `LLMChain` + `SequentialChain` + `PromptTemplate`
Web UI	Streamlit (custom CSS theme)
PDF Parsing	PyPDF2
Data Handling	Pandas
Env Management	python-dotenv
Packaging	setuptools

🚀 Getting Started

1. Clone the repo

git clone https://github.com/houdhoudGH/quizgen.git
cd quizgen

2. Create virtual environment

python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate
pip install -r requirements.txt

3. Set up your API key

Create a .env file:

HUGGING_FACE_API_KEY=your-hf-token

Get a free token at huggingface.co/settings/tokens.

4. Run the app

streamlit run app.py

Open http://localhost:8501 🚀

5. Use it

Upload a PDF or TXT file
Pick subject, difficulty (Simple/Medium/Hard), and number of MCQs
Click Generate MCQs
Review the generated quiz on screen
Download the result as CSV

🔮 Roadmap

The app ships as a working two-stage LLM pipeline. Five directions for production hardening:

Containerization — Dockerfile + docker-compose.yml for one-command deployment
CI/CD — GitHub Actions workflow for automated testing and linting
Question variety — extend beyond MCQs to true/false, fill-in-the-blank, and short-answer
Interactive mode — let users actually take the quiz in-app, with scoring
Multilingual — generate quizzes in Arabic, French, English from the same source
Export formats — Anki deck, Kahoot CSV, Google Forms import

📄 License

MIT — see LICENSE for details.

🎓 About This Project

QuizGen explores multi-stage LLM orchestration — using a generate-then-critique pipeline to produce educational content more reliable than a single-shot prompt could deliver.

Made with 💜 by Gheffari Nour El Houda

_{Master 2 Data Science & NLP · AI Engineer}

_{LangChain · Llama 3.1 · HuggingFace · Streamlit}

_{If you found this useful, consider giving the repo a ⭐}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
Response.json		Response.json
app.py		app.py
data.txt		data.txt
image.png		image.png
image2.png		image2.png
image3.png		image3.png
image4.png		image4.png
requirements.txt		requirements.txt
setup.py		setup.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✨ QuizGen — LLM-Powered MCQ Generator

🌟 What Makes This Project Different

🗺️ The Two-Stage Pipeline

💬 QuizGen in Action

🎨 Clean, intuitive interface

📋 Beautiful card-based quiz display

🔄 Multiple questions, consistent quality

🔍 AI Quality Review + CSV Export

📊 Example Output

🔬 How It Works

Stage 1 — Quiz Generation Prompt

Stage 2 — Review Chain

SequentialChain Orchestration

📁 Repository Structure

⚙️ Tech Stack

🚀 Getting Started

1. Clone the repo

2. Create virtual environment

3. Set up your API key

4. Run the app

5. Use it

🔮 Roadmap

📄 License

🎓 About This Project

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

✨ QuizGen — LLM-Powered MCQ Generator

🌟 What Makes This Project Different

🗺️ The Two-Stage Pipeline

💬 QuizGen in Action

🎨 Clean, intuitive interface

📋 Beautiful card-based quiz display

🔄 Multiple questions, consistent quality

🔍 AI Quality Review + CSV Export

📊 Example Output

🔬 How It Works

Stage 1 — Quiz Generation Prompt

Stage 2 — Review Chain

SequentialChain Orchestration

📁 Repository Structure

⚙️ Tech Stack

🚀 Getting Started

1. Clone the repo

2. Create virtual environment

3. Set up your API key

4. Run the app

5. Use it

🔮 Roadmap

📄 License

🎓 About This Project

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages