This project provides a Java-based backend application leveraging Retrieval-Augmented Generation (RAG) to transform the way students engage with lecture materials. Using advanced AI models, the application assists students in understanding topics and preparing for exams by generating exam-ready content from lecture slides.
The system enables students to upload lecture slides in PDF format, query information directly from these slides using AI, and receive autogenerated exam preparation materials such as multiple-choice questions (MCQs) and short-answer questions. By harnessing RAG, it retrieves the most relevant slide sections for student queries, offering clear and contextually appropriate responses.
- Programming Language: Java 17
- Framework: Spring Boot, Spring Data JPA, Thymeleaf
- Database: MySQL and Inmemory vector DB
- Security: Oauth2 for secure login and logout functionality
- AI Model: Ollama (Llama-3.2 & mixbreed for embeddings) open-source model
- Libraries:
- Apache PDFBox for PDF handling PDFs
- Langchain4j (Java library for RAG operations)
- User Interface: Thymeleaf and Swagger UI for API documentation and exploration
- Java Development Kit (JDK) 17 or later
- MySQL
- Maven for dependency management
-
Clone the Repository:
https://github.com/Siyamuddin/RAG_and_Langchain4j cd RAG_and_Langchain4j -
Set Up MySQL Database:
- Create a new MySQL database.
- Update database configurations in
application.properties.
-
Build the Application:
mvn clean install
-
Run the Application:
mvn spring-boot:run
-
Access UI: Navigate to
http://localhost:8080to explore the features.
Pull the Image from dockerHub:
docker pull uddin17/ai_buddy:latest
docker pull mysql:latest--AI Model Setup--
AI_API_KEY=Use your Ai model Api Key
AI_BASE_URL=Ai models base url
AI_MODEL_NAME=Name of your AI model
--Mail Service Setup--
MAIL_HOST=smtp.gmail.com;
MAIL_PASSWORD=User your generated password
MAIL_PORT=User your port
MAIL_USERNAME=Use your email address
--Auth Service Setup--
OAUTH2.ID=Use your OAUTH ID
OAUTH2.SECRETE=Use your OAuth secrete
OAUTH_REDIRECT_URI=User your rdirect URL
--Database Setup--
SPRING_DATASOURCE_PASSWORD=Password;
SPRING_DATASOURCE_URL=User your DB url
SPRING_DATASOURCE_USERNAME=Username
SPRING_JPA_HIBERNATE_DDL_AUTO=update;
--API Call Setup for Frontend(Thymeleaf)--
UPDATE=http://localhost:8080/api/v1/user/update;
UPLOAD_SLIDE=http://localhost:8080/api/v1/slide/upload/user
DELETE=http://localhost:8080/api/v1/user/delete;
DELETE_REDIRECT=http://localhost:8080/login;
DELETE_SLIDE=http://localhost:8080/api/v1/slide/delete;
GET_ALL=http://localhost:8080/api/v1/slide/get/all/user;
GMCQ=http://localhost:8080/api/v1/slide/generate/mcq;
GS=http://localhost:8080/api/v1/slide/generate/summary;
GSQ=http://localhost:8080/api/v1/slide/generate/sq;- Create Classes: Students can create and manage classes within the application.
- Upload Materials: Lecture slides can be uploaded in PDF format for AI-driven analysis.
- Text Extraction & Segmentation: Uploaded PDFs are converted to text and segmented into manageable sections.
- Vector Embeddings: Text segments are transformed into vector embeddings using Langchain4j, stored in an in-memory embedding store for efficient retrieval.
- Retrieval-Augmented Generation (RAG): Upon a student's query, the system retrieves the most relevant lecture sections and provides AI-generated responses, helping students understand complex topics.
- Automatic Summarization: Summarizes uploaded lecture content automatically.
- MCQ and Short Answer Generation: Generates potential MCQs and short-answer questions from slide content to aid in exam preparation.
- Oauth2-Based Security: Secure login and logout features with Google Oauth2 to manage user sessions and prevent unauthorized access.
- Interactive Documentation: Swagger UI enables developers and users to explore API endpoints, execute requests, and view real-time responses within a user-friendly interface.
- Spring Boot: Facilitates REST API development, handling core logic, routing, authentication, and backend services.
- Langchain4j for RAG: Manages text segmentation into embeddings, retrieves relevant segments, and integrates responses using the Llama-3.2 AI model.
- MySQL: Stores user data, class info, lecture materials, and AI-generated content.
- Oauth2: Secures API access by managing user authentication and authorization.
- Swagger UI: Provides an interactive interface to test and understand the API endpoints.
Contributions are welcome! Feel free to open issues, submit pull requests, or suggest features.