How I built an interactive MCQ generator for insurance exams using Python, LangChain, and Google’s Gemini API

In the ever-evolving world of education technology, combining large language models (LLMs) with smart retrieval systems is revolutionizing how we study.
As a developer passionate about both AI and insurance, I built an AI-powered web app that automatically generates multiple-choice questions (MCQs) from insurance textbooks, specifically for AAII Level 1 and Level 2 exams.
This app is now live at aii-mcq-generator.onrender.com, helping learners practice, revise, and reinforce core insurance principles using cutting-edge language models.
📘 What Does the App Do?
The AAII MCQ Generator takes in a selected textbook (stored securely), extracts the content for a chosen chapter, and uses Google Gemini 2.0 Flash to generate 10 multiple-choice questions — complete with four answer options, the correct answer, explanations, and page-source references.
All questions are displayed in an interactive quiz format where users can test their knowledge and receive instant feedback and scoring.
This tool is designed to help insurance students, trainers, and professionals revise efficiently by turning static PDFs into dynamic, personalized assessments.
Users only need their Gemini API key and can choose from multiple subjects and chapters.
🧠 How It Works (Technically)
The core of this app lies in combining data science concepts like semantic search, embeddings, and prompt engineering into an intuitive frontend powered by Streamlit. Here’s a high-level breakdown:
- PDF Parsing and Chapter Detection
Using PyMuPDF (fitz), the app scans each insurance textbook to extract its table of contents and chapter page ranges. This allows precise targeting of chapter-specific content for revision.
- Text Chunking
The content is split into overlapping chunks using LangChain’s RecursiveCharacterTextSplitter. This ensures the text remains within the context limits of the LLM while preserving semantic meaning.
- Embedding Creation
Each chunk is converted into a high-dimensional vector using Google’s embedding-001 model. These vectors represent the semantic content of each section of the textbook.
- Vector Store Indexing with Chroma
The embeddings are stored in a persistent vector database (Chroma, powered by FAISS). When a user selects a chapter, the app performs a semantic search to retrieve only the most relevant chunks.
- Filtering by Chapter Metadata
Results from the vectorstore are filtered by page range to ensure that only the relevant chapter content is passed to the language model for MCQ generation.
- Prompt Engineering and LLM Generation
A structured prompt is created and sent to the Gemini 2.0 Flash API via a REST call. The prompt instructs the model to generate 10 high-quality questions, with formatted output including answer options, explanations, and textbook page numbers.
- Question Parsing and User Interface
The model’s response is parsed using regular expressions and displayed in a clean quiz format using Streamlit. Users can select answers, submit, and immediately see their score, correct answers, explanations, and sources.
🌐 Deployment on Render
To make this tool publicly accessible, I deployed the app on Render.
The Streamlit frontend runs as a web service, and the app pulls textbooks from a private GitHub repository to protect copyrighted content.
The vector databases for each subject are saved on the server and reused across sessions to optimize performance.
The final deployed app is available at:
🔗 aii-mcq-generator.onrender.com
No user data is stored, and all MCQ generation runs in real time based on the Gemini API key the user provides.
🔍 Why This Matters
Many insurance certification students struggle to prepare due to the dense, technical nature of their reading materials. This project transforms that experience by giving learners the power to:
- Generate personalized quizzes instantly from their textbooks
- Receive explanations and sources for each question
- Target specific chapters instead of studying blindly
More importantly, it brings together data science, natural language processing, and user-centered design into a practical EdTech solution.
🧰 Tech Stack Overview
| Component | Tool/Service |
| Frontend | Streamlit |
| LLM API | Gemini 2.0 Flash (Google Generative AI) |
| Embedding Model | models/embedding-001 (Gemini) |
| Vector Database | Chroma with FAISS backend |
| PDF Processing | PyMuPDF (fitz) |
| Hosting | Render (Web Service) |
| Storage | Private GitHub Repo (Textbooks, Vector DB) |
✨ What Could Be Improved (and What’s Next)
While the AAII MCQ Generator is already a powerful revision tool, there’s still plenty of room to enhance both its performance and features for a better user experience.
One current limitation is that the app may take a little time to load when you first select a subject — especially if the vector database (Chroma) for that textbook is being created or reloaded. This happens because the app processes and indexes the entire textbook on the fly if it hasn’t already been cached. While this ensures accuracy, it can slow down the experience. A few ways to improve this include:
- Pre-building and storing vector databases during deployment
- Offloading vectorization to a background worker or separate microservice
- Switching to cloud-hosted vector stores (e.g., Pinecone or MongoDB Atlas Vector Search) for faster retrieval
Beyond performance, here are several feature upgrades that can be implemented:
- Export Options: Let users download quizzes as PDFs or CSVs for offline practice or printing
- Progress Tracking: Add the ability to save user results and show performance trends over time
- LLM Flexibility: Integrate additional LLMs like Claude, GPT-4o, or Mistral to allow model comparison
- Cross-Domain Support: Extend the platform to support other industries like law, medicine, and finance — any subject with dense reading material and the need for comprehension-based assessment
The core infrastructure is already flexible and modular, making it easy to scale the app across both verticals and learners.
💬 Try It Out & Share Feedback
You can access the live demo at
👉 https://aii-mcq-generator.onrender.com
All you need is a Gemini API key and a few seconds of patience to start generating real MCQs from real textbooks.
For collaboration, feedback, or custom solutions:
🔗 paarishaemilie.com
☕ Buy Me a Coffee
🏁 Final Thoughts
This project was a blend of everything I love: AI, data science, insurance education, and meaningful user experience. If you’re building in this space, whether for education, AI tools, or domain-specific learning, I’d love to connect.
Together, we can build a future where knowledge is more accessible, personalized, and intelligent.



