Building an AI-Powered MCQ Generator for Insurance Exams Using Gemini & LangChain

How I built an interactive MCQ generator for insurance exams using Python, LangChain, and Google’s Gemini API

In the ever-evolving world of education technology, combining large language models (LLMs) with smart retrieval systems is revolutionizing how we study. 

As a developer passionate about both AI and insurance, I built an AI-powered web app that automatically generates multiple-choice questions (MCQs) from insurance textbooks, specifically for AAII Level 1 and Level 2 exams. 

This app is now live at aii-mcq-generator.onrender.com, helping learners practice, revise, and reinforce core insurance principles using cutting-edge language models.

📘 What Does the App Do?

The AAII MCQ Generator takes in a selected textbook (stored securely), extracts the content for a chosen chapter, and uses Google Gemini 2.0 Flash to generate 10 multiple-choice questions — complete with four answer options, the correct answer, explanations, and page-source references. 

All questions are displayed in an interactive quiz format where users can test their knowledge and receive instant feedback and scoring.

This tool is designed to help insurance students, trainers, and professionals revise efficiently by turning static PDFs into dynamic, personalized assessments. 

Users only need their Gemini API key and can choose from multiple subjects and chapters.

🧠 How It Works (Technically)

The core of this app lies in combining data science concepts like semantic search, embeddings, and prompt engineering into an intuitive frontend powered by Streamlit. Here’s a high-level breakdown:

  1. PDF Parsing and Chapter Detection

Using PyMuPDF (fitz), the app scans each insurance textbook to extract its table of contents and chapter page ranges. This allows precise targeting of chapter-specific content for revision.

  1. Text Chunking

The content is split into overlapping chunks using LangChain’s RecursiveCharacterTextSplitter. This ensures the text remains within the context limits of the LLM while preserving semantic meaning.

  1. Embedding Creation

Each chunk is converted into a high-dimensional vector using Google’s embedding-001 model. These vectors represent the semantic content of each section of the textbook.

  1. Vector Store Indexing with Chroma

The embeddings are stored in a persistent vector database (Chroma, powered by FAISS). When a user selects a chapter, the app performs a semantic search to retrieve only the most relevant chunks.

  1. Filtering by Chapter Metadata

Results from the vectorstore are filtered by page range to ensure that only the relevant chapter content is passed to the language model for MCQ generation.

  1. Prompt Engineering and LLM Generation

A structured prompt is created and sent to the Gemini 2.0 Flash API via a REST call. The prompt instructs the model to generate 10 high-quality questions, with formatted output including answer options, explanations, and textbook page numbers.

  1. Question Parsing and User Interface

The model’s response is parsed using regular expressions and displayed in a clean quiz format using Streamlit. Users can select answers, submit, and immediately see their score, correct answers, explanations, and sources.

🌐 Deployment on Render

To make this tool publicly accessible, I deployed the app on Render

The Streamlit frontend runs as a web service, and the app pulls textbooks from a private GitHub repository to protect copyrighted content. 

The vector databases for each subject are saved on the server and reused across sessions to optimize performance.

The final deployed app is available at:

🔗 aii-mcq-generator.onrender.com

No user data is stored, and all MCQ generation runs in real time based on the Gemini API key the user provides.

🔍 Why This Matters

Many insurance certification students struggle to prepare due to the dense, technical nature of their reading materials. This project transforms that experience by giving learners the power to:

  • Generate personalized quizzes instantly from their textbooks
  • Receive explanations and sources for each question
  • Target specific chapters instead of studying blindly

More importantly, it brings together data science, natural language processing, and user-centered design into a practical EdTech solution.

🧰 Tech Stack Overview

ComponentTool/Service
FrontendStreamlit
LLM APIGemini 2.0 Flash (Google Generative AI)
Embedding Modelmodels/embedding-001 (Gemini)
Vector DatabaseChroma with FAISS backend
PDF ProcessingPyMuPDF (fitz)
HostingRender (Web Service)
StoragePrivate GitHub Repo (Textbooks, Vector DB)

✨ What Could Be Improved (and What’s Next)

While the AAII MCQ Generator is already a powerful revision tool, there’s still plenty of room to enhance both its performance and features for a better user experience.

One current limitation is that the app may take a little time to load when you first select a subject — especially if the vector database (Chroma) for that textbook is being created or reloaded. This happens because the app processes and indexes the entire textbook on the fly if it hasn’t already been cached. While this ensures accuracy, it can slow down the experience. A few ways to improve this include:

  • Pre-building and storing vector databases during deployment
  • Offloading vectorization to a background worker or separate microservice
  • Switching to cloud-hosted vector stores (e.g., Pinecone or MongoDB Atlas Vector Search) for faster retrieval

Beyond performance, here are several feature upgrades that can be implemented:

  • Export Options: Let users download quizzes as PDFs or CSVs for offline practice or printing
  • Progress Tracking: Add the ability to save user results and show performance trends over time
  • LLM Flexibility: Integrate additional LLMs like Claude, GPT-4o, or Mistral to allow model comparison
  • Cross-Domain Support: Extend the platform to support other industries like law, medicine, and finance — any subject with dense reading material and the need for comprehension-based assessment

The core infrastructure is already flexible and modular, making it easy to scale the app across both verticals and learners. 

💬 Try It Out & Share Feedback

You can access the live demo at
👉 https://aii-mcq-generator.onrender.com

All you need is a Gemini API key and a few seconds of patience to start generating real MCQs from real textbooks.

For collaboration, feedback, or custom solutions:
🔗 paarishaemilie.com
Buy Me a Coffee

🏁 Final Thoughts

This project was a blend of everything I love: AI, data science, insurance education, and meaningful user experience. If you’re building in this space, whether for education, AI tools, or domain-specific learning, I’d love to connect.

Together, we can build a future where knowledge is more accessible, personalized, and intelligent.

Scroll to Top