Building BoardGameGPT: An AI Assistant for Rule Queries
Sometimes, the most frustrating part of board game night isn’t losing – it’s pausing the game to hunt through the rulebook for a specific clarification. What if you could just ask the rulebook your question? BoardGameGPT (a working title!) was my exploration into solving that problem. It’s a system that takes board game rules text and allows users to ask natural language questions, getting answers grounded directly in the provided rules, built entirely in Python using the Google Gemini API within a Kaggle Notebook.
The Goal
Build a functional question-answering system for board game rules with:
- Ability to process provided rulebook text.
- Natural language query interface.
- Answers generated by AI but grounded in the specific rules using Retrieval-Augmented Generation (RAG).
- Demonstrate key Generative AI capabilities like Embeddings, Vector Search, and RAG.
Stack Overview
Layer | Tools |
---|---|
Language/Environment | Python 3 in Kaggle Notebooks |
Core AI | Google Gemini API (gemini-2.0-flash ) |
GenAI SDK | google-genai (v1.7.0 pattern) |
Embeddings | Google text-embedding-004 via SDK |
Data Handling | Pandas, NumPy |
Vector Search (Simulation) | Scikit-learn (cosine_similarity ) |
API Interaction | google-genai Client, google-api-core (Retry) |
The RAG Approach: Grounding AI Answers
Instead of letting the Large Language Model (LLM) answer questions from its general knowledge (where it might hallucinate or invent rules), I implemented a Retrieval-Augmented Generation (RAG) pipeline:
- Ingest & Chunk: Load the rulebook text and split it into smaller, manageable chunks (initially by paragraph).
- Embed: Convert each text chunk into a numerical vector (embedding) using Google’s
text-embedding-004
model. These vectors capture the semantic meaning of the rule snippet. - Retrieve: When a user asks a question (e.g., “How does scoring work?”), embed the question using the same model. Then, compare the question’s embedding to all the chunk embeddings using cosine similarity to find the most relevant rule snippets from the text.
- Generate: Construct a new prompt containing the original question and the most relevant retrieved rule snippets. Send this combined prompt to the Gemini generative model (
gemini-2.0-flash
) with clear instructions to answer the question based only on the provided snippets.
This RAG process ensures the answers are directly tied to the source material, making the assistant much more reliable for specific rule clarifications.
Key Features
- Rule Text Ingestion: Processes provided block of rule text.
- Natural Language Questions: Allows users to ask questions freely.
- Contextual Retrieval: Finds the most relevant rule snippets using embeddings and similarity search.
- Grounded Generation: Generates answers based only on the retrieved rule context.
- Demonstrates Core GenAI Capabilities: Embeddings, Vector Search (simulated), and RAG are clearly implemented.
Development in Kaggle Notebooks
Developing this project within a Kaggle Notebook offered several advantages:
- Easy access to computing resources.
- Integrated environment for Python code, markdown documentation, and output display.
- Secure handling of API keys using Kaggle Secrets.
- Simple sharing and reproducibility.
(No separate deployment needed as it lives within the notebook environment).
Challenges Faced
- Chunking Strategy: Simple paragraph splitting (
\n\n
) proved inadequate for complex formatted documents (like official Monopoly rules). Finding the right balance between chunk size and semantic coherence is key for RAG performance. More advanced techniques (semantic chunking, sentence splitting) would be needed for broader applicability. - Embedding Relevance: Ensuring the retrieved chunks were truly the most relevant required careful selection of the embedding model and task type (
RETRIEVAL_DOCUMENT
vs.RETRIEVAL_QUERY
). - Prompt Engineering for RAG: Crafting the final prompt to force the LLM to only use the provided context and not its general knowledge was crucial and required iteration.
- SDK Nuances: Working through the specific usage patterns of the
google-genai
client library (v1.7.0), especially the structure of API responses for embeddings and the correct parameters for file uploads (in earlier iterations), required debugging and referencing examples. - Evaluating Answer Quality: Subjectively judging the generated answers is useful, but implementing quantitative evaluation metrics (like faithfulness, relevance) would be a significant next step for assessing true performance.
What I Learned
- Implementing a complete RAG pipeline from scratch in Python.
- Practical use of the Google Gemini API via the
google-genai
SDK, including client initialization, embedding generation (client.models.embed_content
), and content generation (client.models.generate_content
). - The importance of different embedding task types (
RETRIEVAL_DOCUMENT
,RETRIEVAL_QUERY
). - Fundamentals of semantic search using vector embeddings and cosine similarity.
- Prompt engineering techniques specifically for grounded generation in RAG.
- Handling API errors and implementing retry logic (
google.api_core.retry
). - The limitations of basic text chunking for complex documents.
This project was a fantastic way to dive deep into the practical application of RAG, a powerful technique for making LLMs more factual and context-aware. While the simulated vector search and basic chunking have limitations, the core pipeline demonstrates the potential of using Gemini to build specialized Q&A systems. It’s not perfect, but it works, and it was a valuable learning experience in harnessing modern Generative AI capabilities.