Building BoardGameGPT: An AI Assistant for Rule Queries

3 minute read

Sometimes, the most frustrating part of board game night isn’t losing – it’s pausing the game to hunt through the rulebook for a specific clarification. What if you could just ask the rulebook your question? BoardGameGPT (a working title!) was my exploration into solving that problem. It’s a system that takes board game rules text and allows users to ask natural language questions, getting answers grounded directly in the provided rules, built entirely in Python using the Google Gemini API within a Kaggle Notebook.

The Goal

Build a functional question-answering system for board game rules with:

Ability to process provided rulebook text.
Natural language query interface.
Answers generated by AI but grounded in the specific rules using Retrieval-Augmented Generation (RAG).
Demonstrate key Generative AI capabilities like Embeddings, Vector Search, and RAG.

Stack Overview

Layer	Tools
Language/Environment	Python 3 in Kaggle Notebooks
Core AI	Google Gemini API (`gemini-2.0-flash`)
GenAI SDK	`google-genai` (v1.7.0 pattern)
Embeddings	Google `text-embedding-004` via SDK
Data Handling	Pandas, NumPy
Vector Search (Simulation)	Scikit-learn (`cosine_similarity`)
API Interaction	`google-genai` Client, `google-api-core` (Retry)

The RAG Approach: Grounding AI Answers

Instead of letting the Large Language Model (LLM) answer questions from its general knowledge (where it might hallucinate or invent rules), I implemented a Retrieval-Augmented Generation (RAG) pipeline:

Ingest & Chunk: Load the rulebook text and split it into smaller, manageable chunks (initially by paragraph).
Embed: Convert each text chunk into a numerical vector (embedding) using Google’s text-embedding-004 model. These vectors capture the semantic meaning of the rule snippet.
Retrieve: When a user asks a question (e.g., “How does scoring work?”), embed the question using the same model. Then, compare the question’s embedding to all the chunk embeddings using cosine similarity to find the most relevant rule snippets from the text.
Generate: Construct a new prompt containing the original question and the most relevant retrieved rule snippets. Send this combined prompt to the Gemini generative model (gemini-2.0-flash) with clear instructions to answer the question based only on the provided snippets.

This RAG process ensures the answers are directly tied to the source material, making the assistant much more reliable for specific rule clarifications.

Key Features

Rule Text Ingestion: Processes provided block of rule text.
Natural Language Questions: Allows users to ask questions freely.
Contextual Retrieval: Finds the most relevant rule snippets using embeddings and similarity search.
Grounded Generation: Generates answers based only on the retrieved rule context.
Demonstrates Core GenAI Capabilities: Embeddings, Vector Search (simulated), and RAG are clearly implemented.

Development in Kaggle Notebooks

Developing this project within a Kaggle Notebook offered several advantages:

Easy access to computing resources.
Integrated environment for Python code, markdown documentation, and output display.
Secure handling of API keys using Kaggle Secrets.
Simple sharing and reproducibility.

(No separate deployment needed as it lives within the notebook environment).

Challenges Faced

Chunking Strategy: Simple paragraph splitting (\n\n) proved inadequate for complex formatted documents (like official Monopoly rules). Finding the right balance between chunk size and semantic coherence is key for RAG performance. More advanced techniques (semantic chunking, sentence splitting) would be needed for broader applicability.
Embedding Relevance: Ensuring the retrieved chunks were truly the most relevant required careful selection of the embedding model and task type (RETRIEVAL_DOCUMENT vs. RETRIEVAL_QUERY).
Prompt Engineering for RAG: Crafting the final prompt to force the LLM to only use the provided context and not its general knowledge was crucial and required iteration.
SDK Nuances: Working through the specific usage patterns of the google-genai client library (v1.7.0), especially the structure of API responses for embeddings and the correct parameters for file uploads (in earlier iterations), required debugging and referencing examples.
Evaluating Answer Quality: Subjectively judging the generated answers is useful, but implementing quantitative evaluation metrics (like faithfulness, relevance) would be a significant next step for assessing true performance.

What I Learned

Implementing a complete RAG pipeline from scratch in Python.
Practical use of the Google Gemini API via the google-genai SDK, including client initialization, embedding generation (client.models.embed_content), and content generation (client.models.generate_content).
The importance of different embedding task types (RETRIEVAL_DOCUMENT, RETRIEVAL_QUERY).
Fundamentals of semantic search using vector embeddings and cosine similarity.
Prompt engineering techniques specifically for grounded generation in RAG.
Handling API errors and implementing retry logic (google.api_core.retry).
The limitations of basic text chunking for complex documents.

This project was a fantastic way to dive deep into the practical application of RAG, a powerful technique for making LLMs more factual and context-aware. While the simulated vector search and basic chunking have limitations, the core pipeline demonstrates the potential of using Gemini to build specialized Q&A systems. It’s not perfect, but it works, and it was a valuable learning experience in harnessing modern Generative AI capabilities.

Share on

Twitter Facebook LinkedIn

Christopher Flynn

Building BoardGameGPT: An AI Assistant for Rule Queries

The Goal

Stack Overview

The RAG Approach: Grounding AI Answers

Key Features

Development in Kaggle Notebooks

Challenges Faced

What I Learned

Share on

You May Also Enjoy

Building FairSplit: A Full-Stack Expense Splitting App

Building NoteVault: A Local-First Note-Taking App

Building a Real-Time Demand Forecasting System

Rainfall Prediction: My Weather Forecasting Journey