DocuMind Documentation
The full map of what we built for the Actian hackathon: ingestion, retrieval, grounded answers, memory, observability, DCLI, and MCP.
The 2AM Origin Story
I was at work, asked our AI a basic internal question about deployment, and it answered with full confidence and zero facts. You know that moment when the model sounds like a senior engineer but it is absolutely improvising? Yeah, that moment.
I just sat there thinking: why are we pretending this is fine when the docs exist and the AI just cannot see them? So we built DocuMind in hackathon mode slightly sleep-deprived, aggressively caffeinated, and very motivated by spite.
The goal was simple: stop vibe-based answers. Make the model read our actual documentation before it says anything smart-looking.
What We Actually Built
DocuMind is an internal documentation intelligence layer. Drop in your docs, we parse and chunk them, generate embeddings, store vectors in Actian, and retrieve only the relevant slices when someone asks a question.
Instead of dumping your entire wiki into one prompt and praying, we send only top relevant chunks. Think of it like giving the LLM a curated reading list instead of making it speed-read your whole company.
Ingestion for markdown, text, PDF, URL, transcripts, conversation JSON
Semantic and hybrid retrieval
Grounded Q&A with source snippets
Memory namespace for conversation context
Observability scoring + alerts
DCLI and MCP integration
Why This Works (And Why It Matters)
Stack We Ran With
We kept the stack pragmatic: move fast, keep quality measurable, do not over-engineer at 3AM.
Vector DB
Actian Vector Database (Beta)
Backend
Python + FastAPI
Agent Framework
LangChain (primary) or LlamaIndex
LLM Layer
OpenAI GPT
Control Plane
SQLite now
Observability
RAGAS style metrics + custom scoring + alerts
Source Repository
Official GitHub repo: https://github.com/mdkaifansari04/documind