RAG-Powered Autonomous Runbook with Hallucination Prevention
NOC Oracle is a Retrieval-Augmented Generation (RAG) system specialized for telecommunications troubleshooting. It maps specific error codes (e.g., E-4045, S-304) to exact repair procedures from technical manuals, eliminating the hallucination risk of generic LLMs.
The system uses context-aware chunking with MarkdownHeaderTextSplitter to preserve the relationship between error codes and their solutions. It implements hybrid search combining semantic vector search with keyword boosting for exact alphanumeric code matching.
Built for enterprise trust, it includes a 'Hallucination Risk' comparison mode that shows side-by-side what a generic LLM would guess versus the RAG-verified answer from the manual, demonstrating why RAG is essential for operations.
Field engineers cannot rely on generic LLMs (ChatGPT/Gemini) for troubleshooting because they hallucinate commands. A generic model might invent a `reset-network` command that destroys the config. Engineers need exact, verified procedures from official manuals, not plausible but dangerous guesses.
In network operations, incorrect commands can cause service outages and revenue loss. Traditional LLMs provide confident but ungrounded answers, making them unsuitable for production troubleshooting workflows.
A three-stage RAG pipeline: (1) Context-aware ingestion using MarkdownHeaderTextSplitter to preserve error-solution relationships, (2) Hybrid retrieval combining vector search with keyword boosting for exact code matching, (3) Strict context enforcement in LLM prompts to prevent hallucination.
Uses MarkdownHeaderTextSplitter to chunk technical manuals by header hierarchy. Injects parent headers (Category, Error_Code) into chunk content so embeddings treat error codes and solutions as atomic units. Stores in persistent ChromaDB.
Implements two-stage retrieval: (1) Vector similarity search for semantic queries, (2) Keyword boosting layer that detects error codes in queries (case-insensitive, hyphen-tolerant) and forces exact matches to Rank #1. Returns top 3 chunks for LLM context.
Uses Gemini 2.0 Flash Lite with strict context enforcement. Prompt instructs model to answer ONLY from provided context and state 'Procedure not found' if error code is missing. Includes baseline comparison mode showing ungrounded LLM responses.
MarkdownHeaderTextSplitter preserves technical manual hierarchy preventing error-solution separation. Hybrid search combines semantic understanding (vague queries) with exact matching (error codes). ChromaDB provides local persistence for privacy. Google Text Embedding 004 optimized for technical content.
Deep dive into the technical implementation with annotated code examples
View Technical DetailsEnsuring exact match retrieval for alphanumeric error codes with variable formatting (S-304 vs s304)
Implemented regex-based keyword booster that normalizes codes (removes hyphens, case-insensitive) and searches in both content and metadata. Forces matching chunks to Rank #1 before vector similarity ranking.
Preserving error code-solution relationships during chunking
Used MarkdownHeaderTextSplitter to chunk at header boundaries. Injected parent headers (Category, Error_Code) into chunk page_content so embeddings see codes and solutions together.
Preventing LLM from hallucinating procedures when error codes are not in manual
Strict prompt engineering: 'Answer ONLY using provided context. If error code not found, state Procedure not found in standard operating manual.' No fallback to general knowledge.
Deployed as part of TRINITY Project NOC suite. System demonstrates exact match retrieval for specific error codes while maintaining semantic understanding for vague queries. Hallucination prevention mechanisms ensure verified answers from technical manuals.