Back to Projects
AI/ML Engineering • Project

NOC Oracle

RAG-Powered Autonomous Runbook with Hallucination Prevention

Gemini 2.0 Flash Lite LangChain ChromaDB Google Text Embedding 004 MarkdownHeaderTextSplitter Streamlit
Nov 2024 - Jan 2025
Python 3.12
System Prototyping (RAG)

Overview

NOC Oracle is a Retrieval-Augmented Generation (RAG) system specialized for telecommunications troubleshooting. It maps specific error codes (e.g., E-4045, S-304) to exact repair procedures from technical manuals, eliminating the hallucination risk of generic LLMs.

The system uses context-aware chunking with MarkdownHeaderTextSplitter to preserve the relationship between error codes and their solutions. It implements hybrid search combining semantic vector search with keyword boosting for exact alphanumeric code matching.

Built for enterprise trust, it includes a 'Hallucination Risk' comparison mode that shows side-by-side what a generic LLM would guess versus the RAG-verified answer from the manual, demonstrating why RAG is essential for operations.

Key Achievement: Built RAG system with hybrid search combining semantic vector search and keyword boosting, context-aware chunking preserving error-solution relationships, and strict context enforcement preventing hallucination

Key Metrics & Results

Hybrid
Search Strategy
Top 3
Retrieval Precision
Context Aware
Chunking
Strict
Context Enforcement

Problem Statement

Field engineers cannot rely on generic LLMs (ChatGPT/Gemini) for troubleshooting because they hallucinate commands. A generic model might invent a `reset-network` command that destroys the config. Engineers need exact, verified procedures from official manuals, not plausible but dangerous guesses.

Business Context

In network operations, incorrect commands can cause service outages and revenue loss. Traditional LLMs provide confident but ungrounded answers, making them unsuitable for production troubleshooting workflows.

Technical Challenges

Solution Architecture

A three-stage RAG pipeline: (1) Context-aware ingestion using MarkdownHeaderTextSplitter to preserve error-solution relationships, (2) Hybrid retrieval combining vector search with keyword boosting for exact code matching, (3) Strict context enforcement in LLM prompts to prevent hallucination.

System Components

Context-Aware Ingestor

Uses MarkdownHeaderTextSplitter to chunk technical manuals by header hierarchy. Injects parent headers (Category, Error_Code) into chunk content so embeddings treat error codes and solutions as atomic units. Stores in persistent ChromaDB.

Hybrid Retriever

Implements two-stage retrieval: (1) Vector similarity search for semantic queries, (2) Keyword boosting layer that detects error codes in queries (case-insensitive, hyphen-tolerant) and forces exact matches to Rank #1. Returns top 3 chunks for LLM context.

RAG Engine with Hallucination Prevention

Uses Gemini 2.0 Flash Lite with strict context enforcement. Prompt instructs model to answer ONLY from provided context and state 'Procedure not found' if error code is missing. Includes baseline comparison mode showing ungrounded LLM responses.

Technology Stack Rationale

MarkdownHeaderTextSplitter preserves technical manual hierarchy preventing error-solution separation. Hybrid search combines semantic understanding (vague queries) with exact matching (error codes). ChromaDB provides local persistence for privacy. Google Text Embedding 004 optimized for technical content.

Implementation Highlights

Key Features

Detailed Code Documentation

Deep dive into the technical implementation with annotated code examples

View Technical Details

Challenges & Solutions

Challenge 1

Ensuring exact match retrieval for alphanumeric error codes with variable formatting (S-304 vs s304)

Solution

Implemented regex-based keyword booster that normalizes codes (removes hyphens, case-insensitive) and searches in both content and metadata. Forces matching chunks to Rank #1 before vector similarity ranking.

Challenge 2

Preserving error code-solution relationships during chunking

Solution

Used MarkdownHeaderTextSplitter to chunk at header boundaries. Injected parent headers (Category, Error_Code) into chunk page_content so embeddings see codes and solutions together.

Challenge 3

Preventing LLM from hallucinating procedures when error codes are not in manual

Solution

Strict prompt engineering: 'Answer ONLY using provided context. If error code not found, state Procedure not found in standard operating manual.' No fallback to general knowledge.

Results & Impact

Deployed as part of TRINITY Project NOC suite. System demonstrates exact match retrieval for specific error codes while maintaining semantic understanding for vague queries. Hallucination prevention mechanisms ensure verified answers from technical manuals.

Production Performance

  • Hybrid search returns top 3 relevant chunks with exact matches prioritized
  • Context-aware chunking preserves error-solution relationships
  • Strict context enforcement prevents ungrounded responses
  • Local ChromaDB persistence ensures data privacy and offline capability

Lessons Learned

What Worked Well

What I'd Do Differently

Future Enhancements

Related Projects