NOC Oracle

Overview

NOC Oracle is a Retrieval-Augmented Generation (RAG) system specialized for telecommunications troubleshooting. It maps specific error codes (e.g., E-4045, S-304) to exact repair procedures from technical manuals, eliminating the hallucination risk of generic LLMs.

The system uses context-aware chunking with MarkdownHeaderTextSplitter to preserve the relationship between error codes and their solutions. It implements hybrid search combining semantic vector search with keyword boosting for exact alphanumeric code matching.

Built for enterprise trust, it includes a 'Hallucination Risk' comparison mode that shows side-by-side what a generic LLM would guess versus the RAG-verified answer from the manual, demonstrating why RAG is essential for operations.

                Key Achievement: Built RAG system with hybrid search combining semantic vector search and keyword boosting, context-aware chunking preserving error-solution relationships, and strict context enforcement preventing hallucination
            

Key Metrics & Results

Hybrid

Search Strategy

Top 3

Retrieval Precision

Context Aware

Chunking

Strict

Context Enforcement

Problem Statement

Field engineers cannot rely on generic LLMs (ChatGPT/Gemini) for troubleshooting because they hallucinate commands. A generic model might invent a `reset-network` command that destroys the config. Engineers need exact, verified procedures from official manuals, not plausible but dangerous guesses.

Business Context

In network operations, incorrect commands can cause service outages and revenue loss. Traditional LLMs provide confident but ungrounded answers, making them unsuitable for production troubleshooting workflows.

Technical Challenges

Ensuring exact match retrieval for specific alphanumeric error codes with variable formatting (e.g., S-304 vs s304)
Preserving context between error codes and their solutions during chunking
Preventing LLM hallucination when error codes are not found in the manual
Maintaining semantic understanding for vague queries like 'power issue'

Solution Architecture

A three-stage RAG pipeline: (1) Context-aware ingestion using MarkdownHeaderTextSplitter to preserve error-solution relationships, (2) Hybrid retrieval combining vector search with keyword boosting for exact code matching, (3) Strict context enforcement in LLM prompts to prevent hallucination.

System Components

Context-Aware Ingestor

Uses MarkdownHeaderTextSplitter to chunk technical manuals by header hierarchy. Injects parent headers (Category, Error_Code) into chunk content so embeddings treat error codes and solutions as atomic units. Stores in persistent ChromaDB.

Hybrid Retriever

Implements two-stage retrieval: (1) Vector similarity search for semantic queries, (2) Keyword boosting layer that detects error codes in queries (case-insensitive, hyphen-tolerant) and forces exact matches to Rank #1. Returns top 3 chunks for LLM context.

RAG Engine with Hallucination Prevention

Uses Gemini 2.0 Flash Lite with strict context enforcement. Prompt instructs model to answer ONLY from provided context and state 'Procedure not found' if error code is missing. Includes baseline comparison mode showing ungrounded LLM responses.

Technology Stack Rationale

MarkdownHeaderTextSplitter preserves technical manual hierarchy preventing error-solution separation. Hybrid search combines semantic understanding (vague queries) with exact matching (error codes). ChromaDB provides local persistence for privacy. Google Text Embedding 004 optimized for technical content.

Implementation Highlights

Key Features

Context-Aware Chunking: Uses MarkdownHeaderTextSplitter to preserve error code-solution relationships by chunking at header boundaries
Hybrid Search with Keyword Boosting: Detects alphanumeric error codes in queries and forces exact matches to Rank #1, regardless of formatting (S-304, s304, S304)
Hallucination Risk Comparison: Side-by-side view showing generic LLM guesses versus RAG-verified answers from manual
Strict Context Enforcement: LLM prompt forces answers only from provided context, returning 'Procedure not found' for missing codes

Detailed Code Documentation

Deep dive into the technical implementation with annotated code examples

View Technical Details

Challenges & Solutions

Challenge 1

Ensuring exact match retrieval for alphanumeric error codes with variable formatting (S-304 vs s304)

Solution

Implemented regex-based keyword booster that normalizes codes (removes hyphens, case-insensitive) and searches in both content and metadata. Forces matching chunks to Rank #1 before vector similarity ranking.

Challenge 2

Preserving error code-solution relationships during chunking

Solution

Used MarkdownHeaderTextSplitter to chunk at header boundaries. Injected parent headers (Category, Error_Code) into chunk page_content so embeddings see codes and solutions together.

Challenge 3

Preventing LLM from hallucinating procedures when error codes are not in manual

Solution

Strict prompt engineering: 'Answer ONLY using provided context. If error code not found, state Procedure not found in standard operating manual.' No fallback to general knowledge.

Results & Impact

Deployed as part of TRINITY Project NOC suite. System demonstrates exact match retrieval for specific error codes while maintaining semantic understanding for vague queries. Hallucination prevention mechanisms ensure verified answers from technical manuals.

                Production Performance
                Hybrid search returns top 3 relevant chunks with exact matches prioritized
Context-aware chunking preserves error-solution relationships
Strict context enforcement prevents ungrounded responses
Local ChromaDB persistence ensures data privacy and offline capability

            

Lessons Learned

What Worked Well

MarkdownHeaderTextSplitter preserved technical manual structure perfectly
Hybrid search with keyword boosting solved exact code matching problem
Context injection into embeddings improved retrieval quality significantly
Strict prompt enforcement eliminated hallucination risk

What I'd Do Differently

Would migrate to google-genai SDK earlier (current google-generativeai deprecated as of Jan 2025)
Should add BM25 keyword search layer for even better exact matching
Could implement query expansion for common error code synonyms
Would add retrieval metrics (precision@k, recall@k) for production monitoring

Future Enhancements

Migrate to google-genai SDK before June 2026 deprecation deadline
Add BM25 keyword search layer for improved exact matching
Implement multi-manual support with source attribution
Add query expansion for error code synonyms and common variations

Overview

Key Metrics & Results

Problem Statement

Business Context

Technical Challenges

Solution Architecture

System Components

Context-Aware Ingestor

Hybrid Retriever

RAG Engine with Hallucination Prevention

Technology Stack Rationale

Implementation Highlights

Key Features

Detailed Code Documentation

Challenges & Solutions

Challenge 1

Solution

Challenge 2

Solution

Challenge 3

Solution

Results & Impact

Production Performance

Lessons Learned

What Worked Well

What I'd Do Differently

Future Enhancements

Related Projects

incident-commander

net-ops-agent

google-cloud-ai-studio