Net-Ops Agent

Overview

Net-Ops Agent is an autonomous agentic AI system that translates natural language operational commands into safe, deterministic Python function calls. It uses a Reasoning-Action Separation pattern with Human-in-the-Loop (HITL) approval to prevent unauthorized executions.

The system replaces dangerous open-ended code generation with strict function calling. The LLM reasons about which tool to use (e.g., restart_service, scale_cluster) but execution is intercepted and requires explicit human approval before any Python function runs.

Built for enterprise safety, it ensures type-safe tool definitions through Python type hints and prevents unauthorized executions through mandatory approval gates. All actions require explicit authorization.

                Key Achievement: Built agentic AI system with reasoning-action separation pattern, mandatory human-in-the-loop approval gates, and deterministic function calling preventing unauthorized executions
            

Key Metrics & Results

HITL

Approval Pattern

3

Tool Functions

Mandatory

Approval Required

Type Safe

Tool Definitions

Problem Statement

Enterprises are hesitant to deploy GenAI for operations because they fear an LLM will hallucinate a dangerous command like `delete database` or `sudo rm -rf`. Chatbots are useful for advice, but dangerous for action. Operations require deterministic, auditable commands with zero risk of unauthorized execution.

Business Context

Network operations involve critical infrastructure where incorrect commands can cause service outages, data loss, and revenue impact. Traditional agentic systems lack safety guardrails, making them unsuitable for production use.

Technical Challenges

Preventing LLM from inventing commands not in the approved toolbelt
Ensuring deterministic tool usage with correct argument types
Implementing reliable interception between reasoning and execution phases
Maintaining state persistence for approval gates across network lags

Solution Architecture

A three-phase safety architecture: (1) Tool Definition with strict type hints, (2) Reasoning Phase where LLM selects tool and arguments, (3) Approval Gate that intercepts execution and requires human authorization. All tool calls are validated before approval.

System Components

Deterministic Toolbelt

Pre-defined Python functions (get_service_health, restart_service, scale_cluster) with strict type hints and docstrings. Gemini uses these definitions to learn tool usage. No code generation allowed—only function calling from approved list.

Agent Core with Function Calling

Uses Gemini 2.0 Flash Lite with native function calling support. LLM receives tool definitions and user query, returns ToolCall object with function name and validated arguments. Execution is disabled (enable_automatic_function_calling=False) to force manual approval.

Safety Interface with Approval Gate

Streamlit UI with session state management. Displays pending action (tool name, arguments) and requires explicit Approve/Reject button click. State persists across network lags. Only approved actions execute Python functions. All actions logged for audit.

Technology Stack Rationale

Function calling prevents code generation eliminating hallucination risk. Python type hints provide type safety for tool definitions. Streamlit session state provides reliable state persistence for approval gates. Human-in-the-Loop pattern ensures mandatory authorization for all executions.

Implementation Highlights

Key Features

Reasoning-Action Separation: LLM reasons about which tool to use but execution is intercepted and paused for human approval
Deterministic Tool Use: Strict function calling from pre-defined toolbelt—no open-ended code generation allowed
Type-Safe Tool Definitions: All tool arguments use Python type hints ensuring argument correctness
Persistent Approval State: Session state management ensures approval gates persist across network lags and UI refreshes

Detailed Code Documentation

Deep dive into the technical implementation with annotated code examples

View Technical Details

Challenges & Solutions

Challenge 1

Preventing LLM from inventing commands not in the approved toolbelt

Solution

Replaced open-ended code generation with strict function calling. LLM can only select from pre-defined tools passed via Gemini's tools parameter. No code execution capability outside toolbelt.

Challenge 2

Ensuring reliable interception between reasoning and execution phases

Solution

Disabled automatic function calling (enable_automatic_function_calling=False). System manually parses ToolCall objects from response, stores in session state, and displays approval UI before execution.

Challenge 3

Maintaining approval state persistence across network lags

Solution

Used Streamlit session state to store pending actions. UI checks session state on load and displays approval gate if action is pending. State persists until user approves or rejects.

Results & Impact

Deployed as part of TRINITY Project NOC suite. System demonstrates type-safe tool definitions through Python type hints and mandatory approval gates preventing unauthorized executions. All actions require explicit authorization. Operators can safely use natural language commands with confidence.

                Production Performance
                Type-safe tool definitions with Python type hints
Mandatory approval gates for all tool executions
3 operational tools: service health check, restart, cluster scaling
Sub-second reasoning latency with Gemini 2.0 Flash Lite

            

Lessons Learned

What Worked Well

Function calling eliminated code generation risk completely
Session state provided reliable approval gate persistence
Type hints provided compile-time type checking
Human-in-the-Loop pattern built operator trust

What I'd Do Differently

Would migrate to google-genai SDK earlier (current google-generativeai deprecated as of Jan 2025)
Should add tool execution timeout and cancellation support
Could implement role-based approval (auto-approve for read-only tools)
Would add audit log persistence beyond session state

Future Enhancements

Migrate to google-genai SDK before June 2026 deprecation deadline
Add role-based auto-approval for read-only operations
Implement tool execution timeouts and cancellation
Add persistent audit logging to database for compliance

Overview

Key Metrics & Results

Problem Statement

Business Context

Technical Challenges

Solution Architecture

System Components

Deterministic Toolbelt

Agent Core with Function Calling

Safety Interface with Approval Gate

Technology Stack Rationale

Implementation Highlights

Key Features

Detailed Code Documentation

Challenges & Solutions

Challenge 1

Solution

Challenge 2

Solution

Challenge 3

Solution

Results & Impact

Production Performance

Lessons Learned

What Worked Well

What I'd Do Differently

Future Enhancements

Related Projects

incident-commander

noc-oracle

google-cloud-ai-studio