Back to Projects
AI/ML Engineering • Project

Net-Ops Agent

Agentic AI with Human-in-the-Loop Safety Guardrails

Gemini 2.0 Flash Lite Native Function Calling Python Type Hints Streamlit Session State Human-in-the-Loop (HITL)
Nov 2024 - Jan 2025
Python 3.12
System Prototyping (Agentic AI)

Overview

Net-Ops Agent is an autonomous agentic AI system that translates natural language operational commands into safe, deterministic Python function calls. It uses a Reasoning-Action Separation pattern with Human-in-the-Loop (HITL) approval to prevent unauthorized executions.

The system replaces dangerous open-ended code generation with strict function calling. The LLM reasons about which tool to use (e.g., restart_service, scale_cluster) but execution is intercepted and requires explicit human approval before any Python function runs.

Built for enterprise safety, it ensures type-safe tool definitions through Python type hints and prevents unauthorized executions through mandatory approval gates. All actions require explicit authorization.

Key Achievement: Built agentic AI system with reasoning-action separation pattern, mandatory human-in-the-loop approval gates, and deterministic function calling preventing unauthorized executions

Key Metrics & Results

HITL
Approval Pattern
3
Tool Functions
Mandatory
Approval Required
Type Safe
Tool Definitions

Problem Statement

Enterprises are hesitant to deploy GenAI for operations because they fear an LLM will hallucinate a dangerous command like `delete database` or `sudo rm -rf`. Chatbots are useful for advice, but dangerous for action. Operations require deterministic, auditable commands with zero risk of unauthorized execution.

Business Context

Network operations involve critical infrastructure where incorrect commands can cause service outages, data loss, and revenue impact. Traditional agentic systems lack safety guardrails, making them unsuitable for production use.

Technical Challenges

Solution Architecture

A three-phase safety architecture: (1) Tool Definition with strict type hints, (2) Reasoning Phase where LLM selects tool and arguments, (3) Approval Gate that intercepts execution and requires human authorization. All tool calls are validated before approval.

System Components

Deterministic Toolbelt

Pre-defined Python functions (get_service_health, restart_service, scale_cluster) with strict type hints and docstrings. Gemini uses these definitions to learn tool usage. No code generation allowed—only function calling from approved list.

Agent Core with Function Calling

Uses Gemini 2.0 Flash Lite with native function calling support. LLM receives tool definitions and user query, returns ToolCall object with function name and validated arguments. Execution is disabled (enable_automatic_function_calling=False) to force manual approval.

Safety Interface with Approval Gate

Streamlit UI with session state management. Displays pending action (tool name, arguments) and requires explicit Approve/Reject button click. State persists across network lags. Only approved actions execute Python functions. All actions logged for audit.

Technology Stack Rationale

Function calling prevents code generation eliminating hallucination risk. Python type hints provide type safety for tool definitions. Streamlit session state provides reliable state persistence for approval gates. Human-in-the-Loop pattern ensures mandatory authorization for all executions.

Implementation Highlights

Key Features

Detailed Code Documentation

Deep dive into the technical implementation with annotated code examples

View Technical Details

Challenges & Solutions

Challenge 1

Preventing LLM from inventing commands not in the approved toolbelt

Solution

Replaced open-ended code generation with strict function calling. LLM can only select from pre-defined tools passed via Gemini's tools parameter. No code execution capability outside toolbelt.

Challenge 2

Ensuring reliable interception between reasoning and execution phases

Solution

Disabled automatic function calling (enable_automatic_function_calling=False). System manually parses ToolCall objects from response, stores in session state, and displays approval UI before execution.

Challenge 3

Maintaining approval state persistence across network lags

Solution

Used Streamlit session state to store pending actions. UI checks session state on load and displays approval gate if action is pending. State persists until user approves or rejects.

Results & Impact

Deployed as part of TRINITY Project NOC suite. System demonstrates type-safe tool definitions through Python type hints and mandatory approval gates preventing unauthorized executions. All actions require explicit authorization. Operators can safely use natural language commands with confidence.

Production Performance

  • Type-safe tool definitions with Python type hints
  • Mandatory approval gates for all tool executions
  • 3 operational tools: service health check, restart, cluster scaling
  • Sub-second reasoning latency with Gemini 2.0 Flash Lite

Lessons Learned

What Worked Well

What I'd Do Differently

Future Enhancements

Related Projects