GCP Billing Agent - System Architecture

Overview

This document describes the overall architecture of the GCP Billing Agent application, from Agent Engine deployments through Cloud Run services to end-user interactions.

Architecture Diagram

graph TB
    subgraph "User Layer"
        User[👤 End User<br/>asl.apps-eval.com]
        Browser[🌐 Web Browser]
    end

    subgraph "Cloud Run Services"
        UI[Frontend UI Service<br/>agent-engine-ui<br/>React + Vite]
        API[Backend API Service<br/>agent-engine-api<br/>FastAPI]
    end

    subgraph "Vertex AI Agent Engine"
        Agent1[Reasoning Engine 1<br/>bq_agent<br/>ID: 1660126218499915776]
        Agent2[Reasoning Engine 2<br/>bq_agent_mick<br/>ID: 291031931779284992]
    end

    subgraph "Google Cloud Services"
        Firestore[(Firestore Database<br/>Users & Query History)]
        BigQuery[(BigQuery<br/>GCP Billing Data)]
    end

    subgraph "Authentication & Security"
        JWT[JWT Tokens<br/>Custom Auth]
        SA_API[Service Account<br/>agent-engine-api-sa]
        SA_UI[Service Account<br/>agent-engine-ui-sa]
        IAM[IAM Roles & Permissions]
    end

    subgraph "Data Flow"
        AuthFlow[Authentication Flow]
        QueryFlow[Query Flow]
        AgentDiscovery[Agent Discovery]
    end

    %% User interactions
    User --> Browser
    Browser -->|HTTPS| UI
    Browser -->|API Calls| API

    %% Frontend to Backend
    UI -->|REST API<br/>HTTPS| API

    %% Authentication
    Browser -->|Signup/Login| API
    API -->|Store Credentials| Firestore
    API -->|Verify/Generate| JWT
    JWT -->|Bearer Token| Browser
    Browser -->|Authenticated Requests| API

    %% Agent Discovery (Auto-scan)
    API -->|List Reasoning Engines<br/>GET /reasoningEngines| Agent1
    API -->|List Reasoning Engines<br/>GET /reasoningEngines| Agent2
    Agent1 -->|Agent Metadata| API
    Agent2 -->|Agent Metadata| API
    API -->|Cache Configs<br/>5 min TTL| AgentDiscovery

    %% Query Flow
    Browser -->|Query Request<br/>POST /query/stream| API
    API -->|Stream Query<br/>POST :streamQuery| Agent1
    API -->|Stream Query<br/>POST :streamQuery| Agent2
    Agent1 -->|Stream Response<br/>SSE| API
    Agent2 -->|Stream Response<br/>SSE| API
    API -->|Forward Stream<br/>Server-Sent Events| Browser

    %% Agent to BigQuery
    Agent1 -->|Query BigQuery<br/>Read-Only| BigQuery
    Agent2 -->|Query BigQuery<br/>Read-Only| BigQuery

    %% Save Query History
    API -->|Save Query/Response| Firestore

    %% Service Account Permissions
    SA_API -->|Uses Credentials| IAM
    SA_UI -->|Uses Credentials| IAM
    IAM -->|aiplatform.reasoningEngines.*<br/>bigquery.*<br/>datastore.*| Agent1
    IAM -->|aiplatform.reasoningEngines.*<br/>bigquery.*<br/>datastore.*| Agent2
    IAM -->|datastore.*| Firestore
    IAM -->|bigquery.*| BigQuery

    %% Styling
    classDef userLayer fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    classDef cloudRun fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    classDef agentEngine fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    classDef dataStore fill:#fff3e0,stroke:#e65100,stroke-width:2px
    classDef security fill:#ffebee,stroke:#b71c1c,stroke-width:2px

    class User,Browser userLayer
    class UI,API cloudRun
    class Agent1,Agent2 agentEngine
    class Firestore,BigQuery dataStore
    class JWT,SA_API,SA_UI,IAM security

Component Details

1. Frontend UI Service (Cloud Run)

Service: agent-engine-ui
Technology: React + Vite
Function:
- User interface for chat-based interaction
- Authentication UI (signup/login)
- Agent selection dropdown
- Query history display
- Table formatting for agent responses

2. Backend API Service (Cloud Run)

Service: agent-engine-api
Technology: FastAPI (Python)
Function:
- REST API endpoints
- JWT authentication
- Agent discovery (auto-scan from Agent Engine)
- Query streaming to reasoning engines
- Query history management
- Firestore integration

3. Vertex AI Agent Engine

Deployments:
- bq_agent - BigQuery billing analysis agent
- bq_agent_mick - BigQuery billing analysis agent (variant)
Function:
- Execute natural language queries
- Query BigQuery with read-only access
- Return structured responses
- Maintain conversation context

4. Firestore Database

Collections:
- users - User accounts (email, hashed passwords)
- query_history - Query/response history per user
Function:
- User authentication storage
- Query history persistence
- Per-user data isolation

5. BigQuery

Dataset: gcp_billing_data
Table: billing_data_ndjson
Function:
- Store GCP billing export data
- Provide read-only access to agents
- Support complex billing analysis queries

Sequence Diagrams

Authentication Flow

sequenceDiagram
    participant User
    participant Browser
    participant UI as Frontend UI
    participant API as Backend API
    participant Firestore
    participant JWT as JWT Auth

    User->>Browser: Navigate to app
    Browser->>UI: Load login page
    User->>UI: Enter email/password
    UI->>API: POST /auth/signup
    API->>API: Validate domain (@asl.apps-eval.com)
    API->>API: Hash password (bcrypt)
    API->>Firestore: Store user credentials
    Firestore-->>API: User created
    API->>JWT: Generate access token
    JWT-->>API: JWT token
    API-->>UI: {access_token, user_id, email}
    UI->>Browser: Store token (localStorage)
    Browser->>UI: User logged in
    UI->>API: GET /agents (with Bearer token)
    API->>JWT: Verify token
    JWT-->>API: Valid user_id
    API-->>UI: Agent list

Agent Discovery Flow

sequenceDiagram
    participant API as Backend API
    participant Cache as Agent Cache
    participant AgentEngine as Vertex AI<br/>Agent Engine API
    participant UI as Frontend UI

    Note over API,Cache: On startup or cache expiry (5 min TTL)
    API->>Cache: Check cache
    Cache-->>API: Cache expired/missing
    API->>AgentEngine: GET /reasoningEngines
    AgentEngine-->>API: List of engines<br/>(bq_agent, bq_agent_mick)
    API->>API: Process & format configs
    API->>Cache: Store configs (5 min TTL)
    Cache-->>API: Configs cached
    UI->>API: GET /agents
    API->>Cache: Get cached configs
    Cache-->>API: Agent configs
    API-->>UI: Agent list JSON
    UI->>UI: Populate dropdown

Query Flow

sequenceDiagram
    participant User
    participant Browser
    participant UI as Frontend UI
    participant API as Backend API
    participant AgentEngine as Vertex AI<br/>Reasoning Engine
    participant BigQuery
    participant Firestore

    User->>UI: Select agent & type query
    UI->>API: POST /query/stream<br/>(Bearer token, agent_name, message)
    API->>API: Verify JWT token
    API->>API: Extract user_id from token
    API->>API: Get agent config (cached)
    API->>AgentEngine: POST /reasoningEngines/{id}:streamQuery<br/>(message, user_id)
    
    Note over AgentEngine,BigQuery: Agent processes query
    AgentEngine->>BigQuery: Execute SQL query<br/>(Read-only)
    BigQuery-->>AgentEngine: Query results
    AgentEngine->>AgentEngine: Process & format response
    
    AgentEngine-->>API: Stream response chunks<br/>(Server-Sent Events)
    API-->>UI: Forward SSE chunks
    UI->>Browser: Display streaming text
    Browser->>User: Real-time response
    
    Note over API,Firestore: After stream completes
    API->>API: Collect full response
    API->>Firestore: Save query history<br/>(user_id, agent, query, response)
    Firestore-->>API: History saved
    UI->>API: GET /history?user_id=...
    API->>Firestore: Fetch user history
    Firestore-->>API: History items
    API-->>UI: History JSON
    UI->>Browser: Update history sidebar

Data Flows

Authentication Flow

User signs up with @asl.apps-eval.com email
Backend validates domain and hashes password (bcrypt)
User credentials stored in Firestore
Backend generates JWT token
Frontend stores token and sends in subsequent requests

Agent Discovery Flow

Backend starts up or cache expires (5 min TTL)
Backend calls Vertex AI API: GET /reasoningEngines
API returns list of all deployed reasoning engines
Backend processes and caches agent configs
Frontend calls /agents endpoint to get available agents
Dropdown populated with discovered agents

Query Flow

User selects agent and types query
Frontend sends authenticated request to /query/stream
Backend validates JWT and extracts user_id
Backend streams query to selected reasoning engine: POST /reasoningEngines/{id}:streamQuery
Reasoning engine queries BigQuery (if needed)
Agent processes query and streams response back
Backend forwards stream to frontend via Server-Sent Events (SSE)
After completion, backend saves query/response to Firestore
Frontend displays response and updates history

Security & IAM

Service Accounts

API Service Account (agent-engine-api-sa):
- Custom role: gcpBillingAgentService
- Permissions: aiplatform.reasoningEngines.*, bigquery.*, datastore.*
- Also has: roles/aiplatform.admin (for comprehensive access)
UI Service Account (agent-engine-ui-sa):
- Permissions: roles/run.invoker (to invoke API service)

Authentication

Custom JWT-based authentication
Domain restriction: @asl.apps-eval.com only
Password hashing: bcrypt (72-byte limit enforced)
Token expiration: Configurable (default 7 days)

Network Security

Cloud Run services: Public HTTPS ingress (--ingress=all)
No VPC connector (simplified deployment)
CORS configured for UI → API communication
All traffic encrypted via HTTPS/TLS

Deployment Architecture

Deployment Flow

Infrastructure Setup (01-infrastructure.sh):
- Enable APIs
- Create service accounts
- Set up IAM permissions
IAM Configuration (02-iam-permissions.sh):
- Create custom IAM role
- Grant permissions to service accounts
Application Deployment (03-applications.sh):
- Build Docker images (Cloud Build)
- Deploy backend to Cloud Run
- Deploy frontend to Cloud Run
- Set environment variables
- Configure service accounts
Agent Engine Deployments (separate):
- Deploy agents using ADK CLI
- Agents automatically discovered by backend

Scaling & Performance

Cloud Run: Auto-scaling (0 to N instances)
Agent Discovery: 5-minute cache TTL to reduce API calls
Query Streaming: Server-Sent Events for real-time responses
Firestore: Automatic scaling for user and history data
BigQuery: Serverless, automatically scales to query size

Monitoring & Logging

Cloud Run Logs: Application logs, startup logs, errors
Cloud Logging: Centralized logging for all services
Error Handling: Graceful fallbacks for agent discovery
Health Checks: Built-in Cloud Run health checks

Future Enhancements

Add authentication middleware for additional security layers
Implement rate limiting per user
Add query result caching
Support for additional data sources
Multi-domain support (e.g., innovationbox.cloud)
Admin dashboard for user management
Query analytics and usage metrics