GCP Billing Agent - System Architecture
GCP Billing Agent - System Architecture
Overview
This document describes the overall architecture of the GCP Billing Agent application, from Agent Engine deployments through Cloud Run services to end-user interactions.
Architecture Diagram
graph TB
subgraph "User Layer"
User[👤 End User<br/>asl.apps-eval.com]
Browser[🌐 Web Browser]
end
subgraph "Cloud Run Services"
UI[Frontend UI Service<br/>agent-engine-ui<br/>React + Vite]
API[Backend API Service<br/>agent-engine-api<br/>FastAPI]
end
subgraph "Vertex AI Agent Engine"
Agent1[Reasoning Engine 1<br/>bq_agent<br/>ID: 1660126218499915776]
Agent2[Reasoning Engine 2<br/>bq_agent_mick<br/>ID: 291031931779284992]
end
subgraph "Google Cloud Services"
Firestore[(Firestore Database<br/>Users & Query History)]
BigQuery[(BigQuery<br/>GCP Billing Data)]
end
subgraph "Authentication & Security"
JWT[JWT Tokens<br/>Custom Auth]
SA_API[Service Account<br/>agent-engine-api-sa]
SA_UI[Service Account<br/>agent-engine-ui-sa]
IAM[IAM Roles & Permissions]
end
subgraph "Data Flow"
AuthFlow[Authentication Flow]
QueryFlow[Query Flow]
AgentDiscovery[Agent Discovery]
end
%% User interactions
User --> Browser
Browser -->|HTTPS| UI
Browser -->|API Calls| API
%% Frontend to Backend
UI -->|REST API<br/>HTTPS| API
%% Authentication
Browser -->|Signup/Login| API
API -->|Store Credentials| Firestore
API -->|Verify/Generate| JWT
JWT -->|Bearer Token| Browser
Browser -->|Authenticated Requests| API
%% Agent Discovery (Auto-scan)
API -->|List Reasoning Engines<br/>GET /reasoningEngines| Agent1
API -->|List Reasoning Engines<br/>GET /reasoningEngines| Agent2
Agent1 -->|Agent Metadata| API
Agent2 -->|Agent Metadata| API
API -->|Cache Configs<br/>5 min TTL| AgentDiscovery
%% Query Flow
Browser -->|Query Request<br/>POST /query/stream| API
API -->|Stream Query<br/>POST :streamQuery| Agent1
API -->|Stream Query<br/>POST :streamQuery| Agent2
Agent1 -->|Stream Response<br/>SSE| API
Agent2 -->|Stream Response<br/>SSE| API
API -->|Forward Stream<br/>Server-Sent Events| Browser
%% Agent to BigQuery
Agent1 -->|Query BigQuery<br/>Read-Only| BigQuery
Agent2 -->|Query BigQuery<br/>Read-Only| BigQuery
%% Save Query History
API -->|Save Query/Response| Firestore
%% Service Account Permissions
SA_API -->|Uses Credentials| IAM
SA_UI -->|Uses Credentials| IAM
IAM -->|aiplatform.reasoningEngines.*<br/>bigquery.*<br/>datastore.*| Agent1
IAM -->|aiplatform.reasoningEngines.*<br/>bigquery.*<br/>datastore.*| Agent2
IAM -->|datastore.*| Firestore
IAM -->|bigquery.*| BigQuery
%% Styling
classDef userLayer fill:#e1f5ff,stroke:#01579b,stroke-width:2px
classDef cloudRun fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
classDef agentEngine fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
classDef dataStore fill:#fff3e0,stroke:#e65100,stroke-width:2px
classDef security fill:#ffebee,stroke:#b71c1c,stroke-width:2px
class User,Browser userLayer
class UI,API cloudRun
class Agent1,Agent2 agentEngine
class Firestore,BigQuery dataStore
class JWT,SA_API,SA_UI,IAM security
Component Details
1. Frontend UI Service (Cloud Run)
- Service:
agent-engine-ui - Technology: React + Vite
- Function:
- User interface for chat-based interaction
- Authentication UI (signup/login)
- Agent selection dropdown
- Query history display
- Table formatting for agent responses
2. Backend API Service (Cloud Run)
- Service:
agent-engine-api - Technology: FastAPI (Python)
- Function:
- REST API endpoints
- JWT authentication
- Agent discovery (auto-scan from Agent Engine)
- Query streaming to reasoning engines
- Query history management
- Firestore integration
3. Vertex AI Agent Engine
- Deployments:
bq_agent- BigQuery billing analysis agentbq_agent_mick- BigQuery billing analysis agent (variant)
- Function:
- Execute natural language queries
- Query BigQuery with read-only access
- Return structured responses
- Maintain conversation context
4. Firestore Database
- Collections:
users- User accounts (email, hashed passwords)query_history- Query/response history per user
- Function:
- User authentication storage
- Query history persistence
- Per-user data isolation
5. BigQuery
- Dataset:
gcp_billing_data - Table:
billing_data_ndjson - Function:
- Store GCP billing export data
- Provide read-only access to agents
- Support complex billing analysis queries
Sequence Diagrams
Authentication Flow
sequenceDiagram
participant User
participant Browser
participant UI as Frontend UI
participant API as Backend API
participant Firestore
participant JWT as JWT Auth
User->>Browser: Navigate to app
Browser->>UI: Load login page
User->>UI: Enter email/password
UI->>API: POST /auth/signup
API->>API: Validate domain (@asl.apps-eval.com)
API->>API: Hash password (bcrypt)
API->>Firestore: Store user credentials
Firestore-->>API: User created
API->>JWT: Generate access token
JWT-->>API: JWT token
API-->>UI: {access_token, user_id, email}
UI->>Browser: Store token (localStorage)
Browser->>UI: User logged in
UI->>API: GET /agents (with Bearer token)
API->>JWT: Verify token
JWT-->>API: Valid user_id
API-->>UI: Agent list
Agent Discovery Flow
sequenceDiagram
participant API as Backend API
participant Cache as Agent Cache
participant AgentEngine as Vertex AI<br/>Agent Engine API
participant UI as Frontend UI
Note over API,Cache: On startup or cache expiry (5 min TTL)
API->>Cache: Check cache
Cache-->>API: Cache expired/missing
API->>AgentEngine: GET /reasoningEngines
AgentEngine-->>API: List of engines<br/>(bq_agent, bq_agent_mick)
API->>API: Process & format configs
API->>Cache: Store configs (5 min TTL)
Cache-->>API: Configs cached
UI->>API: GET /agents
API->>Cache: Get cached configs
Cache-->>API: Agent configs
API-->>UI: Agent list JSON
UI->>UI: Populate dropdown
Query Flow
sequenceDiagram
participant User
participant Browser
participant UI as Frontend UI
participant API as Backend API
participant AgentEngine as Vertex AI<br/>Reasoning Engine
participant BigQuery
participant Firestore
User->>UI: Select agent & type query
UI->>API: POST /query/stream<br/>(Bearer token, agent_name, message)
API->>API: Verify JWT token
API->>API: Extract user_id from token
API->>API: Get agent config (cached)
API->>AgentEngine: POST /reasoningEngines/{id}:streamQuery<br/>(message, user_id)
Note over AgentEngine,BigQuery: Agent processes query
AgentEngine->>BigQuery: Execute SQL query<br/>(Read-only)
BigQuery-->>AgentEngine: Query results
AgentEngine->>AgentEngine: Process & format response
AgentEngine-->>API: Stream response chunks<br/>(Server-Sent Events)
API-->>UI: Forward SSE chunks
UI->>Browser: Display streaming text
Browser->>User: Real-time response
Note over API,Firestore: After stream completes
API->>API: Collect full response
API->>Firestore: Save query history<br/>(user_id, agent, query, response)
Firestore-->>API: History saved
UI->>API: GET /history?user_id=...
API->>Firestore: Fetch user history
Firestore-->>API: History items
API-->>UI: History JSON
UI->>Browser: Update history sidebar
Data Flows
Authentication Flow
- User signs up with
@asl.apps-eval.comemail - Backend validates domain and hashes password (bcrypt)
- User credentials stored in Firestore
- Backend generates JWT token
- Frontend stores token and sends in subsequent requests
Agent Discovery Flow
- Backend starts up or cache expires (5 min TTL)
- Backend calls Vertex AI API:
GET /reasoningEngines - API returns list of all deployed reasoning engines
- Backend processes and caches agent configs
- Frontend calls
/agentsendpoint to get available agents - Dropdown populated with discovered agents
Query Flow
- User selects agent and types query
- Frontend sends authenticated request to
/query/stream - Backend validates JWT and extracts user_id
- Backend streams query to selected reasoning engine:
POST /reasoningEngines/{id}:streamQuery - Reasoning engine queries BigQuery (if needed)
- Agent processes query and streams response back
- Backend forwards stream to frontend via Server-Sent Events (SSE)
- After completion, backend saves query/response to Firestore
- Frontend displays response and updates history
Security & IAM
Service Accounts
- API Service Account (
agent-engine-api-sa):- Custom role:
gcpBillingAgentService - Permissions:
aiplatform.reasoningEngines.*,bigquery.*,datastore.* - Also has:
roles/aiplatform.admin(for comprehensive access)
- Custom role:
- UI Service Account (
agent-engine-ui-sa):- Permissions:
roles/run.invoker(to invoke API service)
- Permissions:
Authentication
- Custom JWT-based authentication
- Domain restriction:
@asl.apps-eval.comonly - Password hashing: bcrypt (72-byte limit enforced)
- Token expiration: Configurable (default 7 days)
Network Security
- Cloud Run services: Public HTTPS ingress (
--ingress=all) - No VPC connector (simplified deployment)
- CORS configured for UI → API communication
- All traffic encrypted via HTTPS/TLS
Deployment Architecture
Deployment Flow
- Infrastructure Setup (
01-infrastructure.sh):- Enable APIs
- Create service accounts
- Set up IAM permissions
- IAM Configuration (
02-iam-permissions.sh):- Create custom IAM role
- Grant permissions to service accounts
- Application Deployment (
03-applications.sh):- Build Docker images (Cloud Build)
- Deploy backend to Cloud Run
- Deploy frontend to Cloud Run
- Set environment variables
- Configure service accounts
- Agent Engine Deployments (separate):
- Deploy agents using ADK CLI
- Agents automatically discovered by backend
Scaling & Performance
- Cloud Run: Auto-scaling (0 to N instances)
- Agent Discovery: 5-minute cache TTL to reduce API calls
- Query Streaming: Server-Sent Events for real-time responses
- Firestore: Automatic scaling for user and history data
- BigQuery: Serverless, automatically scales to query size
Monitoring & Logging
- Cloud Run Logs: Application logs, startup logs, errors
- Cloud Logging: Centralized logging for all services
- Error Handling: Graceful fallbacks for agent discovery
- Health Checks: Built-in Cloud Run health checks
Future Enhancements
- Add authentication middleware for additional security layers
- Implement rate limiting per user
- Add query result caching
- Support for additional data sources
- Multi-domain support (e.g., innovationbox.cloud)
- Admin dashboard for user management
- Query analytics and usage metrics