Use Cloud Run to host AI agents. AI agents can be implemented as Cloud Run services and perform tasks and provide information to users in a conversational manner. Cloud Run provides automatic scaling and high scalability without provisioning resources, while only billing for actual usage. AI agents can be used for a variety of purposes, such as customer service, virtual assistants, and content generation.
You can use a Cloud Run service as a scalable API endpoint to process prompts from end users. Your service runs an AI orchestration framework, such as LangChain, LangGraph, or Firebase Genkit which orchestrates calls to:
- AI models such as Gemini API, Vertex AI endpoints, or another GPU-enabled Cloud Run service.
- Vector databases like Cloud SQL for PostgreSQL or
AlloyDB for PostgreSQL with the
pgvector
extension. - Other services or APIs.
For a more detailed architecture, see Infrastructure for a RAG-capable generative AI application using Vertex AI and AlloyDB for PostgreSQL.
Learn how to deploy Firebase Genkit to Cloud Run in the Firebase Genkit documentation.
Learn how to build and deploy a LangChain app to Cloud Run by working through a codelab.