8.6 KiB
🤖 AutoStream AI Agent
Conversational AI + Lead Generation System
An intelligent support agent that understands user intent, retrieves answers from a local knowledge base, detects high-intent leads, and captures user details — all through a clean, real-time chat interface.
demo video - https://www.youtube.com/watch?v=z_nOnDO30Gs
📌 Overview
AutoStream AI Agent is a full-stack conversational AI system built for ServiceHive's Inflx platform. It simulates a real-world SaaS support assistant that goes beyond answering questions — it identifies when a user is ready to convert, collects their details, and triggers a lead capture workflow automatically.
This project demonstrates a production-grade approach to building agentic AI systems using LangChain, Gemini 1.5 Flash, and a RAG pipeline backed by a local JSON knowledge base.
✨ Features
- Intent Detection — Classifies every user message as a greeting, product inquiry, or high-intent lead
- RAG-Powered Responses — Retrieves accurate answers from a local knowledge base (no hallucinations)
- Lead Capture Workflow — Collects name, email, and creator platform when high intent is detected
- Tool Execution — Calls
mock_lead_capture()only after all required fields are collected - Session Memory — Retains full conversation context across 5–6 turns
- FastAPI Backend — Clean REST API with a
/chatendpoint - Responsive Chat UI — Professional light/dark-mode frontend built with HTML, CSS, and JS
🧠 Tech Stack
| Layer | Technology |
|---|---|
| Language | Python 3.10+ |
| Backend Framework | FastAPI |
| LLM | Gemini 1.5 Flash (via LangChain) |
| Agent Framework | LangChain / Custom Agent Logic |
| Knowledge Base | JSON file (local RAG) |
| Memory | LangChain ConversationBufferMemory |
| Frontend | HTML5, CSS3, Vanilla JS |
| Environment | python-dotenv |
📂 Project Structure
autostream-ai-agent/
│
├── backend/
│ ├── agent.py # Main agent orchestration logic
│ ├── app.py # FastAPI server + /api/chat endpoint
│ ├── intent.py # Intent classification (greeting / inquiry / lead)
│ ├── rag.py # RAG pipeline — knowledge base retrieval
│ ├── memory.py # Conversation session memory management
│ ├── tools.py # Tool definitions + mock_lead_capture()
│ └── knowledge_base.json # Local knowledge base (pricing, features, policies)
│
├── frontend/
│ ├── index.html # Chat UI (single-page)
│ └── script.js # Frontend logic — send, receive, render messages
│
├── .env # API keys (not committed)
├── .env.example # Example environment variables
├── requirements.txt # Python dependencies
└── README.md
⚙️ How to Run Locally
1. Clone the Repository
git clone https://github.com/your-username/autostream-ai-agent.git
cd autostream-ai-agent
2. Create a Virtual Environment
python -m venv venv
source venv/bin/activate # macOS / Linux
venv\Scripts\activate # Windows
3. Install Dependencies
pip install -r requirements.txt
4. Configure Environment Variables
cp .env.example .env
Open .env and add your credentials:
GEMINI_API_KEY=your_gemini_api_key_here
5. Start the Backend Server
uvicorn backend.app:app --reload --port 8000
6. Open the Frontend
Open frontend/index.html directly in your browser, or serve it with:
python -m http.server 5500 --directory frontend
Then visit: http://localhost:5500
🔌 API Endpoints
POST /api/chat
Send a user message and receive an AI-generated response.
Request Body
{
"system": "You are AutoStream AI...",
"messages": [
{ "role": "user", "content": "What is the Pro plan pricing?" }
],
"mode": "chat"
}
Response
{
"content": "The Pro plan is priced at $79/month and includes unlimited videos, 4K resolution, and AI captions.",
"intent": "product_inquiry"
}
Status Codes
| Code | Meaning |
|---|---|
200 |
Success |
400 |
Bad request — missing fields |
500 |
Internal server error |
🏗️ Architecture Explanation
Why LangChain?
LangChain was chosen for its modular architecture — it separates concerns cleanly between the LLM, memory, tools, and retrieval layers. This makes it straightforward to swap components (e.g., replace Gemini with Claude or GPT) without rewriting the agent logic.
How State is Managed
Conversation state is managed using LangChain's ConversationBufferMemory, which stores the full message history in-memory per session. Each API call includes the complete history so the LLM retains context across turns. For multi-user scenarios, sessions are keyed by a unique chat_id generated at the start of each conversation.
RAG Pipeline
The knowledge base (knowledge_base.json) stores AutoStream's pricing, features, and policies as structured documents. On each user query, rag.py performs a similarity search against this knowledge base and injects the most relevant context into the LLM prompt — ensuring responses are grounded in factual product data rather than model hallucinations.
Tool Execution (Lead Capture)
tools.py defines mock_lead_capture() as a LangChain tool. The agent is instructed to call this function only after collecting the user's name, email, and creator platform — preventing premature triggering. The agent collects each field conversationally, one at a time, before executing the tool.
🎬 Demo Conversation Flow
User → "Hi there"
Agent → "Hello! I'm AutoStream AI. How can I assist you today?"
User → "What are your pricing plans?"
Agent → [RAG retrieval] "We offer two plans:
Basic — $29/month (10 videos, 720p)
Pro — $79/month (Unlimited, 4K, AI captions)"
User → "The Pro plan sounds great. I run a YouTube channel."
Agent → [High intent detected] "That's great to hear! May I get your name?"
User → "Alex"
Agent → "Thank you, Alex. Could you share your email address?"
User → "alex@gmail.com"
Agent → "Got it. And which platform are you primarily creating for?"
User → "YouTube"
Agent → [Calls mock_lead_capture("Alex", "alex@gmail.com", "YouTube")]
"Thank you, Alex. Your details have been captured.
Welcome to AutoStream — our team will be in touch shortly."
📱 WhatsApp Integration (Deployment Theory)
To deploy this agent on WhatsApp, the recommended approach is using the WhatsApp Business API via Meta's Cloud API combined with a webhook-based architecture:
- Register a webhook on Meta Developer Portal pointing to a public endpoint (e.g.,
https://yourdomain.com/webhook) - When a user sends a WhatsApp message, Meta sends a
POSTrequest to your webhook with the message payload - Your FastAPI server receives the payload, extracts the message text and sender ID, and routes it through the existing agent pipeline
- The agent's response is sent back to the user via a
POSTrequest to the WhatsApp Send Message API using the user's phone number as the session identifier - Session memory is maintained per phone number using a dictionary or Redis store
This approach requires no change to the core agent logic — only a new webhook handler and a WhatsApp API client (e.g.,
pywa,whatsapp-python, or direct HTTP calls).
🚀 Future Improvements
- Vector database (Pinecone / ChromaDB) for scalable RAG beyond JSON
- Streaming responses for real-time token-by-token output
- User authentication and persistent session storage (PostgreSQL / Redis)
- WhatsApp & Telegram deployment via webhook integrations
- Analytics dashboard — track intent distribution and lead conversion rate
- Multi-language support for global content creator audiences
- Human handoff — escalate to a live agent when confidence is low
👤 Author
Built for ServiceHive — Inflx Platform Machine Learning Intern Assignment — Social-to-Lead Agentic Workflow
| Project | AutoStream AI Agent |
| Stack | Python · FastAPI · LangChain · Gemini · HTML/JS |
| Assignment | ServiceHive / Inflx — ML Intern Take-Home |
AutoStream AI Agent — Built with precision. Designed for conversion.