autostream-ai-agent/README.md
2026-04-12 21:52:34 +05:30

8.6 KiB
Raw Blame History

🤖 AutoStream AI Agent

Conversational AI + Lead Generation System

An intelligent support agent that understands user intent, retrieves answers from a local knowledge base, detects high-intent leads, and captures user details — all through a clean, real-time chat interface.

demo video - https://www.youtube.com/watch?v=z_nOnDO30Gs

📌 Overview

AutoStream AI Agent is a full-stack conversational AI system built for ServiceHive's Inflx platform. It simulates a real-world SaaS support assistant that goes beyond answering questions — it identifies when a user is ready to convert, collects their details, and triggers a lead capture workflow automatically.

This project demonstrates a production-grade approach to building agentic AI systems using LangChain, Gemini 1.5 Flash, and a RAG pipeline backed by a local JSON knowledge base.


Features

  • Intent Detection — Classifies every user message as a greeting, product inquiry, or high-intent lead
  • RAG-Powered Responses — Retrieves accurate answers from a local knowledge base (no hallucinations)
  • Lead Capture Workflow — Collects name, email, and creator platform when high intent is detected
  • Tool Execution — Calls mock_lead_capture() only after all required fields are collected
  • Session Memory — Retains full conversation context across 56 turns
  • FastAPI Backend — Clean REST API with a /chat endpoint
  • Responsive Chat UI — Professional light/dark-mode frontend built with HTML, CSS, and JS

🧠 Tech Stack

Layer Technology
Language Python 3.10+
Backend Framework FastAPI
LLM Gemini 1.5 Flash (via LangChain)
Agent Framework LangChain / Custom Agent Logic
Knowledge Base JSON file (local RAG)
Memory LangChain ConversationBufferMemory
Frontend HTML5, CSS3, Vanilla JS
Environment python-dotenv

📂 Project Structure

autostream-ai-agent/
│
├── backend/
│   ├── agent.py              # Main agent orchestration logic
│   ├── app.py                # FastAPI server + /api/chat endpoint
│   ├── intent.py             # Intent classification (greeting / inquiry / lead)
│   ├── rag.py                # RAG pipeline — knowledge base retrieval
│   ├── memory.py             # Conversation session memory management
│   ├── tools.py              # Tool definitions + mock_lead_capture()
│   └── knowledge_base.json   # Local knowledge base (pricing, features, policies)
│
├── frontend/
│   ├── index.html            # Chat UI (single-page)
│   └── script.js             # Frontend logic — send, receive, render messages
│
├── .env                      # API keys (not committed)
├── .env.example              # Example environment variables
├── requirements.txt          # Python dependencies
└── README.md

⚙️ How to Run Locally

1. Clone the Repository

git clone https://github.com/your-username/autostream-ai-agent.git
cd autostream-ai-agent

2. Create a Virtual Environment

python -m venv venv
source venv/bin/activate        # macOS / Linux
venv\Scripts\activate           # Windows

3. Install Dependencies

pip install -r requirements.txt

4. Configure Environment Variables

cp .env.example .env

Open .env and add your credentials:

GEMINI_API_KEY=your_gemini_api_key_here

5. Start the Backend Server

uvicorn backend.app:app --reload --port 8000

6. Open the Frontend

Open frontend/index.html directly in your browser, or serve it with:

python -m http.server 5500 --directory frontend

Then visit: http://localhost:5500


🔌 API Endpoints

POST /api/chat

Send a user message and receive an AI-generated response.

Request Body

{
  "system": "You are AutoStream AI...",
  "messages": [
    { "role": "user", "content": "What is the Pro plan pricing?" }
  ],
  "mode": "chat"
}

Response

{
  "content": "The Pro plan is priced at $79/month and includes unlimited videos, 4K resolution, and AI captions.",
  "intent": "product_inquiry"
}

Status Codes

Code Meaning
200 Success
400 Bad request — missing fields
500 Internal server error

🏗️ Architecture Explanation

Why LangChain?

LangChain was chosen for its modular architecture — it separates concerns cleanly between the LLM, memory, tools, and retrieval layers. This makes it straightforward to swap components (e.g., replace Gemini with Claude or GPT) without rewriting the agent logic.

How State is Managed

Conversation state is managed using LangChain's ConversationBufferMemory, which stores the full message history in-memory per session. Each API call includes the complete history so the LLM retains context across turns. For multi-user scenarios, sessions are keyed by a unique chat_id generated at the start of each conversation.

RAG Pipeline

The knowledge base (knowledge_base.json) stores AutoStream's pricing, features, and policies as structured documents. On each user query, rag.py performs a similarity search against this knowledge base and injects the most relevant context into the LLM prompt — ensuring responses are grounded in factual product data rather than model hallucinations.

Tool Execution (Lead Capture)

tools.py defines mock_lead_capture() as a LangChain tool. The agent is instructed to call this function only after collecting the user's name, email, and creator platform — preventing premature triggering. The agent collects each field conversationally, one at a time, before executing the tool.


🎬 Demo Conversation Flow

User    → "Hi there"
Agent   → "Hello! I'm AutoStream AI. How can I assist you today?"

User    → "What are your pricing plans?"
Agent   → [RAG retrieval] "We offer two plans:
            Basic — $29/month (10 videos, 720p)
            Pro   — $79/month (Unlimited, 4K, AI captions)"

User    → "The Pro plan sounds great. I run a YouTube channel."
Agent   → [High intent detected] "That's great to hear! May I get your name?"

User    → "Alex"
Agent   → "Thank you, Alex. Could you share your email address?"

User    → "alex@gmail.com"
Agent   → "Got it. And which platform are you primarily creating for?"

User    → "YouTube"
Agent   → [Calls mock_lead_capture("Alex", "alex@gmail.com", "YouTube")]
          "Thank you, Alex. Your details have been captured.
           Welcome to AutoStream — our team will be in touch shortly."

📱 WhatsApp Integration (Deployment Theory)

To deploy this agent on WhatsApp, the recommended approach is using the WhatsApp Business API via Meta's Cloud API combined with a webhook-based architecture:

  1. Register a webhook on Meta Developer Portal pointing to a public endpoint (e.g., https://yourdomain.com/webhook)
  2. When a user sends a WhatsApp message, Meta sends a POST request to your webhook with the message payload
  3. Your FastAPI server receives the payload, extracts the message text and sender ID, and routes it through the existing agent pipeline
  4. The agent's response is sent back to the user via a POST request to the WhatsApp Send Message API using the user's phone number as the session identifier
  5. Session memory is maintained per phone number using a dictionary or Redis store

This approach requires no change to the core agent logic — only a new webhook handler and a WhatsApp API client (e.g., pywa, whatsapp-python, or direct HTTP calls).


🚀 Future Improvements

  • Vector database (Pinecone / ChromaDB) for scalable RAG beyond JSON
  • Streaming responses for real-time token-by-token output
  • User authentication and persistent session storage (PostgreSQL / Redis)
  • WhatsApp & Telegram deployment via webhook integrations
  • Analytics dashboard — track intent distribution and lead conversion rate
  • Multi-language support for global content creator audiences
  • Human handoff — escalate to a live agent when confidence is low

👤 Author

Built for ServiceHive — Inflx Platform Machine Learning Intern Assignment — Social-to-Lead Agentic Workflow

Project AutoStream AI Agent
Stack Python · FastAPI · LangChain · Gemini · HTML/JS
Assignment ServiceHive / Inflx — ML Intern Take-Home

AutoStream AI Agent — Built with precision. Designed for conversion.