# Tech Stack Advisor - Multi-Agent RAG System
> **Production-Ready AI System**: LangGraph orchestration with 5 specialized agents, FastAPI backend, Modern Web UI with Google OAuth, Railway deployment, and intelligent memory management
A production-grade multi-agent system that provides intelligent technology stack recommendations using retrieval-augmented generation (RAG) and specialized AI agents orchestrated with LangGraph.
## π Live Demo
**Deployed on Railway**: https://ranjana-tech-stack-advisor-production.up.railway.app
**Local Testing**: Clone and run the backend with `python -m backend.src.api.main`
## π― What This Project Offers
### **5 Specialized AI Agents**
Each agent provides expert recommendations in their domain:
1. **π¬ Conversation Manager Agent**
- Intelligent multi-turn dialogues to gather requirements
- Structured follow-up questions with multiple-choice options
- Context extraction and completion tracking
- Determines when enough information is collected
2. **ποΈ Database Agent**
- Database selection (SQL, NoSQL, Graph, Time-series)
- Scaling strategies (horizontal, vertical, sharding)
- Caching recommendations (Redis, Memcached)
- Connection pool sizing
3. **βοΈ Infrastructure Agent**
- Cloud provider recommendations (AWS, GCP, Azure)
- Architecture patterns (monolith, microservices, serverless)
- Deployment strategies (containers, VMs, K8s)
- Load balancing and CDN
4. **π° Cost Agent**
- Multi-cloud cost comparison
- Monthly/annual budget estimates
- Compute, storage, database, bandwidth breakdown
- Cost optimization recommendations
5. **π Security Agent**
- Threat modeling based on data sensitivity
- Compliance requirements (GDPR, HIPAA, SOC2, PCI-DSS)
- Risk assessment and prioritization
- Security architecture recommendations
### **Technologies & Patterns**
- β
**LangGraph**: Multi-agent orchestration with state management
- β
**RAG System**: Qdrant vector database with semantic search
- β
**FastAPI**: Production API with rate limiting and monitoring
- β
**Anthropic Claude**: Claude 3 Haiku for fast, cost-effective inference
- β
**Modern Web UI**: HTML/CSS/JavaScript frontend with responsive design
- β
**Google OAuth 2.0**: Secure authentication without password storage
- β
**User Management**: Registration, login, admin dashboard, and feedback system
- β
**Multi-Turn Conversations**: Intelligent dialogues with context accumulation
- β
**Long-Term Memory**: Qdrant-based semantic search for query history (384-dim vectors)
- β
**Short-Term Memory**: Session management with 30-minute timeout
- β
**Railway**: One-click deployment with GitHub integration
- β
**Structured Logging**: JSON logs with correlation IDs
### **Architecture**
```
βββββββββββββββββββββββββββββββββββββββββββ
β Frontend: Modern Web UI (HTML/JS) β
β - User authentication (OAuth/Local) β
β - Google OAuth 2.0 integration β
β - Interactive recommendation interface β
β - Admin dashboard β
β - User feedback system β
β ββββ Served as static files from API β
ββββββββββββββββββ¬βββββββββββββββββββββββββ
β HTTP/REST + JWT Auth
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β FastAPI Backend (Port 8000) β
β βββ /auth/* - Auth endpoints β
β βββ /recommend - Main endpoint β
β βββ /health - Health check β
β βββ /admin/* - Admin dashboard β
β βββ /feedback - User feedback β
β βββ JWT & Google OAuth integration β
ββββββββββββββββββ¬βββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β LangGraph Workflow Orchestrator β
β βββ Context Parser (NLP extraction) β
β βββ Parallel Agent Execution β
β βββ Result Aggregation β
βββββββ¬βββββββββ¬βββββββββ¬ββββββββββ¬ββββββββ
β β β β
βΌ βΌ βΌ βΌ
βββββββ βββββββ ββββββ βββββββ
β DB β βInfraβ βCostβ β Sec β
βAgentβ βAgentβ βAgntβ βAgentβ
ββββ¬βββ ββββ¬βββ ββββ¬ββ ββββ¬βββ
β β β β
ββββββββββ΄βββββββββ΄βββββββββ
β
ββββββββββββΌβββββββββββ
βΌ βΌ βΌ
ββββββββββββ ββββββββββββ ββββββββββββ
β Qdrant β βAnthropic β β Context β
β Vector DBβ β Claude β β State β
β(RAG docs)β β Haiku β β(LangGraph)β
ββββββββββββ ββββββββββββ ββββββββββββ
```
## ποΈ Project Structure
```
tech-stack-advisor/
βββ backend/
β βββ src/
β β βββ agents/ # Specialized agents
β β β βββ database.py # Database recommendations
β β β βββ infrastructure.py # Cloud architecture
β β β βββ cost.py # Multi-cloud pricing
β β β βββ security.py # Threat assessment
β β βββ orchestration/ # LangGraph workflows
β β β βββ workflow.py # Multi-agent orchestration
β β βββ api/ # FastAPI endpoints
β β β βββ main.py # API server with auth & metrics
β β β βββ models.py # Pydantic models
β β βββ rag/ # Vector DB integration
β β β βββ vectorstore.py # Qdrant client
β β βββ core/ # Configuration
β β βββ config.py # Settings (Pydantic)
β β βββ auth.py # JWT auth & user management
β β βββ google_oauth.py # Google OAuth 2.0 helper
β β βββ logging.py # Structured logging & metrics
β βββ static/ # Static web files
β βββ index.html # Main UI
β βββ login.html # Login page
β βββ register.html # Registration page
β βββ admin.html # Admin dashboard
β βββ auth.js # Auth helper
β βββ GOOGLE_OAUTH_SETUP.md # OAuth setup guide
βββ knowledge_base/ # RAG documents
β βββ tech_docs/ # Technology documentation
βββ Dockerfile # Docker container definition (Python 3.11)
βββ .dockerignore # Docker build exclusions
βββ railway.toml # Railway deployment config
βββ RAILWAY_DEPLOYMENT.md # Deployment guide
βββ GRAFANA_CLOUD_SETUP.md # Grafana monitoring guide
βββ GOOGLE_OAUTH_SETUP.md # Google OAuth guide
βββ .env # Environment variables
```
## π Quick Start
### **1. Prerequisites**
```bash
- Python 3.11+
- Anthropic API key (get from https://console.anthropic.com)
- Qdrant (runs locally via Docker or use Qdrant Cloud)
```
### **2. Setup**
```bash
# Clone repository
git clone https://github.com/ranjanarajendran/tech-stack-advisor.git
cd tech-stack-advisor
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -e ".[dev]"
# Configure environment variables
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY
```
### **3. Run the Application**
**Option A: Quick Start (Recommended)**
```bash
# One command to start everything
./run_app.sh
```
**Option B: Manual Start**
```bash
# Start the backend server (serves both API and UI)
python -m backend.src.api.main
```
Access the app at `http://localhost:8000`
The backend automatically serves:
- Frontend UI at `/` (index.html)
- API endpoints at `/recommend`, `/health`, etc.
- Admin dashboard at `/admin.html`
- API documentation at `/docs`
### **4. Optional: Google OAuth Setup**
To enable Google login:
1. Follow the [Google OAuth Setup Guide](./backend/GOOGLE_OAUTH_SETUP.md)
2. Add credentials to `.env`:
```bash
GOOGLE_CLIENT_ID=your_client_id.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=your_client_secret
GOOGLE_REDIRECT_URI=http://localhost:8000/auth/google/callback
```
3. Restart the server to apply changes
## π Example Usage
### **Query:**
```
I'm building a real-time chat application expecting 100K daily active users
with payment processing and GDPR compliance requirements. What's the best
tech stack?
```
### **Agent Responses:**
**ποΈ Database Agent**
- **Primary DB**: PostgreSQL (ACID compliance for payments)
- **Cache**: Redis (real-time message delivery)
- **Scale Tier**: Medium (100K DAU)
- **Estimated Connections**: 2,000-5,000
- **Recommendation**: Use connection pooling (PgBouncer), implement read replicas
**βοΈ Infrastructure Agent**
- **Architecture**: Microservices with API Gateway
- **Provider**: AWS (mature payment processing tools)
- **Load Balancer**: ALB with auto-scaling
- **CDN**: CloudFront for static assets
- **Deployment**: ECS Fargate (managed containers)
**π° Cost Agent**
- **AWS**: $847/month
- Compute: $412 (ECS Fargate)
- Database: $285 (RDS PostgreSQL + ElastiCache)
- Storage: $85 (S3 + CloudFront)
- Bandwidth: $65
- **GCP**: $923/month
- **Azure**: $891/month
- **Recommendation**: AWS (lowest cost, best payment integrations)
**π Security Agent**
- **Risk Priority**: HIGH (payment data + PII)
- **Compliance**: GDPR, PCI-DSS Level 1
- **Threats Identified**:
- Critical: SQL injection, payment data exposure
- High: CSRF, XSS, session hijacking
- Medium: DDoS, rate limit bypass
- **Recommendations**:
- Use Stripe/Braintree (PCI-compliant payment processor)
- Implement WAF (AWS Shield + WAF)
- Data encryption at rest (KMS) and in transit (TLS 1.3)
- GDPR: Data residency (EU region), right to deletion
## π§ Configuration
### **Environment Variables**
Required variables in `.env`:
```bash
# Anthropic API (required)
ANTHROPIC_API_KEY=sk-ant-api03-xxxxx
# Qdrant Vector DB (optional - uses localhost:6333 by default)
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=your-key-here # Only if using Qdrant Cloud
# API Configuration
ENVIRONMENT=development
LOG_LEVEL=INFO
# Google OAuth (optional - for Google login)
GOOGLE_CLIENT_ID=your_client_id.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=your_client_secret
GOOGLE_REDIRECT_URI=http://localhost:8000/auth/google/callback
```
### **Cost Monitoring**
Built-in usage tracking:
```python
# Daily budget cap (configurable)
DAILY_BUDGET_USD = 2.00 # Alert if exceeded
DAILY_QUERY_CAP = 100 # Max queries per day
# Automatic cost tracking
- Token usage per request
- Cost per agent
- Daily/monthly totals
```
### **Rate Limiting**
Currently disabled for development. Enable in `backend/src/api/main.py`:
```python
# Demo mode (no auth required)
RATE_LIMIT_DEMO = 10/hour per IP
# Authenticated mode
RATE_LIMIT_AUTHENTICATED = 100/hour per API key
```
## π§ͺ Testing
```bash
# Run all tests
pytest backend/tests/
# Type checking
mypy backend/src/
# Linting
ruff check backend/src/
# Format code
ruff format backend/src/
```
## π¦ Deployment
### **Railway (Recommended - $5-10/month)**
**Full deployment guide**: See [RAILWAY_DEPLOYMENT.md](./RAILWAY_DEPLOYMENT.md)
**Docker-Based Deployment**:
This project uses Docker for deployment to ensure consistent Python environments between local development and production. The Dockerfile specifies Python 3.11, which:
- Prevents version mismatches (local may use 3.11.6 while Railway may try 3.13+)
- Ensures reproducible builds across different environments
- Includes health checks for container orchestration
- Optimizes layer caching for faster rebuilds
**Quick steps**:
1. Push to GitHub (already done β
)
2. Sign up at https://railway.app
3. Deploy backend:
- New Project β Deploy from GitHub repo
- Select `ranjanarajendran/tech-stack-advisor`
- Railway will automatically detect Dockerfile
- Add environment variables (ANTHROPIC_API_KEY, etc.)
- Copy backend URL
4. Deploy frontend:
- Same project β Add service from GitHub
- Set `API_BASE_URL` to backend URL
- Deploy
**Features**:
- Auto-deploy on git push (Docker-based)
- Zero-downtime deployments
- Built-in metrics & logs
- Custom domains with SSL
- $5 free trial credit
### **Alternative Deployment Options**
**AWS Free Tier** (tight on resources):
- t2.micro (1GB RAM) - challenging for both services
- Lambda - 15-min timeout may be tight for LLM calls
**Vercel** (not compatible):
- Doesn't support Streamlit (needs persistent WebSocket server)
- Would require rebuilding UI in Next.js/React
**Docker Compose** (local/self-hosted):
```bash
docker-compose up --build
```
## π Monitoring & Observability
### **API Endpoints**
**Public:**
- `GET /` - Main web UI (index.html)
- `GET /health` - Health check (agents loaded, uptime)
- `GET /docs` - Interactive API documentation (FastAPI Swagger)
**Authentication:**
- `POST /auth/register` - User registration
- `POST /auth/login` - User login (returns JWT)
- `POST /auth/logout` - User logout
- `GET /auth/google/login` - Initiate Google OAuth flow
- `GET /auth/google/callback` - Google OAuth callback
**Authenticated:**
- `POST /recommend` - Main recommendation endpoint (requires auth)
- `POST /feedback` - Submit user feedback
- `GET /metrics` - Usage metrics in JSON format (requests, cost, budget)
- `GET /metrics/prometheus` - Prometheus-format metrics for monitoring systems
**Admin Only:**
- `GET /admin.html` - Admin dashboard
- `GET /admin/users` - List all users
- `GET /admin/feedback` - View all feedback
### **Logging**
Structured JSON logs with correlation IDs:
```json
{
"timestamp": "2025-11-20T22:49:56.470Z",
"level": "info",
"event": "recommendation_complete",
"correlation_id": "aa447f46-f441-4d05-9650-a737475dbcd1",
"status": "success",
"dau": 100000,
"agents_called": 4,
"duration_ms": 12847
}
```
### **Prometheus Metrics**
The `/metrics/prometheus` endpoint exposes metrics in Prometheus format for integration with monitoring systems like Grafana Cloud:
**HTTP Metrics:**
- `http_requests_total{method, endpoint, status_code}` - Total HTTP requests with labels
- `http_request_duration_seconds{method, endpoint}` - Request duration histogram
**LLM Usage Metrics:**
- `llm_tokens_total{agent, token_type}` - Token usage by agent (input/output tokens)
- `llm_cost_usd_total{agent}` - Cumulative cost by agent
- `llm_requests_total{agent, status}` - LLM request count by agent and status
- `llm_daily_tokens` - Daily token usage gauge
- `llm_daily_cost_usd` - Daily cost in USD gauge
- `llm_daily_queries` - Daily query count gauge
**Application Metrics:**
- `active_conversation_sessions` - Active conversation sessions count
- `user_registrations_total{oauth_provider}` - User registrations by provider
- `user_logins_total{oauth_provider}` - User logins by provider
- `recommendations_total{status, authenticated}` - Recommendations generated
**Example Setup:**
See [GRAFANA_CLOUD_SETUP.md](./GRAFANA_CLOUD_SETUP.md) for complete guide on setting up free monitoring with Grafana Cloud.
### **Streamlit Dashboard**
Real-time metrics in UI sidebar:
- API health status
- Agents loaded count
- Total requests
- Daily queries
- Daily cost ($USD)
- Budget remaining
## π Known Issues & Fixes
### **Issue 1: "Failed to get recommendations" (FIXED)**
**Problem**: Streamlit UI showed error even when API returned success.
**Root Cause**: Bug in error checking logic - checked for key presence instead of value truthiness.
**Fix**: Updated `frontend/streamlit_app.py:369`:
```python
# Before (incorrect)
if "error" in result:
# After (correct)
if result.get("error"):
```
### **Issue 2: 401 Authentication Errors (RESOLVED)**
**Problem**: API key invalid errors after updating `.env`.
**Root Cause**: Long-running API process cached old environment variables.
**Solution**: Restart API server after changing `.env`:
```bash
# Kill old processes
pkill -f "python.*backend.src.api.main"
# Restart
./run_app.sh
```
**Note**: Python loads environment variables once at startup - changes require process restart.
## πΊοΈ Roadmap
- [x] **Phase 1**: Multi-agent RAG system with LangGraph
- [x] **Phase 1.5**: Railway deployment configuration
- [x] **Phase 1.75**: Modern web UI with user authentication
- [x] **Phase 1.85**: Google OAuth 2.0 integration
- [x] **Phase 1.9**: User feedback system
- [ ] **Phase 2**: Kubernetes deployment (k3s/EKS)
- [ ] **Phase 3**: MLflow experiment tracking
- [ ] **Phase 4**: Fine-tuning on custom dataset
- [ ] **Phase 5**: Mobile app (React Native)
## π€ Contributing
Contributions welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests (`pytest`)
5. Submit a pull request
## π License
MIT License - see [LICENSE](./LICENSE) file for details
## π€ Author
**Ranjana Rajendran**
- GitHub: [@ranjanarajendran](https://github.com/ranjanarajendran)
- LinkedIn: [ranjana-rajendran](https://www.linkedin.com/in/ranjana-rajendran-9b3bb73)
- Portfolio: [ranjanarajendran.github.io](https://ranjanarajendran.github.io)
- Email: [ranjana.rajendran@gmail.com](mailto:ranjana.rajendran@gmail.com)
## π¬ Feedback & Support
Have feedback or encountering issues? We'd love to hear from you!
- **Email**: [ranjana.rajendran@gmail.com](mailto:ranjana.rajendran@gmail.com)
- **GitHub Issues**: [Report a bug](https://github.com/ranjanarajendran/tech-stack-advisor/issues)
- **In-app Feedback**: Use the "Need Help or Have Feedback?" section at the bottom of the web UI
## π Acknowledgments
- **Anthropic** - Claude AI API
- **LangChain/LangGraph** - Agent orchestration framework
- **Qdrant** - Vector database for RAG
- **Railway** - Easy deployment platform
- **Google** - OAuth 2.0 authentication
---
**Status**: β
Production-ready, actively maintained
**Repository**: https://github.com/ranjanarajendran/tech-stack-advisor
**Deployment**: Railway (backend + frontend)
**Estimated Monthly Cost**: $5-10 (Railway) + $0.50-2 (Anthropic API)
---
*Built to demonstrate production ML engineering: prototype β deployment β monitoring*