README.md

Language: markdown | Path: README.md | Lines: 559
# Tech Stack Advisor - Multi-Agent RAG System

> **Production-Ready AI System**: LangGraph orchestration with 5 specialized agents, FastAPI backend, Modern Web UI with Google OAuth, Railway deployment, and intelligent memory management

A production-grade multi-agent system that provides intelligent technology stack recommendations using retrieval-augmented generation (RAG) and specialized AI agents orchestrated with LangGraph.

## 🚀 Live Demo

**Deployed on Railway**: https://ranjana-tech-stack-advisor-production.up.railway.app

**Local Testing**: Clone and run the backend with `python -m backend.src.api.main`

## 🎯 What This Project Offers

### **5 Specialized AI Agents**

Each agent provides expert recommendations in their domain:

1. **💬 Conversation Manager Agent**
   - Intelligent multi-turn dialogues to gather requirements
   - Structured follow-up questions with multiple-choice options
   - Context extraction and completion tracking
   - Determines when enough information is collected

2. **🗄️ Database Agent**
   - Database selection (SQL, NoSQL, Graph, Time-series)
   - Scaling strategies (horizontal, vertical, sharding)
   - Caching recommendations (Redis, Memcached)
   - Connection pool sizing

3. **☁️ Infrastructure Agent**
   - Cloud provider recommendations (AWS, GCP, Azure)
   - Architecture patterns (monolith, microservices, serverless)
   - Deployment strategies (containers, VMs, K8s)
   - Load balancing and CDN

4. **💰 Cost Agent**
   - Multi-cloud cost comparison
   - Monthly/annual budget estimates
   - Compute, storage, database, bandwidth breakdown
   - Cost optimization recommendations

5. **🔒 Security Agent**
   - Threat modeling based on data sensitivity
   - Compliance requirements (GDPR, HIPAA, SOC2, PCI-DSS)
   - Risk assessment and prioritization
   - Security architecture recommendations

### **Technologies & Patterns**

- ✅ **LangGraph**: Multi-agent orchestration with state management
- ✅ **RAG System**: Qdrant vector database with semantic search
- ✅ **FastAPI**: Production API with rate limiting and monitoring
- ✅ **Anthropic Claude**: Claude 3 Haiku for fast, cost-effective inference
- ✅ **Modern Web UI**: HTML/CSS/JavaScript frontend with responsive design
- ✅ **Google OAuth 2.0**: Secure authentication without password storage
- ✅ **User Management**: Registration, login, admin dashboard, and feedback system
- ✅ **Multi-Turn Conversations**: Intelligent dialogues with context accumulation
- ✅ **Long-Term Memory**: Qdrant-based semantic search for query history (384-dim vectors)
- ✅ **Short-Term Memory**: Session management with 30-minute timeout
- ✅ **Railway**: One-click deployment with GitHub integration
- ✅ **Structured Logging**: JSON logs with correlation IDs

### **Architecture**

```
┌─────────────────────────────────────────┐
│ Frontend: Modern Web UI (HTML/JS)      │
│  - User authentication (OAuth/Local)    │
│  - Google OAuth 2.0 integration         │
│  - Interactive recommendation interface │
│  - Admin dashboard                      │
│  - User feedback system                 │
│  └─── Served as static files from API   │
└────────────────┬────────────────────────┘
                 │ HTTP/REST + JWT Auth
                 ▼
┌─────────────────────────────────────────┐
│  FastAPI Backend (Port 8000)            │
│  ├── /auth/*       - Auth endpoints     │
│  ├── /recommend    - Main endpoint      │
│  ├── /health       - Health check       │
│  ├── /admin/*      - Admin dashboard    │
│  ├── /feedback     - User feedback      │
│  └── JWT & Google OAuth integration     │
└────────────────┬────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────┐
│  LangGraph Workflow Orchestrator        │
│  ├── Context Parser (NLP extraction)    │
│  ├── Parallel Agent Execution           │
│  └── Result Aggregation                 │
└─────┬────────┬────────┬─────────┬───────┘
      │        │        │         │
      ▼        ▼        ▼         ▼
  ┌─────┐  ┌─────┐  ┌────┐  ┌─────┐
  │ DB  │  │Infra│  │Cost│  │ Sec │
  │Agent│  │Agent│  │Agnt│  │Agent│
  └──┬──┘  └──┬──┘  └──┬─┘  └──┬──┘
     │        │        │        │
     └────────┴────────┴────────┘
                 │
      ┌──────────┼──────────┐
      ▼          ▼          ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Qdrant   │ │Anthropic │ │ Context  │
│ Vector DB│ │ Claude   │ │  State   │
│(RAG docs)│ │  Haiku   │ │(LangGraph)│
└──────────┘ └──────────┘ └──────────┘
```

## 🏗️ Project Structure

```
tech-stack-advisor/
├── backend/
│   ├── src/
│   │   ├── agents/              # Specialized agents
│   │   │   ├── database.py     # Database recommendations
│   │   │   ├── infrastructure.py # Cloud architecture
│   │   │   ├── cost.py         # Multi-cloud pricing
│   │   │   └── security.py     # Threat assessment
│   │   ├── orchestration/       # LangGraph workflows
│   │   │   └── workflow.py     # Multi-agent orchestration
│   │   ├── api/                 # FastAPI endpoints
│   │   │   ├── main.py         # API server with auth & metrics
│   │   │   └── models.py       # Pydantic models
│   │   ├── rag/                 # Vector DB integration
│   │   │   └── vectorstore.py  # Qdrant client
│   │   └── core/                # Configuration
│   │       ├── config.py       # Settings (Pydantic)
│   │       ├── auth.py         # JWT auth & user management
│   │       ├── google_oauth.py # Google OAuth 2.0 helper
│   │       └── logging.py      # Structured logging & metrics
│   └── static/                  # Static web files
│       ├── index.html          # Main UI
│       ├── login.html          # Login page
│       ├── register.html       # Registration page
│       ├── admin.html          # Admin dashboard
│       ├── auth.js             # Auth helper
│       └── GOOGLE_OAUTH_SETUP.md # OAuth setup guide
├── knowledge_base/              # RAG documents
│   └── tech_docs/              # Technology documentation
├── Dockerfile                   # Docker container definition (Python 3.11)
├── .dockerignore               # Docker build exclusions
├── railway.toml                 # Railway deployment config
├── RAILWAY_DEPLOYMENT.md       # Deployment guide
├── GRAFANA_CLOUD_SETUP.md      # Grafana monitoring guide
├── GOOGLE_OAUTH_SETUP.md       # Google OAuth guide
└── .env                        # Environment variables
```

## 🚀 Quick Start

### **1. Prerequisites**

```bash
- Python 3.11+
- Anthropic API key (get from https://console.anthropic.com)
- Qdrant (runs locally via Docker or use Qdrant Cloud)
```

### **2. Setup**

```bash
# Clone repository
git clone https://github.com/ranjanarajendran/tech-stack-advisor.git
cd tech-stack-advisor

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -e ".[dev]"

# Configure environment variables
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY
```

### **3. Run the Application**

**Option A: Quick Start (Recommended)**
```bash
# One command to start everything
./run_app.sh
```

**Option B: Manual Start**
```bash
# Start the backend server (serves both API and UI)
python -m backend.src.api.main
```

Access the app at `http://localhost:8000`

The backend automatically serves:
- Frontend UI at `/` (index.html)
- API endpoints at `/recommend`, `/health`, etc.
- Admin dashboard at `/admin.html`
- API documentation at `/docs`

### **4. Optional: Google OAuth Setup**

To enable Google login:

1. Follow the [Google OAuth Setup Guide](./backend/GOOGLE_OAUTH_SETUP.md)
2. Add credentials to `.env`:
   ```bash
   GOOGLE_CLIENT_ID=your_client_id.apps.googleusercontent.com
   GOOGLE_CLIENT_SECRET=your_client_secret
   GOOGLE_REDIRECT_URI=http://localhost:8000/auth/google/callback
   ```
3. Restart the server to apply changes

## 📋 Example Usage

### **Query:**
```
I'm building a real-time chat application expecting 100K daily active users
with payment processing and GDPR compliance requirements. What's the best
tech stack?
```

### **Agent Responses:**

**🗄️ Database Agent**
- **Primary DB**: PostgreSQL (ACID compliance for payments)
- **Cache**: Redis (real-time message delivery)
- **Scale Tier**: Medium (100K DAU)
- **Estimated Connections**: 2,000-5,000
- **Recommendation**: Use connection pooling (PgBouncer), implement read replicas

**☁️ Infrastructure Agent**
- **Architecture**: Microservices with API Gateway
- **Provider**: AWS (mature payment processing tools)
- **Load Balancer**: ALB with auto-scaling
- **CDN**: CloudFront for static assets
- **Deployment**: ECS Fargate (managed containers)

**💰 Cost Agent**
- **AWS**: $847/month
  - Compute: $412 (ECS Fargate)
  - Database: $285 (RDS PostgreSQL + ElastiCache)
  - Storage: $85 (S3 + CloudFront)
  - Bandwidth: $65
- **GCP**: $923/month
- **Azure**: $891/month
- **Recommendation**: AWS (lowest cost, best payment integrations)

**🔒 Security Agent**
- **Risk Priority**: HIGH (payment data + PII)
- **Compliance**: GDPR, PCI-DSS Level 1
- **Threats Identified**:
  - Critical: SQL injection, payment data exposure
  - High: CSRF, XSS, session hijacking
  - Medium: DDoS, rate limit bypass
- **Recommendations**:
  - Use Stripe/Braintree (PCI-compliant payment processor)
  - Implement WAF (AWS Shield + WAF)
  - Data encryption at rest (KMS) and in transit (TLS 1.3)
  - GDPR: Data residency (EU region), right to deletion

## 🔧 Configuration

### **Environment Variables**

Required variables in `.env`:

```bash
# Anthropic API (required)
ANTHROPIC_API_KEY=sk-ant-api03-xxxxx

# Qdrant Vector DB (optional - uses localhost:6333 by default)
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=your-key-here  # Only if using Qdrant Cloud

# API Configuration
ENVIRONMENT=development
LOG_LEVEL=INFO

# Google OAuth (optional - for Google login)
GOOGLE_CLIENT_ID=your_client_id.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=your_client_secret
GOOGLE_REDIRECT_URI=http://localhost:8000/auth/google/callback
```

### **Cost Monitoring**

Built-in usage tracking:

```python
# Daily budget cap (configurable)
DAILY_BUDGET_USD = 2.00      # Alert if exceeded
DAILY_QUERY_CAP = 100        # Max queries per day

# Automatic cost tracking
- Token usage per request
- Cost per agent
- Daily/monthly totals
```

### **Rate Limiting**

Currently disabled for development. Enable in `backend/src/api/main.py`:

```python
# Demo mode (no auth required)
RATE_LIMIT_DEMO = 10/hour per IP

# Authenticated mode
RATE_LIMIT_AUTHENTICATED = 100/hour per API key
```

## 🧪 Testing

```bash
# Run all tests
pytest backend/tests/

# Type checking
mypy backend/src/

# Linting
ruff check backend/src/

# Format code
ruff format backend/src/
```

## 📦 Deployment

### **Railway (Recommended - $5-10/month)**

**Full deployment guide**: See [RAILWAY_DEPLOYMENT.md](./RAILWAY_DEPLOYMENT.md)

**Docker-Based Deployment**:

This project uses Docker for deployment to ensure consistent Python environments between local development and production. The Dockerfile specifies Python 3.11, which:
- Prevents version mismatches (local may use 3.11.6 while Railway may try 3.13+)
- Ensures reproducible builds across different environments
- Includes health checks for container orchestration
- Optimizes layer caching for faster rebuilds

**Quick steps**:

1. Push to GitHub (already done ✅)
2. Sign up at https://railway.app
3. Deploy backend:
   - New Project → Deploy from GitHub repo
   - Select `ranjanarajendran/tech-stack-advisor`
   - Railway will automatically detect Dockerfile
   - Add environment variables (ANTHROPIC_API_KEY, etc.)
   - Copy backend URL
4. Deploy frontend:
   - Same project → Add service from GitHub
   - Set `API_BASE_URL` to backend URL
   - Deploy

**Features**:
- Auto-deploy on git push (Docker-based)
- Zero-downtime deployments
- Built-in metrics & logs
- Custom domains with SSL
- $5 free trial credit

### **Alternative Deployment Options**

**AWS Free Tier** (tight on resources):
- t2.micro (1GB RAM) - challenging for both services
- Lambda - 15-min timeout may be tight for LLM calls

**Vercel** (not compatible):
- Doesn't support Streamlit (needs persistent WebSocket server)
- Would require rebuilding UI in Next.js/React

**Docker Compose** (local/self-hosted):
```bash
docker-compose up --build
```

## 📊 Monitoring & Observability

### **API Endpoints**

**Public:**
- `GET /` - Main web UI (index.html)
- `GET /health` - Health check (agents loaded, uptime)
- `GET /docs` - Interactive API documentation (FastAPI Swagger)

**Authentication:**
- `POST /auth/register` - User registration
- `POST /auth/login` - User login (returns JWT)
- `POST /auth/logout` - User logout
- `GET /auth/google/login` - Initiate Google OAuth flow
- `GET /auth/google/callback` - Google OAuth callback

**Authenticated:**
- `POST /recommend` - Main recommendation endpoint (requires auth)
- `POST /feedback` - Submit user feedback
- `GET /metrics` - Usage metrics in JSON format (requests, cost, budget)
- `GET /metrics/prometheus` - Prometheus-format metrics for monitoring systems

**Admin Only:**
- `GET /admin.html` - Admin dashboard
- `GET /admin/users` - List all users
- `GET /admin/feedback` - View all feedback

### **Logging**

Structured JSON logs with correlation IDs:

```json
{
  "timestamp": "2025-11-20T22:49:56.470Z",
  "level": "info",
  "event": "recommendation_complete",
  "correlation_id": "aa447f46-f441-4d05-9650-a737475dbcd1",
  "status": "success",
  "dau": 100000,
  "agents_called": 4,
  "duration_ms": 12847
}
```

### **Prometheus Metrics**

The `/metrics/prometheus` endpoint exposes metrics in Prometheus format for integration with monitoring systems like Grafana Cloud:

**HTTP Metrics:**
- `http_requests_total{method, endpoint, status_code}` - Total HTTP requests with labels
- `http_request_duration_seconds{method, endpoint}` - Request duration histogram

**LLM Usage Metrics:**
- `llm_tokens_total{agent, token_type}` - Token usage by agent (input/output tokens)
- `llm_cost_usd_total{agent}` - Cumulative cost by agent
- `llm_requests_total{agent, status}` - LLM request count by agent and status
- `llm_daily_tokens` - Daily token usage gauge
- `llm_daily_cost_usd` - Daily cost in USD gauge
- `llm_daily_queries` - Daily query count gauge

**Application Metrics:**
- `active_conversation_sessions` - Active conversation sessions count
- `user_registrations_total{oauth_provider}` - User registrations by provider
- `user_logins_total{oauth_provider}` - User logins by provider
- `recommendations_total{status, authenticated}` - Recommendations generated

**Example Setup:**
See [GRAFANA_CLOUD_SETUP.md](./GRAFANA_CLOUD_SETUP.md) for complete guide on setting up free monitoring with Grafana Cloud.

### **Streamlit Dashboard**

Real-time metrics in UI sidebar:
- API health status
- Agents loaded count
- Total requests
- Daily queries
- Daily cost ($USD)
- Budget remaining

## 🐛 Known Issues & Fixes

### **Issue 1: "Failed to get recommendations" (FIXED)**

**Problem**: Streamlit UI showed error even when API returned success.

**Root Cause**: Bug in error checking logic - checked for key presence instead of value truthiness.

**Fix**: Updated `frontend/streamlit_app.py:369`:
```python
# Before (incorrect)
if "error" in result:

# After (correct)
if result.get("error"):
```

### **Issue 2: 401 Authentication Errors (RESOLVED)**

**Problem**: API key invalid errors after updating `.env`.

**Root Cause**: Long-running API process cached old environment variables.

**Solution**: Restart API server after changing `.env`:
```bash
# Kill old processes
pkill -f "python.*backend.src.api.main"

# Restart
./run_app.sh
```

**Note**: Python loads environment variables once at startup - changes require process restart.

## 🗺️ Roadmap

- [x] **Phase 1**: Multi-agent RAG system with LangGraph
- [x] **Phase 1.5**: Railway deployment configuration
- [x] **Phase 1.75**: Modern web UI with user authentication
- [x] **Phase 1.85**: Google OAuth 2.0 integration
- [x] **Phase 1.9**: User feedback system
- [ ] **Phase 2**: Kubernetes deployment (k3s/EKS)
- [ ] **Phase 3**: MLflow experiment tracking
- [ ] **Phase 4**: Fine-tuning on custom dataset
- [ ] **Phase 5**: Mobile app (React Native)

## 🤝 Contributing

Contributions welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests (`pytest`)
5. Submit a pull request

## 📝 License

MIT License - see [LICENSE](./LICENSE) file for details

## 👤 Author

**Ranjana Rajendran**

- GitHub: [@ranjanarajendran](https://github.com/ranjanarajendran)
- LinkedIn: [ranjana-rajendran](https://www.linkedin.com/in/ranjana-rajendran-9b3bb73)
- Portfolio: [ranjanarajendran.github.io](https://ranjanarajendran.github.io)
- Email: [ranjana.rajendran@gmail.com](mailto:ranjana.rajendran@gmail.com)

## 💬 Feedback & Support

Have feedback or encountering issues? We'd love to hear from you!

- **Email**: [ranjana.rajendran@gmail.com](mailto:ranjana.rajendran@gmail.com)
- **GitHub Issues**: [Report a bug](https://github.com/ranjanarajendran/tech-stack-advisor/issues)
- **In-app Feedback**: Use the "Need Help or Have Feedback?" section at the bottom of the web UI

## 🙏 Acknowledgments

- **Anthropic** - Claude AI API
- **LangChain/LangGraph** - Agent orchestration framework
- **Qdrant** - Vector database for RAG
- **Railway** - Easy deployment platform
- **Google** - OAuth 2.0 authentication

---

**Status**: ✅ Production-ready, actively maintained

**Repository**: https://github.com/ranjanarajendran/tech-stack-advisor

**Deployment**: Railway (backend + frontend)

**Estimated Monthly Cost**: $5-10 (Railway) + $0.50-2 (Anthropic API)

---

*Built to demonstrate production ML engineering: prototype → deployment → monitoring*
Tech Stack Advisor - Code Viewer

README.md