Tech Stack Advisor - Code Viewer

← Back to File Tree

README.md

Language: markdown | Path: README.md | Lines: 559
# Tech Stack Advisor - Multi-Agent RAG System

> **Production-Ready AI System**: LangGraph orchestration with 5 specialized agents, FastAPI backend, Modern Web UI with Google OAuth, Railway deployment, and intelligent memory management

A production-grade multi-agent system that provides intelligent technology stack recommendations using retrieval-augmented generation (RAG) and specialized AI agents orchestrated with LangGraph.

## πŸš€ Live Demo

**Deployed on Railway**: https://ranjana-tech-stack-advisor-production.up.railway.app

**Local Testing**: Clone and run the backend with `python -m backend.src.api.main`

## 🎯 What This Project Offers

### **5 Specialized AI Agents**

Each agent provides expert recommendations in their domain:

1. **πŸ’¬ Conversation Manager Agent**
   - Intelligent multi-turn dialogues to gather requirements
   - Structured follow-up questions with multiple-choice options
   - Context extraction and completion tracking
   - Determines when enough information is collected

2. **πŸ—„οΈ Database Agent**
   - Database selection (SQL, NoSQL, Graph, Time-series)
   - Scaling strategies (horizontal, vertical, sharding)
   - Caching recommendations (Redis, Memcached)
   - Connection pool sizing

3. **☁️ Infrastructure Agent**
   - Cloud provider recommendations (AWS, GCP, Azure)
   - Architecture patterns (monolith, microservices, serverless)
   - Deployment strategies (containers, VMs, K8s)
   - Load balancing and CDN

4. **πŸ’° Cost Agent**
   - Multi-cloud cost comparison
   - Monthly/annual budget estimates
   - Compute, storage, database, bandwidth breakdown
   - Cost optimization recommendations

5. **πŸ”’ Security Agent**
   - Threat modeling based on data sensitivity
   - Compliance requirements (GDPR, HIPAA, SOC2, PCI-DSS)
   - Risk assessment and prioritization
   - Security architecture recommendations

### **Technologies & Patterns**

- βœ… **LangGraph**: Multi-agent orchestration with state management
- βœ… **RAG System**: Qdrant vector database with semantic search
- βœ… **FastAPI**: Production API with rate limiting and monitoring
- βœ… **Anthropic Claude**: Claude 3 Haiku for fast, cost-effective inference
- βœ… **Modern Web UI**: HTML/CSS/JavaScript frontend with responsive design
- βœ… **Google OAuth 2.0**: Secure authentication without password storage
- βœ… **User Management**: Registration, login, admin dashboard, and feedback system
- βœ… **Multi-Turn Conversations**: Intelligent dialogues with context accumulation
- βœ… **Long-Term Memory**: Qdrant-based semantic search for query history (384-dim vectors)
- βœ… **Short-Term Memory**: Session management with 30-minute timeout
- βœ… **Railway**: One-click deployment with GitHub integration
- βœ… **Structured Logging**: JSON logs with correlation IDs

### **Architecture**

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Frontend: Modern Web UI (HTML/JS)      β”‚
β”‚  - User authentication (OAuth/Local)    β”‚
β”‚  - Google OAuth 2.0 integration         β”‚
β”‚  - Interactive recommendation interface β”‚
β”‚  - Admin dashboard                      β”‚
β”‚  - User feedback system                 β”‚
β”‚  └─── Served as static files from API   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚ HTTP/REST + JWT Auth
                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  FastAPI Backend (Port 8000)            β”‚
β”‚  β”œβ”€β”€ /auth/*       - Auth endpoints     β”‚
β”‚  β”œβ”€β”€ /recommend    - Main endpoint      β”‚
β”‚  β”œβ”€β”€ /health       - Health check       β”‚
β”‚  β”œβ”€β”€ /admin/*      - Admin dashboard    β”‚
β”‚  β”œβ”€β”€ /feedback     - User feedback      β”‚
β”‚  └── JWT & Google OAuth integration     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  LangGraph Workflow Orchestrator        β”‚
β”‚  β”œβ”€β”€ Context Parser (NLP extraction)    β”‚
β”‚  β”œβ”€β”€ Parallel Agent Execution           β”‚
β”‚  └── Result Aggregation                 β”‚
β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
      β”‚        β”‚        β”‚         β”‚
      β–Ό        β–Ό        β–Ό         β–Ό
  β”Œβ”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”
  β”‚ DB  β”‚  β”‚Infraβ”‚  β”‚Costβ”‚  β”‚ Sec β”‚
  β”‚Agentβ”‚  β”‚Agentβ”‚  β”‚Agntβ”‚  β”‚Agentβ”‚
  β””β”€β”€β”¬β”€β”€β”˜  β””β”€β”€β”¬β”€β”€β”˜  β””β”€β”€β”¬β”€β”˜  β””β”€β”€β”¬β”€β”€β”˜
     β”‚        β”‚        β”‚        β”‚
     β””β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β–Ό          β–Ό          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Qdrant   β”‚ β”‚Anthropic β”‚ β”‚ Context  β”‚
β”‚ Vector DBβ”‚ β”‚ Claude   β”‚ β”‚  State   β”‚
β”‚(RAG docs)β”‚ β”‚  Haiku   β”‚ β”‚(LangGraph)β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

## πŸ—οΈ Project Structure

```
tech-stack-advisor/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ agents/              # Specialized agents
β”‚   β”‚   β”‚   β”œβ”€β”€ database.py     # Database recommendations
β”‚   β”‚   β”‚   β”œβ”€β”€ infrastructure.py # Cloud architecture
β”‚   β”‚   β”‚   β”œβ”€β”€ cost.py         # Multi-cloud pricing
β”‚   β”‚   β”‚   └── security.py     # Threat assessment
β”‚   β”‚   β”œβ”€β”€ orchestration/       # LangGraph workflows
β”‚   β”‚   β”‚   └── workflow.py     # Multi-agent orchestration
β”‚   β”‚   β”œβ”€β”€ api/                 # FastAPI endpoints
β”‚   β”‚   β”‚   β”œβ”€β”€ main.py         # API server with auth & metrics
β”‚   β”‚   β”‚   └── models.py       # Pydantic models
β”‚   β”‚   β”œβ”€β”€ rag/                 # Vector DB integration
β”‚   β”‚   β”‚   └── vectorstore.py  # Qdrant client
β”‚   β”‚   └── core/                # Configuration
β”‚   β”‚       β”œβ”€β”€ config.py       # Settings (Pydantic)
β”‚   β”‚       β”œβ”€β”€ auth.py         # JWT auth & user management
β”‚   β”‚       β”œβ”€β”€ google_oauth.py # Google OAuth 2.0 helper
β”‚   β”‚       └── logging.py      # Structured logging & metrics
β”‚   └── static/                  # Static web files
β”‚       β”œβ”€β”€ index.html          # Main UI
β”‚       β”œβ”€β”€ login.html          # Login page
β”‚       β”œβ”€β”€ register.html       # Registration page
β”‚       β”œβ”€β”€ admin.html          # Admin dashboard
β”‚       β”œβ”€β”€ auth.js             # Auth helper
β”‚       └── GOOGLE_OAUTH_SETUP.md # OAuth setup guide
β”œβ”€β”€ knowledge_base/              # RAG documents
β”‚   └── tech_docs/              # Technology documentation
β”œβ”€β”€ Dockerfile                   # Docker container definition (Python 3.11)
β”œβ”€β”€ .dockerignore               # Docker build exclusions
β”œβ”€β”€ railway.toml                 # Railway deployment config
β”œβ”€β”€ RAILWAY_DEPLOYMENT.md       # Deployment guide
β”œβ”€β”€ GRAFANA_CLOUD_SETUP.md      # Grafana monitoring guide
β”œβ”€β”€ GOOGLE_OAUTH_SETUP.md       # Google OAuth guide
└── .env                        # Environment variables
```

## πŸš€ Quick Start

### **1. Prerequisites**

```bash
- Python 3.11+
- Anthropic API key (get from https://console.anthropic.com)
- Qdrant (runs locally via Docker or use Qdrant Cloud)
```

### **2. Setup**

```bash
# Clone repository
git clone https://github.com/ranjanarajendran/tech-stack-advisor.git
cd tech-stack-advisor

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -e ".[dev]"

# Configure environment variables
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY
```

### **3. Run the Application**

**Option A: Quick Start (Recommended)**
```bash
# One command to start everything
./run_app.sh
```

**Option B: Manual Start**
```bash
# Start the backend server (serves both API and UI)
python -m backend.src.api.main
```

Access the app at `http://localhost:8000`

The backend automatically serves:
- Frontend UI at `/` (index.html)
- API endpoints at `/recommend`, `/health`, etc.
- Admin dashboard at `/admin.html`
- API documentation at `/docs`

### **4. Optional: Google OAuth Setup**

To enable Google login:

1. Follow the [Google OAuth Setup Guide](./backend/GOOGLE_OAUTH_SETUP.md)
2. Add credentials to `.env`:
   ```bash
   GOOGLE_CLIENT_ID=your_client_id.apps.googleusercontent.com
   GOOGLE_CLIENT_SECRET=your_client_secret
   GOOGLE_REDIRECT_URI=http://localhost:8000/auth/google/callback
   ```
3. Restart the server to apply changes

## πŸ“‹ Example Usage

### **Query:**
```
I'm building a real-time chat application expecting 100K daily active users
with payment processing and GDPR compliance requirements. What's the best
tech stack?
```

### **Agent Responses:**

**πŸ—„οΈ Database Agent**
- **Primary DB**: PostgreSQL (ACID compliance for payments)
- **Cache**: Redis (real-time message delivery)
- **Scale Tier**: Medium (100K DAU)
- **Estimated Connections**: 2,000-5,000
- **Recommendation**: Use connection pooling (PgBouncer), implement read replicas

**☁️ Infrastructure Agent**
- **Architecture**: Microservices with API Gateway
- **Provider**: AWS (mature payment processing tools)
- **Load Balancer**: ALB with auto-scaling
- **CDN**: CloudFront for static assets
- **Deployment**: ECS Fargate (managed containers)

**πŸ’° Cost Agent**
- **AWS**: $847/month
  - Compute: $412 (ECS Fargate)
  - Database: $285 (RDS PostgreSQL + ElastiCache)
  - Storage: $85 (S3 + CloudFront)
  - Bandwidth: $65
- **GCP**: $923/month
- **Azure**: $891/month
- **Recommendation**: AWS (lowest cost, best payment integrations)

**πŸ”’ Security Agent**
- **Risk Priority**: HIGH (payment data + PII)
- **Compliance**: GDPR, PCI-DSS Level 1
- **Threats Identified**:
  - Critical: SQL injection, payment data exposure
  - High: CSRF, XSS, session hijacking
  - Medium: DDoS, rate limit bypass
- **Recommendations**:
  - Use Stripe/Braintree (PCI-compliant payment processor)
  - Implement WAF (AWS Shield + WAF)
  - Data encryption at rest (KMS) and in transit (TLS 1.3)
  - GDPR: Data residency (EU region), right to deletion

## πŸ”§ Configuration

### **Environment Variables**

Required variables in `.env`:

```bash
# Anthropic API (required)
ANTHROPIC_API_KEY=sk-ant-api03-xxxxx

# Qdrant Vector DB (optional - uses localhost:6333 by default)
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=your-key-here  # Only if using Qdrant Cloud

# API Configuration
ENVIRONMENT=development
LOG_LEVEL=INFO

# Google OAuth (optional - for Google login)
GOOGLE_CLIENT_ID=your_client_id.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=your_client_secret
GOOGLE_REDIRECT_URI=http://localhost:8000/auth/google/callback
```

### **Cost Monitoring**

Built-in usage tracking:

```python
# Daily budget cap (configurable)
DAILY_BUDGET_USD = 2.00      # Alert if exceeded
DAILY_QUERY_CAP = 100        # Max queries per day

# Automatic cost tracking
- Token usage per request
- Cost per agent
- Daily/monthly totals
```

### **Rate Limiting**

Currently disabled for development. Enable in `backend/src/api/main.py`:

```python
# Demo mode (no auth required)
RATE_LIMIT_DEMO = 10/hour per IP

# Authenticated mode
RATE_LIMIT_AUTHENTICATED = 100/hour per API key
```

## πŸ§ͺ Testing

```bash
# Run all tests
pytest backend/tests/

# Type checking
mypy backend/src/

# Linting
ruff check backend/src/

# Format code
ruff format backend/src/
```

## πŸ“¦ Deployment

### **Railway (Recommended - $5-10/month)**

**Full deployment guide**: See [RAILWAY_DEPLOYMENT.md](./RAILWAY_DEPLOYMENT.md)

**Docker-Based Deployment**:

This project uses Docker for deployment to ensure consistent Python environments between local development and production. The Dockerfile specifies Python 3.11, which:
- Prevents version mismatches (local may use 3.11.6 while Railway may try 3.13+)
- Ensures reproducible builds across different environments
- Includes health checks for container orchestration
- Optimizes layer caching for faster rebuilds

**Quick steps**:

1. Push to GitHub (already done βœ…)
2. Sign up at https://railway.app
3. Deploy backend:
   - New Project β†’ Deploy from GitHub repo
   - Select `ranjanarajendran/tech-stack-advisor`
   - Railway will automatically detect Dockerfile
   - Add environment variables (ANTHROPIC_API_KEY, etc.)
   - Copy backend URL
4. Deploy frontend:
   - Same project β†’ Add service from GitHub
   - Set `API_BASE_URL` to backend URL
   - Deploy

**Features**:
- Auto-deploy on git push (Docker-based)
- Zero-downtime deployments
- Built-in metrics & logs
- Custom domains with SSL
- $5 free trial credit

### **Alternative Deployment Options**

**AWS Free Tier** (tight on resources):
- t2.micro (1GB RAM) - challenging for both services
- Lambda - 15-min timeout may be tight for LLM calls

**Vercel** (not compatible):
- Doesn't support Streamlit (needs persistent WebSocket server)
- Would require rebuilding UI in Next.js/React

**Docker Compose** (local/self-hosted):
```bash
docker-compose up --build
```

## πŸ“Š Monitoring & Observability

### **API Endpoints**

**Public:**
- `GET /` - Main web UI (index.html)
- `GET /health` - Health check (agents loaded, uptime)
- `GET /docs` - Interactive API documentation (FastAPI Swagger)

**Authentication:**
- `POST /auth/register` - User registration
- `POST /auth/login` - User login (returns JWT)
- `POST /auth/logout` - User logout
- `GET /auth/google/login` - Initiate Google OAuth flow
- `GET /auth/google/callback` - Google OAuth callback

**Authenticated:**
- `POST /recommend` - Main recommendation endpoint (requires auth)
- `POST /feedback` - Submit user feedback
- `GET /metrics` - Usage metrics in JSON format (requests, cost, budget)
- `GET /metrics/prometheus` - Prometheus-format metrics for monitoring systems

**Admin Only:**
- `GET /admin.html` - Admin dashboard
- `GET /admin/users` - List all users
- `GET /admin/feedback` - View all feedback

### **Logging**

Structured JSON logs with correlation IDs:

```json
{
  "timestamp": "2025-11-20T22:49:56.470Z",
  "level": "info",
  "event": "recommendation_complete",
  "correlation_id": "aa447f46-f441-4d05-9650-a737475dbcd1",
  "status": "success",
  "dau": 100000,
  "agents_called": 4,
  "duration_ms": 12847
}
```

### **Prometheus Metrics**

The `/metrics/prometheus` endpoint exposes metrics in Prometheus format for integration with monitoring systems like Grafana Cloud:

**HTTP Metrics:**
- `http_requests_total{method, endpoint, status_code}` - Total HTTP requests with labels
- `http_request_duration_seconds{method, endpoint}` - Request duration histogram

**LLM Usage Metrics:**
- `llm_tokens_total{agent, token_type}` - Token usage by agent (input/output tokens)
- `llm_cost_usd_total{agent}` - Cumulative cost by agent
- `llm_requests_total{agent, status}` - LLM request count by agent and status
- `llm_daily_tokens` - Daily token usage gauge
- `llm_daily_cost_usd` - Daily cost in USD gauge
- `llm_daily_queries` - Daily query count gauge

**Application Metrics:**
- `active_conversation_sessions` - Active conversation sessions count
- `user_registrations_total{oauth_provider}` - User registrations by provider
- `user_logins_total{oauth_provider}` - User logins by provider
- `recommendations_total{status, authenticated}` - Recommendations generated

**Example Setup:**
See [GRAFANA_CLOUD_SETUP.md](./GRAFANA_CLOUD_SETUP.md) for complete guide on setting up free monitoring with Grafana Cloud.

### **Streamlit Dashboard**

Real-time metrics in UI sidebar:
- API health status
- Agents loaded count
- Total requests
- Daily queries
- Daily cost ($USD)
- Budget remaining

## πŸ› Known Issues & Fixes

### **Issue 1: "Failed to get recommendations" (FIXED)**

**Problem**: Streamlit UI showed error even when API returned success.

**Root Cause**: Bug in error checking logic - checked for key presence instead of value truthiness.

**Fix**: Updated `frontend/streamlit_app.py:369`:
```python
# Before (incorrect)
if "error" in result:

# After (correct)
if result.get("error"):
```

### **Issue 2: 401 Authentication Errors (RESOLVED)**

**Problem**: API key invalid errors after updating `.env`.

**Root Cause**: Long-running API process cached old environment variables.

**Solution**: Restart API server after changing `.env`:
```bash
# Kill old processes
pkill -f "python.*backend.src.api.main"

# Restart
./run_app.sh
```

**Note**: Python loads environment variables once at startup - changes require process restart.

## πŸ—ΊοΈ Roadmap

- [x] **Phase 1**: Multi-agent RAG system with LangGraph
- [x] **Phase 1.5**: Railway deployment configuration
- [x] **Phase 1.75**: Modern web UI with user authentication
- [x] **Phase 1.85**: Google OAuth 2.0 integration
- [x] **Phase 1.9**: User feedback system
- [ ] **Phase 2**: Kubernetes deployment (k3s/EKS)
- [ ] **Phase 3**: MLflow experiment tracking
- [ ] **Phase 4**: Fine-tuning on custom dataset
- [ ] **Phase 5**: Mobile app (React Native)

## 🀝 Contributing

Contributions welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests (`pytest`)
5. Submit a pull request

## πŸ“ License

MIT License - see [LICENSE](./LICENSE) file for details

## πŸ‘€ Author

**Ranjana Rajendran**

- GitHub: [@ranjanarajendran](https://github.com/ranjanarajendran)
- LinkedIn: [ranjana-rajendran](https://www.linkedin.com/in/ranjana-rajendran-9b3bb73)
- Portfolio: [ranjanarajendran.github.io](https://ranjanarajendran.github.io)
- Email: [ranjana.rajendran@gmail.com](mailto:ranjana.rajendran@gmail.com)

## πŸ’¬ Feedback & Support

Have feedback or encountering issues? We'd love to hear from you!

- **Email**: [ranjana.rajendran@gmail.com](mailto:ranjana.rajendran@gmail.com)
- **GitHub Issues**: [Report a bug](https://github.com/ranjanarajendran/tech-stack-advisor/issues)
- **In-app Feedback**: Use the "Need Help or Have Feedback?" section at the bottom of the web UI

## πŸ™ Acknowledgments

- **Anthropic** - Claude AI API
- **LangChain/LangGraph** - Agent orchestration framework
- **Qdrant** - Vector database for RAG
- **Railway** - Easy deployment platform
- **Google** - OAuth 2.0 authentication

---

**Status**: βœ… Production-ready, actively maintained

**Repository**: https://github.com/ranjanarajendran/tech-stack-advisor

**Deployment**: Railway (backend + frontend)

**Estimated Monthly Cost**: $5-10 (Railway) + $0.50-2 (Anthropic API)

---

*Built to demonstrate production ML engineering: prototype β†’ deployment β†’ monitoring*