docs/index.html - Tech Stack Advisor Code Viewer

index.html

Language: html | Path: docs/index.html | Lines: 1487
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Tech Stack Advisor - AI-Powered Architecture Recommendations</title>
    <style>
        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }

        body {
            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, sans-serif;
            line-height: 1.6;
            color: #333;
            background: #f5f5f5;
        }

        .hero {
            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
            color: white;
            padding: 100px 20px;
            text-align: center;
        }

        .hero h1 {
            font-size: 3em;
            margin-bottom: 20px;
            font-weight: 700;
        }

        .hero p {
            font-size: 1.3em;
            max-width: 800px;
            margin: 0 auto 30px;
            opacity: 0.95;
        }

        .hero-badges {
            display: flex;
            justify-content: center;
            gap: 15px;
            flex-wrap: wrap;
            margin-top: 30px;
        }

        .badge {
            background: rgba(255, 255, 255, 0.2);
            padding: 8px 16px;
            border-radius: 20px;
            font-size: 0.9em;
            backdrop-filter: blur(10px);
        }

        .container {
            max-width: 1200px;
            margin: 0 auto;
            padding: 60px 20px;
        }

        .section {
            background: white;
            margin-bottom: 40px;
            padding: 40px;
            border-radius: 12px;
            box-shadow: 0 2px 10px rgba(0,0,0,0.1);
        }

        h2 {
            color: #667eea;
            font-size: 2.2em;
            margin-bottom: 30px;
            border-bottom: 3px solid #667eea;
            padding-bottom: 10px;
        }

        h3 {
            color: #764ba2;
            font-size: 1.6em;
            margin: 30px 0 15px;
        }

        .architecture-diagram {
            background: #f9f9f9;
            border: 2px solid #e0e0e0;
            border-radius: 8px;
            padding: 30px;
            font-family: 'Courier New', monospace;
            font-size: 0.85em;
            line-height: 1.5;
            overflow-x: auto;
            margin: 30px 0;
        }

        .feature-grid {
            display: grid;
            grid-template-columns: repeat(auto-fit, minmax(280px, 1fr));
            gap: 25px;
            margin: 30px 0;
        }

        .feature-card {
            background: linear-gradient(135deg, #f5f7fa 0%, #c3cfe2 100%);
            padding: 25px;
            border-radius: 10px;
            border-left: 4px solid #667eea;
        }

        .feature-card h4 {
            color: #667eea;
            margin-bottom: 10px;
            font-size: 1.2em;
        }

        .tech-stack {
            display: flex;
            flex-wrap: wrap;
            gap: 10px;
            margin: 20px 0;
        }

        .tech-tag {
            background: #667eea;
            color: white;
            padding: 6px 14px;
            border-radius: 6px;
            font-size: 0.9em;
            font-weight: 500;
        }

        .timeline {
            position: relative;
            padding-left: 40px;
            margin: 40px 0;
        }

        .timeline::before {
            content: '';
            position: absolute;
            left: 0;
            top: 0;
            bottom: 0;
            width: 3px;
            background: #667eea;
        }

        .timeline-item {
            position: relative;
            margin-bottom: 30px;
            padding-left: 30px;
        }

        .timeline-item::before {
            content: '';
            position: absolute;
            left: -46px;
            top: 5px;
            width: 15px;
            height: 15px;
            border-radius: 50%;
            background: #667eea;
            border: 3px solid white;
            box-shadow: 0 0 0 2px #667eea;
        }

        .timeline-item h4 {
            color: #764ba2;
            margin-bottom: 10px;
        }

        .decision-card {
            background: #f0f4ff;
            border-left: 4px solid #667eea;
            padding: 20px;
            margin: 20px 0;
            border-radius: 6px;
        }

        .decision-card h4 {
            color: #667eea;
            margin-bottom: 10px;
        }

        .pros-cons {
            display: grid;
            grid-template-columns: 1fr 1fr;
            gap: 20px;
            margin: 20px 0;
        }

        .pros, .cons {
            padding: 20px;
            border-radius: 8px;
        }

        .pros {
            background: #e8f5e9;
            border-left: 4px solid #4caf50;
        }

        .cons {
            background: #ffebee;
            border-left: 4px solid #f44336;
        }

        .code-block {
            background: #1e1e1e;
            color: #d4d4d4;
            padding: 20px;
            border-radius: 8px;
            overflow-x: auto;
            margin: 20px 0;
            font-family: 'Courier New', monospace;
            font-size: 0.9em;
            white-space: pre-wrap;
            line-height: 1.6;
        }

        .metrics-grid {
            display: grid;
            grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
            gap: 20px;
            margin: 30px 0;
        }

        .metric-card {
            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
            color: white;
            padding: 25px;
            border-radius: 10px;
            text-align: center;
        }

        .metric-value {
            font-size: 2.5em;
            font-weight: bold;
            margin-bottom: 5px;
        }

        .metric-label {
            font-size: 0.9em;
            opacity: 0.9;
        }

        .challenge-card {
            background: #fff3e0;
            border-left: 4px solid #ff9800;
            padding: 20px;
            margin: 20px 0;
            border-radius: 6px;
        }

        .solution-card {
            background: #e8f5e9;
            border-left: 4px solid #4caf50;
            padding: 20px;
            margin: 20px 0;
            border-radius: 6px;
        }

        .cta-section {
            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
            color: white;
            padding: 60px 20px;
            text-align: center;
            margin-top: 60px;
        }

        .cta-button {
            display: inline-block;
            background: white;
            color: #667eea;
            padding: 15px 40px;
            border-radius: 30px;
            text-decoration: none;
            font-weight: 600;
            font-size: 1.1em;
            margin: 10px;
            transition: transform 0.2s;
        }

        .cta-button:hover {
            transform: translateY(-2px);
            box-shadow: 0 5px 15px rgba(0,0,0,0.3);
        }

        footer {
            background: #2c3e50;
            color: white;
            text-align: center;
            padding: 30px 20px;
        }

        @media (max-width: 768px) {
            .hero h1 {
                font-size: 2em;
            }

            .pros-cons {
                grid-template-columns: 1fr;
            }

            .section {
                padding: 25px;
            }
        }

        ul, ol {
            margin-left: 25px;
            margin-top: 10px;
        }

        li {
            margin: 8px 0;
        }

        table {
            width: 100%;
            border-collapse: collapse;
            margin: 20px 0;
        }

        table th, table td {
            border: 1px solid #ddd;
            padding: 12px;
            text-align: left;
        }

        table th {
            background: #667eea;
            color: white;
        }

        table tr:nth-child(even) {
            background: #f9f9f9;
        }
    </style>
</head>
<body>
    <!-- Hero Section -->
    <div class="hero">
        <h1>🚀 Tech Stack Advisor</h1>
        <p>An AI-powered multi-agent system that provides intelligent, personalized technology stack recommendations for software projects</p>
        <div class="hero-badges">
            <span class="badge">🤖 5 Specialized AI Agents</span>
            <span class="badge">⚡ LangGraph Orchestration</span>
            <span class="badge">🔒 JWT Authentication</span>
            <span class="badge">☁️ Production Ready</span>
        </div>
    </div>

    <!-- Main Content -->
    <div class="container">

        <!-- Overview Section -->
        <div class="section">
            <h2>📋 Project Overview</h2>
            <p style="font-size: 1.1em; line-height: 1.8;">
                Tech Stack Advisor is a production-ready web application that leverages cutting-edge AI technology to help developers and architects make informed decisions about their technology stack. The system uses five specialized AI agents working in concert to analyze project requirements and deliver comprehensive recommendations covering conversations, databases, infrastructure, cost optimization, and security. With intelligent multi-turn conversations, long-term memory powered by Qdrant vector search, and semantic query history, the system provides personalized, context-aware recommendations.
            </p>

            <div class="metrics-grid">
                <div class="metric-card">
                    <div class="metric-value">~3,400</div>
                    <div class="metric-label">Lines of Code</div>
                </div>
                <div class="metric-card">
                    <div class="metric-value">5</div>
                    <div class="metric-label">AI Agents</div>
                </div>
                <div class="metric-card">
                    <div class="metric-value">2-4s</div>
                    <div class="metric-label">Response Time</div>
                </div>
                <div class="metric-card">
                    <div class="metric-value">$0.0015</div>
                    <div class="metric-label">Cost Per Query</div>
                </div>
            </div>
        </div>

        <!-- Live Demo & Resources -->
        <div class="section">
            <h2>🎬 Try It Live</h2>

            <div class="feature-grid">
                <div class="feature-card">
                    <h4>🌐 Live Application</h4>
                    <p>Try the production deployment</p>
                    <a href="https://ranjana-tech-stack-advisor-production.up.railway.app" target="_blank" style="color: #667eea; font-weight: bold;">Launch App →</a>
                </div>
                <div class="feature-card">
                    <h4>📖 API Documentation</h4>
                    <p>Interactive Swagger UI</p>
                    <a href="https://ranjana-tech-stack-advisor-production.up.railway.app/docs" target="_blank" style="color: #667eea; font-weight: bold;">View Docs →</a>
                </div>
                <div class="feature-card">
                    <h4>💻 Source Code</h4>
                    <p>Private repository (ranjanarajendran)</p>
                    <a href="https://github.com/ranjanarajendran/tech-stack-advisor" target="_blank" style="color: #667eea; font-weight: bold; display: block; margin-bottom: 8px;">View Repository →</a>
                    <p style="font-size: 0.85em; margin-top: 8px;">No access? <a href="mailto:ranjana.rajendran@gmail.com?subject=Tech%20Stack%20Advisor%20Repository%20Access%20Request" style="color: #667eea;">Request here</a></p>
                </div>
                <div class="feature-card">
                    <h4>📊 Technical Documentation</h4>
                    <p>Comprehensive docs in repository</p>
                    <a href="https://github.com/ranjanarajendran/tech-stack-advisor/blob/main/TECHNICAL_DOCUMENTATION.md" target="_blank" style="color: #667eea; font-weight: bold; display: block; margin-bottom: 8px;">View Docs →</a>
                    <p style="font-size: 0.85em; margin-top: 8px;">No access? <a href="mailto:ranjana.rajendran@gmail.com?subject=Tech%20Stack%20Advisor%20Docs%20Access%20Request" style="color: #667eea;">Request here</a></p>
                </div>
            </div>
        </div>

        <!-- What Problem Does It Solve -->
        <div class="section">
            <h2>🎯 What Problem Does It Solve?</h2>

            <h3>The Challenge</h3>
            <p>
                Choosing the right technology stack for a project is one of the most critical decisions in software development. It requires:
            </p>
            <ul>
                <li><strong>Deep expertise</strong> across multiple domains (databases, infrastructure, security, cost optimization)</li>
                <li><strong>Understanding of scale requirements</strong> and how different technologies perform at various scales</li>
                <li><strong>Knowledge of compliance requirements</strong> (GDPR, HIPAA, PCI-DSS, SOC 2)</li>
                <li><strong>Cost-benefit analysis</strong> across different cloud providers and deployment options</li>
                <li><strong>Security threat modeling</strong> and mitigation strategies</li>
            </ul>

            <h3>The Solution</h3>
            <p>
                Tech Stack Advisor automates this complex decision-making process by:
            </p>
            <div class="feature-grid">
                <div class="feature-card">
                    <h4>💬 Intelligent Conversations</h4>
                    <p>Engages in multi-turn dialogues to gather project requirements intelligently, asking targeted follow-up questions with structured choices.</p>
                </div>
                <div class="feature-card">
                    <h4>🗄️ Database Recommendations</h4>
                    <p>Analyzes data type, scale, consistency requirements, and recommends optimal database solutions with scaling strategies.</p>
                </div>
                <div class="feature-card">
                    <h4>☁️ Infrastructure Planning</h4>
                    <p>Suggests cloud providers, architecture patterns, and deployment strategies based on workload characteristics.</p>
                </div>
                <div class="feature-card">
                    <h4>💰 Cost Optimization</h4>
                    <p>Provides multi-provider cost comparisons and optimization recommendations to maximize budget efficiency.</p>
                </div>
                <div class="feature-card">
                    <h4>🔒 Security Analysis</h4>
                    <p>Performs threat modeling, checks compliance requirements, and recommends security measures.</p>
                </div>
            </div>
        </div>

        <!-- System Architecture -->
        <div class="section">
            <h2>🏗️ System Architecture</h2>

            <style>
                .arch-container {
                    display: flex;
                    flex-direction: column;
                    align-items: center;
                    gap: 20px;
                    margin: 40px 0;
                }

                .arch-box {
                    background: white;
                    border: 3px solid #667eea;
                    border-radius: 12px;
                    padding: 25px;
                    box-shadow: 0 4px 12px rgba(102, 126, 234, 0.2);
                    width: 90%;
                    max-width: 700px;
                }

                .arch-box h4 {
                    color: #667eea;
                    font-size: 1.3em;
                    margin-bottom: 15px;
                    text-align: center;
                    font-weight: bold;
                }

                .arch-box ul {
                    list-style: none;
                    padding: 0;
                    margin: 10px 0;
                }

                .arch-box li {
                    padding: 5px 0;
                    padding-left: 20px;
                    position: relative;
                }

                .arch-box li::before {
                    content: "•";
                    position: absolute;
                    left: 0;
                    color: #667eea;
                    font-size: 1.5em;
                }

                .arch-arrow {
                    color: #667eea;
                    font-size: 2em;
                    font-weight: bold;
                }

                .arch-agents {
                    display: grid;
                    grid-template-columns: repeat(5, 1fr);
                    gap: 15px;
                    width: 90%;
                    max-width: 900px;
                }

                .agent-box {
                    background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
                    color: white;
                    border-radius: 10px;
                    padding: 20px 10px;
                    text-align: center;
                    font-weight: bold;
                    box-shadow: 0 4px 8px rgba(0,0,0,0.2);
                }

                .arch-services {
                    display: grid;
                    grid-template-columns: repeat(3, 1fr);
                    gap: 15px;
                    width: 90%;
                    max-width: 700px;
                }

                .service-box {
                    background: #f0f4ff;
                    border: 2px solid #667eea;
                    border-radius: 8px;
                    padding: 15px;
                    text-align: center;
                    font-weight: 600;
                    color: #667eea;
                }

                @media (max-width: 768px) {
                    .arch-agents {
                        grid-template-columns: repeat(2, 1fr);
                    }

                    .arch-services {
                        grid-template-columns: 1fr;
                    }
                }

                @media (min-width: 769px) and (max-width: 1024px) {
                    .arch-agents {
                        grid-template-columns: repeat(3, 1fr);
                    }
                }
            </style>

            <div class="arch-container">
                <div class="arch-box" style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; border: none;">
                    <h4 style="color: white;">🎨 Modern Web UI</h4>
                    <ul>
                        <li>HTML/CSS/JavaScript (Vanilla)</li>
                        <li>User Authentication (Local + Google OAuth)</li>
                        <li>Responsive Design</li>
                        <li>Real-time API Integration with JWT</li>
                        <li>Admin Dashboard</li>
                        <li>Download JSON Results</li>
                    </ul>
                </div>

                <div class="arch-arrow">↓</div>
                <div style="text-align: center; color: #667eea; font-weight: bold; margin: -10px 0;">HTTP REST + JWT Auth</div>
                <div class="arch-arrow">↓</div>

                <div class="arch-box">
                    <h4>⚡ FastAPI Backend (Port 8000)</h4>
                    <ul>
                        <li>Serves static files (HTML/CSS/JS)</li>
                        <li>POST /recommend - Main recommendation endpoint</li>
                        <li>Authentication endpoints (register/login/OAuth)</li>
                        <li>GET /health - Health monitoring</li>
                        <li>GET /metrics - Usage & cost tracking</li>
                        <li>Rate limiting & JWT authentication</li>
                        <li>Auto-generated Swagger docs</li>
                    </ul>
                </div>

                <div class="arch-arrow">↓</div>

                <div class="arch-box" style="background: #f0f4ff;">
                    <h4>🔄 LangGraph Orchestrator</h4>
                    <ul>
                        <li>Query Parser (NLP-based context extraction)</li>
                        <li>Sequential agent coordination</li>
                        <li>State management with TypedDict</li>
                        <li>Correlation IDs for tracing</li>
                    </ul>
                </div>

                <div class="arch-arrow">↓</div>

                <div class="arch-agents">
                    <div class="agent-box">
                        💬<br>Conversation<br>Manager
                    </div>
                    <div class="agent-box">
                        🗄️<br>Database<br>Agent
                    </div>
                    <div class="agent-box">
                        ☁️<br>Infrastructure<br>Agent
                    </div>
                    <div class="agent-box">
                        💰<br>Cost<br>Agent
                    </div>
                    <div class="agent-box">
                        🔒<br>Security<br>Agent
                    </div>
                </div>

                <div class="arch-arrow">↓</div>

                <div class="arch-services">
                    <div class="service-box">
                        📚 Qdrant<br>Vector Store<br>(34 docs)
                    </div>
                    <div class="service-box">
                        🤖 Claude AI<br>Haiku Model<br>(LLM)
                    </div>
                    <div class="service-box">
                        💵 Pricing<br>Data<br>(Real-time)
                    </div>
                </div>
            </div>

            <h3>Architecture Highlights</h3>
            <ul>
                <li><strong>Single-Service Design:</strong> Unified FastAPI backend serving both API and web UI on port 8000</li>
                <li><strong>Agent Orchestration:</strong> LangGraph manages sequential workflow through all agents</li>
                <li><strong>RAG System:</strong> Qdrant vector database with 34 curated technical documents</li>
                <li><strong>Authentication:</strong> JWT tokens + Google OAuth 2.0 for user management</li>
                <li><strong>State Management:</strong> Correlation IDs track requests through entire pipeline</li>
            </ul>
        </div>

        <!-- Key Technical Decisions -->
        <div class="section">
            <h2>💡 Key Technical Decisions</h2>

            <div class="decision-card">
                <h4>Decision 1: Vanilla JavaScript vs. Streamlit/React</h4>
                <p><strong>Chosen:</strong> Vanilla HTML/CSS/JavaScript</p>
                <p><strong>Primary Reason:</strong> Streamlit has known deployment issues on Railway, requiring complex WebSocket configuration and separate service management.</p>
                <div class="pros-cons">
                    <div class="pros">
                        <strong>✅ Why Vanilla JS Won:</strong>
                        <ul>
                            <li>Single service deployment (no WebSocket complexity)</li>
                            <li>Served directly from FastAPI (port 8000 only)</li>
                            <li>No Streamlit-specific Railway configuration</li>
                            <li>Zero build step, instant deployment</li>
                            <li>Smaller bundle size (~15KB vs ~2MB)</li>
                        </ul>
                    </div>
                    <div class="cons">
                        <strong>❌ Rejected Alternatives:</strong>
                        <ul>
                            <li><strong>Streamlit:</strong> Deployment complexity on Railway, WebSocket issues, required separate service</li>
                            <li><strong>React:</strong> Overkill for UI complexity, build step overhead</li>
                            <li><strong>Vue:</strong> Added dependency overhead</li>
                        </ul>
                    </div>
                </div>
                <p style="margin-top: 15px;"><strong>Key Learning:</strong> We initially built with Streamlit but encountered deployment issues on Railway. Rewriting in vanilla JavaScript reduced architecture complexity from 2 services to 1, eliminated WebSocket configuration headaches, and made deployment trivial.</p>
            </div>

            <div class="decision-card">
                <h4>Decision 2: LLM Provider Selection</h4>
                <p><strong>Chosen:</strong> Anthropic Claude (Haiku model)</p>
                <p><strong>Why:</strong></p>
                <ul>
                    <li>Best cost/performance ratio: $0.25 per 1M input tokens (vs GPT-4: $30)</li>
                    <li>Long context windows (200K tokens)</li>
                    <li>Strong instruction following</li>
                    <li>Built-in safety features</li>
                    <li>Lower latency than GPT-4</li>
                </ul>
                <p><strong>Cost Comparison (per 1,000 queries):</strong></p>
                <table>
                    <tr>
                        <th>Model</th>
                        <th>Cost</th>
                        <th>Decision</th>
                    </tr>
                    <tr>
                        <td>Claude Haiku</td>
                        <td>$1.50</td>
                        <td>✅ Selected</td>
                    </tr>
                    <tr>
                        <td>Claude Sonnet</td>
                        <td>$15.00</td>
                        <td>❌ 10x more expensive, exceeds 1GB Railway limit</td>
                    </tr>
                    <tr>
                        <td>GPT-3.5-Turbo</td>
                        <td>$2.00</td>
                        <td>❌ More expensive, lower quality</td>
                    </tr>
                    <tr>
                        <td>GPT-4-Turbo</td>
                        <td>$30.00</td>
                        <td>❌ 20x more expensive</td>
                    </tr>
                    <tr>
                        <td>Gemini Pro</td>
                        <td>$0.50</td>
                        <td>❌ Inconsistent API, less mature</td>
                    </tr>
                </table>

                <p style="margin-top: 15px;"><strong>Initial Consideration: Claude Sonnet</strong></p>
                <p>We initially considered upgrading to Claude Sonnet for its ability to generate larger responses (8,192 output tokens vs Haiku's 4,096), which would be useful for comprehensive infrastructure diagrams and detailed recommendations.</p>

                <p><strong>Why We Stayed with Haiku:</strong></p>
                <ul>
                    <li><strong>Cost:</strong> Sonnet is 10x more expensive ($3 vs $0.30 per 1M tokens)</li>
                    <li><strong>Memory Footprint:</strong> Sonnet model exceeded Railway's 1GB free tier memory limit, requiring paid plan upgrade</li>
                    <li><strong>Architectural Workaround:</strong> Instead of upgrading, we split the infrastructure diagram generation into a separate, smaller task outside the main Infrastructure Agent, keeping responses within Haiku's token limits</li>
                    <li><strong>Performance:</strong> Haiku's faster response times (2-4s) better suited our real-time recommendation use case</li>
                </ul>

                <p style="margin-top: 10px;"><strong>Key Learning:</strong> Architectural refactoring (task decomposition) can be more cost-effective than upgrading to larger models. By splitting complex outputs into focused sub-tasks, we maintained quality while achieving 10x cost savings and staying within infrastructure constraints.</p>
            </div>

            <div class="decision-card">
                <h4>Decision 3: Multi-Agent Architecture</h4>
                <p><strong>Chosen:</strong> 5 specialized agents with LangGraph orchestration</p>
                <p><strong>Agents:</strong></p>
                <ol>
                    <li><strong>Conversation Manager:</strong> Intelligent multi-turn dialogues to gather requirements</li>
                    <li><strong>Database Agent:</strong> Database technology recommendations</li>
                    <li><strong>Infrastructure Agent:</strong> Cloud architecture and deployment strategies</li>
                    <li><strong>Cost Agent:</strong> Multi-provider cost comparisons</li>
                    <li><strong>Security Agent:</strong> Threat modeling and compliance checks</li>
                </ol>
                <p><strong>Why:</strong></p>
                <ul>
                    <li><strong>Separation of Concerns:</strong> Each agent has focused expertise</li>
                    <li><strong>Better Prompt Engineering:</strong> Smaller, targeted prompts vs one giant prompt</li>
                    <li><strong>Conversational UX:</strong> Conversation Manager guides users through complex requirements</li>
                    <li><strong>Parallel Future Optimization:</strong> Can parallelize agents for 3.7× speedup</li>
                    <li><strong>Maintainability:</strong> Easy to update individual agents</li>
                    <li><strong>Testability:</strong> Each agent can be tested in isolation</li>
                </ul>
            </div>

            <div class="decision-card">
                <h4>Decision 4: Deployment Platform</h4>
                <p><strong>Chosen:</strong> Railway (Hobby plan $5/month)</p>
                <div class="pros-cons">
                    <div class="pros">
                        <strong>✅ Pros:</strong>
                        <ul>
                            <li>GitHub auto-deploy</li>
                            <li>Predictable pricing</li>
                            <li>Zero-downtime deploys</li>
                            <li>Built-in SSL</li>
                            <li>Simple environment management</li>
                        </ul>
                    </div>
                    <div class="cons">
                        <strong>❌ Rejected Alternatives:</strong>
                        <ul>
                            <li><strong>AWS:</strong> Complex, time-consuming setup</li>
                            <li><strong>Heroku:</strong> More expensive, being sunset</li>
                            <li><strong>Vercel:</strong> Serverless cold starts</li>
                            <li><strong>Railway Free:</strong> 500 hours/month limit</li>
                        </ul>
                    </div>
                </div>
            </div>
        </div>

        <!-- Implementation Journey -->
        <div class="section">
            <h2>🚀 Implementation Journey</h2>

            <div class="timeline">
                <div class="timeline-item">
                    <h4>Week 1: Agent Development</h4>
                    <p>Built 4 specialized agents (Database, Infrastructure, Cost, Security) with base class architecture and LLM integration. Implemented 8 tools for knowledge retrieval and computation.</p>
                    <p><strong>LOC:</strong> ~1,000 | <strong>Key Tech:</strong> Python, Anthropic Claude, Protocol-based tools</p>
                </div>

                <div class="timeline-item">
                    <h4>Week 2: LangGraph Orchestration</h4>
                    <p>Designed sequential workflow pipeline with state management. Implemented query parser for extracting DAU, compliance, and budget from natural language.</p>
                    <p><strong>LOC:</strong> ~500 | <strong>Key Tech:</strong> LangGraph, TypedDict state, Correlation IDs</p>
                </div>

                <div class="timeline-item">
                    <h4>Week 3: REST API Development</h4>
                    <p>Built production FastAPI with rate limiting, cost controls, and comprehensive error handling. Added Swagger/ReDoc documentation.</p>
                    <p><strong>LOC:</strong> ~400 | <strong>Key Tech:</strong> FastAPI, slowapi, Pydantic, CORS</p>
                </div>

                <div class="timeline-item">
                    <h4>Week 4: RAG System</h4>
                    <p>Implemented vector search with Qdrant. Curated 34 technical documents covering databases, infrastructure, and security. Used sentence-transformers for embeddings.</p>
                    <p><strong>LOC:</strong> ~500 | <strong>Key Tech:</strong> Qdrant, sentence-transformers, 384-d vectors</p>
                </div>

                <div class="timeline-item">
                    <h4>Week 5: Authentication & Frontend</h4>
                    <p>Built Modern Web UI with vanilla JavaScript. Implemented JWT authentication, Google OAuth 2.0, and admin dashboard. Replaced Streamlit for simpler deployment.</p>
                    <p><strong>LOC:</strong> ~400 | <strong>Key Tech:</strong> HTML/CSS/JS, JWT, bcrypt, Google OAuth</p>
                </div>

                <div class="timeline-item">
                    <h4>Week 6: Deployment & Polish</h4>
                    <p>Deployed to Railway. Fixed NumPy compatibility issues. Switched from free tier to Hobby plan ($5/month) due to 500 hour limit. Added comprehensive documentation.</p>
                    <p><strong>Status:</strong> ✅ Production-ready | <strong>Platform:</strong> Railway</p>
                </div>
            </div>
        </div>

        <!-- Challenges & Solutions -->
        <div class="section">
            <h2>⚡ Challenges & Solutions</h2>

            <div class="challenge-card">
                <h4>Challenge 1: sentence-transformers Compatibility</h4>
                <p><strong>Problem:</strong> ImportError with NumPy 2.0 breaking sentence-transformers</p>
                <div class="code-block">
ImportError: cannot import name 'cached_download' from 'huggingface_hub'
# Root cause: sentence-transformers 2.x incompatible with NumPy 2.0
                </div>
            </div>
            <div class="solution-card">
                <h4>✅ Solution</h4>
                <p>Pinned NumPy to version &lt;2.0.0 in pyproject.toml:</p>
                <div class="code-block">
[project]
dependencies = [
    "sentence-transformers>=2.2.2,<3.0.0",
    "numpy>=1.21.0,<2.0.0",  # Pin to NumPy 1.x
    "transformers>=4.30.0",
]
                </div>
                <p><strong>Lesson Learned:</strong> Always pin major versions in production dependencies</p>
            </div>

            <div class="challenge-card">
                <h4>Challenge 2: Railway Free Tier Exceeded</h4>
                <p><strong>Problem:</strong> App went down with "exceeded usage limit" error</p>
                <p><strong>Investigation:</strong></p>
                <ul>
                    <li>Free tier: 500 hours/month</li>
                    <li>Our usage: 24/7 × 30 days = 720 hours/month</li>
                    <li>Overage: 220 hours → app suspended</li>
                </ul>
            </div>
            <div class="solution-card">
                <h4>✅ Solution</h4>
                <p>Upgraded to Railway Hobby plan ($5/month) for unlimited execution hours</p>
                <p><strong>Why this was the right choice:</strong></p>
                <ul>
                    <li>Predictable costs vs pay-per-use</li>
                    <li>Continuous availability</li>
                    <li>Still cheaper than AWS (when factoring in setup time)</li>
                </ul>
            </div>

            <div class="challenge-card">
                <h4>Challenge 3: Streamlit Deployment Complexity</h4>
                <p><strong>Problem:</strong> Streamlit required separate service, WebSocket configuration, and complex CORS setup</p>
            </div>
            <div class="solution-card">
                <h4>✅ Solution</h4>
                <p>Rewrote frontend in vanilla HTML/CSS/JavaScript served directly from FastAPI</p>
                <p><strong>Benefits:</strong></p>
                <ul>
                    <li>Single service deployment (simplified architecture)</li>
                    <li>No WebSocket issues</li>
                    <li>Faster page loads (no framework overhead)</li>
                    <li>Single port (8000) instead of two</li>
                </ul>
            </div>

            <div class="challenge-card">
                <h4>Challenge 4: Cost Control at Scale</h4>
                <p><strong>Problem:</strong> Needed to prevent runaway API costs from abuse or bugs</p>
            </div>
            <div class="solution-card">
                <h4>✅ Solution</h4>
                <p>Implemented multi-layer protection:</p>
                <ul>
                    <li><strong>Rate limiting:</strong> 5 req/hour (demo), 50 req/hour (authenticated)</li>
                    <li><strong>Daily budget cap:</strong> $2.00 default, configurable</li>
                    <li><strong>Token tracking:</strong> Monitor per-request costs</li>
                    <li><strong>Query validation:</strong> Limit input length (10-1000 chars)</li>
                </ul>
            </div>
        </div>

        <!-- Memory Management -->
        <div class="section">
            <h2>🧠 Memory Management & Conversation Design</h2>

            <h3>Short-Term Memory (Request Scope)</h3>
            <p>Each request gets a unique correlation ID that tracks the request through all agents:</p>
            <div class="code-block">
import uuid
from contextvars import ContextVar

correlation_id_var: ContextVar[str] = ContextVar('correlation_id')

@app.middleware("http")
async def add_correlation_id(request: Request, call_next):
    correlation_id = str(uuid.uuid4())
    correlation_id_var.set(correlation_id)
    logger.info("request_start", correlation_id=correlation_id)
    response = await call_next(request)
    return response
            </div>
            <p><strong>Purpose:</strong> Debug issues, trace requests, performance analysis</p>

            <h3>Long-Term Memory (Implemented with Qdrant)</h3>
            <p>Persistent storage using Qdrant vector database with semantic search capabilities:</p>

            <h4>Three Qdrant Collections:</h4>
            <ol>
                <li><strong>users:</strong> Authentication data, user profiles, usage statistics (total_queries, total_cost_usd)</li>
                <li><strong>user_queries:</strong> Query history with 384-dimensional semantic embeddings for similarity search</li>
                <li><strong>user_feedback:</strong> User feedback on recommendations for continuous improvement</li>
            </ol>

            <h4>Semantic Search Implementation:</h4>
            <div class="code-block">
from sentence_transformers import SentenceTransformer
from qdrant_client import QdrantClient

class UserMemoryStore:
    def __init__(self):
        self.embedding_model = SentenceTransformer('all-MiniLM-L6-v2')  # 384-dim
        self.client = QdrantClient(url=QDRANT_URL, api_key=QDRANT_API_KEY)

    def store_query(self, user_id, query, recommendations, tokens_used, cost_usd):
        # Generate semantic embedding
        query_embedding = self.embedding_model.encode(query).tolist()

        # Store with vector for similarity search
        self.client.upsert(
            collection_name="user_queries",
            points=[PointStruct(
                vector=query_embedding,
                payload={
                    "user_id": user_id,
                    "query": query,
                    "recommendations": recommendations,
                    "tokens_used": tokens_used,
                    "cost_usd": cost_usd,
                }
            )]
        )

    def search_similar_queries(self, user_id, query, limit=5):
        # Find semantically similar past queries
        query_embedding = self.embedding_model.encode(query)
        results = self.client.search(
            collection_name="user_queries",
            query_vector=query_embedding,
            query_filter={"user_id": user_id},
            limit=limit
        )
        return results  # Returns queries with similarity scores
            </div>

            <h4>Enabled Features:</h4>
            <ul>
                <li><strong>Query History:</strong> "You asked something similar 2 days ago for a chat app"</li>
                <li><strong>Semantic Search:</strong> Find related queries even with different wording</li>
                <li><strong>User Statistics:</strong> Track total queries, cumulative cost per user</li>
                <li><strong>Feedback Loop:</strong> Store and analyze user feedback on recommendations</li>
                <li><strong>Cost Tracking:</strong> Monitor per-user API costs for budget controls</li>
            </ul>

            <h3>Multi-Turn Conversations (Implemented)</h3>
            <p>Conversation Manager agent enables intelligent multi-turn dialogues with session-based memory:</p>

            <h4>SessionStore Implementation:</h4>
            <div class="code-block">
class SessionStore:
    """In-memory short-term conversation memory (30-minute timeout)"""

    @staticmethod
    def create_session(user_id: str) -> str:
        session_id = str(uuid.uuid4())
        _sessions[session_id] = {
            "user_id": user_id,
            "conversation_history": [],  # All messages in conversation
            "extracted_context": {},      # Accumulated project requirements
            "completion_percentage": 0,   # How much info gathered
            "ready_for_recommendation": False
        }
        return session_id

    @staticmethod
    def add_message(session_id: str, role: str, content: str):
        session["conversation_history"].append({
            "role": role,
            "content": content,
            "timestamp": time.time()
        })
            </div>

            <h4>Conversation Flow:</h4>
            <ol>
                <li><strong>User starts conversation:</strong> "I need a tech stack for my project"</li>
                <li><strong>Agent asks follow-up:</strong> "How many daily active users do you expect?"</li>
                <li><strong>User responds:</strong> "Around 100K users"</li>
                <li><strong>Agent continues:</strong> "What type of data will you be storing?"</li>
                <li><strong>Context accumulates:</strong> extracted_context = {"dau": 100000, "data_type": "..."}</li>
                <li><strong>Completion tracked:</strong> completion_percentage increases from 0% → 100%</li>
                <li><strong>Ready signal:</strong> When ready_for_recommendation = True, system generates full recommendation</li>
            </ol>

            <h4>Enabled Multi-Turn Queries:</h4>
            <ul>
                <li>"What if I increase the budget to $1000?" → Updates context, regenerates recommendations</li>
                <li>"Can you recommend alternatives to PostgreSQL?" → Refines database recommendations</li>
                <li>"How would this change for 1M users instead?" → Re-runs all agents with new scale</li>
            </ul>

            <p><strong>Note:</strong> Production systems should migrate from in-memory SessionStore to Redis for persistence across server restarts and multi-instance deployments.</p>
        </div>

        <!-- Authentication -->
        <div class="section">
            <h2>🔐 Authentication & Security</h2>

            <h3>Why Authentication Was Necessary</h3>
            <ol>
                <li><strong>Cost Control:</strong> Prevent abuse of expensive LLM API calls (~$0.0015/query)</li>
                <li><strong>Rate Limiting:</strong> Enforce per-user limits instead of per-IP</li>
                <li><strong>Audit Trail:</strong> Track who makes what requests for debugging</li>
                <li><strong>Feature Access:</strong> Enable user profiles, query history, saved recommendations</li>
                <li><strong>Admin Features:</strong> Manage users, view feedback, monitor system health</li>
            </ol>

            <h3>Authentication Implementation</h3>
            <div class="feature-grid">
                <div class="feature-card">
                    <h4>JWT Tokens</h4>
                    <p>Stateless authentication with 1-hour expiration. Tokens include user email and role (user/admin).</p>
                </div>
                <div class="feature-card">
                    <h4>Password Security</h4>
                    <p>bcrypt hashing with salt rounds. Passwords never stored in plain text or logged.</p>
                </div>
                <div class="feature-card">
                    <h4>Google OAuth 2.0</h4>
                    <p>Social login with state parameter for CSRF protection. User passwords stay at Google.</p>
                </div>
                <div class="feature-card">
                    <h4>Rate Limiting</h4>
                    <p>Per-user limits: 50 req/hour authenticated vs 5 req/hour demo mode.</p>
                </div>
            </div>

            <h3>Security Measures</h3>
            <ul>
                <li><strong>Input Validation:</strong> Pydantic schemas validate all inputs</li>
                <li><strong>XSS Prevention:</strong> Content-Security-Policy headers</li>
                <li><strong>CSRF Protection:</strong> JWT tokens + OAuth state parameter</li>
                <li><strong>CORS Configuration:</strong> Restrict origins in production</li>
                <li><strong>SQL Injection:</strong> Parameterized queries with SQLAlchemy</li>
            </ul>

            <h3>Rate Limiting Implementation (SlowAPI)</h3>
            <p>
                The system implements comprehensive rate limiting using <strong>SlowAPI</strong>, a FastAPI extension built on a token bucket algorithm with in-memory storage. This protects against abuse and controls API costs.
            </p>

            <h4 style="color: #667eea; font-size: 1.2em; margin: 20px 0 10px;">Architecture</h4>
            <div class="code-block">
# backend/src/api/main.py
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded

# Initialize limiter with IP-based tracking
limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter

# Register exception handler for HTTP 429 responses
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
            </div>

            <h4 style="color: #667eea; font-size: 1.2em; margin: 20px 0 10px;">Configuration</h4>
            <div class="code-block">
# backend/src/core/config.py
class Settings(BaseSettings):
    rate_limit_demo: str = "50/hour"           # Demo/unauthenticated users
    rate_limit_authenticated: str = "100/hour"  # Authenticated users
    daily_query_cap: int = 100                  # Daily limit per user
            </div>

            <h4 style="color: #667eea; font-size: 1.2em; margin: 20px 0 10px;">Applied to Endpoints</h4>
            <div class="code-block">
@app.post("/recommend")
@limiter.limit(settings.rate_limit_demo)  # 50 requests/hour by IP
async def get_recommendation(request: Request, req: RecommendationRequest):
    # Endpoint logic
    pass

@app.post("/generate-diagram")
@limiter.limit(settings.rate_limit_demo)
async def generate_architecture_diagram(request: Request, req: dict):
    pass

@app.post("/conversation/start")
@limiter.limit(settings.rate_limit_demo)
async def start_conversation(request: Request):
    pass
            </div>

            <h4 style="color: #667eea; font-size: 1.2em; margin: 20px 0 10px;">How It Works</h4>
            <ul>
                <li><strong>IP-Based Tracking:</strong> <code>get_remote_address</code> extracts client IP from request headers</li>
                <li><strong>Sliding Window Algorithm:</strong> Tracks requests per IP in a time window (e.g., last hour)</li>
                <li><strong>Automatic Enforcement:</strong> Returns HTTP 429 (Too Many Requests) with <code>Retry-After</code> header when limit exceeded</li>
                <li><strong>Per-Endpoint Limits:</strong> Each decorated endpoint maintains independent rate limits</li>
                <li><strong>In-Memory Storage:</strong> Fast lookup with minimal latency (suitable for single-instance deployments)</li>
            </ul>

            <h4 style="color: #667eea; font-size: 1.2em; margin: 20px 0 10px;">Benefits</h4>
            <div class="feature-grid">
                <div class="feature-card">
                    <h4 style="font-size: 1.1em;">💰 Cost Control</h4>
                    <p>Prevents LLM API cost spiral from excessive requests</p>
                </div>
                <div class="feature-card">
                    <h4 style="font-size: 1.1em;">🛡️ Abuse Prevention</h4>
                    <p>Protects against denial-of-service attempts</p>
                </div>
                <div class="feature-card">
                    <h4 style="font-size: 1.1em;">⚖️ Fair Resource Allocation</h4>
                    <p>Ensures equitable access among all users</p>
                </div>
                <div class="feature-card">
                    <h4 style="font-size: 1.1em;">🚀 Production-Ready</h4>
                    <p>Battle-tested library with minimal overhead</p>
                </div>
                <div class="feature-card">
                    <h4 style="font-size: 1.1em;">⚙️ Configurable</h4>
                    <p>Different limits for demo vs authenticated users (50/hour vs 100/hour)</p>
                </div>
            </div>

            <h4 style="color: #667eea; font-size: 1.2em; margin: 20px 0 10px;">Limitations & Future Enhancements</h4>
            <ul>
                <li><strong>In-Memory Storage:</strong> Limits reset on server restart; consider Redis backend for production clusters</li>
                <li><strong>IP-Based Only:</strong> Sophisticated users can bypass with IP rotation; consider user-based limits</li>
                <li><strong>No Distributed Sync:</strong> Multi-instance deployments need shared state (Redis/Memcached)</li>
            </ul>
        </div>

        <!-- Performance & Scalability -->
        <div class="section">
            <h2>📈 Performance & Scalability</h2>

            <h3>Current Performance Metrics</h3>
            <table>
                <tr>
                    <th>Metric</th>
                    <th>Value</th>
                    <th>Notes</th>
                </tr>
                <tr>
                    <td>Total Response Time</td>
                    <td>2-4 seconds</td>
                    <td>Includes all agents + parsing</td>
                </tr>
                <tr>
                    <td>LLM Latency</td>
                    <td>~3.3 seconds</td>
                    <td>99.5% of total time</td>
                </tr>
                <tr>
                    <td>RAG Search</td>
                    <td>~30ms</td>
                    <td>Vector search across 34 docs</td>
                </tr>
                <tr>
                    <td>Query Parsing</td>
                    <td>1-5ms</td>
                    <td>NLP extraction</td>
                </tr>
                <tr>
                    <td>Tokens Per Query</td>
                    <td>~6,250</td>
                    <td>Across all agents</td>
                </tr>
                <tr>
                    <td>Cost Per Query</td>
                    <td>$0.0015</td>
                    <td>Claude Haiku pricing</td>
                </tr>
            </table>

            <h3>Bottleneck Analysis</h3>
            <p><strong>Current Architecture (Sequential):</strong></p>
            <ul>
                <li>Parse Query: 5ms</li>
                <li>Database Agent: 800ms</li>
                <li>Infrastructure Agent: 900ms</li>
                <li>Cost Agent: 850ms</li>
                <li>Security Agent: 700ms</li>
                <li><strong>Total: 3,255ms</strong></li>
            </ul>

            <p><strong>Optimized Architecture (Parallel - Future):</strong></p>
            <ul>
                <li>Parse Query: 5ms</li>
                <li>All Agents (Parallel): 900ms (slowest agent)</li>
                <li><strong>Total: 905ms (3.7× faster!)</strong></li>
            </ul>

            <h3>Scalability Analysis</h3>
            <table>
                <tr>
                    <th>Load Level</th>
                    <th>Requests/Day</th>
                    <th>Monthly Cost</th>
                    <th>Infrastructure</th>
                </tr>
                <tr>
                    <td>Demo</td>
                    <td>100</td>
                    <td>$4.50 API + $5 hosting = $9.50</td>
                    <td>Single Railway instance</td>
                </tr>
                <tr>
                    <td>Small Business</td>
                    <td>1,000</td>
                    <td>$45 API + $5 hosting = $50</td>
                    <td>Single Railway instance</td>
                </tr>
                <tr>
                    <td>Growing Startup</td>
                    <td>10,000</td>
                    <td>$450 API + $25 hosting = $475</td>
                    <td>2-3 Railway instances + load balancer</td>
                </tr>
                <tr>
                    <td>Enterprise</td>
                    <td>100,000</td>
                    <td>$4,500 API + $500 infrastructure = $5,000</td>
                    <td>Kubernetes cluster, Redis cache</td>
                </tr>
            </table>
        </div>

        <!-- Monitoring & Observability -->
        <div class="section">
            <h2>📊 Monitoring & Observability</h2>

            <p>
                The Tech Stack Advisor includes comprehensive monitoring capabilities with Prometheus-format metrics, structured logging, and Grafana Cloud integration for production-grade observability.
            </p>

            <h3>Prometheus Metrics Endpoint</h3>
            <p>
                The system exposes metrics at <code style="background: #f0f4ff; padding: 4px 8px; border-radius: 4px; color: #667eea;">/metrics/prometheus</code> in Prometheus format for seamless integration with monitoring systems:
            </p>
            <div class="code-block">
# Access Prometheus metrics (requires JWT authentication)
curl http://localhost:8000/metrics/prometheus \
  -H "Authorization: Bearer &lt;your-jwt-token&gt;"
            </div>

            <h3>Available Metrics</h3>

            <h4 style="color: #667eea; font-size: 1.2em; margin: 20px 0 10px;">HTTP Request Metrics</h4>
            <ul>
                <li><code style="background: #f0f4ff; padding: 2px 6px; border-radius: 3px;">http_requests_total{method, endpoint, status_code}</code> - Total HTTP requests counter with labels for method, endpoint, and status code</li>
                <li><code style="background: #f0f4ff; padding: 2px 6px; border-radius: 3px;">http_request_duration_seconds{method, endpoint}</code> - HTTP request duration histogram for calculating p50, p95, p99 latencies</li>
            </ul>

            <h4 style="color: #667eea; font-size: 1.2em; margin: 20px 0 10px;">LLM Usage & Cost Tracking</h4>
            <ul>
                <li><code style="background: #f0f4ff; padding: 2px 6px; border-radius: 3px;">llm_tokens_total{agent, token_type}</code> - Token usage by agent (input/output tokens)</li>
                <li><code style="background: #f0f4ff; padding: 2px 6px; border-radius: 3px;">llm_cost_usd_total{agent}</code> - Cumulative API cost per agent in USD</li>
                <li><code style="background: #f0f4ff; padding: 2px 6px; border-radius: 3px;">llm_requests_total{agent, status}</code> - LLM request count by agent and status (success/error)</li>
                <li><code style="background: #f0f4ff; padding: 2px 6px; border-radius: 3px;">llm_daily_tokens</code> - Daily token usage gauge (resets at midnight UTC)</li>
                <li><code style="background: #f0f4ff; padding: 2px 6px; border-radius: 3px;">llm_daily_cost_usd</code> - Daily cost in USD gauge</li>
                <li><code style="background: #f0f4ff; padding: 2px 6px; border-radius: 3px;">llm_daily_queries</code> - Daily query count gauge</li>
            </ul>

            <h4 style="color: #667eea; font-size: 1.2em; margin: 20px 0 10px;">Application Metrics</h4>
            <ul>
                <li><code style="background: #f0f4ff; padding: 2px 6px; border-radius: 3px;">active_conversation_sessions</code> - Number of active conversation sessions</li>
                <li><code style="background: #f0f4ff; padding: 2px 6px; border-radius: 3px;">user_registrations_total{oauth_provider}</code> - Total user registrations by OAuth provider (local/google)</li>
                <li><code style="background: #f0f4ff; padding: 2px 6px; border-radius: 3px;">user_logins_total{oauth_provider}</code> - Total user logins by provider</li>
                <li><code style="background: #f0f4ff; padding: 2px 6px; border-radius: 3px;">recommendations_total{status, authenticated}</code> - Total recommendations generated with status and auth labels</li>
            </ul>

            <h3>Grafana Cloud Integration</h3>
            <p>
                The application integrates seamlessly with Grafana Cloud for real-time monitoring dashboards and alerting. The free tier provides:
            </p>
            <div class="feature-grid">
                <div class="feature-card">
                    <h4>📊 Metrics Storage</h4>
                    <p>10,000 metric series with 14-day retention</p>
                </div>
                <div class="feature-card">
                    <h4>📈 Real-time Dashboards</h4>
                    <p>Customizable dashboards for HTTP, LLM, and application metrics</p>
                </div>
                <div class="feature-card">
                    <h4>🔔 Alerting</h4>
                    <p>Alert on cost thresholds, error rates, and latency spikes</p>
                </div>
                <div class="feature-card">
                    <h4>💰 Cost</h4>
                    <p>$0/month for free tier (suitable for demo/small projects)</p>
                </div>
            </div>

            <p style="margin-top: 20px;">
                <strong>Setup Guide:</strong> See <a href="https://github.com/ranjanarajendran/tech-stack-advisor/blob/main/GRAFANA_CLOUD_SETUP.md" target="_blank" style="color: #667eea; font-weight: bold;">GRAFANA_CLOUD_SETUP.md</a> for complete configuration instructions. (Private repo - <a href="mailto:ranjana.rajendran@gmail.com?subject=Tech%20Stack%20Advisor%20Access%20Request" style="color: #667eea;">request access</a> if needed)
            </p>

            <h3>Example PromQL Queries</h3>
            <p>Common queries for monitoring the application in Grafana:</p>
            <div class="code-block">
# Request rate (requests per second)
rate(http_requests_total[5m])

# P95 latency across all endpoints
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

# Daily LLM cost tracking
llm_daily_cost_usd

# Error rate percentage
sum(rate(http_requests_total{status_code=~"5.."}[5m])) /
sum(rate(http_requests_total[5m])) * 100

# Token usage by agent
sum by (agent) (llm_tokens_total)

# Active sessions gauge
active_conversation_sessions
            </div>

            <h3>Structured Logging</h3>
            <p>
                All logs are emitted in structured JSON format using <code style="background: #f0f4ff; padding: 4px 8px; border-radius: 4px; color: #667eea;">structlog</code> with correlation IDs for request tracing:
            </p>
            <div class="code-block">
{
  "event": "recommendation_generated",
  "correlation_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "user_id": "user@example.com",
  "tokens_used": 6250,
  "cost_usd": 0.0015,
  "duration_ms": 3245,
  "timestamp": "2024-01-15T10:30:45.123Z"
}
            </div>
            <p><strong>Benefits:</strong></p>
            <ul>
                <li><strong>Request Tracing:</strong> Correlation IDs track requests through all agents and services</li>
                <li><strong>Debugging:</strong> Structured logs enable powerful filtering and aggregation (e.g., "show all errors for correlation_id X")</li>
                <li><strong>Performance Analysis:</strong> Track duration and cost for individual requests</li>
                <li><strong>Cost Control:</strong> Monitor per-user API costs and daily spending</li>
            </ul>
        </div>

        <!-- Technology Stack -->
        <div class="section">
            <h2>🛠️ Technology Stack</h2>

            <h3>Backend</h3>
            <div class="tech-stack">
                <span class="tech-tag">Python 3.11+</span>
                <span class="tech-tag">FastAPI</span>
                <span class="tech-tag">Pydantic</span>
                <span class="tech-tag">LangChain</span>
                <span class="tech-tag">LangGraph</span>
                <span class="tech-tag">Anthropic Claude</span>
                <span class="tech-tag">sentence-transformers</span>
                <span class="tech-tag">Qdrant</span>
                <span class="tech-tag">structlog</span>
                <span class="tech-tag">slowapi</span>
                <span class="tech-tag">bcrypt</span>
                <span class="tech-tag">PyJWT</span>
            </div>

            <h3>Frontend</h3>
            <div class="tech-stack">
                <span class="tech-tag">HTML5</span>
                <span class="tech-tag">CSS3</span>
                <span class="tech-tag">JavaScript (ES6+)</span>
                <span class="tech-tag">JWT localStorage</span>
            </div>

            <h3>Development & Testing</h3>
            <div class="tech-stack">
                <span class="tech-tag">pytest</span>
                <span class="tech-tag">mypy</span>
                <span class="tech-tag">ruff</span>
                <span class="tech-tag">uvicorn</span>
            </div>

            <h3>Infrastructure</h3>
            <div class="tech-stack">
                <span class="tech-tag">Railway</span>
                <span class="tech-tag">GitHub Auto-deploy</span>
                <span class="tech-tag">SSL/HTTPS</span>
                <span class="tech-tag">SQLite</span>
            </div>
        </div>

        <!-- Lessons Learned -->
        <div class="section">
            <h2>📚 Lessons Learned</h2>

            <h3>1. Simplicity Wins</h3>
            <p>Vanilla JavaScript over React saved weeks of complexity. No build step means faster iteration and simpler deployment.</p>

            <h3>2. Cost-Conscious Architecture</h3>
            <p>Choosing Claude Haiku over GPT-4 saved 95% on API costs without sacrificing quality. Always benchmark cheaper alternatives.</p>

            <h3>3. Dependency Hell is Real</h3>
            <p>The NumPy 2.0 breaking change taught us to pin major versions and test upgrades carefully.</p>

            <h3>4. Platform Matters</h3>
            <p>Railway's $5/month hobby plan is worth it vs fighting with free tier limits. Developer time is expensive.</p>

            <h3>5. Multi-Agent Design</h3>
            <p>Specialized agents with focused prompts outperform monolithic prompts for complex tasks.</p>

            <h3>6. Authentication is Non-Negotiable</h3>
            <p>Even for "free" services, authentication prevents abuse and enables valuable features like personalization.</p>

            <h3>7. Monitor Everything</h3>
            <p>Correlation IDs, structured logging, and cost tracking saved countless debugging hours.</p>
        </div>

    </div>

    <!-- Call to Action -->
    <div class="cta-section">
        <h2>Ready to Learn More?</h2>
        <p style="font-size: 1.2em; margin: 20px 0;">
            This project showcases production-ready AI engineering, modern web development, and cloud deployment expertise.
        </p>
        <a href="https://ranjana-tech-stack-advisor-production.up.railway.app" class="cta-button">Try Live Demo</a>
        <a href="https://github.com/ranjanarajendran/tech-stack-advisor" class="cta-button">View on GitHub</a>
        <p style="font-size: 0.9em; margin-top: 20px; opacity: 0.9;">
            Private repository - <a href="mailto:ranjana.rajendran@gmail.com?subject=Tech%20Stack%20Advisor%20Repository%20Access%20Request" style="color: white; text-decoration: underline;">Request access</a> if needed
        </p>
    </div>

    <!-- Footer -->
    <footer>
        <p>Built by Ranjana Rajendran | <a href="mailto:ranjana.rajendran@gmail.com" style="color: white;">ranjana.rajendran@gmail.com</a></p>
        <p style="margin-top: 10px; opacity: 0.8;">
            Technologies: Python • FastAPI • LangGraph • Anthropic Claude • Qdrant • JWT • Google OAuth • Railway
        </p>
    </footer>
</body>
</html>
Tech Stack Advisor - Code Viewer

index.html