[Hands On] 질문 유형별 최적 LLM 모델 선택 AI Assistant 구축 실습

1. 개요

AI를 활용한 비즈니스 모델이 전 산업에 걸쳐 중요해진 만큼 다양한 LLM 모델들이 빠르게 개발되고 있습니다.
이러한 언어 모델들은 각각의 특성과 니즈에 따라 성능과 생성해 낼 수 있는 답변의 방향성도 천차만별이기에
하나의 LLM 모델만을 고수하여 모든 요구사항을 해결하고자 하는 것 보다는 전문성·성능·응답 속도에 따라 상황별로 모델을 선택하는 방식이 중요해지고 있습니다.

최근 LLM들은 상당히 범용적이지만,
다음과 같이 사용자 요청의 성격에 따라 요구되는 능력과 우선순위가 모두 다르게 느껴질 수 있습니다.

수학·코딩처럼 정확성이 중요한 작업,
창작·스토리텔링처럼 창의성이 중요한 작업,
번역·요약처럼 속도와 비용이 중요한 작업 등

이와 같이 각각 모델의 특성에 최적화된 장점만 활용하여 AI 서비스를 구축한다면 우리는 더욱더 효과적이고 효율적인 비즈니스 모델을 창출해 낼 수 있을 것입니다.

이에 따라 본 실습에서는 사용자 질문의 유형에 따라 질문의 성격을 분석하여
맥락과 가장 적합한 LLM(Amazon Bedrock 기반) 모델을 선택 및 호출, 응답하는 Multi-LLM 응답 AI Assistant을 구현해보도록 하겠습니다.

2. 목표

사용자 질문 분석을 통한 의도(Intent) 분류 및 유형 분류
Multi-LLM 환경에서 질문 유형에 따른 모델 자동 선정
Flask 기반 Web Backend에서 AI Assistant API 구축

3. 구성

웹 기반 서비스 배포를 위한 Nginx Reverse Proxy 환경 구성

1) NGINX 설치

#NGINX 설치 
yum install nginx -y 

#서비스 실행 
systemctl start nginx 

#부팅 시 자동 실행 활성화 
systemctl enable nginx 

#상태 확인 
systemctl status nginx

2) Proxy 구성

#Proxy 구성 
nano /etc/nginx/nginx.conf

2. Python 기반 Flask Web Backend 구성

1) Web UI를 위한 index.html 구성

<!-- templates/index.html -->
<!DOCTYPE html>
<html lang="ko">
<head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Saem's AI Assistant</title>

    <!-- ======================= -->
    <!--        CSS AREA         -->
    <!-- ======================= -->

    <style>
        /* Reset rules */

        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }

        /* Color Theme Variables */

        :root {
            --primary-blue: #4A90E2;
            --light-blue: #E8F4F8;
            --hover-blue: #357ABD;
            --border-blue: #B8D4E8;
            --text-dark: #2C3E50;
            --text-light: #7F8C8D;
            --bg-white: #FFFFFF;
            --bg-gray: #F5F7FA;
            --shadow: rgba(74, 144, 226, 0.1);
            --assistant-bubble: #DFF0FF;
            --user-bubble: #4A90E2;
            --accent-green: #27AE60;
            --accent-purple: #9B59B6;
        }

        body {
            font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif;
            background: linear-gradient(135deg, #E8F4F8, #F5F7FA);
            min-height: 100vh;
            display: flex;
            flex-direction: column;
        }

        /* Header block */

        .header {
            width: 100%;
            padding: 20px;
            border-bottom: 2px solid var(--border-blue);
            background: var(--bg-white);
            box-shadow: 0 2px 8px var(--shadow);
        }

        .header-content {
            max-width: 1200px;
            margin: auto;
            display: flex;
            align-items: center;
            justify-content: space-between;
        }

        .logo {
            font-size: 24px;
            font-weight: 600;
            color: var(--primary-blue);
        }

        .header-actions {
            display: flex;
            gap: 12px;
        }

        .btn {
            padding: 10px 20px;
            border-radius: 10px;
            border: none;
            cursor: pointer;
            font-size: 14px;
            transition: 0.25s;
        }

        .btn-secondary {
            background: var(--light-blue);
            color: var(--primary-blue);
        }

        .btn-secondary:hover {
            background: var(--border-blue);
        }

        .btn-primary {
            background: var(--primary-blue);
            color: var(--bg-white);
        }

        .btn-primary:hover {
            background: var(--hover-blue);
        }

        /* Main layout */

        .main-container {
            flex: 1;
            padding: 20px;
            display: flex;
            justify-content: center;
        }

        .chat-container {

            width: 100%;
            max-width: 1200px;
            background: var(--bg-white);
            height: calc(100vh - 140px);
            border-radius: 16px;
            overflow: hidden;
            box-shadow: 0 4px 20px var(--shadow);
            display: flex;
            flex-direction: column;
        }

        /* model indicator */

        .model-indicator {
            padding: 16px 20px;
            background: linear-gradient(135deg, var(--light-blue), #F0F8FF);
            border-bottom: 1px solid var(--border-blue);
            display: none;
            flex-direction: column;
            gap: 8px;
            font-size: 13px;
        }

        .model-indicator.active {
            display: flex;
        }

        .model-row {
            display: flex;
            align-items: center;
            gap: 10px;
        }

        .model-badge {
            background: var(--primary-blue);
            padding: 5px 12px;
            border-radius: 12px;
            color: white;
            font-size: 12px;
            font-weight: 600;
        }

        .specialist-badge {
            background: var(--accent-purple);
            padding: 5px 12px;
            border-radius: 12px;
            color: white;
            font-size: 12px;
            font-weight: 600;
        }

        .router-badge {
            background: var(--accent-green);
            padding: 5px 12px;
            border-radius: 12px;
            color: white;
            font-size: 11px;
            font-weight: 500;
        }

        .routing-reason {
            color: var(--text-light);
            font-size: 12px;
            font-style: italic;
        }

        /* messages display  */

        .messages-container {
            flex: 1;
            overflow-y: auto;
            padding: 25px 30px 10px;
            scroll-behavior: smooth;
        }

        .message {
            margin-bottom: 16px;
            max-width: 85%;
            display: flex;
            animation: fadeIn 0.25s ease-out;
        }

        .message.user {
            justify-content: flex-end;
        }

        .message.assistant {
            justify-content: flex-start;
            flex-direction: column;
        }

        .message .bubble {
            padding: 14px 18px;
            border-radius: 14px;
            line-height: 1.5;
            font-size: 15px;
            max-width: 100%;
            word-wrap: break-word;
        }

        .message.user .bubble {
            background: var(--user-bubble);
            color: white;
        }

        .message.assistant .bubble {
            background: var(--assistant-bubble);
            border: 1px solid var(--border-blue);
            color: var(--text-dark);
        }

        .message-meta {
            margin-top: 6px;
            font-size: 11px;
            color: var(--text-light);
            display: flex;
            gap: 6px;
            flex-wrap: wrap;
            align-items: center;
        }

        .assistant-model {
            background: var(--primary-blue);
            color: white;
            padding: 3px 8px;
            border-radius: 6px;
            font-size: 10px;
        }

        .specialist-info {
            background: var(--accent-purple);
            color: white;
            padding: 3px 8px;
            border-radius: 6px;
            font-size: 10px;
        }

        /* typing animation - 개선된 로딩 인디케이터 */

        .typing-indicator {
            background: var(--assistant-bubble);
            border: 1px solid var(--border-blue);
            padding: 16px 20px;
            border-radius: 14px;
            width: fit-content;
            display: none;
            flex-direction: column;
            gap: 8px;
        }

        .typing-indicator.active {
            display: flex;
        }

        .typing-status {
            font-size: 12px;
            color: var(--text-light);
            margin-bottom: 4px;
        }

        .typing-dots {
            display: flex;
            gap: 6px;
        }

        .typing-dot {
            width: 8px;
            height: 8px;
            background: var(--primary-blue);
            border-radius: 50%;
            display: inline-block;
            animation: typing 1.5s infinite ease-out;
        }

        .typing-dot:nth-child(2) {
            animation-delay: 0.2s;
        }

        .typing-dot:nth-child(3) {
            animation-delay: 0.4s;
        }

        @keyframes typing {
            0%, 60%, 100% { transform: translateY(0); opacity: 0.5; }
            30% { transform: translateY(-7px); opacity: 1; }
        }

        /* input box */

        .input-container {
            border-top: 2px solid var(--border-blue);
            padding: 20px;
            background: var(--bg-gray);
        }

        .input-wrapper {
            display: flex;
            gap: 12px;
        }

        #messageInput {
            flex: 1;
            padding: 15px 18px;
            font-size: 15px;
            border-radius: 12px;
            border: 2px solid var(--border-blue);
        }

        #messageInput:focus {
            outline: none;
            border-color: var(--primary-blue);
            box-shadow: 0 0 4px rgba(74, 144, 226, 0.2);
        }

        #sendButton {
            padding: 15px 25px;
            border-radius: 12px;
            border: none;
            font-size: 16px;
            cursor: pointer;
            background: var(--primary-blue);
            color: white;
            font-weight: bold;
        }

        #sendButton:hover {
            background: var(--hover-blue);
        }

        #sendButton:disabled {
            opacity: 0.4;
            cursor: not-allowed;
        }

        /* welcome state */

        .empty-state {
            text-align: center;
            margin-top: 80px;
        }

        .empty-state h2 {
            font-size: 26px;
            font-weight: bold;
            margin-bottom: 12px;
            color: var(--primary-blue);
        }

        .suggestion-chips {
            margin-top: 20px;
            display: flex;
            gap: 10px;
            flex-wrap: wrap;
            justify-content: center;
        }

        .chip {
            padding: 10px 18px;
            border-radius: 20px;
            background: var(--assistant-bubble);
            border: 1px solid var(--primary-blue);
            cursor: pointer;
            transition: 0.25s;
        }

        .chip:hover {
            background: var(--primary-blue);
            color: white;
            transform: translateY(-3px);
        }

        /* scrollbar */

        ::-webkit-scrollbar {
            width: 6px;
        }

        ::-webkit-scrollbar-thumb {
            background: var(--primary-blue);
            border-radius: 6px;
        }

        /* media query */

        @media (max-width: 700px) {
            .message { max-width: 95%; }
            #sendButton { width: 100%; }
            .input-wrapper { flex-direction: column; }
        }

        @keyframes fadeIn {
            from { opacity: 0; transform: translateY(7px); }
            to { opacity: 1; transform: translateY(0); }
        }
    </style>

</head>

<!-- ======================= -->
<!--  MAIN BODY STRUCTURE   -->
<!-- ======================= -->

<body>

    <!-- Nav/Header Zone -->
    <header class="header">
        <div class="header-content">
            <div class="logo">Saem's AI Assistant</div>
            <div class="header-actions">
                <button class="btn btn-secondary" onclick="clearChat()">Clear</button>
                <button class="btn btn-primary" onclick="showHistory()">History</button>
            </div>
        </div>
    </header>

    
    <!-- Chat Layout -->
    <div class="main-container">
        <div class="chat-container">
            <!-- Model Indicator -->
            <div class="model-indicator" id="modelIndicator">
                <div class="model-row">
                    <span style="color: var(--text-light); font-weight: 600;">Base LLM:</span>
                    <span class="router-badge" id="routerModel">-</span>
                </div>
                <div class="model-row">

                    <!-- <span style="color: var(--text-light); font-weight: 600;">🎯 Router:</span>
                    <span class="router-badge" id="routerModel">-</span> -->
                    <span style="color: var(--text-light); font-weight: 600;"> Selected LLM:</span>
                    <span class="specialist-badge" id="specialistName">-</span>
                    <span class="model-badge" id="currentModel">-</span>
                </div>
                <div class="routing-reason" id="routingReason"></div>
            </div>

            <!-- Messages -->
            <div class="messages-container" id="messagesContainer">
                <div class="empty-state" id="emptyState">
                    <h2>Welcome 👋</h2>
                    <p>Ask anything — the router will select the best AI specialist for you.</p>
                    <div class="suggestion-chips">
                        <div class="chip" onclick="sendSuggestion('파이썬으로 퀵소트 알고리즘 구현해줘')">💻 Python code</div>
                        <div class="chip" onclick="sendSuggestion('125 곱하기 37은?')">🔢 Math</div>
                        <div class="chip" onclick="sendSuggestion('이 문장을 영어로 번역해줘')">🌐 Translation</div>
                        <div class="chip" onclick="sendSuggestion('AI의 미래 전망 분석해줘')">📊 Analysis</div>
                    </div>
                </div>

                <!-- Typing Indicator -->
                <div class="typing-indicator" id="typingIndicator">
                    <div class="typing-status" id="typingStatus">Analyzing your question...</div>
                    <div class="typing-dots">
                        <span class="typing-dot"></span>
                        <span class="typing-dot"></span>
                        <span class="typing-dot"></span>
                    </div>
                </div>
            </div>

            <!-- Input -->
            <div class="input-container">
                <div class="input-wrapper">
                    <input id="messageInput" placeholder="Type your message..." onkeypress="if(event.key==='Enter')sendMessage()" />
                    <button id="sendButton" onclick="sendMessage()">Send</button>
                </div>
            </div>
        </div>
    </div>

    <!-- ======================= -->
    <!--     JAVASCRIPT AREA     -->
    <!-- ======================= -->

    <script>
        function sendSuggestion(text) {
            document.getElementById("messageInput").value = text;
            sendMessage();
        }

        async function sendMessage() {
            const input = document.getElementById("messageInput");
            const text = input.value.trim();
            if (!text) return;

            addMessage("user", text);
            input.value = "";

            const btn = document.getElementById("sendButton");
            btn.disabled = true;

            // 로딩 인디케이터 표시
            showTypingIndicator("Analyzing your question...");

            try {
                await sendNormal(text);
            } catch (error) {
                console.error("Error:", error);
                addMessage("assistant", "Sorry, an error occurred. Please try again.");
            } finally {
                hideTypingIndicator();
                btn.disabled = false;
            }
        }

        async function sendNormal(text) {
            // 1단계: 질문 분석 중
            updateTypingStatus("🔍 Router is analyzing...");
            
            const res = await fetch("/api/chat", {
                method: "POST",
                headers: { "Content-Type": "application/json" },
                body: JSON.stringify({ message: text })
            });

            // 2단계: 전문가 선택 완료
            updateTypingStatus("🤖 Specialist is generating response...");

            const data = await res.json();

            // 모델 정보 업데이트
            updateModelIndicator(data);
            
            // 응답 추가
            addMessage("assistant", data.response, {
                model: data.model_label,
                specialist: data.specialist_name,
                description: data.specialist_description
            });
        }

        function showTypingIndicator(status) {
            const indicator = document.getElementById("typingIndicator");
            updateTypingStatus(status);
            indicator.classList.add("active");
            scrollToBottom();
        }

        function hideTypingIndicator() {
            const indicator = document.getElementById("typingIndicator");
            indicator.classList.remove("active");
        }

        function updateTypingStatus(status) {
            const statusEl = document.getElementById("typingStatus");
            statusEl.textContent = status;
        }

        function addMessage(role, text, metadata = null) {
            const container = document.getElementById("messagesContainer");

            // hide empty
            document.getElementById("emptyState").style.display = "none";

            const wrapper = document.createElement("div");
            wrapper.className = `message ${role}`;

            const bubble = document.createElement("div");
            bubble.className = "bubble";
            bubble.innerHTML = text.replace(/\n/g,"<br>");

            wrapper.appendChild(bubble);

            if(role === "assistant" && metadata) {
                const meta = document.createElement("div");
                meta.className = "message-meta";
                
                const modelSpan = document.createElement("span");
                modelSpan.className = "assistant-model";
                modelSpan.textContent = `Model: ${metadata.model}`;
                
                const specialistSpan = document.createElement("span");
                specialistSpan.className = "specialist-info";
                specialistSpan.textContent = `Specialist: ${metadata.specialist}`;
                
                meta.appendChild(modelSpan);
                meta.appendChild(specialistSpan);
                
                if(metadata.description) {
                    const descSpan = document.createElement("span");
                    descSpan.style.color = "var(--text-light)";
                    descSpan.style.fontSize = "10px";
                    descSpan.textContent = `(${metadata.description})`;
                    meta.appendChild(descSpan);
                }
                
                wrapper.appendChild(meta);
            }

            container.appendChild(wrapper);
            scrollToBottom();
        }

        function updateModelIndicator(data) {
            // Router 정보
            document.getElementById("routerModel").textContent = data.router_model || "Nova Lite";
            
            // Specialist 정보
            const specialistName = data.specialist_name || "-";
            document.getElementById("specialistName").textContent = 
                specialistName.replace("_", " ").toUpperCase();
            
            // Model 정보
            document.getElementById("currentModel").textContent = data.model_label || "-";
            
            // Routing 이유
            const reasonEl = document.getElementById("routingReason");
            if(data.routing_reason) {
                reasonEl.textContent = `💡 ${data.routing_reason}`;
            }
            
            document.getElementById("modelIndicator").classList.add("active");
        }

        function scrollToBottom() {
            const container = document.getElementById("messagesContainer");
            container.scrollTop = container.scrollHeight;
        }

        async function clearChat() {
            if(!confirm("Clear all messages?")) return;
            
            document.getElementById("messagesContainer").innerHTML = `
                <div class="empty-state" id="emptyState">
                    <h2>Welcome 👋</h2>
                    <p>Ask anything — the router will select the best AI specialist for you.</p>
                    <div class="suggestion-chips">
                        <div class="chip" onclick="sendSuggestion('파이썬으로 퀵소트 알고리즘 구현해줘')">💻 Python code</div>
                        <div class="chip" onclick="sendSuggestion('125 곱하기 37은?')">🔢 Math</div>
                        <div class="chip" onclick="sendSuggestion('이 문장을 영어로 번역해줘')">🌐 Translation</div>
                        <div class="chip" onclick="sendSuggestion('AI의 미래 전망 분석해줘')">📊 Analysis</div>
                    </div>
                </div>
                <div class="typing-indicator" id="typingIndicator">
                    <div class="typing-status" id="typingStatus">Analyzing your question...</div>
                    <div class="typing-dots">
                        <span class="typing-dot"></span>
                        <span class="typing-dot"></span>
                        <span class="typing-dot"></span>
                    </div>
                </div>
            `;
            document.getElementById("modelIndicator").classList.remove("active");
            document.getElementById("emptyState").style.display = "block";
        }

        async function showHistory(){
            alert("History feature coming soon!");
        }
    </script>

</body>
</html>

2) app.py 파일 구성

# app.py
from flask import Flask, render_template, request, jsoni
from strands import Agent
import asyncio
import json
from datetime import datetime

app = Flask(__name__)

class MetaRouter:
    """베이스 LLM이 질문을 분석하여 최적의 전문 모델을 선택하는 메타 라우터"""
    
    def __init__(self):
        # 베이스 라우터 모델 (가장 빠르고 저렴한 모델 사용)
        self.base_router_model = 'amazon.nova-lite-v1:0'
        self.base_router_agent = Agent(model=self.base_router_model)
        
        # 사용 가능한 전문 모델들
        self.specialist_models = {
            'math_expert': {
                'model_id': 'amazon.nova-pro-v1:0',
                'description': '수학 계산, 수식, 방정식, 미적분, 통계 등 수학적 문제 해결 전문',
                'strengths': '정확한 계산, 수학적 추론, 복잡한 수식 처리'
            },
            'code_expert': {
                'model_id': 'amazon.nova-pro-v1:0',
                'description': '프로그래밍, 코드 작성, 디버깅, 알고리즘 개발 전문',
                'strengths': '코드 생성, 버그 수정, 알고리즘 최적화, 다양한 프로그래밍 언어 지원'
            },
            'analysis_expert': {
                'model_id': 'meta.llama3-70b-instruct-v1:0',
                'description': '데이터 분석, 비교, 평가, 논리적 추론 전문',
                'strengths': '심층 분석, 패턴 인식, 비교 평가, 논리적 사고'
            },
            'creative_expert': {
                'model_id': 'meta.llama3-70b-instruct-v1:0',
                'description': '창작 글쓰기, 스토리텔링, 아이디어 생성 전문',
                'strengths': '창의적 글쓰기, 이야기 구성, 상상력, 브레인스토밍'
            },
            'conversation_expert': {
                'model_id': 'amazon.nova-lite-v1:0',
                'description': '일반 대화, 간단한 질문 응답, 일상적 상호작용 전문',
                'strengths': '자연스러운 대화, 빠른 응답, 친근한 커뮤니케이션'
            },
            'translation_expert': {
                'model_id': 'amazon.nova-pro-v1:0',
                'description': '다국어 번역, 언어 간 변환 전문',
                'strengths': '정확한 번역, 문맥 이해, 다양한 언어 지원'
            },
            'reasoning_expert': {
                'model_id': 'meta.llama3-70b-instruct-v1:0',
                'description': '복잡한 추론, 문제 해결, 단계별 사고 전문',
                'strengths': '논리적 추론, 복잡한 문제 해결, 체계적 접근'
            }
        }
        
        # UI용 모델 라벨
        self.model_labels = {
            "amazon.nova-pro-v1:0": "Nova Pro",
            "amazon.nova-lite-v1:0": "Nova Lite",
            "meta.llama3-70b-instruct-v1:0": "Llama3-70B",
            "meta.llama3-8b-instruct-v1:0": "Llama3-8B"
        }
        
        # 전문가 에이전트 캐시
        self.specialist_agents = {}
        
    async def select_specialist(self, user_query):
        """베이스 모델이 사용자 쿼리를 분석하여 최적의 전문가를 선택"""
        
        # 전문가 목록 포맷팅
        specialists_info = "\n".join([
            f"- {name}: {info['description']} (강점: {info['strengths']})"
            for name, info in self.specialist_models.items()
        ])
        
        # 베이스 라우터에게 전문가 선택 요청
        routing_prompt = f"""당신은 사용자의 질문을 분석하여 최적의 전문 AI 모델을 선택하는 라우터입니다.

사용 가능한 전문가 모델들:
{specialists_info}

사용자 질문: {user_query}

위 질문에 가장 적합한 전문가를 하나만 선택하고, 다음 JSON 형식으로만 응답하세요:
{{
    "specialist": "선택한_전문가_이름",
    "reason": "선택 이유 (한 문장)"
}}

응답 예시:
{{"specialist": "math_expert", "reason": "수학 계산이 필요한 질문입니다"}}

JSON만 출력하세요. 다른 설명은 불필요합니다."""

        try:
            routing_response = await self.base_router_agent.invoke_async(routing_prompt)
            
            # 응답에서 텍스트 추출
            if isinstance(routing_response, dict):
                response_text = routing_response.get('content', routing_response.get('text', str(routing_response)))
            elif hasattr(routing_response, 'content'):
                response_text = routing_response.content
            else:
                response_text = str(routing_response)
            
            # JSON 파싱
            response_text = response_text.strip()
            # 코드 블록 제거
            if '```json' in response_text:
                response_text = response_text.split('```json')[1].split('```')[0].strip()
            elif '```' in response_text:
                response_text = response_text.split('```')[1].split('```')[0].strip()
            
            routing_decision = json.loads(response_text)
            specialist_name = routing_decision.get('specialist', 'conversation_expert')
            reason = routing_decision.get('reason', '일반 대화로 판단')
            
            # 유효성 검증
            if specialist_name not in self.specialist_models:
                specialist_name = 'conversation_expert'
                reason = '기본 대화 모델 사용'
            
            return specialist_name, reason
            
        except Exception as e:
            print(f"라우팅 오류: {e}")
            # 오류 시 기본 모델 사용
            return 'conversation_expert', '라우팅 오류로 기본 모델 사용'
    
    async def get_specialist_agent(self, specialist_name):
        """전문가 에이전트 반환 (캐싱)"""
        if specialist_name not in self.specialist_agents:
            model_id = self.specialist_models[specialist_name]['model_id']
            self.specialist_agents[specialist_name] = Agent(model=model_id)
        
        return self.specialist_agents[specialist_name]


router = MetaRouter()
conversation_history = []


@app.route('/')
def index():
    return render_template('index.html')


@app.route('/api/chat', methods=['POST'])
def chat():
    data = request.json
    user_message = data.get('message', '')
    
    if not user_message:
        return jsonify({'error': 'No message provided'}), 400

    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    response_data = loop.run_until_complete(process_message(user_message))
    loop.close()

    conversation_history.append({
        'role': 'user',
        'content': user_message,
        'timestamp': datetime.now().isoformat()
    })
    conversation_history.append({
        'role': 'assistant',
        'content': response_data['response'],
        'specialist': response_data['specialist_name'],
        'model': response_data['model_label'],
        'routing_reason': response_data['routing_reason'],
        'timestamp': datetime.now().isoformat()
    })

    return jsonify(response_data)


async def process_message(user_message):
    """메시지 처리: 베이스 라우터가 전문가 선택 → 전문가가 응답"""
    
    # 1단계: 베이스 라우터가 최적의 전문가 선택
    specialist_name, routing_reason = await router.select_specialist(user_message)
    
    # 2단계: 선택된 전문가 에이전트로 실제 응답 생성
    specialist_agent = await router.get_specialist_agent(specialist_name)
    response = await specialist_agent.invoke_async(user_message)
    
    # 응답 텍스트 추출
    if isinstance(response, dict):
        response_text = response.get('content', response.get('text', response.get('output', str(response))))
    elif hasattr(response, 'content'):
        response_text = response.content
    elif hasattr(response, 'text'):
        response_text = response.text
    else:
        response_text = str(response)
    
    # 모델 정보 가져오기
    model_id = router.specialist_models[specialist_name]['model_id']
    model_label = router.model_labels.get(model_id, model_id)
    specialist_description = router.specialist_models[specialist_name]['description']
    
    return {
        'response': response_text,
        'specialist_name': specialist_name,
        'specialist_description': specialist_description,
        'model_id': model_id,
        'model_label': model_label,
        'routing_reason': routing_reason,
        'router_model': router.model_labels.get(router.base_router_model, router.base_router_model)
    }


if __name__ == '__main__':
    app.run(debug=False, host='0.0.0.0', port=5000)

4. 결과

1. Web UI를 통한 AI Assistant 응답 테스트

1) 웹 접속 테스트

2) 일상 대화 관련 질문 테스트

Nova Lite가 답변 (일반 대화, 간단한 질문, 일상적 상호작용 전문 답변에 적합)
= amazon.nova-lite-v1:0 모델 호출

3) 미국 증시 관련 질문 테스트

Llama3-70B가 답변 (데이터 분석, 비교, 평가, 논리적 추론 전문 답변에 적합)
= meta.llama3-70b-instruct-v1:0 모델 호출

4) 수학 계산 관련 질문 테스트

Nova Pro가 답변 (프로그래밍, 코드 작성, 디버깅, 알고리즘 개발 답변에 적합)
= amazon.nova-pro-v1:0 모델 호출

5. 결론

이번 실습은 사용자의 질문 의도를 분석해 가장 적합한 모델을 자동으로 선택하는 구조를 구현하여 단순한 답변만 제공하는 AI Assistant가 아닌, AI가 스스로 상황을 이해하고 사용자에게 더욱더 최적화된 답변 제공을 위해 필요한 LLM을 AI가 스스로 선택하는 구조를 만들고자 하였습니다.

본 실습을 통해 우리는 단일 모델에 의존하는 기존 방식보다, 질문의 성격에 따라 적합한 LLM을 자동 선택하는 구조가 비스니스 모델 관점에서 더 높은 가치를 만들어낼 수 있다는 점을 확인하였으며, 특히 모델 선택 과정을 자동화한 덕분에 불필요한 고성능 모델 호출을 줄이며 비용을 최적화할 수 있었고, 복잡한 문제에는 더 강력한 모델을 적용해 정확도를 높일 수 있다는 가능성을 통해 비용 대비 성능 효율이 크게 개선되는 것을 기대할 수 있게 되었습니다.

또한 모델 선택 이유를 사용자에게 투명하게 제공하는 구조는 단순 기능을 넘어 서비스 신뢰성과 심리적 안정감을 높여줄 수 있으며, 확장 가능한 모델 구조와 모듈형 설계 방식은 향후 기능 확장과 운영 효율 개선에도 도움이 될 것입니다. 이러한 장점들을 활용한 AI 기반 서비스 설계나 운영을 통해 비즈니스 혁신을 경험해 보시길 바라겠습니다.

[ 참고 레퍼런스 ]