Lab 08: Multi-Agent System Design

Time: 50 minutes | Level: Architect | Docker: docker run -it --rm zchencow/innozverse-ai:latest bash

Overview

Multi-agent AI systems enable complex task automation by coordinating specialized AI agents. This lab covers ReAct patterns, agent orchestration topologies, memory architectures, inter-agent communication, evaluation frameworks, and production guardrails.

Architecture

┌─────────────────────────────────────────────────────────────┐
│              Multi-Agent Orchestration System                │
├─────────────────────────────────────────────────────────────┤
│  User Request → Orchestrator Agent                          │
│                      ↓                                      │
│         ┌────────────┴────────────┐                        │
│     Planner Agent          Tool Router                      │
│         ↓                       ↓                          │
│  Task Decomposition    ┌─────────┴─────────┐               │
│         ↓              │                   │               │
│  [Researcher] [Analyst] [Code Agent] [API Agent]           │
│         ↓         ↓         ↓           ↓                  │
│     Memory Layer (Working/Episodic/Semantic)                │
│         ↓                                                   │
│  Synthesizer Agent → Final Response                         │
└─────────────────────────────────────────────────────────────┘

Step 1: ReAct Pattern (Reason + Act)

ReAct is the fundamental pattern for tool-using agents.

ReAct Loop:

ReAct Prompt Structure:

💡 The key innovation of ReAct is interleaving reasoning and acting. This allows the agent to adapt its plan based on tool results, unlike CoT (chain-of-thought) which reasons without acting.


Step 2: Agent Orchestration Topologies

Sequential (Pipeline):

Parallel:

Hierarchical:

Comparison:

Topology
Latency
Scalability
Complexity
Use Case

Sequential

High

Low

Low

Step-by-step workflows

Parallel

Low

High

Medium

Independent research tasks

Hierarchical

Medium

High

High

Complex enterprise workflows


Step 3: Memory Types

Agents need different types of memory for different purposes.

Memory Type
Storage
Persistence
Capacity
Use Case

Working

LLM context window

Session only

4K-128K tokens

Current task context

Episodic

Vector database

Persistent

Unlimited

Past interactions, conversation history

Semantic

Knowledge base/KG

Persistent

Unlimited

Domain knowledge, facts

Procedural

Fine-tuned model

In weights

Fixed

Skills, how-to knowledge

Memory Architecture for Production Agent:

Memory Compression:


Step 4: Tool Use and Function Calling

Tool Design Principles:

Tool Taxonomy:

Category
Examples
Risk Level

Read-only

search, database_read, file_read

Low

Write

file_write, database_write, email_send

High

Execute

code_run, shell_command, API_call

Critical

External

web_browse, API_call

Medium

Human-in-the-loop for High-Risk Tools:


Step 5: Inter-Agent Communication

Message Bus Pattern:

Direct Communication:

Blackboard Pattern:

Agent Communication Protocols:

  • LangChain: Python-native, broad tool ecosystem

  • AutoGen (Microsoft): Multi-agent conversation framework

  • CrewAI: Role-based agents with task management

  • OpenAI Assistants API: Managed threads and tool calls


Step 6: Agent Evaluation Framework

Evaluation Dimensions:

Dimension
Metric
Tool

Task completion

% tasks completed correctly

Custom benchmark

Tool accuracy

Correct tool selection rate

Logging

Reasoning quality

Step-by-step correctness

LLM judge

Efficiency

Avg steps to completion

Tracing

Safety

Harmful action rate

Red-team testing

Latency

P95 time to complete task

APM

Agent Trajectory Evaluation:

Benchmark Datasets:

  • HotpotQA: multi-hop reasoning

  • WebArena: web navigation tasks

  • MINT: multi-turn tool use

  • SWE-Bench: software engineering tasks


Step 7: Guardrails and Sandboxing

Input Guardrails:

Output Guardrails:

Code Execution Sandboxing:

Agent Guardrail Architecture (Nemo Guardrails / Llama Guard):


Step 8: Capstone — Multi-Agent Simulation

📸 Verified Output:


Summary

Concept
Key Points

ReAct Pattern

Interleave reasoning + tool actions; adapt plan to results

Orchestration

Sequential (pipeline), Parallel (speed), Hierarchical (complex)

Memory Types

Working (context), Episodic (history), Semantic (knowledge), Procedural (skills)

Tool Design

Atomic, clear interface, error handling, human approval for high-risk

Inter-Agent Comms

Message bus, direct delegation, blackboard pattern

Evaluation

Task completion rate + reasoning quality + safety + efficiency

Guardrails

Input validation + output filtering + code sandboxing

Next Lab: Lab 09: AI Security Red Team →

Last updated