Les LLMs ont révolutionné l'IA, mais ils restent stateless et réactifs. Vous posez une question, ils répondent. Fin de l'interaction.
L'IA agentique (Agentic AI) change la donne : des agents qui planifient, agissent, utilisent des outils, et corrigent leurs erreurs de manière autonome. Selon Stanford HAI, l'IA agentique sera l'une des 3 tendances dominantes de 2026.
Le problème ? L'architecture des agents est complexe :
- Comment gérer la mémoire à long terme ?
- Comment éviter les boucles infinies ?
- Comment orchestrer plusieurs agents ?
- Dans cet article, nous explorons 5 patterns architecturaux éprouvés pour construire des agents IA fiables :
- 1. ReAct : Raisonnement + Action
- 2. Plan-and-Execute : Planification stratégique
- 3. ReWOO : Orchestration optimisée
- 4. Multi-Agent : Collaboration spécialisée
- 5. Reflection : Auto-critique et amélioration
- Pour chaque pattern, nous fournissons :
- Architecture détaillée
- Implémentation Python (LangChain/LangGraph)
- Cas d'usage réels
- Comparatif performances
- Le pattern ReAct (Yao et al., 2023) alterne entre raisonnement et action :
- L'agent explicite son raisonnement à chaque étape, ce qui améliore la traçabilité et réduit les erreurs.
- Trace d'exécution :
- ✅ Support client : Recherche docs + base de connaissances
- ✅ Analyse de données : Requêtes SQL + calculs + visualisation
- ✅ Recherche académique : Recherche papers + synthèse
- Limites :
- ❌ Pas de planification à long terme
- ❌ Une seule action à la fois (séquentiel)
- Inspiré par BabyAGI et AutoGPT, ce pattern sépare planification et exécution :
- 1. Planner : Génère un plan d'actions
- 2. Executor : Exécute chaque action
- 3. Replanner : Ajuste le plan si nécessaire
- Sortie :
- ✅ Tâches complexes multi-étapes
- ✅ Recherche approfondie (competitive intelligence)
- ✅ Workflows métier (RH, finance)
- Avantages :
- ✅ Plan explicite et modifiable
- ✅ Parallélisation possible des étapes
- ReWOO (Xu et al., 2023) optimise ReAct en planifiant toutes les actions d'un coup, puis en les exécutant en parallèle.
- Différence clé :
- ReAct : Thought → Action → Observation → Thought → ...
- ReWOO : Thought → [Action1, Action2, Action3] → [Obs1, Obs2, Obs3] → Answer
- Performances :
- ReAct : 8 secondes (séquentiel)
- ReWOO : 3 secondes (parallèle) ✅
- Au lieu d'un agent monolithique, déléguer à des agents spécialisés :
- Researcher : Recherche d'informations
- Analyst : Analyse de données
- Writer : Rédaction
- Critic : Revue qualité
- Conversation typique :
- ✅ Génération de contenu complexe (rapports, articles)
- ✅ Analyse multi-dimensionnelle (business intelligence)
- ✅ Workflows créatifs (design, marketing)
- L'agent critique son propre travail et itère jusqu'à satisfaction :
- Sortie :
- Critères de choix :
- 1. Latence : ReWOO (parallèle)
- 2. Qualité : Reflection ou Multi-Agent
- 3. Simplicité : ReAct
- 4. Complexité tâche : Plan-Execute ou Multi-Agent
- Les 5 patterns architecturaux couvrent tous les cas d'usage de l'IA agentique :
- ReAct : Le couteau suisse (80% des cas)
- Plan-Execute : Tâches complexes structurées
- ReWOO : Performance maximale
- Multi-Agent : Collaboration spécialisée
- Reflection : Qualité maximale
- Checklist implémentation :
- [ ] Choisir pattern selon use case
- [ ] Implémenter timeouts et limites
- [ ] Ajouter logging/tracing
- [ ] Monitorer coûts LLM
- [ ] Tester cas limites (boucles, erreurs)
- Maillage interne :
- Claude Agent SDK : Construire un Agent IA Autonome
- MCP : Protocole Standardisé IA Agentique
- Sécuriser Code Généré par IA : Checklist
- Optimiser Coûts LLM en Production
Conclusion
class CostTracker:
COSTS = {
"gpt-4o": {"input": 0.005, "output": 0.015}, # per 1K tokens
"gpt-4o-mini": {"input": 0.0015, "output": 0.006}
} def __init__(self, model: str):
self.model = model
self.total_input_tokens = 0
self.total_output_tokens = 0 def track(self, input_tokens: int, output_tokens: int):
self.total_input_tokens += input_tokens
self.total_output_tokens += output_tokens def get_cost(self) -> float:
costs = self.COSTS[self.model]
input_cost = (self.total_input_tokens / 1000) * costs["input"]
output_cost = (self.total_output_tokens / 1000) * costs["output"]
return input_cost + output_cost def report(self):
return {
"model": self.model,
"input_tokens": self.total_input_tokens,
"output_tokens": self.total_output_tokens,
"total_cost_usd": round(self.get_cost(), 4)
3. Cost Tracking
import structlog
from opentelemetry import tracelogger = structlog.get_logger()
tracer = trace.get_tracer(__name__)class ObservableAgent:
def run(self, query: str):
with tracer.start_as_current_span("agent_run") as span:
span.set_attribute("query", query) logger.info("agent_started", query=query) try:
result = self._execute(query) span.set_attribute("result", result)
logger.info("agent_completed", result=result) return result except Exception as e:
span.set_attribute("error", str(e))
logger.error("agent_failed", error=str(e))
2. Logging et Observabilité
from functools import wraps
import timedef with_timeout(seconds):
def decorator(func):
@wraps(func)
def wrapper(args, *kwargs):
start = time.time()
result = func(args, *kwargs)
duration = time.time() - start if duration > seconds:
raise TimeoutError(f"Execution exceeded {seconds}s") return result
return wrapper
return decoratorclass SafeAgent:
MAX_ITERATIONS = 15
MAX_TOOL_CALLS = 50
TIMEOUT_SECONDS = 120 def __init__(self, base_agent):
self.agent = base_agent
self.iterations = 0
self.tool_calls = 0 @with_timeout(TIMEOUT_SECONDS)
def run(self, query: str):
self.iterations = 0
self.tool_calls = 0 while self.iterations < self.MAX_ITERATIONS:
self.iterations += 1 # Exécuter agent
result = self.agent.step(query) # Compter appels outils
if result.get("tool_used"):
self.tool_calls += 1 if self.tool_calls > self.MAX_TOOL_CALLS:
raise RuntimeError("Tool call limit exceeded") if result.get("final_answer"):
return result["final_answer"]1. Limites et Timeouts
Production-Ready : Monitoring et Safeguards
| Pattern | Complexité | Performance | Use Case Idéal |
| --------- | ----------- | ------------- | ---------------- |
| ReAct | ⭐⭐ | Moyen | Support, Q&A simple |
| Plan-Execute | ⭐⭐⭐ | Moyen | Tâches multi-étapes |
| ReWOO | ⭐⭐⭐ | Élevé ✅ | Recherche intensive |
| Multi-Agent | ⭐⭐⭐⭐ | Variable | Contenu complexe |
| Reflection | ⭐⭐⭐ | Lent | Qualité critique |
Comparatif des Patterns
🔄 Iteration 1
📝 Attempt:
Quantum computing uses quantum bits (qubits) which can be 0, 1, or both...
💭 Critique:
Too technical. Avoid jargon like "qubits". Use simpler analogies.🔄 Iteration 2
📝 Attempt:
Imagine a magic coin that can be heads AND tails at the same time...
💭 Critique:
Good analogy! But explain why this is useful. Add practical example.🔄 Iteration 3
📝 Attempt:
Imagine a magic coin that can be heads AND tails simultaneously. This lets
quantum computers solve certain problems much faster, like finding the best
route for a delivery truck visiting 100 cities...
💭 Critique:
APPROVED. Clear, age-appropriate, with practical example.from langchain.chains import LLMChain
from langchain.prompts import PromptTemplateclass ReflectionAgent:
def __init__(self, llm):
self.llm = llm # Generator prompt
self.generator_prompt = PromptTemplate.from_template("""
Task: {task}Previous attempt: {previous_attempt}
Critique: {critique}Generate an improved response:
""") # Critic prompt
self.critic_prompt = PromptTemplate.from_template("""
Evaluate the following response to the task:Task: {task}
Response: {response}Provide constructive criticism focusing on:
Accuracy
Completeness
Clarity
Potential improvementsIf the response is satisfactory, respond with "APPROVED".
Otherwise, provide specific feedback.Critique:
""") self.generator = LLMChain(llm=self.llm, prompt=self.generator_prompt)
self.critic = LLMChain(llm=self.llm, prompt=self.critic_prompt) def run(self, task: str, max_iterations: int = 5) -> str:
"""Générer avec réflexion itérative"""
attempt = ""
critique = "" for i in range(max_iterations):
print(f"\n🔄 Iteration {i+1}") # Générer réponse
attempt = self.generator.run(
task=task,
previous_attempt=attempt,
critique=critique
)
print(f"📝 Attempt:\n{attempt[:200]}...") # Critiquer
critique = self.critic.run(
task=task,
response=attempt
)
print(f"💭 Critique:\n{critique}") # Vérifier si approuvé
if "APPROVED" in critique:
print("\n✅ Response approved!")
return attempt print("\n⚠️ Max iterations reached")
return attemptExemple
from langchain_openai import ChatOpenAIllm = ChatOpenAI(model="gpt-4o", temperature=0.7)
agent = ReflectionAgent(llm)task = """
Explain quantum computing to a 10-year-old child.
"""final_answer = agent.run(task, max_iterations=3)
Implémentation
Generate → Critique → Refine → Critique → ...Concept
Pattern 5 : Reflection
Cas d'Usage
User: Write a report on AI impact on software development in 2026Researcher: I'll gather data on AI adoption, developer productivity metrics,
and recent studies... [Searches web, compiles sources]Analyst: Based on the research, I observe 3 key trends:
1. 73% of developers use AI daily (up from 45% in 2024)
2. Code completion tools reduce development time by 35%
3. AI code review reduces bugs by 28%Writer: [Drafts report synthesizing research and analysis]Critic: The report is solid, but I suggest:
Add concrete examples of AI tools
Clarify the 35% productivity metric
Include potential risks/limitationsWriter: [Revises report based on feedback]from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManagerclass MultiAgentSystem:
def __init__(self):
llm_config = {
"model": "gpt-4o",
"temperature": 0.7
} # Agent 1 : Researcher
self.researcher = AssistantAgent(
name="Researcher",
system_message="""
You are a research specialist. Your role is to gather information,
search the web, and compile relevant data. Be thorough and cite sources.
""",
llm_config=llm_config
) # Agent 2 : Analyst
self.analyst = AssistantAgent(
name="Analyst",
system_message="""
You are a data analyst. Your role is to analyze information provided
by the Researcher, identify patterns, and draw insights.
""",
llm_config=llm_config
) # Agent 3 : Writer
self.writer = AssistantAgent(
name="Writer",
system_message="""
You are a technical writer. Your role is to synthesize research and
analysis into clear, well-structured content. Focus on clarity.
""",
llm_config=llm_config
) # Agent 4 : Critic
self.critic = AssistantAgent(
name="Critic",
system_message="""
You are a quality reviewer. Your role is to review the Writer's output,
identify errors, suggest improvements, and ensure accuracy.
""",
llm_config=llm_config
) # User proxy (orchestrateur)
self.user_proxy = UserProxyAgent(
name="User",
human_input_mode="NEVER",
max_consecutive_auto_reply=0
) # Group chat
self.group_chat = GroupChat(
agents=[self.researcher, self.analyst, self.writer, self.critic, self.user_proxy],
messages=[],
max_round=10
) self.manager = GroupChatManager(
groupchat=self.group_chat,
llm_config=llm_config
) def run(self, task: str) -> str:
"""Exécuter tâche avec collaboration multi-agents"""
self.user_proxy.initiate_chat(
self.manager,
message=task
) # Récupérer résultat final
last_message = self.group_chat.messages[-1]
return last_message["content"]Exemple d'utilisation
system = MultiAgentSystem()task = """
Write a comprehensive report on the impact of AI on software development in 2026.
Include statistics, trends, and future predictions.
"""result = system.run(task)
Architecture AutoGen
Concept
Pattern 4 : Multi-Agent
import asyncio
from typing import List, Tupleclass ReWOOAgent:
def __init__(self, llm, tools):
self.llm = llm
self.tools = tools async def run(self, query: str) -> str:
# 1. Planner : générer toutes les actions d'un coup
plan = await self.plan(query)
print(f"📋 Plan: {plan}") # 2. Worker : exécuter toutes les actions en parallèle
observations = await self.execute_parallel(plan)
print(f"👁️ Observations: {observations}") # 3. Solver : synthétiser la réponse
answer = await self.solve(query, plan, observations)
return answer async def plan(self, query: str) -> List[Tuple[str, str]]:
"""Générer plan complet (liste d'actions)"""
prompt = f"""
Given the query: {query}Plan all necessary actions upfront using these tools: {[t.name for t in self.tools]}Format:
#E1 = Tool[input]
#E2 = Tool[input using #E1]
...Plan:
"""
response = await self.llm.ainvoke(prompt)
return self.parse_plan(response.content) def parse_plan(self, plan_text: str) -> List[Tuple[str, str]]:
"""Parser plan en liste (tool, input)"""
actions = []
for line in plan_text.split("\n"):
if "=" in line and "[" in line:
# #E1 = Search[Paris population]
tool_part = line.split("=")[1].strip()
tool_name = tool_part.split("[")[0].strip()
tool_input = tool_part.split("[")[1].split("]")[0]
actions.append((tool_name, tool_input))
return actions async def execute_parallel(self, plan: List[Tuple[str, str]]) -> List[str]:
"""Exécuter toutes les actions en parallèle"""
tasks = []
for tool_name, tool_input in plan:
tool = next((t for t in self.tools if t.name == tool_name), None)
if tool:
# Créer tâche async
task = asyncio.create_task(self.execute_tool(tool, tool_input))
tasks.append(task) # Attendre toutes les tâches
results = await asyncio.gather(*tasks)
return results async def execute_tool(self, tool, input_str: str) -> str:
"""Exécuter un outil (async)"""
try:
# Si l'outil est async
if asyncio.iscoroutinefunction(tool.func):
return await tool.func(input_str)
else:
# Wrapper sync → async
loop = asyncio.get_event_loop()
return await loop.run_in_executor(None, tool.func, input_str)
except Exception as e:
return f"Error: {str(e)}" async def solve(self, query: str, plan: List, observations: List[str]) -> str:
"""Synthétiser réponse finale"""
prompt = f"""
Query: {query}Plan executed:
{plan}Observations:
{observations}Provide the final answer:
"""
response = await self.llm.ainvoke(prompt)
return response.contentExemple d'utilisation
async def main():
from langchain_openai import ChatOpenAI llm = ChatOpenAI(model="gpt-4o")
tools = [
Tool(name="Search", func=DuckDuckGoSearchRun().run, description="Web search"),
Tool(name="Calculator", func=lambda x: str(eval(x)), description="Math")
] agent = ReWOOAgent(llm, tools)
answer = await agent.run("What is the population of Tokyo times 3?")
print(f"\n✅ Answer: {answer}")Architecture
Concept
Pattern 3 : ReWOO (Reasoning WithOut Observation)
Cas d'Usage
📋 Plan:
1. [Search] Find GDP of USA in 2025
2. [Search] Find GDP of China in 2025
3. [Search] Find GDP of Japan in 2025
4. [Calculator] Sum the three GDPs⚙️ Executing step 1: [Search] Find GDP of USA in 2025
✅ Result: USA GDP in 2025 is approximately $28.7 trillion⚙️ Executing step 2: [Search] Find GDP of China in 2025
✅ Result: China GDP in 2025 is approximately $19.4 trillion⚙️ Executing step 3: [Search] Find GDP of Japan in 2025
✅ Result: Japan GDP in 2025 is approximately $4.2 trillion⚙️ Executing step 4: [Calculator] Sum the three GDPs
✅ Result: 52.3📊 Final Answer:
The top 3 economies in 2025 are USA ($28.7T), China ($19.4T), and Japan ($4.2T).
from typing import List, Dict
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplateclass PlanAndExecuteAgent:
def __init__(self, llm, tools):
self.llm = llm
self.tools = {tool.name: tool for tool in tools} # Prompt du Planner
self.planner_prompt = PromptTemplate.from_template("""
You are a planning agent. Given a user query, create a step-by-step plan to solve it.Available tools: {tool_names}Query: {query}Create a numbered plan with specific actions:
1. [Action] Description
2. [Action] Description
...Plan:
""") # Prompt de l'Executor
self.executor_prompt = PromptTemplate.from_template("""
Execute the following step using the available tools.Step: {step}
Tools: {tool_names}
Previous results: {previous_results}Execute the step and return the result.
""") self.planner = LLMChain(llm=self.llm, prompt=self.planner_prompt)
self.executor = LLMChain(llm=self.llm, prompt=self.executor_prompt) def run(self, query: str) -> str:
# 1. Générer le plan
plan = self.planner.run(
query=query,
tool_names=", ".join(self.tools.keys())
) print(f"📋 Plan:\n{plan}\n") # 2. Parser le plan
steps = self.parse_plan(plan) # 3. Exécuter chaque étape
results = []
for i, step in enumerate(steps, 1):
print(f"⚙️ Executing step {i}: {step}") # Extraire l'outil à utiliser
tool_name, tool_input = self.parse_step(step) if tool_name in self.tools:
# Exécuter l'outil
result = self.tools[tool_name].func(tool_input)
results.append({
"step": step,
"result": result
})
print(f"✅ Result: {result}\n")
else:
# Utiliser LLM pour interpréter
result = self.executor.run(
step=step,
tool_names=", ".join(self.tools.keys()),
previous_results=results
)
results.append({"step": step, "result": result}) # 4. Synthèse finale
final_answer = self.synthesize(query, results)
return final_answer def parse_plan(self, plan: str) -> List[str]:
"""Parser le plan en étapes"""
lines = plan.strip().split("\n")
steps = [line.split(". ", 1)[1] if ". " in line else line
for line in lines if line.strip()]
return steps def parse_step(self, step: str) -> tuple:
"""Extraire l'outil et l'input d'une étape"""
# Exemple: "[Search] Find population of Paris"
if "[" in step and "]" in step:
tool = step.split("[")[1].split("]")[0]
tool_input = step.split("]")[1].strip()
return tool, tool_input
return None, step def synthesize(self, query: str, results: List[Dict]) -> str:
"""Synthétiser les résultats"""
synthesis_prompt = PromptTemplate.from_template("""
Based on the following execution results, provide a final answer to the query.Query: {query}Results:
{results}Final Answer:
""") results_text = "\n".join([
f"Step: {r['step']}\nResult: {r['result']}\n"
for r in results
]) chain = LLMChain(llm=self.llm, prompt=synthesis_prompt)
return chain.run(query=query, results=results_text)Exemple d'utilisation
from langchain_openai import ChatOpenAIllm = ChatOpenAI(model="gpt-4o", temperature=0)
tools = [
Tool(name="Search", func=DuckDuckGoSearchRun().run, description="Web search"),
Tool(name="Calculator", func=lambda x: str(eval(x)), description="Math calculations")
]agent = PlanAndExecuteAgent(llm, tools)query = """
Compare the GDP of the top 3 economies in 2025 and calculate the total.
"""answer = agent.run(query)
Architecture
Concept
Pattern 2 : Plan-and-Execute
Cas d'Usage
Thought: I need to find the capital of France first
Action: Wikipedia
Action Input: France capital
Observation: Paris is the capital and largest city of France...Thought: Now I need to find the population of Paris
Action: Search
Action Input: population of Paris 2026
Observation: The population of Paris is approximately 2.2 million...Thought: Now I need to multiply by 2
Action: Calculator
Action Input: 2.2 * 2
Observation: 4.4Thought: I now know the final answer
from langchain.agents import Tool, AgentExecutor, create_react_agent
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain_community.tools import DuckDuckGoSearchRunclass ReActAgent:
def __init__(self, model_name="gpt-4o"):
self.llm = ChatOpenAI(model=model_name, temperature=0) # Définir les outils disponibles
self.tools = [
Tool(
name="Search",
func=DuckDuckGoSearchRun().run,
description="Useful for searching information on the web. Input should be a search query."
),
Tool(
name="Calculator",
func=self.calculator,
description="Useful for mathematical calculations. Input should be a mathematical expression."
),
Tool(
name="Wikipedia",
func=self.wikipedia_search,
description="Useful for getting detailed information about topics. Input should be a topic name."
)
] # Prompt ReAct
self.prompt = PromptTemplate.from_template("""
Answer the following questions as best you can. You have access to the following tools:{tools}Use the following format:Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input questionBegin!Question: {input}
Thought: {agent_scratchpad}
""") # Créer l'agent
self.agent = create_react_agent(
llm=self.llm,
tools=self.tools,
prompt=self.prompt
) # Executor avec safeguards
self.executor = AgentExecutor(
agent=self.agent,
tools=self.tools,
verbose=True,
max_iterations=10, # Limite boucles infinies
max_execution_time=60, # Timeout 60s
handle_parsing_errors=True
) def calculator(self, expression: str) -> str:
"""Évaluation sécurisée d'expressions mathématiques"""
try:
# Whitelist de fonctions autorisées
allowed = {
'abs': abs, 'round': round, 'min': min, 'max': max,
'sum': sum, 'pow': pow
}
result = eval(expression, {"__builtins__": {}}, allowed)
return str(result)
except Exception as e:
return f"Error: {str(e)}" def wikipedia_search(self, query: str) -> str:
"""Recherche Wikipedia"""
import wikipedia
try:
return wikipedia.summary(query, sentences=3)
except Exception as e:
return f"Error: {str(e)}" def run(self, query: str) -> str:
"""Exécuter l'agent sur une requête"""
result = self.executor.invoke({"input": query})
return result["output"]Exemple d'utilisation
agent = ReActAgent()query = """
What is the population of the capital of France multiplied by 2?
"""answer = agent.run(query)
Implémentation
┌─────────────────────────────────────────────────────┐
│ Agent ReAct │
├─────────────────────────────────────────────────────┤
│ │
│ ┌───────────────┐ │
│ │ User Query │ │
│ └───────┬───────┘ │
│ ↓ │
│ ┌───────────────┐ │
│ │ Thought │ "I need to search for..." │
│ └───────┬───────┘ │
│ ↓ │
│ ┌───────────────┐ │
│ │ Action │ search("query") │
│ └───────┬───────┘ │
│ ↓ │
│ ┌───────────────┐ │
│ │ Observation │ Results: [...] │
│ └───────┬───────┘ │
│ ↓ │
│ ┌───────────────┐ │
│ │ Thought │ "Now I need to..." │
│ └───────┬───────┘ │
│ ↓ │
│ ... │
│ ↓ │
│ ┌───────────────┐ │
│ │ Final Answer │ │
│ └───────────────┘ │
│ │
Architecture
Pensée → Action → Observation → Pensée → Action → ...Concept
Pattern 1 : ReAct (Reason + Act)
L'IA agentique transforme les LLMs en assistants autonomes fiables. Le futur n'est plus réactif, il est proactif.
Quel pattern allez-vous implémenter en premier ?