MAS Decision Gate
Meta-Rule
If a single agent can do the job, do not use MAS. Multi-agent systems are organizational scaling tools, not capability multipliers by default. Research shows 80% of AI projects fail due to premature architectural complexity.
12 Factor Agents Perspective (Factor 10)
The 12 Factor Agents framework reinforces this principle:
Factor 10: Small Focused Agents
Smaller focused prompts with controlled context always beat long autonomous runs.
This applies at two levels:
- Single vs Multi-Agent: Start with single agent
- Within Multi-Agent: Each agent should be small and focused
The Progression:
Level 0: Deterministic workflow (no agent)
↓ Only if judgment needed
Level 1: Single focused agent
↓ Only if tools needed
Level 2: Single agent with tools
↓ Only if verification critical
Level 3: Minimal MAS (Planner→Executor→Verifier)
↓ Only if multiple domains
Level 4: Full MAS (when justified by evidence)
Only advance levels when evidence supports it. Most tasks belong at Level 0-2.
Decision Criteria
Use MAS only if at least one is true:
- Natural decomposition: Tasks split into semi-independent roles
- Parallel benefit: Concurrent reasoning materially reduces latency (≥40%)
- Distinct world models: Agents need different knowledge bases or incentives
- Internal verification: Long-horizon work requires checks and balances
If none apply → build a single agent with structured tools.
Quantitative Thresholds
Deploy Single-Agent When:
| Factor | Threshold |
|---|---|
| Domain complexity | < 3 distinct domains |
| Reasoning steps | < 10 required steps |
| Context needs | < 8K tokens |
| Parallel execution | Not required |
| Budget | Tight constraints (MAS costs 2-4x more) |
| Team expertise | Limited distributed systems experience |
Deploy Multi-Agent When:
| Factor | Threshold |
|---|---|
| Domain complexity | ≥3 distinct domains requiring different expertise |
| Parallel benefit | Reduces latency by ≥40% |
| Verification needs | Long-horizon tasks requiring internal checkpoints |
| Model specialization | Need for expert ensembling (code + security, etc.) |
| Concurrency benefit | Outweighs coordination costs |
Decision Questions
To determine whether MAS is appropriate, answer these questions:
Question 1: Can a single agent complete this task?
- If yes with reasonable quality → use single agent
- If struggling with scope/quality → consider MAS
Question 2: What distinct expertise areas are needed?
- Count domains requiring specialized knowledge
- 1-2 domains → single agent with tools
- 3+ domains → MAS may be justified
Question 3: Are subtasks truly independent?
- Can work proceed in parallel without dependencies?
- Yes → MAS provides latency benefit
- No → MAS adds coordination overhead without benefit
Question 4: Is internal verification critical?
- Would self-checking be insufficient?
- Do outputs need adversarial review?
- Yes → MAS with separate verifier agent
Question 5: What is the failure cost?
- Low-stakes task → prefer simplicity (single agent)
- High-stakes task → MAS verification may justify complexity
Simplicity Test
Before building any agent system, answer these sanity-check questions:
Core Questions
-
Could this just be a deterministic workflow or cron job?
- If yes → use traditional automation, not agents
-
Where does uncertainty or judgment actually exist?
- If nowhere → scripted workflow is sufficient
- If bounded → single agent with tools
- If distributed across domains → MAS may be justified
-
What would happen if the agent vanished tomorrow—could you survive?
- If operations stop → high value, proceed carefully
- If minor inconvenience → question the investment
-
What's the simplest version that would provide value?
- Build that first, then add complexity only when evidence supports it
12 Factor Simplicity Questions
In addition to the core questions, apply these 12 Factor Agents checks:
-
Could you own the control flow with code? (Factor 8)
- If yes → code + single agent likely suffices
- If no → MAS may help distribute complexity
- Key insight: Code-controlled DAG beats LLM-controlled DAG
-
Can state be modeled as (state, event) → new_state? (Factor 12)
- If yes → cleaner architecture possible with reducers
- If no → complexity is intrinsic, MAS may help
- Key insight: Stateless reducers enable debugging and replay
-
Is context building well-understood? (Factor 3)
- If yes → single agent with explicit context
- If no → need to understand context before adding agents
- Key insight: Context engineering is the core of agent quality
-
At what level does the task belong? (Factor 10)
- Level 0: Deterministic (script/workflow)
- Level 1-2: Single agent (with or without tools)
- Level 3-4: Multi-agent (only if justified)
- Key insight: Start simple, add agents only when evidence supports it
Red Flags (Agent May Be Overkill)
| Flag | Implication |
|---|---|
| Task can be fully specified with if/then rules | Use deterministic code |
| No variability in inputs or required responses | Use templates/scripts |
| Human oversight would be faster than building | Skip the agent |
| The "intelligence" needed is just API orchestration | Use workflow automation |
Green Flags (Agent Justified)
| Flag | Implication |
|---|---|
| Genuine ambiguity in how to respond | Agent reasoning needed |
| Need to adapt to novel situations | Learning/flexibility required |
| Complex reasoning across multiple inputs | Agent synthesis valuable |
| Learning from feedback improves outcomes | Agent adaptation worthwhile |
Simplicity Test Output
Document simplicity assessment:
## Simplicity Assessment
**Task**: [Brief description]
**Deterministic alternative?**: [Yes/No - what would it look like?]
**Where is judgment needed?**: [Specific points]
**If agent vanished?**: [Impact assessment]
**Minimum viable version**: [Description]
**Conclusion**: [Proceed with agent / Use simpler alternative]
Common Anti-Patterns
Anti-Pattern: MAS for Capability
Wrong: "Multiple agents will be smarter than one"
Reality: Coordination overhead often exceeds capability gains. ChatDev shows 25% correctness, 60-87% failure rates across frameworks.
Anti-Pattern: Premature Decomposition
Wrong: "Let's split this into 5 agents for better organization"
Reality: Every agent boundary introduces failure points (specification, alignment, verification). Start simple, add agents only when evidence supports it.
Anti-Pattern: Personality-Based Splitting
Wrong: "Creative agent, analytical agent, careful agent"
Reality: Split by functional orthogonality, not personality. Planner, Executor, Verifier—not "smart" vs "creative."
Specialist vs Generalist
2026 consensus strongly favors specialists:
Why specialists win:
- 40-60% fewer tokens for domain tasks
- Higher accuracy in specialized domains
- Clearer audit trails and governance
- Reduced computational waste
Architecture pattern: Specialist agents orchestrated by a coordinator that handles delegation.
Decision Output Format
After analysis, document the decision:
## MAS Decision
**Task**: [Brief description]
**Decision**: [Single-Agent / Multi-Agent]
**Rationale**:
- Domain count: [X] domains
- Parallel benefit: [Yes/No - expected %]
- Verification need: [Low/Medium/High]
- Failure cost: [Low/Medium/High]
**If Multi-Agent, justify each agent**:
- Agent 1: [Role] - [Why separate agent needed]
- Agent 2: [Role] - [Why separate agent needed]
Additional Resources
Reference Files
For detailed decision frameworks and evidence:
references/evidence.md- Research data supporting thresholdsreferences/decision-tree.md- Step-by-step decision flowchart../agent-specification/references/twelve-factor-agents.md- Quick reference for all 12 factors
Related Skills
After deciding on MAS, use:
- agent-specification - For writing proper agent specs (Factors 1, 2, 4, 7)
- coordination-patterns - For choosing architecture (Factors 3, 5/6, 8, 12)
- production-readiness - For cost/observability planning (Factors 9, 11)
