reasoning advanced

Tree-of-Thought

Explore multiple reasoning branches in a tree structure, enabling the LLM to evaluate and backtrack on different solution paths.

tree-of-thoughttotreasoningsearchbacktrackingplanning

Overview

Tree-of-Thought (ToT) extends Chain-of-Thought by exploring multiple reasoning branches at each step, evaluating the promise of each branch, and backtracking when a path leads to a dead end. While CoT is a linear chain, ToT is a tree β€” enabling search-like reasoning over a space of possible solutions.

When to Use

  • Planning and puzzle problems (e.g., Game of 24, crosswords)
  • Tasks requiring exploration and backtracking
  • Creative tasks with multiple valid approaches to explore
  • Problems where the first attempt often fails and revision is needed
  • Complex multi-step reasoning where early mistakes compound

Architecture

flowchart TB
    P[Problem] --> T1[Thought 1a]
    P --> T2[Thought 1b]
    P --> T3[Thought 1c]
    
    T1 --> E1{Evaluate}
    T2 --> E2{Evaluate}
    T3 --> E3{Evaluate}
    
    E1 -->|promising| T1a[Thought 2a]
    E1 -->|promising| T1b[Thought 2b]
    E2 -->|dead end| X1[βœ— Pruned]
    E3 -->|promising| T3a[Thought 2c]
    
    T1a --> E4{Evaluate}
    T1b --> E5{Evaluate}
    T3a --> E6{Evaluate}
    
    E4 -->|solution| S[βœ“ Solution Found]
    E5 -->|dead end| X2[βœ— Pruned]
    E6 -->|continue| T3a1[...]
    
    style X1 fill:#1c2128,stroke:#f85149,color:#f85149
    style X2 fill:#1c2128,stroke:#f85149,color:#f85149
    style S fill:#1c2128,stroke:#3fb950,color:#3fb950

Components

ComponentPurposeExample
Thought GeneratorPropose next steps from current stateGenerate 2-3 candidate next moves
State EvaluatorRate how promising a partial solution is”Score 1-10: how close to solution?”
Search StrategyDecide which branches to exploreBFS (breadth-first) or DFS (depth-first)
BacktrackingAbandon unpromising pathsPrune branches below threshold

Implementation

β–Ά Interactive Example (python)

Gotchas & Best Practices

🚨 Extremely High Cost

ToT requires N evaluations per node Γ— M nodes per level Γ— D levels deep. A tree with branching factor 3 and depth 3 needs up to 39 LLM calls. Only use for high-value tasks that justify the cost.

⚠️ Evaluation Quality Is Critical

The whole approach depends on the evaluator accurately scoring partial solutions. If the evaluator is unreliable, the search explores wrong branches and misses right ones.

πŸ’‘ Start with Self-Consistency

For most tasks, Self-Consistency (sample N + vote) gives 80% of the benefit at much lower complexity. Only upgrade to full ToT when you need backtracking and structured exploration.

πŸ’‘ DFS for Memory, BFS for Quality

Depth-first search uses less memory and finds solutions faster. Breadth-first explores more options and finds better solutions. Choose based on your constraint.

Variations

  • BFS-ToT β€” Explore breadth-first, evaluate all branches at each level
  • DFS-ToT β€” Depth-first with backtracking
  • Beam Search β€” Keep top-K branches at each level
  • MCTS-ToT β€” Monte Carlo Tree Search for exploration/exploitation balance
  • Graph-of-Thought β€” Allow merging of branches (DAG instead of tree)

Further Reading