orchestration · inference-time scaffolding

Dynamic Workflows vs Recursive Language Models

Two ways to spend more than one model inference on a hard problem. One puts the plan in deterministic code; the other lets the model improvise the plan at runtime. They solve overlapping problems from opposite directions.

◆ Dynamic Workflows

A sandboxed JS script orchestrates subagents.

The plan is code — loops, fan-out, pipelines — written ahead of execution. Agents are stateless workers that return structured data.

◆ Recursive Language Models

A model recursively calls models over its context.

The plan is emergent — the model, inside a REPL, decides how to split a giant input and when to recurse on the pieces.

paradigm 01 The JS sandbox as an orchestrator

A dynamic workflow is a small plain-JavaScript program the model writes and the runtime executes in a sandbox (no filesystem, no network — just orchestration hooks). The script doesn't do the reasoning; it schedules reasoning. A handful of primitives are injected:

agent(prompt, opts) spawns a subagent and returns its text — or, with a schema, a validated object. pipeline(items, …stages) streams each item through stages with no barrier between them. parallel(thunks) is a barrier — it waits for all. phase(title) groups work; budget caps tokens; the run is resumable and cached.

// the model emits this; the sandbox runs it deterministically
const DIMENSIONS = ['bugs', 'perf', 'security']
const results = await pipeline(DIMENSIONS,
  // stage 1 — review each dimension (runs concurrently)
  d => agent(`review the diff for ${d}`, {schema: FINDINGS}),
  // stage 2 — adversarially verify each finding, no barrier:
  //   'bugs' findings verify while 'perf' is still being reviewed
  review => parallel(review.findings.map(f => () =>
    agent(`try to REFUTE: ${f.title}`, {schema: VERDICT})))
)
return results.flat().filter(f => f.verdict.isReal)

Because control flow lives in code, it is deterministic and inspectable: same script + same inputs ⇒ same structure of work. You get explicit parallelism, concurrency caps (≈ min(16, cores−2) agents at once), token budgets, live phase progress, and journal-based resume. Each agent runs in a fresh, scoped context — long-context degradation is avoided by isolation: no worker ever sees the whole problem.

pipeline() — fan out by phase, verify each item the moment its review lands (no barrier)

paradigm 02 The model as its own recursion

A Recursive Language Model flips the locus of control. Instead of a script driving the model, the model is dropped into an environment — typically a REPL — where its enormous context is just a variable it can inspect, slice, and act on with code. Rather than read 10M tokens in one shot (where quality rots), a root model writes code to split the context and recursively calls a language model — often itself — on each piece, then combines the answers.

# the ROOT model improvises this in a REPL, at inference time
ctx = ENV["context"]              # may be far larger than the window
if len(ctx) < THRESHOLD:
    return call_llm(ctx + question)   # base case: just answer

chunks = split(ctx, ~50_000)         # the model decides how
notes  = [call_llm(f"distil for Q: {c}") for c in chunks]
#                ↑ each call_llm may ITSELF be a recursive LM
return call_llm(f"using {notes}, answer: {question}")

Nothing here is fixed in advance. Depth, fan-out, and decomposition are decided by the model as it reasons — it might recurse twice on a dense section and not at all on boilerplate. The win is near-unbounded context handled by deferral: the root never holds the whole input in its active window; it pulls in only what each sub-question needs. The cost is that the plan is stochastic and opaque — the same prompt can decompose differently run to run.

recursion — the model decides where to descend; results bubble back up the tree

Watch the difference

Same job — “answer a question that needs six pieces of work.” Toggle the execution model and run it. The workflow fans out in structured phases you declared up front. The RLM grows a recursion tree the model shapes as it goes.

deterministic fan-out · phases declared in code

ready.

The comparison, axis by axis

Hover a row to focus it. Emphasise one side:

Axis	Dynamic Workflows	Recursive Language Models
Who plans	Code the model writes before execution	The model, during inference, via a REPL
Orchestrator	A sandboxed JS program (not a model)	A model interacting with an environment
Unit of work	Stateless subagent, structured I/O (schema)	An LM call over a context slice — possibly recursive
Determinism	Same script+inputs → same structure; resumable, cached	Stochastic decomposition; not naturally reproducible
Long context	Avoided by isolation — each agent sees only its slice	Avoided by deferral — root reads slices on demand
Parallelism	Explicit: pipeline (no barrier), parallel (barrier), concurrency cap	Implicit/sequential unless the model writes parallel calls
Verification	First-class stages — adversarial verify, judge panels	The model's own recursive judgement
Cost control	Token budget, concurrency caps, logged drops	Bounded by the recursion depth the model chooses
Observability	Phases, labels, live progress, journal/resume	Mostly opaque recursion trace
Failure mode	Rigidity — a bad script structure caps the outcome	Instability — runaway / insufficient recursion, compounding error
Best fit	Broad fan-out: review, migration, research, audits	Deep reasoning over one enormous input

They compose

This isn't a contest — they nest. A workflow's agent() could itself be a recursive language model when one step must chew through a giant document. An RLM, conversely, could call a whole workflow as a tool when a sub-question needs structured fan-out and adversarial checks. Both are inference-time scaffolding: structure wrapped around raw next-token prediction so a model can do more than a single pass over a problem.

The deciding question is simply where you want the plan to live — pinned in deterministic, inspectable code, or grown by the model's own judgement at runtime. Predictability and breadth pull one way; adaptivity and unbounded context pull the other.