Phase Review and Retry¶
Phase review lets you attach a quality gate to any phase in your pipeline. After the phase's tasks complete, a review task evaluates the output and decides whether to accept it, request a retry with feedback, or reject the phase entirely.
The reviewer is just a Task — AI-powered, deterministic, or human — using the same
infrastructure as every other task in the framework.
Quick start¶
// 1. Define the work tasks
Task gatherTask = Task.builder()
.description("Gather research data on the topic")
.expectedOutput("Research findings")
.build();
Task summarizeTask = Task.builder()
.description("Summarize the gathered research")
.expectedOutput("Research summary")
.context(List.of(gatherTask))
.build();
// 2. Define the review task
// Use .context() to reference the phase tasks so the reviewer can read their output.
Task reviewTask = Task.builder()
.description("Evaluate the research summary. "
+ "If sufficient, respond with: APPROVE\n"
+ "If insufficient, respond with: RETRY: <specific feedback>")
.context(List.of(summarizeTask)) // gives the reviewer access to the phase output
.build();
// 3. Attach PhaseReview to the phase
Phase research = Phase.builder()
.name("research")
.task(gatherTask)
.task(summarizeTask)
.workflow(Workflow.PARALLEL)
.review(PhaseReview.of(reviewTask)) // maxRetries defaults to 2
.build();
Phase writing = Phase.builder()
.name("writing")
.after(research) // only starts after research is approved
.task(draftTask)
.build();
EnsembleOutput output = Ensemble.builder()
.chatLanguageModel(llm)
.phase(research)
.phase(writing)
.build()
.run();
If the reviewer returns RETRY: Need more depth, the framework:
- Re-runs the research phase with the feedback injected into each task prompt.
- Re-runs the review task.
- Repeats until approved or
maxRetriesis exhausted (the last output is accepted).
How the review task reads phase outputs¶
The review task is a standard task. To access the outputs of the reviewed phase's tasks,
declare them in the review task's .context() list. The framework makes all prior
phase task outputs available as context for the review task, including outputs from retried
attempts (both original and rebuilt task object identities are mapped, so context resolution
works correctly across retries).
Task gatherTask = Task.builder().description("Gather data")...build();
Task analyzeTask = Task.builder().description("Analyze data")
.context(List.of(gatherTask)).build();
// Review task declares context to read the phase outputs
Task reviewTask = Task.builder()
.description("Quality gate")
.context(List.of(gatherTask, analyzeTask)) // read both task outputs
.handler(ctx -> {
// ctx.contextOutputs() contains gatherTask and analyzeTask outputs in order
String analysisOutput = ctx.contextOutputs().getLast().getRaw();
if (analysisOutput.length() < 300) {
return ToolResult.success(
PhaseReviewDecision.retry("Analysis too brief. Expand all sections.").toText());
}
return ToolResult.success(PhaseReviewDecision.approve().toText());
})
.build();
When only the final phase output matters, reference only the last task:
Task reviewTask = Task.builder()
.description("Quality gate")
.context(List.of(summarizeTask)) // only need the summary
.handler(ctx -> {
String output = ctx.contextOutputs().getFirst().getRaw();
// evaluate output...
})
.build();
Three reviewer types¶
AI reviewer¶
Use a task with an LLM-backed agent. Declare .context() so the LLM sees the phase
output in its ## Context from Previous Tasks prompt section. Instruct the LLM on the
expected response format:
Task reviewTask = Task.builder()
.description("Evaluate the research summary below.\n\n"
+ "Criteria:\n"
+ "- At least 5 distinct sources cited\n"
+ "- Quantitative data for every major claim\n\n"
+ "If ALL criteria are met, respond with exactly: APPROVE\n"
+ "Otherwise, respond with: RETRY: <specific actionable feedback>")
.context(List.of(summarizeTask)) // LLM sees the summary in its prompt
.build();
The LLM receives the phase output as a prior-task context section and evaluates it
against the stated criteria. Its response (APPROVE or RETRY: <feedback>) is parsed
into a PhaseReviewDecision.
Deterministic reviewer¶
Use a task with a handler for programmatic quality checks. Declare .context() to
access the phase outputs inside the handler:
Task reviewTask = Task.builder()
.description("Quality gate")
.context(List.of(summarizeTask)) // provides output via ctx.contextOutputs()
.handler(ctx -> {
String output = ctx.contextOutputs().getFirst().getRaw();
if (output.length() < 500) {
return ToolResult.success(
PhaseReviewDecision.retry("Output too short. Expand each section.").toText());
}
if (!output.contains("source") && !output.contains("reference")) {
return ToolResult.success(
PhaseReviewDecision.retry("No sources cited. Add at least 3.").toText());
}
return ToolResult.success(PhaseReviewDecision.approve().toText());
})
.build();
Human reviewer¶
Use a task with a Review gate that pauses for console input. Declare .context() and
echo the output so the human can read it before deciding:
Task reviewTask = Task.builder()
.description("Review the research output below and decide on quality.")
.context(List.of(summarizeTask))
.handler(ctx -> {
// Echo the phase output for the human to see during the review gate
String output = ctx.contextOutputs().getFirst().getRaw();
return ToolResult.success(output);
})
.review(Review.required(
"Type APPROVE, RETRY: <feedback>, or REJECT: <reason>"))
.build();
The human sees the phase output displayed in the console review gate, then types their
decision. The typed response is parsed as a PhaseReviewDecision.
Feedback injection¶
When a retry is requested, the reviewer's feedback text is injected into every task in
the phase as a ## Revision Instructions section in the LLM prompt, before the
## Task section:
## Revision Instructions (Attempt 2)
This task is being re-executed based on reviewer feedback.
Incorporate the feedback below into your response.
### Feedback
Need more depth on quantum computing applications. Include at least 3 peer-reviewed sources.
### Previous Output
[the raw output from the prior attempt]
## Task
Research the latest developments in quantum computing...
The original task description is unchanged. The LLM sees the feedback and its prior
output, enabling targeted improvement. Deterministic handler tasks receive the feedback
in the task prompt (visible in logs) but not in ctx.description() — for deterministic
tasks the handler makes its own decisions programmatically.
What is and isn't controllable in the feedback prompt¶
Controllable: the feedback content¶
The text inside ### Feedback is 100% controlled by the reviewer — it is exactly
the string the review task returns after RETRY::
// Whatever you write here becomes the ### Feedback content
PhaseReviewDecision.retry("Need more depth on section 3. Add quantitative data.").toText()
// -> "RETRY: Need more depth on section 3. Add quantitative data."
You can write short one-liners or structured multi-point instructions:
PhaseReviewDecision.retry("""
The output is missing two key elements:
1. Quantitative data -- add numbers/percentages for every major claim.
2. Source citations -- cite at least 3 peer-reviewed papers.
Keep the structure otherwise intact.
""").toText()
Fixed: the prompt structure¶
The surrounding structure is determined by the framework:
| Element | Value | Controllable? |
|---|---|---|
| Section header | ## Revision Instructions (Attempt N) |
No |
| Preamble | "This task is being re-executed based on reviewer feedback. Incorporate the feedback below into your response." | No |
| Feedback label | ### Feedback |
No |
| Feedback content | whatever the reviewer returns | Yes |
| Prior output label | ### Previous Output |
No |
| Prior output content | The task's raw output from the previous attempt | No |
Full control via feedback text¶
If you need to change the framing (e.g. different tone or instructions for the LLM), embed your custom instruction directly at the start of the feedback text:
PhaseReviewDecision.retry("""
IMPORTANT: Discard your previous approach entirely.
The task requires a completely different structure:
- Start with an executive summary (2 sentences)
- Follow with detailed sections for each sub-topic
- End with a bullet-point action list
Specific gaps to address: the current output lacks quantitative data.
""").toText()
The LLM receives this text verbatim under ### Feedback and will incorporate it.
Retry limits¶
By default, a phase is allowed up to 2 self-retries (maxRetries = 2), meaning 3 total
attempts. When the limit is exhausted, the last output is accepted and the pipeline
continues.
PhaseReview.builder()
.task(reviewTask)
.maxRetries(3) // up to 3 retries (4 total attempts)
.build();
To reject the phase and stop the pipeline when quality cannot be achieved, use REJECT:
// In the review task handler:
if (criticalFailure) {
return ToolResult.success(
PhaseReviewDecision.reject("Data is corrupted. Pipeline cannot continue.").toText());
}
A rejection throws a TaskExecutionException and skips all downstream phases.
Predecessor retry¶
A phase can request that a direct predecessor be re-run when it discovers the predecessor's output was insufficient:
// Writing review: if research was lacking, request research redo.
// Declare context on the review task to see the draft and evaluate research quality.
Task writingReviewTask = Task.builder()
.description("Evaluate the draft. "
+ "If the research backing is weak or missing quantitative data, respond with:\n"
+ "RETRY_PREDECESSOR research: <feedback for the research phase>\n"
+ "If the draft quality is acceptable, respond with: APPROVE")
.context(List.of(draftTask)) // read the draft to evaluate research backing
.build();
Phase writing = Phase.builder()
.name("writing")
.after(research)
.task(draftTask)
.review(PhaseReview.builder()
.task(writingReviewTask)
.maxRetries(2)
.maxPredecessorRetries(1) // research can be retried once
.build())
.build();
When RETRY_PREDECESSOR research: <feedback> is returned:
- The research phase is re-run with the feedback injected into its tasks.
- The writing phase is re-run with the updated research outputs.
- The writing review fires again.
The predecessor must be a direct predecessor listed in the phase's .after(). If the
named phase is not a direct predecessor, the decision is treated as APPROVE.
Decision format reference¶
| Output text | Decision |
|---|---|
APPROVE |
Accept output, proceed |
RETRY: <feedback> |
Retry this phase with feedback |
RETRY_PREDECESSOR <name>: <feedback> |
Retry the named predecessor, then retry this phase |
REJECT: <reason> |
Fail this phase and stop downstream phases |
Parsing is case-insensitive. Unrecognised text is treated as APPROVE.
The colon split for RETRY and REJECT is on the first colon only, so feedback
text may contain additional colons (RETRY: issue: too brief → feedback = issue: too brief).
Builder reference¶
// Minimal (task + defaults)
PhaseReview.of(reviewTask)
// With custom max retries
PhaseReview.of(reviewTask, 3)
// Full control
PhaseReview.builder()
.task(reviewTask)
.maxRetries(3)
.maxPredecessorRetries(2)
.build()
| Field | Default | Description |
|---|---|---|
task |
required | The review task; use .context() to access phase outputs |
maxRetries |
2 | Maximum self-retries (0 = review once, no retries) |
maxPredecessorRetries |
2 | Maximum predecessor retries per predecessor |