AI Sprint Refinement

Stop spending Friday afternoons on backlog grooming.
Let AI refine the backlog.

Agentopia by CynetIQ expands ticket descriptions, writes Given/When/Then acceptance criteria, estimates story points, and suggests an assignee based on prior code expertise. Output is written back to Jira or Azure DevOps so your single source of truth stays in your tracker.

What gets generated

📝
Expanded description
Edge cases, error states, and prerequisite work spelled out. No more two-line tickets that explode into a sprint.
Acceptance criteria
Given/When/Then format. Each criterion has its own row so QA can write tests directly off it.
🔢
Story point estimate
Fibonacci, T-shirt, or hours. Configurable per workspace. Falls back to "Needs human breakdown" for novel or 13+ point tickets.
🧑‍💻
Suggested assignee
Based on who has committed to the likely files in the last 90 days. Tie-break shown.
⚠️
Risk indicators
Flags for security, performance, third-party blocker, breaking change. Colour-coded badges in the UI.
↩️
Writes back to source
Jira: comment + Story Points field. Azure DevOps: discussion + Story Points field. GitHub Issues: comment.
📊
Refinement runs page
Every refinement run is logged with input/output, model used, cost. Click into any past run for audit.
🎯
Per-team config
Each team can set its own scale, prompt overrides, and assignee preference.

Sample refinement output

Original ticket: "User can't pay with saved card"

## Refined Description
The /api/orders/checkout endpoint fails with HTTP 500 when the user selects
a previously saved card from their wallet. Stripe webhook returns 402 because
the saved-card token is being passed to confirm-payment without re-attaching
to the current PaymentIntent. Affects all returning customers with at least
one saved card. Likely introduced in #4318 (wallet refactor).

## Acceptance Criteria
- GIVEN a logged-in user with one or more saved cards
  WHEN they pick a saved card on /checkout
  THEN the order completes with a 200 and a charge appears in Stripe
- GIVEN the saved card is expired
  WHEN the user picks it
  THEN the user sees a clear "card expired" message and can pick another
- GIVEN the user has no saved cards
  WHEN they reach /checkout
  THEN the saved-card section is hidden, no JS error in console

## Story Points
3

## Risk
- security: medium (touches payment flow)
- third-party-blocker: low (Stripe is reachable)

## Suggested Assignee
@erin (last 5 commits to packages/services/.../payment_service.py)
Alternate: @daniel

Frequently asked

What does AI Sprint Refinement do exactly?

For each task you click ✨ Refine on, the PM agent: (1) expands the description with concrete edge cases, (2) writes a Given/When/Then acceptance criteria block, (3) estimates story points using a configurable Fibonacci/T-shirt scale, (4) suggests an assignee based on prior expertise on similar tasks, and (5) flags risk indicators (security, perf, third-party blocker). The output is written back to the source ticket as a comment.

Does it write story points back to Jira / Azure DevOps?

Yes. For Jira-sourced tasks, Agentopia by CynetIQ writes to the configured Story Points custom field (default customfield_10016). For Azure DevOps, it writes to Microsoft.VSTS.Scheduling.StoryPoints. Both also get a discussion / comment with the AI refinement output for human review.

How accurate are the story point estimates?

The PM agent looks at the task description, your team’s velocity history, prior similar tasks, and the codebase context. In our internal testing, AI estimates land within ±1 Fibonacci point of the team’s actual estimate 78% of the time, and within ±2 points 96% of the time. The AI is a starting point — you can override before sprint planning.

Can I use this without committing to AI code generation?

Yes — refinement is decoupled from code generation. Many teams start with refinement-only to demo the AI’s value at sprint planning, then expand to code generation once trust builds.

What about "big" or "novel" tasks where the AI shouldn’t guess?

For tasks that are genuinely novel (no analogue in the codebase) or large (would be 13+ points), the prompt explicitly asks the agent to flag the task as needing human breakdown rather than emitting a confident point estimate. You see a "Needs human breakdown" badge instead of a number.

How does the suggested assignee work?

The agent looks at the file paths the task is likely to touch, then queries the Refinement Service for engineers who have committed to those paths recently (last 90 days). The top match is suggested, with the alternative shown if there’s a tie. You can disable this if your team uses pure pull-based assignment.

Related

Refine 50 tickets in 90 seconds

Free tier covers 5,000 imported tasks per month with refinement included. Bring your own LLM key.

Start free