As agent is creative if we allow it to be creative, we need to be quite specific in questions. For example, following test question is not good enough:
Which opportunities did Joe Smith won in season 2025 S2?
Why it’s not good enough:
- There is no specification which attributes to get from Opportunity so sometimes Agent gives you just an Opportunity Name, sometimes it gives you more columns. This would make test fail as Curated DAX is fixed.
- There is no specified sorting. (maybe for you as a person does not matter, but CAT needs dataset to be sorted for comparison).
Therefore, right version of this Question for testing could be:
Which opportunities did Joe Smith won in season 2025 S2? Bring only Opportunity name and sort by it ascending.
This question will give us more consistent answer.
After a successful testing run you can review the result in test dashboard and potentially identify why test failed.

