Skip to main content

What are Scenarios?

Scenarios in Galtea are designed to evaluate the conversational capabilities of your product through multi-turn dialogue interactions. These tests use the Conversation Simulator to create realistic conversations with synthetic users that have specific goals, personalities, and behaviors. Unlike Quality Tests that focus on single-turn question-answer pairs, or Red Teaming Tests that test security vulnerabilities, Scenario based tests evaluate how well your AI product can:
  • Maintain context across multiple conversation turns
  • Guide users toward successful task completion
  • Handle unexpected but realistic user inputs
  • Stay in character and follow conversation guidelines
  • Manage complex dialogue flows

Creating Scenarios

You can create scenarios in Galtea through two methods:
  • Scenario Generator
  • Upload Your Own Scenarios
1

Define Your Product

Start by defining your product in Galtea. These detailed product properties will be used as the foundation for generating realistic scenarios.
2

Configure the test

Select User Scenarios as the test type and Generated as the test origin
The test creation process can be done via the SDK or the Galtea dashboard
3

Generate Scenarios

Galtea will automatically analyze your product definition to create tailored conversation scenarios, complete with:
  • Realistic user personas based on your target audience
  • Goals aligned with your product’s use cases
  • Natural conversation flows that test key features

Conversation Flow Categories

Scenarios can cover various types of conversational interactions:
Scenarios where the synthetic user has a specific goal to accomplish, such as booking a service, making a purchase, or getting support for an issue.
Scenarios focused on retrieving specific information or answers, testing the AI’s knowledge and ability to provide helpful responses.
Scenarios that simulate customer service interactions, including complaints, troubleshooting, and problem resolution.
Scenarios where the user is exploring options, comparing products, or seeking advice before making decisions.
Scenarios that involve multiple stages or require gathering various pieces of information over several turns.
Scenarios that test how the AI handles difficult, unclear, or unexpected user behavior while maintaining conversation quality.

Example Scenarios and File Format

Here are examples of scenario content and structure:
  • Table View
  • Raw CSV
GoalUser PersonaInitial PromptStopping CriteriasMax IterationsScenario
Book a one-way flight from SFO to JFK for next TuesdayA busy professional who is direct and values efficiencyI need to book a flightThe user has confirmed the flight booking|The chatbot indicates it cannot fulfill the request10Flight booking scenario
Find the cheapest round-trip flight to Europe in summerA budget-conscious student who asks many questionsHi, I’m looking for cheap flights to EuropeFlight is booked|User decides not to book|Maximum budget is exceeded15Budget travel scenario
Change an existing flight reservationA frustrated customer whose plans have changedI need to change my flight immediatelyReservation is successfully modified|Customer is transferred to agent8Flight modification scenario
  • Table View
  • Raw CSV
GoalUser PersonaInitial PromptStopping CriteriasMax IterationsScenario
Get help with a defective productAn upset customer who received a broken itemMy product arrived broken and I’m very disappointedIssue is resolved|Customer requests supervisor|Refund is processed12Product defect resolution
Understand how to use a new featureA curious but non-technical userI heard about a new feature but don’t know how to use itUser successfully uses the feature|User gives up|Technical support is escalated10Feature education scenario
Cancel a subscriptionA polite customer who wants to downgrade serviceI’d like to cancel my subscription pleaseSubscription is cancelled|Alternative plan is accepted|Retention offer is declined8Subscription cancellation
  • Table View
  • Raw CSV
GoalUser PersonaInitial PromptStopping CriteriasMax IterationsScenario
Find the right product for specific needsA thorough researcher who compares many optionsI’m looking for a solution but need help choosingProduct recommendation is accepted|User requests human consultation|User decides to research more15Product recommendation scenario
Get pricing information for enterprise solutionA business decision maker focused on ROIWhat are your enterprise pricing options?Quote is requested|Meeting is scheduled|User indicates budget constraints10Enterprise sales scenario
Compare different service tiersAn analytical customer who wants detailed comparisonsCan you help me understand the differences between your plans?Plan is selected|User requests trial|User needs more time to decide12Service comparison scenario
This structure is required by Galtea to automatically generate the test cases for your scenarios as tests within the platform. If this format is not provided, you can still manually create the test cases.
For automatic processing, the file format must be CSV.
The examples provided above are simplified demonstrations. In actual CSV files, scenarios can be much more detailed and the number of test cases (rows) can be significantly higher to provide comprehensive conversation testing coverage.

Structure of Scenarios

Scenarios have a specific structure designed to enable realistic conversation simulation:
goal
Text
required
The overall objective the synthetic user is trying to achieve during the conversation. Example: “Book a one-way flight from San Francisco (SFO) to New York (JFK) for next Tuesday”
user_persona
Text
required
The personality and communication style of the synthetic user that will be simulated throughout the conversation. Example: “A busy professional who is direct and values efficiency. They prefer to get things done quickly without much small talk.”
initial_prompt
Text
The first message the synthetic user sends to start the conversation. If not provided, the system will generate an appropriate opening based on the goal and persona. Example: “Hello, I need to book a flight.”
stopping_criterias
Text
A delimited string of conditions that, if met, will cause the conversation simulation to end. Use ; or | as delimiters to separate multiple criteria. Example: “The user has confirmed the flight booking|The chatbot indicates it cannot fulfill the request;User expresses satisfaction with the service”
max_iterations
Number
The maximum number of conversation turns before the simulation automatically ends. This prevents infinite loops and controls test duration. Example: 10
scenario
Text
A brief description of the scenario for documentation and organizational purposes. This helps categorize and manage different types of conversation tests. Example: “Flight booking scenario for business travelers”

Using Scenarios with the Conversation Simulator

Scenarios are specifically designed to work with Galtea’s Conversation Simulator. When you run evaluations using scenarios, the system will:
  1. Initialize the conversation using the initial_prompt and user_persona
  2. Generate realistic user responses based on the goal and conversation history
  3. Continue the dialogue until one of the stopping_criterias is met or max_iterations is reached
  4. Evaluate the conversation using your selected metrics to assess performance
For detailed implementation examples, see the Conversation Simulator Tutorial.

Best Practices for Manually Creating Scenarios

  • Make goals specific and measurable
  • Include realistic constraints (time, budget, preferences)
  • Vary complexity across different scenarios
  • Consider both successful and unsuccessful outcomes
  • Include personality traits that affect communication style
  • Specify technical knowledge level and domain expertise
  • Define emotional states and motivations
  • Consider cultural and contextual factors
  • Include both positive outcomes (goal achieved) and negative outcomes (failure, frustration)
  • Account for edge cases where the conversation might stall
  • Use multiple criteria connected with ”|” or ”;” to cover various scenarios
  • Make criteria specific enough to be clearly identifiable
  • Test common user journeys and edge cases
  • Include scenarios with different complexity levels
  • Cover various user types and use cases
  • Test both cooperative and challenging user behaviors
I