> ## Documentation Index > Fetch the complete documentation index at: https://docs.galtea.ai/llms.txt > Use this file to discover all available pages before exploring further. # Run Test-Based Evaluations > Learn how to run evaluations for a single-turn, test-based workflow. **Using specifications?** If your tests and metrics are linked to [Specifications](/concepts/product/specification), you can use [Specification-Based Evaluation](#specification-based-evaluation) below — it resolves metrics automatically. This tutorial also covers the manual loop approach for full control over each test case. To evaluate a product version against a predefined [Test](/concepts/product/test), you can loop through its [Test Cases](/concepts/product/test/case) and run an evaluation for each one. This workflow is ideal for regression testing and standardized quality checks. A [session](/concepts/product/version/session) is created **implicitly** the first time you run an evaluation for a specific `version_id` and `test_id` combination. You do not need to create it manually. ### Workflow Identify the `test_id` and `version_id` you want to evaluate. Fetch all test cases associated with the test using `galtea.test_cases.list()`. For each test case, call your product to get its output, create a session with `galtea.sessions.create()`, then use `galtea.inference_results.create_and_evaluate()` to log and evaluate the result. ### Example This example demonstrates how to run an evaluation on all test cases from a specific test. ```python theme={"system"} # 1. Fetch the test cases for the specified test test_cases = galtea.test_cases.list(test_id=test_id) print(f"Found {len(test_cases)} test cases for Test ID: {test_id}") # 2. Loop through each test case and run an evaluation for test_case in test_cases: # Get your product's actual response to the input actual_output = your_product_function(test_case.input) # Create a session and evaluate session = galtea.sessions.create(version_id=version_id, test_case_id=test_case.id) galtea.inference_results.create_and_evaluate( session_id=session.id, output=actual_output, metrics=METRICS_TO_EVALUATE, ) print(f"\nAll evaluations submitted for Version ID: {version_id}") ``` A [session](/concepts/product/version/session) is automatically created behind the scenes to link this `version_id` and `test_id` with the provided [inference result](/concepts/product/version/session/inference-result) (the `actual_output` and the [Test Case](/concepts/product/test/case)'s input). ### Specification-Based Evaluation Instead of passing metrics explicitly, you can evaluate against [Specifications](/concepts/product/specification). Each specification has linked metrics that the API resolves automatically. ```python theme={"system"} # 1. List the specifications for your product specifications = galtea.specifications.list(product_id=product_id) spec_ids = [spec.id for spec in specifications] print(f"Found {len(spec_ids)} specifications for product {product_id}") # 2. Create a session and record an inference result session = galtea.sessions.create(version_id=version_id, test_case_id=test_cases[0].id) inference_result = galtea.inference_results.create( session_id=session.id, input=test_cases[0].input, output=your_product_function(test_cases[0].input), ) # 3. Evaluate using specifications — the API resolves linked metrics automatically evaluations = galtea.evaluations.create( inference_result_id=inference_result.id, specification_ids=spec_ids, ) print(f"Created {len(evaluations)} evaluations from specifications") ``` If you omit both `metrics` and `specification_ids`, the API falls back to all metrics from every specification linked to the product. ## Next Steps Automate test resolution, agent execution, and evaluation with specifications. Evaluate multi-turn conversations and production sessions.