> ## Documentation Index
> Fetch the complete documentation index at: https://docs.galtea.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Waiting for Evaluations

> Poll evaluations until they leave PENDING status, then return the completed results.

After creating evaluations (via [`create()`](/sdk/api/evaluation/create) or [`run()`](/sdk/api/evaluation/run)), they start in `PENDING` status while the evaluation engine processes them. Use `wait_for()` to block until all evaluations have completed.

An evaluation is considered **complete** when its status is anything other than `PENDING`:
`SUCCESS`, `FAILED`, `SKIPPED`, or `PENDING_HUMAN`.

## Usage

<Tabs>
  <Tab title="By Evaluation IDs">
    Wait for specific evaluations you already have IDs for — typically from [`create()`](/sdk/api/evaluation/create) or from a [`run()`](/sdk/api/evaluation/run) with an agent.

    Returns evaluations in the **same order** as the input `evaluation_ids`.

    **Create and wait:**

    ```python theme={"system"}
    # Create evaluations and wait for them to complete
    evaluations = galtea.evaluations.create(
        session_id=session.id,
        metrics=[{"name": "Non-Toxic"}, {"name": "Unbiased"}],
    )

    # Wait for all evaluations to leave PENDING status
    completed = galtea.evaluations.wait_for(
        evaluation_ids=[e.id for e in evaluations],
    )

    for evaluation in completed:
        print(f"{evaluation.id}: {evaluation.status} — score: {evaluation.score}")
    ```

    **Custom timeout and poll interval:**

    ```python theme={"system"}
    # Wait with a custom timeout and poll interval
    completed = galtea.evaluations.wait_for(
        evaluation_ids=[e.id for e in evaluations],
        timeout=600,  # wait up to 10 minutes
        poll_interval=10,  # check every 10 seconds
    )
    ```

    **Full lifecycle — `run()` with agent, then `wait_for()`:**

    ```python theme={"system"}
    # Full lifecycle: run with agent, then wait for evaluations to finish processing
    result = galtea.evaluations.run(
        version_id=version_id,
        agent=my_agent,
    )

    # run() with agent returns evaluations in PENDING status — wait for them to complete
    evaluation_ids = [e.id for e in result["evaluations"]]
    completed = galtea.evaluations.wait_for(evaluation_ids=evaluation_ids)

    for evaluation in completed:
        print(f"Metric {evaluation.metric_id}: {evaluation.status} — {evaluation.score}")
    ```
  </Tab>

  <Tab title="By Job ID">
    Wait for an endpoint-connection job to complete, then automatically discover and collect all evaluations it produced. Use this when calling [`run()`](/sdk/api/evaluation/run) **without an agent**, since evaluation IDs are not available until the job finishes.

    The method handles the full lifecycle:

    1. Polls the job status until it completes
    2. Discovers all sessions created by the job
    3. Waits for evaluations to leave `PENDING` status
    4. Paginates through all results

    No specific result ordering is guaranteed.

    ```python theme={"system"}
        # Endpoint-connection mode: run() returns a jobId instead of evaluations
        result = galtea.evaluations.run(version_id=version_id)
        job_id = result["jobId"]

        # Wait for the job to complete and all evaluations to finish
        completed = galtea.evaluations.wait_for(job_id=job_id, timeout=600)

        for evaluation in completed:
            print(f"Metric {evaluation.metric_id}: {evaluation.status} — {evaluation.score}")
    ```

    <Tip>
      If you need to stop the job before it finishes, use [`galtea.jobs.cancel(job_id)`](/sdk/api/job/cancel) instead of waiting.
    </Tip>
  </Tab>
</Tabs>

## Returns

A list of [Evaluation](/concepts/product/version/session/evaluation) objects once all have left `PENDING` status. When using `evaluation_ids`, results are in the same order as the input. When using `job_id`, no specific result ordering is guaranteed.

## Parameters

<ResponseField name="evaluation_ids" type="list[str]">
  A list of evaluation IDs to wait for. Mutually exclusive with `job_id`.
</ResponseField>

<ResponseField name="job_id" type="str">
  The job ID returned by [`run()`](/sdk/api/evaluation/run) in endpoint-connection mode. Mutually exclusive with `evaluation_ids`.
</ResponseField>

<ResponseField name="timeout" type="int" default="300">
  Maximum seconds to wait before raising `TimeoutError`. When using `job_id`, this covers both the job polling and evaluation polling phases.
</ResponseField>

<ResponseField name="poll_interval" type="int" default="5">
  Seconds to sleep between polling cycles.
</ResponseField>

## Errors

| Error          | Cause                                                            |
| -------------- | ---------------------------------------------------------------- |
| `ValueError`   | Neither `evaluation_ids` nor `job_id` provided, or both provided |
| `RuntimeError` | The job failed (only when using `job_id`)                        |
| `TimeoutError` | Timeout exceeded before all evaluations completed                |