Runs Ensembling

Runs ensembling

An overview of runs ensembling.

Run ensembling is a technique where, from a single input, multiple runs are generated in parallel. Once all the runs succeed, the best run is selected based on some criteria. This technique is useful in situations such as:

  • Taking into account randomness in the model, and using repetitions to get a more reliable result.
  • Running the same model with different configurations to obtain the best results.
  • Comparing different models to determine which one is the best for a given input.

A normal run looks like the following image:

Normal run

In a normal run, the input is sent to an instance. This creates a unique run within the application. The run is then executed and the results are returned as an output.

An ensemble run works differently, as shown in the following image:

Ensemble run

In an ensemble run, the input is processed by an ensemble definition. The ensemble definition contains run groups, and these groups make it possible to create one or more runs from a single group, by means of repetitions.

Once the input is processed by the ensemble definition, one or more runs are created for each of the run groups. All the runs are executed in parallel, and once all the runs succeed, the best run is selected. Finally, the result is returned as an output.

An ensemble run uses the same mechanics as a normal run, this is, it is characterized by a singular run_id and it contains a single input and a single output. The difference is that the ensemble run acts as a parent run that produces multiple child runs.

To work with ensemble runs, please follow these steps:

  1. Understand how it works. This is a detailed overview of how ensemble runs work and how the results are selected.

  2. Create an ensemble definition. Ensemble definitions are meant to be reusable among ensemble runs, and you only need to create them once.

  3. Run with ensembling. Execute an ensemble run.

More information about ensembling can be found in these sections:

How it works

An ensemble run relies on the ensemble definition to create multiple child runs and then select the best one. As such, the ensemble definition is the key component of the ensemble run. The ensemble definition contains two core parts:

  • Run groups: These define how to generate runs from a single input.
  • Rules: These define how to select the best run from the generated runs.

After all the runs are executed, as defined by the run groups, the rules are evaluated and a single run is selected as the best run.

Run groups

A run group, as its name suggests, is a set of runs. The run group contains an instance which is responsible for running the input. The run group also contains the number of repetitions that will be executed. If the number of repetitions is greater than one, the input will be executed multiple times. A run group can also contain options that are used to configure each run in the group.

Consider the following example of run groups present in an ensemble definition:

{
  "run_groups": [
    {
      "id": "run-group-1",
      "instance_id": "production",
      "options": {
        "duration": "30",
        "threads": "2"
      },
      "repetitions": 3
    },
    {
      "id": "run-group-2",
      "instance_id": "production",
      "options": {
        "duration": "20",
        "threads": "4"
      },
      "repetitions": 2
    }
  ]
}
Copy

As a result, the following 5 runs would be created:

Run IndexRun Group IDInstance IDOptionsRepetition ID
1run-group-1production{"duration": "30", "threads": "2"}1
2run-group-1production{"duration": "30", "threads": "2"}2
3run-group-1production{"duration": "30", "threads": "2"}3
4run-group-2production{"duration": "20", "threads": "4"}1
5run-group-2production{"duration": "20", "threads": "4"}2

Rules

Rules are used to select the best run from the generated runs. The input before evaluating the rules is a set of runs. The output after evaluating the rules is a single run, which is the best one. Each rule has an index that gives the rules a priority order. The logic works as follows:

  • Rules are ordered by their index.

  • For the first rule: all the runs are sorted based on the objective and statistics_path of the rules. Each run should have a metric that can be extracted from its statistics, based on the path given by the rule’s statistics_path. If the objective is minimize, the metrics of the runs are sorted in ascending order. If the objective is maximize, the runs are sorted in descending order based on the metric.

  • The first run of the sorted list of runs is the best one for the first rule. For the remaining runs, the ones that fall within a tolerance of the first one (the best) are picked as well. A run falls within the tolerance if the delta (or difference) is less than or equal to the rule’s value. The difference can be absolute or relative.

    Here are the formulas used for calculating the delta, depending on the rule.

    Consider the following.

    \begin{flalign} -& x^* \text{ is the metric of the best run (the first run).} &\\ -& x \text{ is the metric of the run being compared.} &\\ \end{flalign}
    • absolute difference and minimize objective:

      \begin{equation} \Delta_{abs}^{min} = x - x^* \end{equation}
    • absolute difference and maximize objective:

      \begin{equation} \Delta_{abs}^{max} = x^* - x \end{equation}
    • relative difference and minimize objective:

      \begin{equation} \Delta_{rel}^{min} = \frac{x - x^*}{x^*} \end{equation}
    • relative difference and maximize objective:

      \begin{equation} \Delta_{rel}^{max} = \frac{x^* - x}{x^*} \end{equation}
  • If there are no runs within the tolerance, it means that the search is complete and we found the best run.

  • If there is more than one run within the tolerance of the best one, move on to the next rule. Now, the list of competing runs is the best run and the runs that fell within the tolerance.

  • The next rule is applied in the same way as the previous one.

  • After all rules are applied, if there is more than one run left, the best one is picked randomly.

All the runs must have a status_v2 of succeeded and must contain JSON statistics in order to be successfully evaluated by the rules.

Consider the following example. You have the following list of child runs that were created for the parent ensemble run:

{
  "runs": [
    {
      "id": "run-1",
      "statistics": {
        "result": {
          "value": 500,
          "custom": {
            "unplanned_stops": 4,
            "max_travel_duration": 3600
          }
        }
      }
    },
    {
      "id": "run-2",
      "statistics": {
        "result": {
          "value": 298,
          "custom": {
            "unplanned_stops": 3,
            "max_travel_duration": 1200
          }
        }
      }
    },
    {
      "id": "run-3",
      "statistics": {
        "result": {
          "value": 325,
          "custom": {
            "unplanned_stops": 2,
            "max_travel_duration": 7200
          }
        }
      }
    }
  ]
}
Copy

Here are some cases of applying different rules and what the expected outcome would be.

Please note that the path in the statistics_path is specified using JSONPath notation.

  • Case 1

    • Best run:run-1
    • Explanation: After evaluating rule-1, there were no runs that fell within the tolerance. run-2 has a difference of 500-298=202 and run-3 has a difference of 500-325=175. The tolerance is an absolute value of 100, so no other runs are within the tolerance of run-1.
    {
      "rules": [
        {
          "id": "rule-1",
          "statistics_path": "$.result.value",
          "tolerance": {
            "type": "absolute",
            "value": 100
          },
          "objective": "maximize",
          "index": 0
        },
        {
          "id": "rule-2",
          "statistics_path": "$.result.custom.unplanned_stops",
          "tolerance": {
            "type": "relative",
            "value": 0.2
          },
          "objective": "minimize",
          "index": 1
        },
        {
          "id": "rule-3",
          "statistics_path": "$.result.custom.max_travel_duration",
          "tolerance": {
            "type": "relative",
            "value": 0.3
          },
          "objective": "minimize",
          "index": 2
        }
      ]
    }
    
    Copy
  • Case 2

    • Best run:run-3
    • Explanation: After evaluating rule-1, run-1 is the best given that it has the highest .result.value. The tolerance is now 180, which means that run-2 is not within the tolerance (as it has a difference of 500-298=202), but run-3 is, with a difference of 500-325=175. The incumbent runs are now run-1 and run-3. For rule-2, run-3 is the best run because it has the lowest number of unplanned stops, and the rule says to minimize the metric. The relative difference for run-1, with respect to run-3, is (4-2)/2=1, which is greater than the tolerance of 0.2. Given that run-1 does not fall within the tolerance, and only run-3 remains, it is selected as the best.
    {
      "rules": [
        {
          "id": "rule-1",
          "statistics_path": "$.result.value",
          "tolerance": {
            "type": "absolute",
            "value": 180
          },
          "objective": "maximize",
          "index": 0
        },
        {
          "id": "rule-2",
          "statistics_path": "$.result.custom.unplanned_stops",
          "tolerance": {
            "type": "relative",
            "value": 0.2
          },
          "objective": "minimize",
          "index": 1
        },
        {
          "id": "rule-3",
          "statistics_path": "$.result.custom.max_travel_duration",
          "tolerance": {
            "type": "relative",
            "value": 0.3
          },
          "objective": "minimize",
          "index": 2
        }
      ]
    }
    
    Copy
  • Case 3

    • Best run:run-2
    • Explanation: After evaluating rule-1, run-1 is the best given that it has the highest .result.value. The tolerance is now 250, which means that both run-2 and run-3 are within the tolerance, with absolute differences of 500-298=202 and 500-325=175, respectively. The incumbent runs are now run-1, run-2, and run-3. For rule-2, run-3 is the best run because it has the lowest number of unplanned stops, and the rule says to minimize the metric. The tolerance is now 0.7, which means that only run-2 is within the tolerance, with a relative difference of (3-2)/2=0.5. run-1 has a relative difference of (4-2)/2=1, which is greater than the tolerance. The incumbent runs are now run-2 and run-3. For the last rule, rule-3, run-2 is the best run because it has the lowest max_travel_duration, and the rule says to minimize the metric. The relative difference for run-3, with respect to run-2, is (7200-1200)/1200=5, which is greater than the tolerance of 0.8. Given that only run-2 remains, it is selected as the best.
    {
      "rules": [
        {
          "id": "rule-1",
          "statistics_path": "$.result.value",
          "tolerance": {
            "type": "absolute",
            "value": 250
          },
          "objective": "maximize",
          "index": 0
        },
        {
          "id": "rule-2",
          "statistics_path": "$.result.custom.unplanned_stops",
          "tolerance": {
            "type": "relative",
            "value": 0.7
          },
          "objective": "minimize",
          "index": 1
        },
        {
          "id": "rule-3",
          "statistics_path": "$.result.custom.max_travel_duration",
          "tolerance": {
            "type": "relative",
            "value": 0.8
          },
          "objective": "minimize",
          "index": 2
        }
      ]
    }
    
    Copy
  • Case 4

    • Best run:run-2 or run-3
    • Explanation: This is almost the same example as showcased in Case 3. The only difference is the tolerance value of rule-3, which is now 7.5. Given that for rule-3, run-2 was the best run, and the relative difference of run-3 was 5, we can now see that run-3 is within the tolerance of 7.5. As such, all the rules were evaluated, and the best run is selected randomly between run-2 and run-3.
    {
      "rules": [
        {
          "id": "rule-1",
          "statistics_path": "$.result.value",
          "tolerance": {
            "type": "absolute",
            "value": 250
          },
          "objective": "maximize",
          "index": 0
        },
        {
          "id": "rule-2",
          "statistics_path": "$.result.custom.unplanned_stops",
          "tolerance": {
            "type": "relative",
            "value": 0.7
          },
          "objective": "minimize",
          "index": 1
        },
        {
          "id": "rule-3",
          "statistics_path": "$.result.custom.max_travel_duration",
          "tolerance": {
            "type": "relative",
            "value": 7.5
          },
          "objective": "minimize",
          "index": 2
        }
      ]
    }
    
    Copy

Create an ensemble definition

Note, all requests must be authenticated with Bearer Authentication. Make sure your request has a header containing your Nextmv Cloud API key, as such:

  • Key: Authorization
  • Value: Bearer <YOUR-API-KEY>
Authorization: Bearer <YOUR-API-KEY>
Copy

The ensemble definition contains the properties that make it possible to create one or more runs from a single input. You only need to create an ensemble definition once and use the definition when creating ensemble runs.

To create an ensemble definition, use the following endpoint:

POSThttps://api.cloud.nextmv.io/v1/applications/{application_id}/ensembles

Create an ensemble definition.

Create an ensemble definition.

curl -L -X POST \
    "https://api.cloud.nextmv.io/v1/applications/$APP_ID/ensembles" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $NEXTMV_API_KEY" \
    -d "{\"replace\": \"me\"}"
Copy

The request body must be a JSON object that follows the ensemble definition schema.

Run with ensembling

Once an ensemble definition is created, and you have the definition_id, you can use it to run with ensembling. To start an ensemble run, the process is very similar to starting a normal run. The difference lies in the configuration that is sent in the request body.

To run with ensembling, the run_type object must be present in the configuration:

{
  "configuration": {
    "run_type": {
      "type": "ensemble",
      "definition_id": "ensemble-definition-1"
    }
  }
}
Copy

Note that the definition_id must be the ID of an existing ensemble definition. The ensemble definition is meant to be reusable among ensemble runs.

This configuration is used within the context of the endpoint for starting a new application run.

POSThttps://api.cloud.nextmv.io/v1/applications/{application_id}/runs

New application run.

Create new application run.

curl -sS -L -X POST \
  "https://api.cloud.nextmv.io/v1/applications/$APP_ID/runs?instance_id=$INSTANCE_ID" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $NEXTMV_API_KEY" \
  -d $(cat $INPUT_FILE | jq '{"input": ., "options": {"duration": "2"}}' -c) | jq
Copy

You can poll or use webhooks to await for the ensemble run to be finished, signaled by a status_v2 of succeeded. Once the ensemble run is finished, you can get the results.

When using the endpoint to get the results of an esemble run, the results for the best child run are returned. This allows you to use the same workflow of normal runs with the ensemble runs.

GEThttps://api.cloud.nextmv.io/v1/applications/{application_id}/runs/{run_id}

Get run result.

Get the result of a run.

curl -sS -L -X GET \
  "https://api.cloud.nextmv.io/v1/applications/$APP_ID/runs/$RUN_ID" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $NEXTMV_API_KEY" | jq
Copy

To get the ensemble run result, which is the detailed information about how the best run was chosen, you can use the following endpoint.

GEThttps://api.cloud.nextmv.io/v1/applications/{application_id}/runs/{run_id}/ensemble

Get ensemble run results.

Get ensemble run results specified by application and run ID.

curl -L -X GET \
    "https://api.cloud.nextmv.io/v1/applications/$APP_ID/runs/$RUN_ID/ensemble" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $NEXTMV_API_KEY"
Copy

Manage ensemble definitions

To manage the ensemble definitions, you can use the following endpoints:

  • List existing ensemble definitions.

    GEThttps://api.cloud.nextmv.io/v1/applications/{application_id}/ensembles

    List all ensemble definitions.

    List all ensemble definitions for an application.

    curl -L -X GET \
        "https://api.cloud.nextmv.io/v1/applications/$APP_ID/ensembles" \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $NEXTMV_API_KEY"
    
    Copy
  • Get an ensemble definition.

    GEThttps://api.cloud.nextmv.io/v1/applications/{application_id}/ensembles/{ensemble_id}

    Get ensemble definition.

    Get ensemble definition specified by application and ensemble ID.

    curl -L -X GET \
        "https://api.cloud.nextmv.io/v1/applications/$APP_ID/ensembles/$ENSEMBLE_ID" \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $NEXTMV_API_KEY"
    
    Copy
  • Update an ensemble definition.

    PATCHhttps://api.cloud.nextmv.io/v1/applications/{application_id}/ensembles/{ensemble_id}

    Update an ensemble definition.

    Update an ensemble definition.

    curl -L -X PATCH \
        "https://api.cloud.nextmv.io/v1/applications/$APP_ID/ensembles/$ENSEMBLE_ID" \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $NEXTMV_API_KEY" \
        -d "{\"replace\": \"me\"}"
    
    Copy
  • Delete an ensemble definition.

    DELETEhttps://api.cloud.nextmv.io/v1/applications/{application_id}/ensembles/{ensemble_id}

    Delete ensemble definition.

    Delete an ensemble definition.

    curl -L -X DELETE \
        "https://api.cloud.nextmv.io/v1/applications/$APP_ID/ensembles/$ENSEMBLE_ID" \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $NEXTMV_API_KEY"
    
    Copy

Ensemble definition schema

Here is an example of an ensemble definition:

{
  "id": "ensemble-definition-1",
  "name": "First ensemble definition",
  "description": "Ensemble definition description",
  "run_groups": [
    {
      "id": "run-group-1",
      "instance_id": "production",
      "options": {
        "duration": "30",
        "threads": "2"
      },
      "repetitions": 3
    },
    {
      "id": "run-group-2",
      "instance_id": "production",
      "options": {
        "duration": "20",
        "threads": "4"
      },
      "repetitions": 2
    }
  ],
  "rules": [
    {
      "id": "rule-1",
      "statistics_path": "$.result.value",
      "tolerance": {
        "type": "absolute",
        "value": 500
      },
      "objective": "maximize",
      "index": 0
    },
    {
      "id": "rule-1",
      "statistics_path": "$.result.custom.unplanned_stops",
      "tolerance": {
        "type": "relative",
        "value": 0.15
      },
      "objective": "minimize",
      "index": 1
    }
  ]
}
Copy

The ensemble definition follows this schema:

Field nameRequiredData typeDescription
idYesstringA unique identifier for the ensemble definition.
nameYesstringA human-readable name for the ensemble definition.
descriptionNostringAn optional description of the ensemble definition.
run_groupsYesarray of run_groupAn array of run groups that define how to generate runs.
rulesYesarray of ruleAn ordered array of rules that define how to select the best run.

A run_group determines how to create a group of runs from a single input. It follows this schema:

Field nameRequiredData typeDescription
idYesstringA unique identifier for the run group.
instance_idYesstringThe instance that will process the input.
optionsNoobjectThe options that will be used to configure the run. These options are applied on top of the instance’s options.
repetitionsYesintegerThe number of times the input will be executed.

A rule performs an evaluation on a set of runs and chooses the runs that best fulfill the given objective on the specified metric. It follows this schema:

Field nameRequiredData typeDescription
idYesstringA unique identifier for the rule.
statistics_pathYesstringThe JSONPath path to the metric contained in the .statistics of the run.
toleranceYestoleranceThe tolerance that will be used to compare the metric values.
objectiveYesstringThe objective that will be used to select the best run. Must be one of: maximize, minimize.
indexYesintegerThe index of the rule. The index gives different rules a priority order.

The tolerance specifies how to compare a run against the best run in a rule. It follows this schema:

Field nameRequiredData typeDescription
typeYesstringThe type of tolerance. Must be one of: absolute, relative.
valueYesfloatThe value of the tolerance.

Page last updated

Go to on-page nav menu