Runs ensembling

Run ensembling is a technique where, from a single input, multiple runs are generated in parallel. Once all the runs succeed, the best run is selected based on some criteria. This technique is useful in situations such as:

Taking into account randomness in the model, and using repetitions to get a more reliable result.
Running the same model with different configurations to obtain the best results.
Comparing different models to determine which one is the best for a given input.

A normal run looks like the following image:

Normal run

In a normal run, the input is sent to an instance. This creates a unique run within the application. The run is then executed and the results are returned as an output.

An ensemble run works differently, as shown in the following image:

Ensemble run

In an ensemble run, the input is processed by an ensemble definition. The ensemble definition contains run groups, and these groups make it possible to create one or more runs from a single group, by means of repetitions.

Once the input is processed by the ensemble definition, one or more runs are created for each of the run groups. All the runs are executed in parallel, and once all the runs succeed, the best run is selected. Finally, the result is returned as an output.

An ensemble run uses the same mechanics as a normal run, this is, it is characterized by a singular run_id and it contains a single input and a single output. The difference is that the ensemble run acts as a parent run that produces multiple child runs.

To work with ensemble runs, please follow these steps:

Understand how it works. This is a detailed overview of how ensemble runs work and how the results are selected.
Create an ensemble definition. Ensemble definitions are meant to be reusable among ensemble runs, and you only need to create them once.
Run with ensembling. Execute an ensemble run.

More information about ensembling can be found in these sections:

Manage ensemble definitions. This is the process of creating, updating, and deleting ensemble definitions.
Ensemble definition schema. This is the schema that defines the structure of an ensemble definition.

How it works

An ensemble run relies on the ensemble definition to create multiple child runs and then select the best one. As such, the ensemble definition is the key component of the ensemble run. The ensemble definition contains two core parts:

Run groups: These define how to generate runs from a single input.
Rules: These define how to select the best run from the generated runs.

After all the runs are executed, as defined by the run groups, the rules are evaluated and a single run is selected as the best run.

Run groups

A run group, as its name suggests, is a set of runs. The run group contains an instance which is responsible for running the input. The run group also contains the number of repetitions that will be executed. If the number of repetitions is greater than one, the input will be executed multiple times. A run group can also contain options that are used to configure each run in the group.

Consider the following example of run groups present in an ensemble definition:

JSON

{
  "run_groups": [
    {
      "id": "run-group-1",
      "instance_id": "production",
      "options": {
        "duration": "30",
        "threads": "2"
      },
      "repetitions": 3
    },
    {
      "id": "run-group-2",
      "instance_id": "production",
      "options": {
        "duration": "20",
        "threads": "4"
      },
      "repetitions": 2
    }
  ]
}

As a result, the following 5 runs would be created:

Run Index	Run Group ID	Instance ID	Options	Repetition ID
1	`run-group-1`	`production`	`{"duration": "30", "threads": "2"}`	1
2	`run-group-1`	`production`	`{"duration": "30", "threads": "2"}`	2
3	`run-group-1`	`production`	`{"duration": "30", "threads": "2"}`	3
4	`run-group-2`	`production`	`{"duration": "20", "threads": "4"}`	1
5	`run-group-2`	`production`	`{"duration": "20", "threads": "4"}`	2

Rules

Rules are used to select the best run from the generated runs. The input before evaluating the rules is a set of runs. The output after evaluating the rules is a single run, which is the best one. Each rule has an index that gives the rules a priority order. The logic works as follows:

Rules are ordered by their index.
For the first rule: all the runs are sorted based on the objective and statistics_path of the rules. Each run should have a metric that can be extracted from its statistics, based on the path given by the rule’s statistics_path. If the objective is minimize, the metrics of the runs are sorted in ascending order. If the objective is maximize, the runs are sorted in descending order based on the metric.
The first run of the sorted list of runs is the best one for the first rule. For the remaining runs, the ones that fall within a tolerance of the first one (the best) are picked as well. A run falls within the tolerance if the delta (or difference) is less than or equal to the rule’s value. The difference can be absolute or relative.
Here are the formulas used for calculating the delta, depending on the rule.
Consider the following.
- Math
\begin{flalign} -& x^* \text{ is the metric of the best run (the first run).} &\\ -& x \text{ is the metric of the run being compared.} &\\ \end{flalign}
- absolute difference and minimize objective:
  - Math
  \begin{equation} \Delta_{abs}^{min} = x - x^* \end{equation}
- absolute difference and maximize objective:
  - Math
  \begin{equation} \Delta_{abs}^{max} = x^* - x \end{equation}
- relative difference and minimize objective:
  - Math
  \begin{equation} \Delta_{rel}^{min} = \frac{x - x^*}{x^*} \end{equation}
- relative difference and maximize objective:
  - Math
  \begin{equation} \Delta_{rel}^{max} = \frac{x^* - x}{x^*} \end{equation}
If there are no runs within the tolerance, it means that the search is complete and we found the best run.
If there is more than one run within the tolerance of the best one, move on to the next rule. Now, the list of competing runs is the best run and the runs that fell within the tolerance.
The next rule is applied in the same way as the previous one.
After all rules are applied, if there is more than one run left, the best one is picked randomly.

All the runs must have a status_v2 of succeeded and must contain JSON statistics in order to be successfully evaluated by the rules.

Consider the following example. You have the following list of child runs that were created for the parent ensemble run:

JSON

{
  "runs": [
    {
      "id": "run-1",
      "statistics": {
        "result": {
          "value": 500,
          "custom": {
            "unplanned_stops": 4,
            "max_travel_duration": 3600
          }
        }
      }
    },
    {
      "id": "run-2",
      "statistics": {
        "result": {
          "value": 298,
          "custom": {
            "unplanned_stops": 3,
            "max_travel_duration": 1200
          }
        }
      }
    },
    {
      "id": "run-3",
      "statistics": {
        "result": {
          "value": 325,
          "custom": {
            "unplanned_stops": 2,
            "max_travel_duration": 7200
          }
        }
      }
    }
  ]
}

Here are some cases of applying different rules and what the expected outcome would be.

Please note that the path in the statistics_path is specified using JSONPath notation.

Case 1

Best run:run-1
Explanation: After evaluating rule-1, there were no runs that fell within the tolerance. run-2 has a difference of 500-298=202 and run-3 has a difference of 500-325=175. The tolerance is an absolute value of 100, so no other runs are within the tolerance of run-1.

JSON

{
  "rules": [
    {
      "id": "rule-1",
      "statistics_path": "$.result.value",
      "tolerance": {
        "type": "absolute",
        "value": 100
      },
      "objective": "maximize",
      "index": 0
    },
    {
      "id": "rule-2",
      "statistics_path": "$.result.custom.unplanned_stops",
      "tolerance": {
        "type": "relative",
        "value": 0.2
      },
      "objective": "minimize",
      "index": 1
    },
    {
      "id": "rule-3",
      "statistics_path": "$.result.custom.max_travel_duration",
      "tolerance": {
        "type": "relative",
        "value": 0.3
      },
      "objective": "minimize",
      "index": 2
    }
  ]
}

Case 2

Best run:run-3
Explanation: After evaluating rule-1, run-1 is the best given that it has the highest .result.value. The tolerance is now 180, which means that run-2 is not within the tolerance (as it has a difference of 500-298=202), but run-3 is, with a difference of 500-325=175. The incumbent runs are now run-1 and run-3. For rule-2, run-3 is the best run because it has the lowest number of unplanned stops, and the rule says to minimize the metric. The relative difference for run-1, with respect to run-3, is (4-2)/2=1, which is greater than the tolerance of 0.2. Given that run-1 does not fall within the tolerance, and only run-3 remains, it is selected as the best.

JSON

{
  "rules": [
    {
      "id": "rule-1",
      "statistics_path": "$.result.value",
      "tolerance": {
        "type": "absolute",
        "value": 180
      },
      "objective": "maximize",
      "index": 0
    },
    {
      "id": "rule-2",
      "statistics_path": "$.result.custom.unplanned_stops",
      "tolerance": {
        "type": "relative",
        "value": 0.2
      },
      "objective": "minimize",
      "index": 1
    },
    {
      "id": "rule-3",
      "statistics_path": "$.result.custom.max_travel_duration",
      "tolerance": {
        "type": "relative",
        "value": 0.3
      },
      "objective": "minimize",
      "index": 2
    }
  ]
}

Case 3

Best run:run-2
Explanation: After evaluating rule-1, run-1 is the best given that it has the highest .result.value. The tolerance is now 250, which means that both run-2 and run-3 are within the tolerance, with absolute differences of 500-298=202 and 500-325=175, respectively. The incumbent runs are now run-1, run-2, and run-3. For rule-2, run-3 is the best run because it has the lowest number of unplanned stops, and the rule says to minimize the metric. The tolerance is now 0.7, which means that only run-2 is within the tolerance, with a relative difference of (3-2)/2=0.5. run-1 has a relative difference of (4-2)/2=1, which is greater than the tolerance. The incumbent runs are now run-2 and run-3. For the last rule, rule-3, run-2 is the best run because it has the lowest max_travel_duration, and the rule says to minimize the metric. The relative difference for run-3, with respect to run-2, is (7200-1200)/1200=5, which is greater than the tolerance of 0.8. Given that only run-2 remains, it is selected as the best.

JSON

{
  "rules": [
    {
      "id": "rule-1",
      "statistics_path": "$.result.value",
      "tolerance": {
        "type": "absolute",
        "value": 250
      },
      "objective": "maximize",
      "index": 0
    },
    {
      "id": "rule-2",
      "statistics_path": "$.result.custom.unplanned_stops",
      "tolerance": {
        "type": "relative",
        "value": 0.7
      },
      "objective": "minimize",
      "index": 1
    },
    {
      "id": "rule-3",
      "statistics_path": "$.result.custom.max_travel_duration",
      "tolerance": {
        "type": "relative",
        "value": 0.8
      },
      "objective": "minimize",
      "index": 2
    }
  ]
}

Case 4

Best run:run-2 or run-3
Explanation: This is almost the same example as showcased in Case 3. The only difference is the tolerance value of rule-3, which is now 7.5. Given that for rule-3, run-2 was the best run, and the relative difference of run-3 was 5, we can now see that run-3 is within the tolerance of 7.5. As such, all the rules were evaluated, and the best run is selected randomly between run-2 and run-3.

JSON

{
  "rules": [
    {
      "id": "rule-1",
      "statistics_path": "$.result.value",
      "tolerance": {
        "type": "absolute",
        "value": 250
      },
      "objective": "maximize",
      "index": 0
    },
    {
      "id": "rule-2",
      "statistics_path": "$.result.custom.unplanned_stops",
      "tolerance": {
        "type": "relative",
        "value": 0.7
      },
      "objective": "minimize",
      "index": 1
    },
    {
      "id": "rule-3",
      "statistics_path": "$.result.custom.max_travel_duration",
      "tolerance": {
        "type": "relative",
        "value": 7.5
      },
      "objective": "minimize",
      "index": 2
    }
  ]
}

Create an ensemble definition

Note, all requests must be authenticated with Bearer Authentication. Make sure your request has a header containing your Nextmv Cloud API key, as such:

Key: Authorization
Value: Bearer <YOUR-API-KEY>

Bash

Authorization: Bearer <YOUR-API-KEY>

The ensemble definition contains the properties that make it possible to create one or more runs from a single input. You only need to create an ensemble definition once and use the definition when creating ensemble runs.

To create an ensemble definition, use the following endpoint:

POSThttps://api.cloud.nextmv.io/v1/applications/{application_id}/ensembles

Create an ensemble definition.

View API details

Bash

curl -L -X POST \
    "https://api.cloud.nextmv.io/v1/applications/$APP_ID/ensembles" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $NEXTMV_API_KEY" \
    -d "{\"replace\": \"me\"}"

The request body must be a JSON object that follows the ensemble definition schema.

Run with ensembling

Once an ensemble definition is created, and you have the definition_id, you can use it to run with ensembling. To start an ensemble run, the process is very similar to starting a normal run. The difference lies in the configuration that is sent in the request body.

To run with ensembling, the run_type object must be present in the configuration:

JSON

{
  "configuration": {
    "run_type": {
      "type": "ensemble",
      "definition_id": "ensemble-definition-1"
    }
  }
}

Note that the definition_id must be the ID of an existing ensemble definition. The ensemble definition is meant to be reusable among ensemble runs.

This configuration is used within the context of the endpoint for starting a new application run.

POSThttps://api.cloud.nextmv.io/v1/applications/{application_id}/runs

New application run.

Create new application run.

View API details

Bash
Output

curl -sS -L -X POST \
  "https://api.cloud.nextmv.io/v1/applications/$APP_ID/runs?instance_id=$INSTANCE_ID" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $NEXTMV_API_KEY" \
  -d $(cat $INPUT_FILE | jq '{"input": ., "options": {"duration": "2"}}' -c) | jq

You can poll or use webhooks to await for the ensemble run to be finished, signaled by a status_v2 of succeeded. Once the ensemble run is finished, you can get the results.

When using the endpoint to get the results of an esemble run, the results for the best child run are returned. This allows you to use the same workflow of normal runs with the ensemble runs.

GEThttps://api.cloud.nextmv.io/v1/applications/{application_id}/runs/{run_id}

Get run result.

Get the result of a run.

View API details

Bash
Output

curl -sS -L -X GET \
  "https://api.cloud.nextmv.io/v1/applications/$APP_ID/runs/$RUN_ID" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $NEXTMV_API_KEY" | jq

To get the ensemble run result, which is the detailed information about how the best run was chosen, you can use the following endpoint.

GEThttps://api.cloud.nextmv.io/v1/applications/{application_id}/runs/{run_id}/ensemble

Get ensemble run results.

Get ensemble run results specified by application and run ID.

View API details

Bash

curl -L -X GET \
    "https://api.cloud.nextmv.io/v1/applications/$APP_ID/runs/$RUN_ID/ensemble" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $NEXTMV_API_KEY"

Manage ensemble definitions

To manage the ensemble definitions, you can use the following endpoints:

List existing ensemble definitions.

GEThttps://api.cloud.nextmv.io/v1/applications/{application_id}/ensembles

List all ensemble definitions.

List all ensemble definitions for an application.

View API details

Bash

curl -L -X GET \
    "https://api.cloud.nextmv.io/v1/applications/$APP_ID/ensembles" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $NEXTMV_API_KEY"

Get an ensemble definition.

GEThttps://api.cloud.nextmv.io/v1/applications/{application_id}/ensembles/{ensemble_id}

Get ensemble definition.

Get ensemble definition specified by application and ensemble ID.

View API details

Bash

curl -L -X GET \
    "https://api.cloud.nextmv.io/v1/applications/$APP_ID/ensembles/$ENSEMBLE_ID" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $NEXTMV_API_KEY"

Update an ensemble definition.

PATCHhttps://api.cloud.nextmv.io/v1/applications/{application_id}/ensembles/{ensemble_id}

Update an ensemble definition.

View API details

Bash

curl -L -X PATCH \
    "https://api.cloud.nextmv.io/v1/applications/$APP_ID/ensembles/$ENSEMBLE_ID" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $NEXTMV_API_KEY" \
    -d "{\"replace\": \"me\"}"

Delete an ensemble definition.

DELETEhttps://api.cloud.nextmv.io/v1/applications/{application_id}/ensembles/{ensemble_id}

Delete ensemble definition.

Delete an ensemble definition.

View API details

Bash

curl -L -X DELETE \
    "https://api.cloud.nextmv.io/v1/applications/$APP_ID/ensembles/$ENSEMBLE_ID" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $NEXTMV_API_KEY"

Ensemble definition schema

Here is an example of an ensemble definition:

JSON

{
  "id": "ensemble-definition-1",
  "name": "First ensemble definition",
  "description": "Ensemble definition description",
  "run_groups": [
    {
      "id": "run-group-1",
      "instance_id": "production",
      "options": {
        "duration": "30",
        "threads": "2"
      },
      "repetitions": 3
    },
    {
      "id": "run-group-2",
      "instance_id": "production",
      "options": {
        "duration": "20",
        "threads": "4"
      },
      "repetitions": 2
    }
  ],
  "rules": [
    {
      "id": "rule-1",
      "statistics_path": "$.result.value",
      "tolerance": {
        "type": "absolute",
        "value": 500
      },
      "objective": "maximize",
      "index": 0
    },
    {
      "id": "rule-1",
      "statistics_path": "$.result.custom.unplanned_stops",
      "tolerance": {
        "type": "relative",
        "value": 0.15
      },
      "objective": "minimize",
      "index": 1
    }
  ]
}

The ensemble definition follows this schema:

Field name	Required	Data type	Description
`id`	Yes	`string`	A unique identifier for the ensemble definition.
`name`	Yes	`string`	A human-readable name for the ensemble definition.
`description`	No	`string`	An optional description of the ensemble definition.
`run_groups`	Yes	`array` of `run_group`	An array of run groups that define how to generate runs.
`rules`	Yes	`array` of `rule`	An ordered array of rules that define how to select the best run.

A run_group determines how to create a group of runs from a single input. It follows this schema:

Field name	Required	Data type	Description
`id`	Yes	`string`	A unique identifier for the run group.
`instance_id`	Yes	`string`	The instance that will process the input.
`options`	No	`object`	The options that will be used to configure the run. These options are applied on top of the instance’s options.
`repetitions`	Yes	`integer`	The number of times the input will be executed.

A rule performs an evaluation on a set of runs and chooses the runs that best fulfill the given objective on the specified metric. It follows this schema:

Field name	Required	Data type	Description
`id`	Yes	`string`	A unique identifier for the rule.
`statistics_path`	Yes	`string`	The JSONPath path to the metric contained in the `.statistics` of the run.
`tolerance`	Yes	`tolerance`	The tolerance that will be used to compare the metric values.
`objective`	Yes	`string`	The objective that will be used to select the best run. Must be one of: `maximize`, `minimize`.
`index`	Yes	`integer`	The index of the rule. The index gives different rules a priority order.

The tolerance specifies how to compare a run against the best run in a rule. It follows this schema:

Field name	Required	Data type	Description
`type`	Yes	`string`	The type of tolerance. Must be one of: `absolute`, `relative`.
`value`	Yes	`float`	The value of the tolerance.

An overview of runs ensembling.

How it works

Run groups

Rules

Create an ensemble definition

Create an ensemble definition.

Run with ensembling

New application run.

Get run result.

Get ensemble run results.

Manage ensemble definitions

List all ensemble definitions.

Get ensemble definition.

Update an ensemble definition.

Delete ensemble definition.

Ensemble definition schema

Logging

Execution classes

Contents