Available on: >= 0.16.0

Run tasks as containers on Azure Batch VMs.

This runner will deploy the container for the task to a specified Azure Batch pool.

To launch the task on Azure Batch, there is only two main concepts you need to be aware of:

  1. Pool — mandatory, not created by the task. This is a pool composed of nodes where your task can run on.
  2. Job — created by the task runner; holds information about which image, commands, and resources to run on.

How does it work

In order to support inputFiles, namespaceFiles, and outputFiles, the Azure Batch task runner currently relies on resource files and output files which transit through Azure Blob Storage.

As with Process and Docker task runners, you can directly reference input, namespace and output files since you're in the same working directory: cat myFile.txt.

A full flow example

yaml
id: azure_batch_runner
namespace: company.team

tasks:
  - id: scrape_environment_info
    type: io.kestra.plugin.scripts.python.Commands
    containerImage: ghcr.io/kestra-io/pydata:latest
    taskRunner:
      type: io.kestra.plugin.azure.runner.Batch
      account: "{{secrets.account}}"
      accessKey: "{{secrets.accessKey}}"
      endpoint: "{{secrets.endpoint}}"
      poolId: "{{vars.poolId}}"
      blobStorage:
        containerName: "{{vars.containerName}}"
        connectionString: "{{secrets.connectionString}}"
    commands:
      - python main.py
    namespaceFiles:
      enabled: true
    outputFiles:
      - "environment_info.json"
    inputFiles:
      main.py: |
        import platform
        import socket
        import sys
        import json
        from kestra import Kestra

        print("Hello from Azure Batch and kestra!")

        def print_environment_info():
            print(f"Host's network name: {platform.node()}")
            print(f"Python version: {platform.python_version()}")
            print(f"Platform information (instance type): {platform.platform()}")
            print(f"OS/Arch: {sys.platform}/{platform.machine()}")

            env_info = {
                "host": platform.node(),
                "platform": platform.platform(),
                "OS": sys.platform,
                "python_version": platform.python_version(),
            }
            Kestra.outputs(env_info)

            filename = 'environment_info.json'
            with open(filename, 'w') as json_file:
                json.dump(env_info, json_file, indent=4)

        if __name__ == '__main__':
          print_environment_info()

Was this page helpful?