coiled

Caution

Currently, the coiled backend can only be used if your workflow code is organized in a package due to how pytask imports your code and dask serializes task functions (issue).

coiled is a product built on top of dask that eases the deployment of your workflow to many cloud providers like AWS, GCP, and Azure.

Note that, coiled is a paid service. They offer a free monthly tier where you only need to pay the costs for your cloud provider and you can get started without a credit card.

They provide the following benefits which are especially helpful to people who are not familiar with cloud providers or remote computing.

  • coiled manages your resources by spawning workers if you need them and shutting them down if they are idle.

  • Synchronization of your local environment to remote workers.

  • Adaptive scaling if your workflow takes a long time to finish.

There are two ways how you can use coiled with pytask and pytask-parallel.

  1. Run individual tasks in the cloud.

  2. Run your whole workflow in the cloud.

Both approaches are explained below after the setup.

Setup

Follow coiled’s four step short process to set up your local environment and configure your cloud provider.

Running individual tasks

In most projects there are a just couple of tasks that require a lot of resources and that you would like to run in a virtual machine in the cloud.

With coiled’s serverless functions, you can define the hardware and software environment for your task. Just decorate the task function with a @coiled.function decorator.

import coiled


@coiled.function()
def task_example() -> None:
    pass

To execute the workflow, you need to turn on parallelization by requesting two or more workers or specifying one of the parallel backends. Otherwise, the decorated task is run locally.

pytask -n 2
pytask --parallel-backend loky

When you apply the @task decorator to the task, make sure the @coiled.function decorator is applied first, or is closer to the function. Otherwise, it will be ignored. Add more arguments to the decorator to configure the hardware and software environment.

import coiled
from pytask import task


@task
@coiled.function(
    region="eu-central-1",  # Run the task close to you.
    memory="512 GB",  # Use a lot of memory.
    cpu=128,  # Use a lot of CPU.
    vm_type="p3.2xlarge",  # Run a GPU instance.
)
def task_example() -> None: ...

By default, @coiled.function scales adaptively to the workload. It means that coiled infers from the number of submitted tasks and previous runtimes, how many additional remote workers it should deploy to handle the workload. It provides a convenient mechanism to scale without intervention. Also, workers launched by @coiled.function will shutdown quicker than a cluster.

See also

Serverless functions are more thoroughly explained in coiled’s guide.

Running a cluster

It is also possible to launch a cluster and run each task in a worker provided by coiled. Usually, it is not necessary and you are better off using coiled’s serverless functions.

If you want to launch a cluster managed by coiled, register a function that builds an executor using coiled.Cluster.

import coiled
from pytask_parallel import ParallelBackend
from pytask_parallel import registry
from concurrent.futures import Executor


def _build_coiled_executor(n_workers: int) -> Executor:
    return coiled.Cluster(n_workers=n_workers).get_client().get_executor()


registry.register_parallel_backend(ParallelBackend.CUSTOM, _build_coiled_executor)

Then, execute your workflow with

pytask --parallel-backend custom