Custom Executors¶

Note

Please, consider contributing your executor to pytask-parallel if you believe it could be helpful to other people. Start by creating an issue or a draft PR.

pytask-parallel allows you to use any parallel backend as long as it follows the interface defined by concurrent.futures.Executor.

In some cases, adding a new backend can be as easy as registering a builder function that receives n_workers and returns the instantiated executor.

Important

Place the following code in any module that will be imported when you are executing pytask. For example, the src/project/config.py in your project, the src/project/__init__.py or the task module directly.

from concurrent.futures import Executor

from my_project.executor import CustomExecutor  # ty: ignore[unresolved-import]

from pytask_parallel import ParallelBackend
from pytask_parallel import WorkerType
from pytask_parallel import registry


def build_custom_executor(n_workers: int) -> Executor:
    return CustomExecutor(max_workers=n_workers)


registry.register_parallel_backend(
    ParallelBackend.CUSTOM,
    build_custom_executor,
    # Optional defaults.
    worker_type=WorkerType.PROCESSES,
    remote=False,
)

Given the optional WorkerType pytask applies automatic wrappers around the task function to collect tracebacks, capture stdout/stderr and their like. Possible values are WorkerType.PROCESSES (default) or WorkerType.THREADS.

The remote keyword signals pytask that tasks are executed in remote workers without access to the local filesystem. pytask will then automatically sync local files to the workers. By default, pytask assumes workers have access to the local filesystem.

Now, build the project with your custom backend.

pytask --parallel-backend custom