Welcome to Brei
Brei is a small workflow system in Python. The primary reason for creating Brei is to replace GNU Make, in order to be compatible on systems that are naturally deprived of this wonder of human ingenuity. In a nutshell:
[[task]]
description = "Greet the Globe"
stdout = "hello.txt"
script = "echo 'Hello, World!'"
- No new syntax: programmable workflows in TOML or JSON files.
- Efficient: Runs tasks lazily and in parallel.
- Feature complete: Supports templates, variables, includes and configurable runners.
- Few dependencies: Only needs Python ≥3.11.
- Small codebase: Brei is around 1000 lines of Python.
pip install brei
Why
Why yet another workflow tool? This tool was developed as part of the Entangled project, but can be used on its own. Brei is meant to perform small scale automisations for literate programming in Entangled, like generating figures, and performing computations locally. It requires no setup to work with and workflows are easy to understand by novice users. If you have any more serious needs than that, we’d recommend to use a more tried and proven system, of which there are too many to count.
When to use
You’re running a project, there’s lots of odds and ends that need
automisation. You’d use a Makefile
but your friend is on
Windows and doesn’t have GNU Make installed. You try to ship a product
that needs this, but don’t want to confront people trying it for the
first time with a tonne of stuff they’ve never heard of.
Running
Brei is available on PyPI:
pip install brei
Although for Python we recommend using virtual environments, for example Poetry. Once you’ve setup a project in Poetry
poetry add brei
Then brei
should be available as a command-line
executable.
brei --help
Usage: brei [-h] [-i INPUT_FILE] [-f] [-j JOBS] [-v] [-l] [-d] [targets ...]
Build one of the configured targets.
Positional Arguments:
targets names of tasks to run
Options:
-h, --help show this help message and exit
-i, --input-file INPUT_FILE
Brei TOML or JSON file, use a `[...]` suffix to
indicate a subsection.
-f, --force-run, -B rebuild all dependencies
-j, --jobs JOBS limit number of concurrent jobs
-v, --version print version number and exit
-l, --list-runners show default configured runners -d, --debug more verbose logging
How it works
You give Brei a list of tasks that may depend on one another. Brei will run these when input files are newer than the target. Execution is lazy and in parallel.
Tasks
Tasks are the elemental units of work in Brei. A task is the single execution of a given script, and can be indicated to depend on previous tasks by explicitly listing targets and dependencies.
[[task]]
creates = ["hello.txt"]
runner = "bash"
script = "echo 'Hello, World!' > hello.txt"
[[task]]
name = "all"
requires = ["hello.txt"]
[[task]]
name = "clean"
script = "rm hello.txt"
This defines to named tasks all
and clean
,
where all
depends on the creation of a file
hello.txt
. Giving a name
to a task is similar
to creating a ‘phony’ target in Make.
Templates
We can use patterns to create reusable items. Variables follow
Python’s string.Template
syntax (similar to many scripting
languages), ${var_name}
substitutes for the contents of the
var_name
variable. Use two dollar signs $$
to
make a $
literal.
[[task]]
description = "Creating temporary directory"
stdout = "var(dir)"
script = "mktemp -d"
[template.echo]
stdout = "${stdout}"
script = "echo ${text}"
[template.rot13]
stdout = "${stdout}"
stdin = "${stdin}"
script = "tr a-zA-Z n-za-mN-ZA-M"
[[call]]
template = "echo"
[call.args]
stdout = "${dir}/secret.txt"
text = "Uryyb, Jbeyq!"
[[call]]
template = "rot13"
[call.args]
stdin = "${dir}/secret.txt"
stdout = "${dir}/msg.txt"
[[task]]
name = "all"
requires = ["${dir}/msg.txt"]
script = "cat ${dir}/msg.txt"
brei -i examples/rot13.toml all
[13:06:56] INFO Creating temporary directory
INFO creating `/tmp/tmp.8jWt35EFHl/secret.txt`
INFO creating `/tmp/tmp.8jWt35EFHl/msg.txt`
INFO #all Hello, World!
Multiplexing
You can call templates with lists of arguments to create many tasks.
There are two ways to combine multiple arguments: inner
and
outer
, configured with the join
argument. The
inner
product uses zip
to join the arguments,
while outer
uses itertools.product
to join.
The default is inner
.
[[task]]
description = "Creating temporary directory"
stdout = "var(dir)"
script = "mktemp -d"
[template.touch]
description = "${pre} ${a} ${b}"
creates = ["${dir}/${pre}-${a}-${b}"]
script = "touch '${dir}/${pre}-${a}-${b}'"
[[call]]
template = "touch"
collect = "inner"
[call.args]
pre = "inner"
a = ["x", "y", "z"]
b = ["1", "2", "3"]
[[call]]
template = "touch"
collect = "outer"
join = "outer"
[call.args]
pre = "outer"
a = ["x", "y"]
b = ["1", "2"]
[[task]]
name = "all"
requires = ["#inner", "#outer"]
The collect
argument creates a collection phony task
containing all items in the call.
brei -i examples/template_multiplexing.toml all
[13:06:57] INFO Creating temporary directory
INFO inner x 1
INFO inner y 2
INFO inner z 3
INFO outer x 1
INFO outer x 2
INFO outer y 1 INFO outer y 2
Variables
You may write the output of a command to the contents of a variable,
by using "var(name)"
as a target. For instance, in many
science applications its desirable to know which version of a software
generated some output.
[environment]
data_dir = "./data"
output_dir = "./output/${commit}"
[[task]]
stdout = "var(commit)"
script = "git rev-parse HEAD"
[[task]]
creates = ["${output_dir}/data.h5"]
requires = ["${data_dir}/input.h5", "#prepare"]
runner = "python"
path = "scripts/run.py"
[[task]]
name = "prepare"
script = """
mkdir -p ${output_dir}
ln -sf ${output_dir} output/latest
"""
[[task]]
name = "all"
requires = ["${output_dir}/data.h5"]
Also note the following:
- Instead of listing a
script
you can give apath
to an existing script. - Named (phony) targets are referenced with a hash symbol
#
. - Tasks with no targets are always run.
- The
[environment]
item lists global variables. - All string substitution is done lazily.
Includes
You can include parts of a workflow from other files, both TOML and JSON.
[template.echo]
stdout = "${stdout}"
script = "echo '${text}'"
include = [
"./echo.toml"
]
[[call]]
template = "echo"
[call.args]
stdout = "hello.txt"
text = "Hello, World!"
[[task]]
name = "all"
requires = ["hello.txt"]
It is even possible to include files that still need to be generated. The following generates a file with ten tasks, includes that file and runs those tasks.
include = [
"${dir}/gen.json"
]
[[task]]
description = "Creating temporary directory"
stdout = "var(dir)"
script = "mktemp -d"
[[task]]
description = "Generating workflow"
stdout = "${dir}/gen.json"
runner = "python"
script = """
import json
tasks = [
{"stdout": f"${dir}/out{i}.dat",
"script": f"echo '{i}'"} for i in range(10)
]
tasks.append({"name": "write-outs", "requires": [
f"${dir}/out{i}.dat" for i in range(10)
]})
print(json.dumps({"task": tasks}))
"""
[[task]]
name = "all"
requires = ["#write-outs"]
brei -i examples/include-gen.toml all
[13:06:57] INFO Creating temporary directory
INFO Generating workflow
INFO creating `/tmp/tmp.mnHEQwi0vC/out0.dat`
INFO creating `/tmp/tmp.mnHEQwi0vC/out1.dat`
INFO creating `/tmp/tmp.mnHEQwi0vC/out2.dat`
INFO creating `/tmp/tmp.mnHEQwi0vC/out3.dat`
INFO creating `/tmp/tmp.mnHEQwi0vC/out4.dat`
INFO creating `/tmp/tmp.mnHEQwi0vC/out5.dat`
INFO creating `/tmp/tmp.mnHEQwi0vC/out6.dat`
INFO creating `/tmp/tmp.mnHEQwi0vC/out7.dat`
INFO creating `/tmp/tmp.mnHEQwi0vC/out8.dat` INFO creating `/tmp/tmp.mnHEQwi0vC/out9.dat`
Custom Runner
By default, the contents of script
is split in lines,
then each line is passed through Python’s shlex.split
function and then run using asyncio.create_subprocess_exec
.
What that means is that the script will perform the same operations on
all platforms, and arguments are collected similar to a normal Unix
shell or Windows command prompt. However, you can choose to have the
script run by any other means by providing the runner
argument.
[[task]]
description = "Creating temporary directory"
stdout = "var(dir)"
script = "mktemp -d"
[runner.lua]
command = "lua"
args = ["${script}"]
[[task]]
runner = "lua"
stdout = "${dir}/hello.txt"
script = """
function fact (n)
if n == 0 then
return 1
else
return n * fact(n-1)
end
end
print("10! = ", fact(10))
"""
[[task]]
name = "all"
requires = ["${dir}/hello.txt"]
script = "cat ${dir}/hello.txt"
brei -i examples/custom-runner.toml all
[13:06:57] INFO Creating temporary directory
INFO creating `/tmp/tmp.aR1Hs5SO2H/hello.txt`
INFO #all 10! = 3628800
There are a number of runners configured by default.
brei --list-runners
Default Runners
runner ┃ executable ┃ arguments
━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━
python │ python │ ['${script}'] bash │ bash │ ['${script}']
Forcing a rerun
Sometimes you want to always rerun a task no matter what.
[[task]]
name = "test"
requires = ["#coverage-report"]
[[task]]
description = "Print coverage info"
name = "coverage-report"
requires = [".coverage"]
script = "coverage report"
[[task]]
creates = [".coverage"]
force = true
description = "Run tests"
script = "coverage run --source=brei -m pytest"
brei -i examples/force_run.toml test
[13:06:57] INFO Run tests
============================= test session starts ==============================
platform linux -- Python 3.11.8, pytest-7.4.4, pluggy-1.5.0
rootdir: /mnt/data/Code/entangled/brei
plugins: asyncio-0.21.2, hypothesis-6.115.3, timeout-2.3.1
asyncio: mode=Mode.STRICT
collected 21 items
test/test_construct.py . [ 4%]
test/test_history.py .. [ 14%]
test/test_lazy.py .. [ 23%]
test/test_loom.py .... [ 42%]
test/test_phony.py . [ 47%]
test/test_program.py ........ [ 85%]
test/test_result.py . [ 90%]
test/test_template_strings.py .. [100%]
============================== 21 passed in 0.84s ==============================
[13:06:58] INFO Print coverage info
Name Stmts Miss Cover
----------------------------------------------
brei/__init__.py 7 0 100%
brei/async_timer.py 12 0 100%
brei/cli.py 87 55 37%
brei/construct.py 86 16 81%
brei/errors.py 28 5 82%
brei/lazy.py 102 11 89%
brei/logging.py 13 6 54%
brei/program.py 116 16 86%
brei/result.py 28 2 93%
brei/runner.py 6 0 100%
brei/task.py 260 15 94%
brei/template_strings.py 38 0 100%
brei/utility.py 21 0 100%
brei/version.py 2 0 100%
---------------------------------------------- TOTAL 806 126 84%
Remarks
- If you need your workflows also execute on a Windows machine, it is advised to write scripts for the default runner (lists of commands) or in Python.
- Brei is not meant for building programs, so it doesn’t have the same
feature set as GNU Make. If you need more complex logic, you can write a
Brei generator. The generator creates tasks, writes them to JSON and
then Brei can
include
the result from your generator task. - TOML is nice but not ideal: it can be tricky to see the difference
beteween single
[...]
and double[[...]]
square brackets. Sometimes TOML syntax will lead to very verbose notation, however, current alternatives are all worse. - Many modern programming languages that we like (Python, Rust, Julia) have their project settings in a TOML file. This way your Brei workflow can piggy-back on project files that are already there.
License
Copyright 2023 Netherlands eScience Center, Licensed under the Apache License, Version 2.0, see LICENSE.