Containers
Quickstart: Run a worker
Step 1: run a worker on your computer(s)
To run containers, you need to run a worker:
run in local mode | Start here. All data is local to this computer, never to cloud storage. Also quicker and simpler. Desktop only, mobile/tablet not yet supported. |
run in remote mode | If you want access to scaleable compute resources, cloud storage, or viewing workflows on phones and tablets |
Soon on our roadmap are cloud compute resources on demand, with zero configuration. Until then, you must run your own workers.
Step 2: Copy queue to metapage.io settings
Go to settings
Set the “Remote Queue” to the queue from starting a worker in Step 1
If you ran the worker in local
mode, you can ignore this step for now.
Step 3: Add a container
Either add a container from searching or add directly.
Via search:
Via [+ Add] a “Docker Container”:
Step 4: Configure and run the container
Configure your docker job, then run
Run docker containers in the browser
Metapage workflows run in the browser. However many scientific workflows need to run languages like python or R, which do not (yet) run natively in the browser. The container metaframe (container.mtfm.io) solves this problem by providing public compute queues (a “grid”) that metapage workflows can “plug” into like an electrical grid.
You can provide your own computer(s) to your personal grid (queue), or plug a cluster into the grid so your entire team can share the same resources (coming soon). When someone else runs your workflows, they will automatically run it on their own grid, or their own computer.
We abstract away where compute is done, so you don’t have to think about it.
Docker environment
System Environment Variables:
environment variable | description |
---|---|
JOB_INPUTS | Default: /inputs . The directory where job input files are copied. |
JOB_OUTPUTS | Default: /outputs . The directory where job output files will be copied when the job finishes successfully. |
JOB_CACHE | Default: /job-cache . Shared directory for caching e.g. large models. |
Inputs, outputs, and caching
- Inputs are copied into the directory
/inputs
. The env varJOB_INPUTS
is set to this directory. - Copy output files to
/outputs
. The env varJOB_OUTPUTS
is set to this directory.
Define Inputs and Outputs
In Settings / Definition
you can define inputs and outputs. This doesn't change how the code runs, but it helps to quickly connect other metaframes in metapages.
In this example, we defined an input: input.json
and an output data.csv
:
You will see these inputs and outputs automatically in the metapage editor.
Directory for caching data and large ML models
The directory defined in the env var JOB_CACHE
(defaults to /job-cache
) is shared between all jobs running on a host. Use this location to store large data sets and models.
The cache is not shared between worker instances, only between jobs running on a single instance or computer. It comes with no guarantees.
Description
container.mtfm.io
runs docker containers on workers. It is currently in beta.
- Run any publicly available docker image:
Python
,R
,C++
,Java
, ... anything. - Bring your own workers
- Currently individual machines are supported, but kubernetes and nomad support coming soon
- Your queue is simply an unguessable hash. Do not share it without consideration.
Use cases:
- machine learning pipelines
- data analysis workflows
Any time the inputs change (and on start) the configured docker container is run:
/inputs
is the location where inputs are copied as files/outputs
: any files here when the container exits are passed on as metaframe outputs
Versioned. Reproducible. No client install requirements, as long as you have at least one worker running somewhere, you can run any programming language.
Getting started
-
Create a queue
- Click the connect button in the bottom-right
- A "queue" is simply string or key
- The part of the URL that looks like
#?queue=my-queue-e7eebea2-c607-11ee-84de-b7a272dd08fc
- Best if the
queue
value is a long impossible to guess string e.g. a GUID - Workers point to this queue, and run the configured docker jobs
-
Configure the docker job
-
Run a worker (or a bunch) pointing to a queue:
public1
(warning: this is a public shared compute queue)docker run --pull always --restart unless-stopped -tid -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/tmp metapage/metaframe-docker-worker:0.54.0 run --cpus=4 --gpus=0 public1
If you have GPUs, you can add --gpus=1
(or more) to the worker command.
Example URL
Click here to run a python command in a container