Skip to main content

Containers

link_preview

Quickstart: Run a worker

Step 1: run a worker on your computer(s)

To run containers, you need to run a worker:

run in local modeStart here. All data is local to this computer, never to cloud storage. Also quicker and simpler. Desktop only, mobile/tablet not yet supported.
run in remote modeIf you want access to scaleable compute resources, cloud storage, or viewing workflows on phones and tablets
tip

Soon on our roadmap are cloud compute resources on demand, with zero configuration. Until then, you must run your own workers.

Step 2: Copy queue to metapage.io settings

Go to settings



Set the “Remote Queue” to the queue from starting a worker in Step 1

If you ran the worker in local mode, you can ignore this step for now.

Step 3: Add a container

Either add a container from searching or add directly.

Via search:

Via [+ Add] a “Docker Container”:

Step 4: Configure and run the container

Configure your docker job, then run

Run docker containers in the browser

Metapage workflows run in the browser. However many scientific workflows need to run languages like python or R, which do not (yet) run natively in the browser. The container metaframe (container.mtfm.io) solves this problem by providing public compute queues (a “grid”) that metapage workflows can “plug” into like an electrical grid.

You can provide your own computer(s) to your personal grid (queue), or plug a cluster into the grid so your entire team can share the same resources (coming soon). When someone else runs your workflows, they will automatically run it on their own grid, or their own computer.

We abstract away where compute is done, so you don’t have to think about it.

Docker environment

System Environment Variables:

environment variabledescription
JOB_INPUTSDefault: /inputs. The directory where job input files are copied.
JOB_OUTPUTSDefault: /outputs. The directory where job output files will be copied when the job finishes successfully.
JOB_CACHEDefault: /job-cache. Shared directory for caching e.g. large models.

Inputs, outputs, and caching

  • Inputs are copied into the directory /inputs. The env var JOB_INPUTS is set to this directory.
  • Copy output files to /outputs. The env var JOB_OUTPUTS is set to this directory.

Define Inputs and Outputs

In Settings / Definition you can define inputs and outputs. This doesn't change how the code runs, but it helps to quickly connect other metaframes in metapages.

In this example, we defined an input: input.json and an output data.csv:

You will see these inputs and outputs automatically in the metapage editor.

Directory for caching data and large ML models

The directory defined in the env var JOB_CACHE (defaults to /job-cache) is shared between all jobs running on a host. Use this location to store large data sets and models.

The cache is not shared between worker instances, only between jobs running on a single instance or computer. It comes with no guarantees.

Description

container.mtfm.io runs docker containers on workers. It is currently in beta.

  • Run any publicly available docker image: Python, R, C++, Java, ... anything.
  • Bring your own workers
    • Currently individual machines are supported, but kubernetes and nomad support coming soon
  • Your queue is simply an unguessable hash. Do not share it without consideration.

Use cases:

  • machine learning pipelines
  • data analysis workflows

Any time the inputs change (and on start) the configured docker container is run:

  • /inputs is the location where inputs are copied as files
  • /outputs: any files here when the container exits are passed on as metaframe outputs

Versioned. Reproducible. No client install requirements, as long as you have at least one worker running somewhere, you can run any programming language.

Getting started

  1. Create a queue

    • Click the connect button in the bottom-right
    • A "queue" is simply string or key
    • The part of the URL that looks like #?queue=my-queue-e7eebea2-c607-11ee-84de-b7a272dd08fc
    • Best if the queue value is a long impossible to guess string e.g. a GUID
    • Workers point to this queue, and run the configured docker jobs
  2. Configure the docker job

  3. Run a worker (or a bunch) pointing to a queue: public1 (warning: this is a public shared compute queue)

    docker run --pull always --restart unless-stopped -tid -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/tmp metapage/metaframe-docker-worker:0.54.0 run --cpus=4 --gpus=0 public1

If you have GPUs, you can add --gpus=1 (or more) to the worker command.

Example URL

Click here to run a python command in a container

Related topics