A public compute grid for the age of AI and scientific social media
llm transcription:
An unsolved problem in scientific computing is that despite the enormous amount of resources theoretically available, These resources are in many ways not compatible or accessed in such a way that it is extraordinarily onerous and time-consuming to share, compute heavy or requiring workflows between collaborators with different types of compute resources even if they have the equivalent resources Thanks.
In other words, I can't just give you a scientific workflow and expect you to run it out of the box.
There's some work or progress in the space, for example, Jupyter Notebooks and Nextflow. But these different tech stacks are not built for sharing the compute resources between individuals not in the same institution.
Jupiter notebooks do not define their compute environment in a way that avoids... It's not automated, or there's a lot of extra machinery to automate Jupiter notebooks that are a whole other stack and process on themselves.
Whereas technology stacks like Nextflow are only really compatible within the same organization. It's quite difficult to share Nextflow workflows. And Nextflow workflows do not automatically come with visualization. They might generate static images, but these aren't interactive. There's no clear or publicly accessible defined way to get state or inputs. So again, I can't just publish a Nextflow workflow and expect anyone else to run it.
This really limits our ability to collaborate and publish and share. Because if the requirements for me to... ...reproduce... ...a workflow that you published requires me to install a whole new stack and configure it. which is very complex and not trivial at all, that is essentially closed off to me.
Metapages solves this problem by using a public grid. The way this works is that we provide an open-source, publicly available, publicly addressable and instantly defined queues, similar to GitHub Actions, GitLab Workflows, Google Cloud Build, AWS CodePipeline, Azure Pipelines if you've used those before, but with with long-term deep time persistence and execution of science as the focus.
The goal is to be able to represent scientific computation in a way that requires no setup, runs in the browser, and requires little to no setup (depending on your use case). The only decision the user might need to make is to add resources ($$) for large-scale expensive computation, so a simple dial.
Additional design decisions:
A unit of code and the context is defined as a URL, where everything is represented via a URL
Jobs are addressable by the sha of the definition
Jobs and outputs expire after a month
But you can use your own queue or implement or serve your own queue
This is an efficient way to serve a public good, where the data expires after some defined time, and the service is very efficient, and users can supply their own resources
This allows interesting possibilities such as safely and transparently sharing expensive compute resources among small groups of researchers/engineers
This public grid has the following properties
It's open source, which means of course you can take take it and run it yourself
But it's also deployed to a public endpoint And it's a new kind of open source system where it provides at low cost a publicly accessible resource, auto-scaling queues.
It's a public access point to separate what you need to run from where it runs.
This way resources can be provided in a multitude of different ways and we always aim to provide a local access equivalent.
This way, users have more control and power over their own data. It's trivial to add all kinds of OAuth and different kinds of access over the top of these public queues. And if the system is taken up, then organizations some organizations will have their own implementation of these cues for resource access and security reasons. This allows organizations to maintain ownership of what resources run on their system and what resources might have access to. But it also allows safe interoperability with user-defined code and visualization.
It allows users both safe and easy access to serious amounts of compute resources. Yet also makes these resources expressible in a form that is pub can be publicly shared if desired. Even if these resources weren't connected with other similar components, which is the goal here, is that this is kind of a web tile. Even if not shared, it is a extremely convenient recording of a way to reproduce the code in a given environment.
Lacking a consistent definition of how to build an environment to run code involves an enormous amount of manual process. We believe that the way to express a unit of computation is tied into how that is expressed. And in this case it is URLs and web containers. We believe that it is easy to replace the implementation, if desired, while providing a way to publicly specify how to build and run.
Here are some examples.
