The whole reason I started developing Molnett late last year was due to Fly.io's amazing blog post about scheduling and orchestration. That post really resonated with me. So, I went on an even deeper dive into their content, reading about "Docker without Docker" or a public snapshot of their entire init process.

Inspired by this whole ordeal, I set out to implement my own orchestrator and accompanying services myself! This turned out to be a... How should I say it? A time-consuming and sleep-deriving endeavor. Essentially all of my nights of January and February of 2023 (usually until 2AM every night) was spent reading and experimenting with all that Fly.io has been writing about. And it was probably the most fun I've had with programming in years!

A Load Balancer? You got it.

I started working a small proxy in Rust that accepts traffic over HTTP/2 (supports multiplexing) and efficiently proxies it to VMs/processes that might be running HTTP/1.1 or HTTP/2, wherever that might be. The overhead was very minor and it's essentially a load balancer without all the fluff around it (Who cares about security anyway!?).

An Orchestrator? Yes ma'am!

The second project of the year was reading all the white papers about Orchestration mentioned in Fly.io's blog post mentioned earlier and seeing how far I could get implementing my own! This has likely been the most challening task I've taken on. Distributed programming is no joke and this was full of it!

I probably spent well over two weeks and actually got it working, in a way. I decided to go down the route of a Bidding Scheduler (is there a better word for this?)! Essentially, each scheduling request is done synchronously and can either fail or succeed. When a client wants to schedule a new workload, the client reaches out to the Orchestrator and asks it to run a certain workload with certain parameters. The Orchestrator in turn contacts all the worker nodes, asks them how much space they have left and if they are capable of running the workload. And they answer either yes, or no. During this time they are locked to prevent competition of resources. The orchestrator in turn takes a decision based on the information from all the workers, picks the best based on what you as an operator deems important (this is the scheduling part), and tells the worker to run it and returns that information back to the client.
If the process goes astray at any point, an error is returned all the way to the client so they can take a decision on what to do next. There's no state saved in an process somewhere that continuously tries to schedule workloads. Either you get one, or you don't.

This made so much sense to me and solves many fundamental problems that I don't want to deal with. There's no continuous process trying to make the world look and run perfect, there's no distributed state nor any consensus algorithms. Feels much more like traditional programming, and if you remember what I wrote about in the previous post, I wanted to get away from that type of complexity.

In the end, there was light.

These experiments taught me a lot about what I can do and shouldn't do! It was obvious to me that making all this from scratch on my own just wasn't going to work. It would take way too much time for me to get somewhere were I could run actual workloads on it.

Luckily, there is an alternative!

Enter the next contender, Kubernetes!

This is a hot topic! If you actively read Hacker News, you will have read about how Kubernetes is a complex BEAST and completely unnecessary for almost everyone. And they are absolutely correct!

Where Kubernetes shines is in the hands of an Operator, not in the hands of a Developer. The reason why Kubernetes has become so well adapted the recent years is in my eyes due to the powerful eco-system around it. There's countless of companies and individuals working on Open-Source project in every problem space you can think of, and we can take advantage of these building blocks to make something cohesive!

When my co-founder Mikael came into the picture in March, he was quick to question why I wasn't just using Kubernetes. I think I was just burned out from using it so much as a developer, rather than building a platform on top of it as an operator, but I gave it a chance anyway. I started experimenting with building an Operator using Kubebuilder and it took me less than a month to get to a point where I had all the pieces in place to run a serverless platform. At that point it was obvious to me that Kubernetes was the way forward for us!

At this point, I'm working on all the parts that are required to run external workloads on Kubernetes. This is where we will have to spend all of our energy to make something good while keeping it safe, which is not an easy task in any regard. Luckily for us we can use projects such as Cilium, the Ory Stack and Kata Containers to get the features we need to run a multi-tenant platform safely.

What's next?

After a well-needed Summer vacation on beautiful Öland, I'm working hard on getting Molnett on it's feet so we can get users on it! This currently involves implementing an Organization type of ACL into our API using Ory Keto. Keto has proven to be a really amazing piece of software and I can't wait to write more about it in the following weeks!

Keep an eye out on our LinkedIn page or our Newsletter for upcoming posts.

Thank you for reading ❤️

Cloud Provider from Scratch - Picking a platform