Skip to content

Instantly share code, notes, and snippets.

Last active Oct 23, 2022
What would you like to do?
Cloud Operating Systems and Reconstituting the Monolith. tweet responses:


The first post has been published:

The second post has been adapted for Temporal:

these are bullet points of a blogpost on a topic i know very little about but i feel like there is something there that is happening as we speak

this might be better as two blog posts

Cloud Operating Systems

  • the Big 3 Cloud Providers are mostly (not exclusively) racing each other towards providing good cloud primitives.
    • arguably this is not the best way to perceive their strategy as it seems GCP/Azure are verticalizing rather than matching AWS horizontally, but that's not relevant here
  • Applications were originally envisioned to be run directly on these clouds, but, increasingly, intermediate providers are rising up to provide a better developer experience and enforce opinionated architectures (like JAMstack)
    • Netlify
    • Zeit
    • Glitch
    • Amplify
    • Binaris
    • Stackery
    • ???
  • The working name for this new generation of cloud providers, used by Martin Casado, Amjad Masad, and Guillermo Rauch, is "second layer" or "higher level" cloud providers.
  • Nobody loves these names. It doesn't tell you the value add. Also the name implies that more layers atop these layers will happen, and that is doubtful.
  • In the first (serverful) wave of Cloud, the abstraction from hardware to software was often explained as a 3 layer model: IaaS -> PaaS -> SaaS

  • But all the big clouds are essentially PaaSes now - OSes are increasingly being abstracted away. So maybe we can use "second layer PaaS"?
  • if we view the Big 3 as providing new "cloud primitives", then maybe a better name for "second layer clouds" is "Cloud Operating Systems". especially if the premise (if not the current reality) is your application seamlessly running across multiple clouds.

Reconstituting the Monolith

  • Serverless cannot proclaim total victory until we can recreate DHH's demo from 2005 in 15 minutes.
  • The plain fact is that has been hard to break up with the monolith - it is simply too handy to have everything in one place.
  • Serverless functions (Lambda) are nice, but not nearly enough to replace everything we used to do in a single runtime.
  • We can piece back everything with services and APIs, but this architecture is still far too bespoke and brittle and slow and leaky. (altho in theory we still get the benefits of everything being distributed, not worrying about horiz/vertical scaling, and pay-per-use pricing)
  • the jobs that monoliths do that we have to reconstitute in serverless-land:
    • static fileserving: often relegated to CDNs anyway
    • functions: marginal compute
    • gateway: for auth/sessions/rate limiting, etc
      • auth is a hard enough problem on its own that it is offered as a standalone service, altho really it is made up of other elements
    • socket management: for live subscriptions, maybe part of the gateway
    • jobrunners: for long running compute (aka batch processing?)
    • queue: for not dropping messages and jobs (aka stream processing?)
    • scheduler: for coordinating functions and jobrunners. at most basic level this is a cronjob, but you will eventually want a smarter scheduler for prioritizing work across limited allocated resources.
    • object/cold storage: slower, immutable, large, (long lived ?) persistence
    • database/hot storage: fast, mutable, small, (short lived ?) persistence
      • related jobs: searching, caching
    • (metajobs: error logging, usage logging, dashboarding, CI/CD)
    • (unique to cloud: latency aka edge computing. see victor bahl at msft)
  • each has to be able to talk to and make use of each other EASILY to match the DX of monoliths
  • keeping up with this stuff is a fulltime job, the media company covering this is literally called The New Stack
  • infinite scalability is nice, but not at the expense of infinite potential cost. a good cost cap + failover story is also important to DX. Users understand "sorry our service is temporarily down because of a sudden surge in demand", but the opposite of "sorry your bill this month is $1m because of a sudden surge in usage and it's up to you to figure out why" is less well accepted by developers and their employers
  • so maybe the answer to breaking the monolith up is to reconstitute the monolith inside the application framework - standard APIs that expose the various functions of a monolith.
  • the Serverless Framework is an early pioneer of this, but seems focused on the IaaC job rather than the unified interface job (and doesn't have as good an answer for non serverless stuff)
  • Zeit and Next.js take the monorepo -> microservices split rather seriously and have vertically aligned themselves all the way down to the frontend library layer - is there more to do here?
  • Redwood is TPW and team's effort to do this atop Netlify, but the db layer is currently on Heroku.
  • i think Cloud Operating Systems are well positioned to offer and coordinate these jobs and expose a good DX layer for users.
    • Binaris and focus on functions
    • Zeit and Netlify combine static fileserving with functions
    • Begin combines data with the above
    • Amplify adds storage with the above (and, for some reason, XR?!)
    • what about the other jobs of the monolith? currently, we are told to spin up services the regular old way. or duct tape together a bunch of solutions not designed for this task and not integrated with anything else.
    • not. good. enough.

I think the Cloud OS that reconsititutes the monolith earliest, will be a natural aggregator of every application developer moving to a serverless first world.

note - kevit scott - reprogramming the american dream, AI given infinite compute. the guy who built a supercomputer on aws.

again the mega caveat to all of the above is that i am a novice in this industry and am ignorant of both how hard it is to do all of this and the full capabilities of every platform

Copy link

rylandg commented Mar 4, 2020

@sw-yx I've spent some time around around HCI and have worked with quite a few customers that were either actively using it or had used it. You were wondering if it's overhyped, the answer is yes. That being said, I can't think of a single emerging tech that hasn't been overhyped in recent years, it's a real problem in the industry/world IMO. Hype aside, I think HCI is a real indicator of where the market wants to go.

In your original post, you describe an offering that provides all of the individual building blocks needed to create a holistic product, while maintaining your ability to choose which of those blocks are actually used. Even though all of the blocks have been wrapped in a nice "whole product offering" for you (Heroku, Firebase, Amplify, Netlify even), you're still the one who ultimately makes decisions about what services are used. While HCI relates partially to this conversation, there is a key difference. The premise of HCI is that compute and storage are inseparable and if you want one you implicitly want the other. This usually means, that by using HCI you forfeit the ability to scale compute and storage independently.

HCI is really about simplifying things down to a single unit. Because of this, HCI offerings tend to manifest as Appliances (I've heard the term "datacenter in a box" before). It's easy to see why an appliance is the path of least resistance for HCI, if you understand the basic rationale for HCI in the first place (less moving parts). I've personally worked with two different HCI appliances before (albeit minimally), the Hyperflex from Cisco and Nutanix ??? (maybe AOS). I didn't use either solution enough to form a confident opinion, but anecdotally Nutanix was very user friendly and the customer seemed very happy with it. On the other side of the coin, the Hyperflex customer I worked with was in the process of moving away from the solution and was quite frustrated. While I think the appliance route is the most convenient manifestation of HCI, it suffers from a big problem. Generally speaking, large enterprises are the ones who tend to invest in appliances. Unfortunately, HCI is not a great use case for many enterprises who often need to have a strong devops team anyway and also have a cost-driven need to scale compute resources independently. OTOH HCI appliance might actually make sense to a smaller shop that wants a more hands off approach to scaling on-prem infrastructure. From my observations, smaller shops tend to not go the appliance route.

What you should really keep your eye on is vsan from VMWare. Someone might yell at me and tell me that vsan is not really an HCI, which is obviously true. In practice most people who want HCI would be just as served by vsan. With vsan, instead of going the appliance route, vmware offers a Virtualized San layer that allows you to access storage from any DAS across the nodes. This provides a lot of the benefits people are looking for with HCI, without needing to physically bundle your compute and storage into a single unit.

I hope I made things clearer!

Copy link

sw-yx commented Mar 4, 2020

whoa, you really did, thank you!! TIL you worked with Nutanix.. old coworker of mine was a senior sales person there and i confess i never really understood what they did until you explained it in terms of HCI (which, btw, to me stands for Human Computer Interaction, haha).

Copy link

rylandg commented Mar 5, 2020

Glad it made sense. In my short time I've somehow been lucky enough to work with a lot of very diverse types of technology. Even then, I see at least 1 startup a day whose value prop completely goes over my head. When it happens it just excites me because its another reminder of how much new stuff there is to learn.

In regards to Nutanix, when I worked with their HCI offering I wasn't working directly with them (rather one of their customers). More recently my last company had a small engagement with Nutanix itself and while it didn't materialize, the Nutanix team members I worked with were incredibly intelligent and down to earth people. Left a good impression on me.

HCI used to mean human computer interfaces for me which I think is actually the same thing as human computer interaction. And although I do understand what Nutanix offers, I'm still waiting for someone to explain to me what ServiceNow does (10% joking 90% not lol).

Copy link

sw-yx commented Mar 22, 2020

tagging more info as i learn it - here's Joe Duffy of Pulumi talking about how the Cloud OS idea was first proposed by Dave Cutler (Windows NT architect)

(18 mins in)

and some links i found:


Terraform is kind of "self-rolled distros":

Copy link

sw-yx commented Apr 21, 2020

Copy link

sw-yx commented Jun 22, 2020

"we must treat the data center itself as one massive warehouse scale computer." urs holzle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment