Demonstrating the Datacenter Operating System – Mesosphere

ninkendo · on Dec 11, 2014

Definitely a nice UI to visualize the kind of abstractions that services like Marathon and Chronos offer. Kudos!

I'm interested in what's actually happening with the Cassandra/Kafka/etc "add-ons". Are they actually pure Mesos tasks running on ephemeral filesystems? Or are they hooking up to an external storage service to actually persist their data? (Or is this just a POC demo that isn't actually running Cassandra?)

In a typical mesos setup, persistent storage is punted, and left to be implemented outside of mesos. If you launch an application in a stock mesos environment, docker or not, you get an empty directory for your app to run out of. So with a naive implementation of Cassandra on mesos, you'd lose your data if all your nodes had to be restarted at once.

You can mitigate this by having a Cassandra-aware mesos scheduler that knows enough about Cassandra's data redundancy to maintain a "health level" of your database, but this still doesn't solve the issue of what happens when they all go down. You could do regular backups, but then you still have the potential to have stale data if the last backup you ran was X minutes ago and you have to bring everything back up.

The Mesos project (part of Apache, and distinct from Mesosphere) seems to be aware of this and is planning to offer up the primitives necessary for a scheduler to keep tasks "sticky" to a set of data (so you can re-launch Cassandra instances on the same nodes they used to be on) here: https://issues.apache.org/jira/browse/MESOS-1554 but it's not implemented or even fleshed out yet. I would be incredibly excited to see how they solved it if they're doing Cassandra in a "pure mesos" way.

ssk2 · on Dec 11, 2014

The DCOS utility looks great. I've spent way too long trying to get distributed systems installed and configured in the past. If this works as well as the demo shows, it'll be as magical as brew install for OS X.

I wonder how the interface scales with thousands of nodes - at some point you'd have too many doughnut charts for this to be meaningful, right?

preillyme · on Dec 11, 2014

If you want early access to the Mesosphere DCOS (Datacenter Operating System), fill out this form: http://www.mesosphere.com/product/#signup