As an SRE, capacity management, including demand forecasting, is a challenging routine when managing one of the world's largest Aurora/Mesos compute clusters. While Twitter Engineering moves forward with re-platforming to Kubernetes. Come hear me speak about implementing a technical solution that increased our existing clusters capacity by 9%—resulting in a multi-year saving of 9-figures in CAPEX by solving inefficiency challenges at datacenter scale.
- Discovering underutilized resources (CPU, Mem, and Network) within our storage platforms
- Implementing a modern solution based on past experience with:
- Solaris Zones
- Cgroups
- Namespaces
- Co-locate compute and storage workload on the same bare-metal machine