Linux System Performance in the Cloud and Data Center

Linux is the dominant operating system in data centers and the cloud. Its robust networking and IO stacks can support high volume transaction processing. Linux has a rich set of resource management, monitoring and tracing capabilities. Well-tuned Linux systems can deliver low latency transactions and high throughput computing, even on commodity servers. This course introduces common methodologies for hosting workloads on Linux in the cloud and in data centers, including workload characterization, system profiling, performance management and benchmarking. The course is ideal for system administrators and solution integrators who want to learn the fundamentals of performance measurement, debugging and optimization methods used in these environments.

The course begins with measurement and tuning concepts. It reviews how the components of Linux kernel (scheduler, network and IO stacks) and application API (with asynchronous and multi-threaded programming) interact and work together seamlessly as scalable solutions. You will learn how to identify resource contention issues resulting in lower throughput and higher latencies. You’ll also learn how to use the Linux resource management framework (Cgroups, containers) and server virtualization technologies to improve agility in resource provisioning. Additionally, you’ll gain experience simulating production workload for problem isolation and benchmarking.


You will gain hands on experience using the rich set of monitoring and tracing tools available in Linux, including pidstat, iotop, fio, and sysbench, as well as advanced tools to perform full software stack analysis such as systemtap, perf and sysdig. Students will also be exposed to the key cloud technologies such as data sharding, auto-scaling, Service Oriented Architecture (SOA) and the DevOps model, which allow companies to deploy cloud native services to provide new services at a scale not possible in the data center-based environment.


Topics Include:



  • Linux performance matrices, management and tuning principles

  • Linux kernel (scheduler, network and IO stacks)

  • Application API (with asynchronous and multi-threaded programming)

  • How to use Linux performance monitoring and tracing tools and interpret results

  • How to simulate production workload for problem isolation and benchmarking

  • Finding performance bottlenecks and application latencies via advanced tool sets

  • Industry trends: data sharding and auto-scaling in public and private cloud



NOTE: Students are required to bring their own laptops to do labs in class.