OpsMotive Academy

Navigation

Cloud Infrastructure

Linux Performance Triage for Platform Engineers

Read latency, memory pressure, and disk stalls with tooling that fits SSH-first workflows.

Duration
3 weeks
Format
Async labs with two live deep dives
Skill focus
Beginner
Project intensity
Medium

280,000 KRW

Informational tuition reference — no checkout on this static site.

Visual for Linux Performance Triage for Platform Engineers

Program narrative

A pragmatic tour of perf-oriented tooling on modern kernels, aimed at engineers who touch nodes but are not kernel developers. Labs include cgroup surprises, IO stack clues, and concise write-ups your incident channel will appreciate.

Included focus areas

  • • Latency histogram interpretation without chart overload
  • • cgroup v2 memory pressure signals
  • • Disk stall patterns and when to escalate to storage teams
  • • CPU scheduler basics tied to noisy neighbor cases
  • • Safe data collection during incidents
  • • Short postmortem sections that link evidence
  • • Checklists for handoffs to vendor support

Outcomes you can show

  1. Produce a triage checklist adopted in a lab incident drill
  2. Capture a three-step remediation with measured impact
  3. Pair-present findings without relying on dashboard-only storytelling

Mentor of record

Portrait for Marcus Reid

Marcus Reid

Senior DevOps instructor focused on measurable node behavior.

Participant notes

IO stack week mirrored a production mystery we had last quarter. Checklists are now in our wiki.
Talia · SRE

Straight answers

No; labs use VMs with injected slowdowns.