kubectl Command Center: The Sage's Essentials

3 min read

kubectl Command Center: The Sage's Essentials

Generating audio, please wait...

Essential Kubernetes Troubleshooting Guide Using kubectl

When working with Kubernetes from notebooks, CI/CD pipelines, or day-to-day operations, troubleshooting quickly is essential. A few well-chosen kubectl commands can help you identify failing workloads, inspect logs, navigate namespaces, and recover services faster.

This guide expands on the kubectl cheat sheet and provides detailed explanations, practical examples, and troubleshooting workflows that notebook users can reference when issues arise.

1. Create a Faster Workflow with Aliases

Typing kubectl repeatedly can slow you down. Create an alias:

alias k='kubectl'

Add it to your shell profile (~/.bashrc, ~/.zshrc) to make it permanent.

Why it helps

Faster command execution

Easier notebook snippets

Cleaner examples in documentation

2. The “What’s Broken?” View

When something fails, start by checking which pods are not healthy.

k get pods -A --field-selector=status.phase!=Running

What it does

-A: checks all namespaces
Filters pods not in Running state

Quickly exposes Pending, CrashLoopBackOff, Error, or Completed

Typical Issues

Image pull failures
Resource shortages
Startup crashes
Misconfigured secrets/configmaps

Recommended Next Step

Pick a failing pod and inspect details:

k describe pod <pod-name> -n <namespace>

3. Sort Events to Find Recent Failures

Events often explain what happened moments before an outage.

k get events -A --sort-by='.lastTimestamp'

4. Context & Namespace Navigation

Managing multiple clusters or namespaces can cause accidental deployments to the wrong environment.

View Current Context

k config get-contexts

Switch Namespace (with kubens)

kubens <namespace>

Why This Matters in Notebooks

If your notebook runs commands against the wrong cluster, results may be misleading. Always verify context before troubleshooting.

5. Deep Dive Debugging

Stream Logs

k logs -f <pod-name> --tail=100

Why it helps

Shows last 100 lines immediately
Continues streaming live logs
Useful for startup failures and intermittent crashes

If Multiple Containers Exist

k logs <pod-name> -c <container-name>

6. Launch a Temporary Debug Container

Sometimes production containers are minimal and lack tools like curl, ping, or nslookup.

k run debug-shell --rm -it --image=busybox -- /bin/sh

Use It For

DNS resolution tests
Network connectivity checks
Service reachability
Internal API validation

Example

wget -qO- http://my-service:8080/health

7. Inspect Pod Lifecycle & Exit Codes

k describe pod <pod-name> | grep -C 5 State

Common Exit Codes

Code	Meaning
1	Generic application error
127	Command not found
137	Out of memory (OOMKilled)
143	Graceful termination

If You See 137: Increase memory requests/limits or investigate memory leaks.

8. Resource Management

Force Delete a Stuck Pod

k delete pod <pod-name> --grace-period=0 --force

Use Carefully: Only use when a pod is stuck in Terminating and normal deletion fails.

9. Check Cluster Resource Pressure

k top nodes
k top pods

Requires

Metrics Server installed in the cluster.

Useful For

CPU bottlenecks
Memory pressure
Identifying noisy neighbors
Capacity planning

10. Port Forwarding: The Quick Fix

Need temporary access to a service without exposing ingress?

k port-forward svc/<service-name> 8080:<service-port>

Then open:

http://localhost:8080

Great For

Testing APIs locally
Accessing dashboards
Debugging internal services
Secure temporary access

Conclusion

Kubernetes troubleshooting becomes much easier when you follow a consistent workflow. These kubectl commands help reduce mean time to resolution (MTTR) and give notebook users a reliable path from symptom to solution.

Keep this guide linked in your notebook so users can jump directly into deeper troubleshooting whenever needed.