• DevOps
    Case Study

    How we helped a development company rebuild DevOps for efficiency and scale.

    READ CASESTUDY
    icon

    24/7 DevOps as a Service

    Round-the-clock DevOps for uninterrupted efficiency.

    icon

    Infrastructure as a Code

    Crafting infrastructure with ingenious code.

    icon

    CI/CD Pipeline

    Automated CI/CD pipeline for seamless deployments.

    icon

    DevSecOps

    Integrated security in continuous DevOps practices.

    icon

    Hire DevOps Engineers

    Level up your team with DevOps visionaries.

    icon

    Consulting Services

    Navigate success with expert DevOps consulting.

  • TechOps
    Case Study

    How a US hosting leader scaled with us!

    READ CASESTUDY

    WEB HOSTING SUPPORT

    icon

    HelpDesk Support

    Highly skilled 24/7 HelpDesk Support

    icon

    Product Support

    Boost your product support with our expertise.

    MANAGED SERVICES

    icon

    Server Management

    Don’t let server issues slow you down. Let us manage them for you.

    icon

    Server Monitoring

    Safeguard your server health with our comprehensive monitoring solutions.

    STAFF AUGMENTATION

    icon

    Hire an Admin

    Transform your business operations with our expert administrative support.

    icon

    Hire a Team

    Augment your workforce with highly skilled professionals from our diverse talent pool.

  • CloudOps
    Case Study

    How we helped a Private Deemed University in India, save US $3500/m on hosting charges!

    READ CASESTUDY
    icon

    AWS Well Architected Review

    Round-the-clock for uninterrupted efficiency

    icon

    Optimize

    Efficient CloudOps mastery for seamless cloud management

    icon

    Manage

    Automated CI/CD pipeline for seamless deployments

    icon

    Migrate

    Upgrade the journey, Migrate & Modernize seamlessly

    icon

    Modernize

    Simplify compliance complexities with our dedicated services

    icon

    FinOps as a Service

    FinOps as a Service

  • SecOps
    Case Study

    Enabling financial grade platforms through strategic cloud modernisation.

    READ CASESTUDY
    icon

    VAPT

    Vulnerability Assessment and Penetration Testing

    icon

    Source Code Review

    Ensuring source code security ans safe practices to reduce risks

    icon

    Security Consultation

    On demand services for improving server security

    icon

    System Hardening

    Reduced vulnerability and proactive protection

    icon

    Managed SoC

    Monitors and maintains system security. Quick response on incidents.

    icon

    Compliance as a Service

    Regulatory compliance, reduced risk

  • K8s
  • Insights
    Case Study

    How we helped a Private Deemed University in India, save US $3,500/m on hosting charges!

    READ CASESTUDY
    icon

    Blog

    Explore our latest articles and insights

    icon

    Case Studies

    Read about our client success stories

    icon

    Flipbook

    Explore our latest Flipbook

    icon

    Events

    Join us at upcoming events and conferences

    icon

    Webinars

    Watch our educational webinar series

  • Contact Us

Interested to collaborate?

Get in touch with us!

Contact us today to learn how our team can help you leverage our managed cloud and DevOps services so you can focus on growing your business.

  • White Label Managed IT Services for MSPs
  • White Label MSP Support Services
  • Managed HelpDesk Services
  • White Label WordPress Maintenance Services
  • Outsourced WebHosting Support
  • Hosting HelpDesk Support Services
  • cPanel Server Management
  • Plesk Server Management
  • DevOps Automation Services
  • DevOps Containerization Services
  • DevOps Engineering Services Experts
  • DevOps Maturity Assessment
  • DevOps Testing Services & Automation
  • DevOps Implementation Services
  • DevOps Transformation Services
  • White Label Kubernetes IT Services
  • Cloud Automation Services
  • Cloud Modernization Services
  • Database Migration Services
  • DevOps Outsourcing Services

AWS

  • AWS DevOps Services for Scalable Cloud
  • AWS Well-Architected Review
  • AWS Migration Services

Azure

  • Azure DevOps Services & Automation
  • Azure Migration Services

Google Cloud

  • Google Cloud Managed Services
  • Google Cloud Migration Services
  • Google Cloud Platform Services
  • AWSAWS
  • Azure CloudAzure Cloud
  • Google CloudGoogle Cloud
  • Akamai CloudAkamai Cloud
  • OVHOVH
  • Digital OceanDigital Ocean
  • HetznerHetzner
  • Managed DigitalOcean Cloud
  • Managed OVH Cloud
  • Managed Hetzner Cloud
  • Managed Akamai Cloud
  • Oracle Managed Services
  • Our story
  • Life@SupportSages
  • Insights
  • Careers
  • Events
  • Contact Us
  • Sitemap

aws partneraws advanced partner
LinkedInFacebookXInstagramYouTube
SupportSages

Copyright © 2008 – 2026 SupportSages Pvt Ltd. All Rights Reserved.
Privacy PolicyLegal TermsData ProtectionCookie Policy

How to Scale Kubernetes Applications Based on HTTP Request Count Using KEDA

Author Profile
Nikhil Raj
  • 9 min read
How to Scale Kubernetes Applications Based on HTTP Request Count Using KEDA

Generating audio, please wait...

Kubernetes has transformed how we deploy and manage applications, but one challenge persists: how do you scale applications precisely based on actual traffic patterns?

Kubernetes' native Horizontal Pod Autoscaler (HPA) can scale workloads based on CPU, memory, and even custom metrics. However, configuring HPA to scale from event sources such as HTTP request rates, message queues, or external systems often requires additional components and custom metric adapters, increasing operational complexity. Additionally, native HPA cannot scale deployments down to zero replicas, which limits cost efficiency during idle periods.

This is where KEDA (Kubernetes Event-Driven Autoscaling) comes in. KEDA simplifies event-driven autoscaling by integrating directly with external event sources and automatically feeding those metrics into HPA — while also enabling scale-to-zero, something native HPA cannot do. In this blog, we'll explore how KEDA works and demonstrate how to configure HTTP request-based autoscaling on a real Kubernetes cluster.

Why Traditional Kubernetes Autoscaling Is Not Always Enough

The Kubernetes Horizontal Pod Autoscaler (HPA) automatically adjusts the number of pod replicas based on resource metrics such as CPU and memory utilization. For compute-intensive applications, this approach works well because increased workload generally results in higher resource consumption.

However, many modern cloud-native applications do not exhibit this behavior.

Consider an API service that primarily waits for responses from a database or communicates with external services. Even while handling thousands of concurrent requests, CPU utilization may remain relatively low because the application spends most of its time waiting for I/O operations rather than performing computation.

As a result, the HPA may not detect the increasing workload quickly enough, leading to:

Increased request latency

Higher response times

Request timeouts

Poor end-user experience

In these situations, resource utilization becomes an indirect and often delayed indicator of application demand.

Why Choose KEDA?

KEDA complements the Kubernetes HPA by introducing event-driven autoscaling capabilities. Instead of relying exclusively on CPU or memory utilization, KEDA evaluates external workload signals and automatically adjusts the number of running replicas.

This approach is particularly valuable for applications that experience:

Highly variable traffic patterns

Event-driven processing

Queue-based workloads

Scheduled workloads

HTTP-based services

Cost optimization through scale-to-zero

What Is KEDA?

Kubernetes Event-Driven Autoscaling (KEDA) is an open-source autoscaling framework that extends Kubernetes with event-driven scaling capabilities.

Originally developed through collaboration between Microsoft and Red Hat, KEDA has since become a CNCF Graduated Project, reflecting its maturity, stability, and widespread production adoption.

KEDA continuously monitors external event sources and automatically scales Kubernetes workloads according to real-time demand.

Key Features

KEDA provides several capabilities beyond native Kubernetes autoscaling:

Event-driven autoscaling using external metrics

Scale-to-zero for cost-efficient workloads

Integration with more than 60 event sources

Lightweight architecture with minimal cluster overhead

Native integration with Kubernetes Horizontal Pod Autoscaler

Declarative configuration through Kubernetes Custom Resources

How HTTP Request-Based Scaling Works

For HTTP workloads, KEDA provides an HTTP add-on with three components:

ComponentPurpose
InterceptorActs as a lightweight proxy that receives incoming HTTP requests, forwards them to the application, and simultaneously tracks request counts.
External ScalerCollects request metrics from the Interceptor and converts them into a format that KEDA's HPA integration can consume for scaling decisions.
HTTPScaledObjectA Kubernetes Custom Resource that defines the scaling configuration, including the target workload, request thresholds, hosts, path prefixes, and minimum/maximum replica counts.

Demo: Let's Scaling an HTTP Application with KEDA

Let's put KEDA into action! In this hands-on demo, we'll deploy a sample application, configure HTTP request-based autoscaling, and watch KEDA automatically scale the application up and down based on real traffic.

Prerequisites

Before proceeding, ensure you have:

A Kubernetes cluster (v1.16 or later)

kubectl configured to communicate with your cluster

Helm v3 installed

Basic familiarity with Kubernetes Deployments and Services

Step 1: Install KEDA

Add the KEDA Helm repository and install:

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
kubectl create namespace keda
helm install keda kedacore/keda --namespace keda

Verify that the KEDA pods are running:

kubectl get pods -n keda

Step 2: Install the HTTP Add-on

helm install http-add-on kedacore/keda-add-ons-http --namespace keda

Confirm the installation:

kubectl get pods -n keda | grep http

Verify the available configuration fields for your version:

kubectl explain httpscaledobject.spec --recursive

This command displays all supported fields, helping you avoid schema validation errors.

Step 3: Deploy a Sample Application

Create a Deployment and Service for a simple web application:

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-app
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-app
  template:
    metadata:
      labels:
        app: sample-app
    spec:
      containers:
      - name: app
        image: nginx:alpine
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 200m
            memory: 256Mi
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 3
          periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: sample-app-service
  namespace: default
spec:
  selector:
    app: sample-app
  ports:
  - port: 80
    targetPort: 80

Apply the configuration:

kubectl apply -f deployment.yaml

Step 4: Create the HTTPScaledObject

This Custom Resource defines how KEDA should scale the application:

# httpscaledobject.yaml
apiVersion: http.keda.sh/v1alpha1
kind: HTTPScaledObject
metadata:
  name: sample-app-scaler
  namespace: default
spec:
  hosts:
  - sample-app.example.com
  pathPrefixes:
  - /
  scaleTargetRef:
    name: sample-app
    service: sample-app-service
    port: 80
  replicas:
    min: 0
    max: 10
  scaledownPeriod: 300
  scalingMetric:
    requestRate:
      targetValue: 100
      window: "1m"

Field descriptions:

FieldPurpose

hosts

Hostname(s) the interceptor monitors

pathPrefixes

URL paths to include in metrics

scaleTargetRef.name

Target Deployment name

scaleTargetRef.service

Service routing to the pods

replicas.min

Minimum replicas (0 enables scale-to-zero)

replicas.max

Maximum replicas

scaledownPeriod

Seconds to wait before scaling down

scalingMetric.requestRate.targetValue

Target requests/second per pod

scalingMetric.requestRate.window

Time window for rate averaging

This configuration instructs KEDA to maintain one pod for every 100 requests per second, up to a maximum of 10 pods, and wait 5 minutes after traffic subsides before scaling down.

Apply:

kubectl apply -f httpscaledobject.yaml

Verify creation:

kubectl get httpscaledobject

Step 5: Test Autoscaling Behavior

Since this is a local demonstration, use port forwarding to route traffic through the interceptor:

kubectl port-forward -n keda svc/keda-add-ons-http-interceptor-proxy 8080:8080

Generate load For this demonstration, we'll use Hey, a lightweight HTTP load testing tool. It generates concurrent HTTP requests to simulate user traffic, allowing us to observe how KEDA automatically scales the application in response to increasing request rates. If you don't already have it installed, you need to install it.

hey -n 10000 -c 50 -host "sample-app.example.com" http://localhost:8080/

Observe scaling in a separate terminal:

kubectl get pods -l app=sample-app -w

You will see KEDA create additional pods as the request rate exceeds the threshold, then scale them down after the cooldown period once load stops.

Step 6: Verify Metrics

Check the HPA that KEDA automatically creates:

kubectl get hpa

Inspect the scaler status:

kubectl describe httpscaledobject sample-app-scaler
Promotional banner

Production Best Practices

When deploying to production, follow these recommendations:

1. Set Resource Requests and Limits

Always define resource boundaries to ensure predictable scheduling:

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

2. Configure Readiness Probes

Prevent traffic from reaching pods that aren't ready:

readinessProbe:
  httpGet:
    path: /healthz
    port: 80
  initialDelaySeconds: 5
  periodSeconds: 5
  failureThreshold: 3

3. Adjust Scaling Thresholds

Match thresholds to your application's capacity through load testing:

scalingMetric:
  requestRate:
    targetValue: 100    # Adjust based on actual pod capacity
    window: "1m"        # Longer windows smooth out spikes

4. Use Appropriate Cooldown Periods

Prevent rapid scaling oscillations:

scaledownPeriod: 300    # Wait 5 minutes before removing pods

5. Validate Schema Before Applying

Always check the correct field names for your KEDA version:

kubectl explain httpscaledobject.spec --recursive

6. Monitor KEDA Health

Regularly check KEDA component logs and metrics:

kubectl logs -n keda -l app=keda-operator
kubectl logs -n keda -l app=keda-add-ons-http-interceptor

Summary

KEDA is not a replacement for the Kubernetes Horizontal Pod Autoscaler — it's an extension of it. Where the HPA reacts to resource utilization, KEDA reacts to the real-world signals that actually drive demand: HTTP request rates, message queue depths, scheduled events, and more. For modern cloud-native applications, this distinction matters.

In this guide, we deployed the KEDA HTTP Add-on and configured an application to scale automatically based on incoming request traffic — going from zero replicas under no load to multiple pods under sustained traffic, and back to zero once demand subsided. No application code changes were required.

The result is an autoscaling setup that is both cost-efficient and responsive: you only run what you need, when you need it.

KEDA's 60+ built-in scalers mean this same pattern works beyond HTTP as well. Whether you're consuming messages from Kafka, processing an SQS queue, or reacting to custom metrics, KEDA handles it the same way — monitor the signal, scale to meet demand, scale back when it's done. If your application workload is better described by events than by CPU usage, KEDA is worth adding to your stack.

Additional Resources

KEDA Official Documentation

HTTP Add-on Reference

Kubernetes HPA Documentation

  • Kubernetes
Promotional banner
Promotional banner

Enhance Security and Reduce Costs with CloudFront Functions: A SupportSages Solution

Enhance Security and Reduce Costs with CloudFront Functions: A SupportSages Solution
  • AWS
logo

Securing Infrastructure as Code (IaC) with Checkov

Securing Infrastructure as Code (IaC) with Checkov
  • Iaac
  • checkov
logo

Understanding Gateway Endpoints and NAT: A Side-by-Side Comparison of AWS and GCP

Understanding Gateway Endpoints and NAT: A Side-by-Side Comparison of AWS and GCP
  • AWS
  • GCP
logo

Understanding Gateway Endpoints and NAT: A Side-by-Side Comparison of AWS and GCP

Understanding Gateway Endpoints and NAT: A Side-by-Side Comparison of AWS and GCP
  • AWS
  • DevOps
logo
How to Scale Kubernetes Applications Based on HTTP Request Count Using KEDA

Posts by Nikhil Raj