Skip to content

Knative Practices

In this section, we will delve into learning Knative through several practical exercises.

case 1 - Hello World

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: hello
spec:
  template:
    spec:
      containers:
        - image: m.daocloud.io/ghcr.io/knative/helloworld-go:latest
          ports:
            - containerPort: 8080
          env:
            - name: TARGET
              value: "World"

You can use kubectl to check the status of a deployed application that has been automatically configured with ingress and scalers by Knative.

~ kubectl get service.serving.knative.dev/hello
NAME    URL                                              LATESTCREATED   LATESTREADY   READY   REASON
hello   http://hello.knative-serving.knative.loulan.me   hello-00001     hello-00001   True

The deployed Pod YAML is as follows, consisting of two Pods: user-container and queue-proxy.

apiVersion: v1
kind: Pod
metadata:
  name: hello-00003-deployment-5fcb8ccbf-7qjfk
spec:
  containers:
  - name: user-container
  - name: queue-proxy

knative-request-flow

Request Flow:

  1. case1 When there is low traffic or no traffic, traffic will be routed to the activator.
  2. case2 When there is high traffic, traffic will be routed directly to the Pod only if it exceeds the target-burst-capacity.
    1. Configured as 0, expansion from 0 is the only scenario.
    2. Configured as -1, the activator will always be present in the request path.
    3. Configured as >0, the number of additional concurrent requests that the system can handle before triggering scaling.
  3. case3 When the traffic decreases again, traffic will be routed back to the activator if the traffic is lower than current_demand + target-burst-capacity > (pods * concurrency-target).

    The total number of pending requests + the number of requests that can exceed the target concurrency > the target concurrency per Pod * number of Pods.

case 2 - Based on Concurrent Scaling

We first apply the following YAML definition under the cluster.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: hello
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/target: "1"
        autoscaling.knative.dev/class: "kpa.autoscaling.knative.dev"
    spec:
      containers:
        - image: m.daocloud.io/ghcr.io/knative/helloworld-go:latest
          ports:
            - containerPort: 8080
          env:
            - name: TARGET
              value: "World"

Execute the following command for testing, and you can observe the scaling of the Pods by using kubectl get pods -A -w.

wrk -t2 -c4 -d6s http://hello.knative-serving.knative.daocloud.io/

case 3 - Based on concurrent scaling, scale out in advance to reach a specific ratio.

We can easily achieve this, for example, by limiting the concurrency to 10 per container. This can be implemented through autoscaling.knative.dev/target-utilization-percentage: 70, starting to scale out the Pods when 70% is reached.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: hello
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/target: "10"
        autoscaling.knative.dev/class: "kpa.autoscaling.knative.dev"
        autoscaling.knative.dev/target-utilization-percentage: "70" 
        autoscaling.knative.dev/metric: "concurrency"
     spec:
      containers:
        - image: m.daocloud.io/ghcr.io/knative/helloworld-go:latest
          ports:
            - containerPort: 8080
          env:
            - name: TARGET
              value: "World"

case 4 - Canary Release/Traffic Percentage

We can control the distribution of traffic to each version through spec.traffic.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: hello
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/target: "1"  
        autoscaling.knative.dev/class: "kpa.autoscaling.knative.dev"         
    spec:
      containers:
        - image: m.daocloud.io/ghcr.io/knative/helloworld-go:latest
          ports:
            - containerPort: 8080
          env:
            - name: TARGET
              value: "World"
  traffic:
  - latestRevision: true
    percent: 50
  - latestRevision: false
    percent: 50
    revisionName: hello-00001

Comments