Skip to content

Kubernetes officially released v1.26, the stability has been significantly improved

1.26 release

On December 8, 2022 Pacific Time, Kubernetes officially released v1.26 with the topic Electrifying .

As the last version in 2022, many new features have been added, and the stability has also been significantly improved. We will introduce the update of version 1.26 from the following perspectives.

Update overview:

  • Kube APIServer: As the entry point for Kubernetes requests, this update adds 4 new KEP features, and makes some optimizations in response compression, and another 2 features have also been upgraded from Alpha to Beta.
  • Node: We put the updates most closely related to kubelet here, mainly including 4 new KEP features, and 4 features are GA in this version.
  • Storage: In terms of storage, the feature of allocating volumes from snapshots of other namespaces (cross-namespaces) has been added, and 2 storage-related features have entered the Beta stage, and 3 features are officially GA.
  • Network: Mainly updates to Kube Proxy, including a performance-optimized KEP, and 2 features have entered Beta, and 4 features are officially GA.
  • Resource Control and Coordination: Mainly for the update of related resource controllers in kube-controller-manager, there are also 2 new KEP features, 2 features are upgraded from Alpha to Beta, and 1 This feature is officially GA.
  • Scheduler: Mainly added an important KEP feature - PodSchedulingReadiness is used to control when a Pod can be scheduled by the scheduler, and one feature has been upgraded from Alpha to Beta.
  • Observability: In terms of observability, it mainly adds a new mechanism for exposing component status - Component Health SLIs. In addition, many metrics are added to each component.
  • kubectl command, kubeadm and client-go also have some optimizations and Bug Fixes.
  • For features that have been GA, according to the version iteration policy of Kubernetes, 11 feature gates are also removed in 1.26. If these feature gates are continued to be set in the component command, the component will not start normally.

Next, let's take a look at some of the more important API deprecations and changes that affect upgrades.

API deprecations and changes

PR#110618 Kubelet no longer supports the v1alpha2 version of CRI, and the connected container runtime must implement the v1 version of the container runtime interface.

This means that Kubernetes v1.26 will not support containerd 1.5.x and earlier versions; you need to upgrade to containerd 1.6.x or later before you can upgrade the node's kubelet to 1.26.

PR#112306 flowcontrol.apiserver.k8s.io increases v1beta3 version, and sets v1beta2 as the optimal version, v1beta3 will be set in 1.27 is the optimal version.

PR#113336 The CSIMigrationvSphere feature is GA and cannot be turned off.

Tip

Official tip: Do not upgrade to Kubernetes v1.26 if you need to use Windows, XFS or raw blocks. You can upgrade after vSphere CSI Driver adds related support in versions after v2.7.x.

PR#113710 The --pod-eviction-timeoutflag in the kube-controller-manager command is deprecated and is compatible with --enable- taint-managerflag was removed in 1.27.

PR#112643 DynamicKubeletConfig is deprecated in 1.23, the logic in kubelet has been removed in 1.24, this update will FeatureGateDynamicKubeletConfig and APIServer The logic within is removed.

PR#112120 Remove some invalid klog related flags in Kube component.

Kubernetes APIServer

KEP-2799 Reduction of Secret-based Service Account Tokens

Added Alpha Feature Gate —— LegacyServiceAccountTokenTracking to control whether to enable this feature.

When LegacyServiceAccountTokenTracking is enabled, the secret-based sa token will use the label kubernetes.io/legacy-token-last-used to record the last usage time.

KEP-3488 CEL for Admission Control

Related PRs: PR#113314, PR#113349, PR#112994, PR#112792, PR#112926, PR#112858

Based on the KEP-2876 CRD validation expression language provided by Kubernetes v1.25, This feature adds a new resource under admissionregistration.k8s.io/v1alpha1 - ValidatingAdmissionPolicy , which allows field validation when not using a Validation Webhook.

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: "demo-policy.example.com"
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
      - apiGroups: ["apps"]
        apiVersions: ["v1"]
        operations: ["CREATE", "UPDATE"]
        resources: ["deployments"]
  validations:
    - expression: "object.spec.replicas <= 5"

This will require the resource's spec.replicas field to be less than or equal to 5.

Added Alpha Feature Gate —— ValidatingAdmissionPolicy to control whether to enable this feature

KEP-3352 Aggregated Discovery PR#113171

Currently, users can only traverse the Group and Version APIs to obtain the Discovery API, but this feature reduces these calls to only two interfaces: /api and /apis.

Added Alpha Feature Gate —— AggregatedDiscoveryEndpoint to control whether to enable this feature.

KEP-3325 Auth API to get self user attributes PR#111333

Add resource SelfSubjectReview in authentication/v1alpha1 group to provide users with querying their own user information mapped in Kubernetes.

And the kubectl alpha auth whoami command has been added to facilitate query.

kubectl alpha auth whoami -o yaml
apiVersion: authentication.k8s.io/v1alpha1
kind: SelfSubjectReview
status:
  userInfo:
    username: jane.doe
    uid: b79dbf30-0c6a-11ed-861d-0242ac120002
    groups:
      - students
      - teachers
      - system:authenticated
    extra:
      skills:
        - reading
        - learning
      subjects:
        - math
        - sports

Added Alpha Feature Gate —— APISelfSubjectAttributesReview to control whether to enable this feature.

PR#112193 APIServer adds --aggregator-reject-forwarding-redirect flag, Users can set it to false to continue forwarding the redirection response of AA (Aggregated API) Server, the default is true.

PR#113015 You can specify custom resources through the --encryption-provider-config file, and these custom resources can be stored encrypted in etcd .

Response Compression

PR#112299 Based on load testing and production data collected from thousands of production Kubernetes clusters, the community observed that gzip compression in Kubernetes APIServer is currently suboptimal.

ISSUE: kubernetes/kubernetes#112296

PR#112309 Add DisableCompression field in kubeconfig, when set to true, it is required to no longer compress the response.

PR#112580 Add --disable-compression flag in kubectl, when set to true, request no longer compress the response.

Function stability upgrade

Alpha -> Beta

Feature Gate KEP
LegacyServiceAccountTokenNoAutoGeneration KEP-2799 Reduction of Secret-based Service Account Tokens
APIServerIdentity KEP-1965 kube-apiserver identity

Node (Kubelet)

KEP-3063 dynamic resource allocation PR#111023

Add resource.k8s.io/v1alpha1 group and add resources related to dynamic resource allocation under this group: 'ResourceClaim', 'ResourceClass', 'ResourceClaimTemplate', 'PodScheduling'.

Added Alpha Feature Gate —— DynamicResourceAllocation to control whether to enable this feature.

The new API is more flexible than Kubernetes' existing Device Plugins functionality, because it allows Pods to request special types of resources, which can be provided at the node level, cluster level, or according to other modes set by the user.

Similarly, the Pod structure also adds corresponding support for dynamic resource allocation.

apiVersion: v1
kind: Pod
spec:
  containers:
    - name: with-resource
      image: busybox
      command: ["sh", "-c", "set && mount && ls -la /dev/"]
      resources:
        claims:
          - name: resource
  resourceClaims:
    - name: resource
      source:
        resourceClaimName: shared-claim
        # resourceClaimTemplateName: test-inline-claim-template

KEP-3545 Improved multi-numa alignment in Topology Manager PR#112914

This feature better handles NUMA (Non-Uniform Memory Access) nodes by optimizing the TopologyManager.

Add a new configurable item topologyManagerPolicyOptions field and --topology-manager-policy-options flag to set additional configuration for Topology Manager Policy.

And add three Alpha feature gates to control the configuration of Topology Manager Policy:

  • TopologyManagerPolicyOptions
  • TopologyManagerPolicyAlphaOptions
  • TopologyManagerPolicyBetaOptions

TopologyManagerPolicyOptions controls whether the TopologyManagerPolicyOptions feature is supported, and the other two are used to control whether the Alpha and Beta level Options of TopologyManagerPolicy can be set respectively.

Of course, to use the TopologyManager function, you need to enable the TopologyManager Feature Gate, but the Feature Gate is already in the Beta stage and does not need to be actively set.

KEP-3386 Kubelet Evented PLEG for Better Performance PR#111384

This feature allows kubelet to reduce periodic polling by relying as much as possible on notifications from the Container Runtime Interface (CRI) when tracking Pod status in a node, which reduces kubelet's CPU usage

Added Alpha Feature Gate —— EventedPLEG to control whether to enable this feature.

PR#86139 In past versions, when using httpGet in container preStop and postStart lifecycle callbacks, Even if the schemes are set to HTTPS, http is still used to access, and the headers set by the user will not be applied in the setting request.

lifecycle:
  postStart:
    httpGet:
      port: 443
      scheme: HTTPS
    httpHandlers:
      - name: HEADER
        value: VALUE

This PR fixes these problems, and when the https access is abnormal, it will fall back to the http request. When the fallback occurs, A LifecycleHTTPFallback event is created for the Pod and the kubelet_lifecycle_handler_http_fallbacks_total metric is updated.

In addition, Added an Alpha Feature Gate - ConsistentHTTPGetHandlers , users can set --feature-gates=ConsistentHTTPGetHandlers=false in the kubelet to turn off the fallback behavior.

KEP-3503 Windows allows specifying whether a Pod is added to a node's network namespace PR#112961

Added Alpha Feature Gate —— WindowsHostNetworking to control whether to enable this feature.

PR#112414 allows to combine multiple line options in /etc/resolv.conf into a single line setting in Pod.

options ndots:1 attempts:3
options ndots:1 attempts:3 ndots:5

changed to:

options ndots:5 attempts:3

BUG FIX

PR#113041 Fixed kubectl exec causing kubelet to pick wrong container due to duplicate container name due to lifecycle.preStop .

PR#108832 When the container sets limit.cpu, but requests.cpu is "0", cgroups cpuShares takes the minimum value of 2 instead of using limit.cpu.

PR#112184 When kubelet only sets --cloud-provider or --node-ip , it will ensure that invalid annotations in the node are cleared - alpha.kubernetes.io/provided-node-ip .

PR#113481 Fix kubectl logs --timestamps when viewing logs, the problem of confusing random timestamps appears.

PR#112518 Fixed an issue where pods would continue to run on nodes tainted with NoExecute taints when the PodDisruptionConditions feature gate was enabled.

PR#112123 Set the minimum value of cpuCFSQuotaPeriod from 1us to 1ms, setting the value below 1ms will fail the verification.

Related PRs: PR#112077, PR#111520, PR#63437

Function stability upgrade

Beta -> GA

Feature Gate KEP
CPU Manager KEP-3057 Graduate to CPUManager to GA
DevicePlugins KEP-3573 Graduate DeviceManager to GA
Kubelet CredentialProviders KEP-2133 Kubelet Credential Provider
WindowsHostProcessContainers KEP-1981 Support for Windows privileged containers

Storage

KEP-3294 Provision volumes from cross-namespace snapshots

Related PRs: PR#113186, PR#kubernetes-csi/external-rpovisioner#805

Prior to Kubernetes 1.26, with VolumeSnapshot , users could allocate volumes from snapshots. But it fails to bind to VolumeSnapshots in other namespaces in a PersistentVolumeClaim .

apiVersion: v1
kind: PersistentVolumeClaim
spec:
  storageClassName: csi-hostpath-sc
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  dataSourceRef:
    apiGroup: snapshot.storage.k8s.io
    kind: VolumeSnapshot
    name: new-snapshot-demo
    namespace: prod
  volumeMode: Filesystem

This feature supports users to allocate volumes from snapshots across namespaces by setting the newly added field spec.dataSourceRef.namespace in PVC

Added Alpha Feature Gate —— CrossNamespaceVolumeDataSource to control whether to enable this feature

Function stability upgrade

Alpha -> Beta

Feature Gate KEP
RetroactiveDefaultStorageClass KEP-3333 Retroactive default StorageClass assignment
NodeOutOfServiceVolumeDetach KEP-2268 Non-graceful node shutdown

Beta -> GA

Feature Gate KEP
CSIMigrationAzureFile KEP-625 In-tree storage plugin to CSI Driver Migration
CSIMigrationvSphere KEP-1491 vSphere in-tree to CSI driver migration
DelegateFSGroupToCSIDriver KEP-2317 Allow Kubernetes to supply pod's fsgroup to CSI driver on mount

network

PR#110268 Optimize kube-proxy performance, it only sends rules changed in call to iptables-restore instead of whole ruleset

Added Alpha Feature Gate —— MinimizeIPTablesRestore to control whether to enable this feature.

PR#108250 kube-proxy adds flag--iptables-localhost-nodeports to allow users to prohibit access to NodePort services through localhost.

PR#111806 If the user requests to use ipvs, but the system is not configured correctly, it will no longer fall back to iptables mode, but return an error.

PR#113363 If kube-proxy detects that the pod.Spec.PodCIDRs assigned to Nodes have changed, it will restart.

PR#112133 Removed the deprecated "userspace" proxy mode.

Function stability upgrade

Alpha -> Beta

Feature Gate KEP
Proxy Terminating Endpoints KEP-1669 Proxy Terminating Endpoints
ExpandedDNSConfig KEP-2595 Expanded DNS configuration

Beta -> GA

Feature Gate KEP
MixedProtocolLBService KEP-1435 Support of mixed protocols in Services with type=LoadBalancer
ServiceIPStaticSubrange KEP-3070 Reserve Service IP Ranges For Dynamic and Static IP Allocation
Service Internal Traffic Policy KEP-2086 Service Internal Traffic Policy
EndpointSliceTerminatingCondition KEP-1672 Tracking Terminating Endpoints

Resource Control and Coordination

KEP-3017 PodHealthyPolicy for PodDisruptionBudget PR#113375

Added spec.unhealthyPodEvictionPolicy field to PodDisruptionBudget resource to control when unhealthy Pods should be considered for eviction.

The spec.unhealthyPodEvictionPolicy field currently has two values that can be set - IfHealthyBudget and AlwaysAllow

spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: zookeeper
  unhealthyPodEvictionPolicy: IfHealthyBudget

Add Alpha Feature Gate —— PDBUnhealthyPodEvictionPolicy to control whether to enable this feature

KEP-3335 StatefulSet Start Ordinal PR#112744

StatefulSets currently number pods starting at 0.

This feature adds a spec.ordinals.start field to StatefulSet to control the starting number of Pods.

apiVersion: apps/v1
kind: StatefulSet
spec:
  ordinals:
    start: 1

Add Alpha Feature Gate —— StatefulSetStartOrdinal to control whether to enable this feature

PR#112011 Stops working if multiple HPAs involve the same pod, And set HPA's ScalingAction Condition to False, Reason is AmbiguousSelector, this PR also includes multiple HPAs pointing to the same Deployment.

Function stability upgrade

Alpha -> Beta

Feature Gate KEP
JobPodFailurePolicy KEP-3329 Retriable and non-retriable Pod failures for Jobs
PodDisruptionConditions KEP-3329 Retriable and non-retriable Pod failures for Jobs

Beta -> GA

Feature Gate KEP
JobTrackingWithFinalizers KEP-2307 Job tracking without lingering Pods

Pod Scheduling

KEP-3521 Pod Scheduling Readiness

Related PR: PR#113275, PR#113274, PR#113442

Not all Pods in the Pending state are ready to be scheduled, and some Pods cannot be successfully scheduled due to the lack of necessary resources, which will also bring additional work to the scheduler.

This feature adds the spec.schedulingGates field in the Pod to control whether the Pod is ready for actual scheduling.

spec:
  scheduling Gates:
  - name: <value>

Scheduling will only start if spec.schedulingGates is cleared:

$ kubectl get pod example-po
NAME READY STATUS RESTARTS AGE
example-po 0/1 SchedulingGated 0 30s

Added Alpha Feature Gate —— PodSchedulingReadiness to control whether to enable this feature

PR#111726 Output Pending status Pod information in the Scheduler's debug Dummer.

Optimization and BUG FIX

PR#111809 When using Patch to update Pod status, In addition to net.ConnectionRefused , retries are also performed on ServiceUnavailable and InternalError errors.

The ServiceUnavailable error occurs when the APIServer is temporarily unable to process a request.

E0805 20:54:21.624945 123623 scheduler.go:356] Error updating pod foo: the server is currently unable to handle the request (patch pods foo)

InternalError usually occurs due to a temporary failure of the webhook.

E0811 23:32:30.886582 213747 scheduler.go:357] Error updating pod foo: Internal error occurred: failed calling webhook "xyz": Post "xyz": context deadline exceeded

Function stability upgrade

Alpha -> Beta

Feature Gate KEP
NodeInclusionPolicy KEP-3094 Take taints/tolerations into consideration when calculating PodTopologySpread skew

Observability

KEP-3466 Kubernetes Component Health SLIs

There was no standard format to expose the health information of Kubernetes components before. This feature adds a new path /metrics/slis to each component to expose the service level metric (ServiceLevelIndicator) in the Prometheus format.

Every component needs to expose two metrics:

  • gauge: the current health check status of the component

    # HELP kubernetes_healthcheck [ALPHA] This metric records the result of a single healthcheck.
    # TYPE kubernetes_healthcheck gauge
    kubernetes_healthcheck{name="etcd", type="healthz"} 1
    kubernetes_healthcheck{name="etcd", type="readyz"} 1
    
  • counter: Cumulative count of detected states for each health check

    # HELP kubernetes_healthchecks_total [ALPHA] This metric records the results of all healthchecks.
    # TYPE kubernetes_healthchecks_total counter
    kubernetes_healthchecks_total{name="etcd", status="success", type="healthz"} 15
    kubernetes_healthchecks_total{name="etcd", status="success", type="readyz"} 15
    
  • Kube APIServer: PR#112741

  • Kube Controller Manager: PR#112978
  • Kube Scheduler: PR#113026
  • Kubelet: PR#113030
  • Kube Proxy: PR#113057
  • Cloud Controller Manager: PR#113340

Added Alpha Feature Gate - ComponentSLIs to control whether to enable this feature.

Some metrics have been added to each component, and some metrics calculation problems have been fixed.

Kubectl commands

Subcommand enhancements and fixes

PR#109525 kubectl wait command supports setting non-existent fields in -o jsonpath=, which can be useful when some fields are set asynchronously

PR#111096 kubectl api-resources adds categories column in -o wide output, and adds --categories parameter to support filtering based on categories.

PR#113819 kubectl alpha event moved to top-level command kubectl events.

PR#111093 Fixed kubectlrollouthistory --revision=-ojson|yaml to output json/yaml, return the latest version instead Not the specified revision.

PR#111571 Optimize the prompt information of the kubectl label --dry-run command to prevent users from misunderstanding that the label has been set.

  • before

    $ kubectl label pod foo bar=baz --dry-run=server
    pod/foo labeled
    
  • after

    $ kubectl label pod foo bar=baz --dry-run=server
    pod/foo labeled (server dry run)
    

PR#112556 Optimize the error message when kubectl patch uses StrategicMerigePatch to update custom resources.

PR#112700 Fix kubectl covert choose wrong api version.

PR#109505 kubectl annotate no longer throws an error when setting an annotation with the same value as the original value.

PR#110907 When executing kubectl apply, if --namespace is specified, but --prune-allowlist is not specified, non-namespace resources will be deleted. The pr is just to print a warning. In 1.28, when kubectl apply specifies a namespace, it no longer deletes the resource pv & namespace without a namespace.

PR#113116 kubectl apply adds --prune-allowlist flag, used with --prune flag to replace the deprecated --prune-whitelist flag.

Other

PR#113146 The kubectl explain command can use OpenAPIv3 via the environment variable KUBECTL_EXPLAIN_OPENAPIV3.

PR#112553 Kubectl escapes terminal special characters in output. Fixes CVE-2021-25743.

PR#112150 Optimize kubectl's display of invalid requests returned by APIServer.

PR#112243, PR#112261 deprecated kubectl run command Multiple flags, even if set they will be ignored.

Shell Completion

PR#113636 kubectl shell completion supports showing command description in bash.

bash-5.1$ kubectl a[tab][tab]
alpha                   (Commands for features in alpha)
annotate                (Update the annotations on a resource)
api-resources           (Print the supported API resources on the server)
api-versions            (Print the supported API versions on the server, in the form of "group/version")
apply                   (Apply a configuration to a resource by file name or stdin)
argo                    (The command argo is a plugin installed by the user)
attach                  (Attach to a running container)
auth                    (Inspect authorization)
autoscale               (Auto-scale a deployment, replica set, stateful set, or replication controller)

bash-5.1$ kubectl --c[tab][tab]
--cache-dir            (Default cache directory)
--certificate-authority (Path to a cert file for the certificate authority)
--client-certificate    (Path to a client certificate file for TLS)
--client-key           (Path to a client key file for TLS)
--cluster              (The name of the kubeconfig cluster to use)
--context              (The name of the kubeconfig context to use)

PR#105867 provides shell completion for the kubectl plugin, and the plugin can pass kube_complete- to provide shell completion for plugin commands.

Kubeadm

Command fixes and enhancements

PR#113005 kubeadm join phase control-plane-preapare certs supports running with --dry-run .

PR#112945 supports dry-run mode for sub-phases, e.g. kubeadm reset phase cleanup-node --dry-run .

PR#111512 A new p hase —— show-join-command has been added to the kubeadm init command. Users can pass kubeadm init --skip -phase=show-join-command to skip printing the join information, this phase cannot be executed alone.

PR#112172 The kubeadm reset command adds the --cleanup-tmp-dir flag, which will clean up the contents of /etc/kubernetes/tmp , The default is false.

PR#112732 kubeadm adds validation for container registry format in configuration.

PR#111783 When the CertificateAuthorityData of the kubeconfig read by kubeadm is empty, it will try to load the CA certificate from the external CertificateAuthority file.

PR#112508 Allow keys in RSA and ECDSA formats during preflight check.

PR#110972 kubeadm reset will try to clean up stale data when executing, and the old data will be cleared when each reset phase is executed , the default etcd data directory will be deleted when the remove-etcd-member phase is executed.

PR#112751 Fixed a bug when validating ClusterConfiguration network-related fields (dnsDomain, serviceSubnet, podSubnet).

Other

PR#111277 Optimize the error message when kubeadm runs subcommands.

PR#112008 Since the node-role.kubernetes.io/master taint is no longer set in the control plane node in 1.25, kubeadm is no longer for CoreDNSDeployment Set node-role.kubernetes.io/master toleration.

PR#112000 The kubeadm init|join|upgrade command removes the --container-runtime flag since there is only one flag since dockershim was removed A value that can be set with --container-runtime=remote.

Client-Go

PR#112200 The SharedInformerFactory of client-go adds a Shutdown method to wait for all running informers in the Factory to end.

Remove features

The new version removes the GA Feature Gate:

  • ServiceLoadBalancerClass
  • ServiceLBNodePortControl
  • CSRDuration
  • DefaultPodTopologySpread
  • NonPreemptingPriority
  • PodAffinityNamespaceSelector
  • PreferNominatedNode
  • PodOverhead
  • UnversionedKubeletConfigMap
  • Indexed Job
  • SuspendJob

Function downgrade

Beta -> Alpha

LocalStorageCapacityIsolationFSQuotaMonitoring was upgraded to Beta in v1.25, but was rolled back to Alpha due to the problem that the ConfigMap would not be properly synced to the Pod file system after the update.

Release History

References

1.26 release

Comments