Skip to content

High Availability Solution for Bootstrap Nodes

This page explains the concept of a "backup bootstrap node." Essentially, this node is a virtual machine designed to perform all the duties of the primary bootstrap node but is activated only when the original node experiences issues. Once the primary node is restored, operations should promptly revert to it. To ensure the high availability of the bootstrap service, policies such as pre-configured DNS, HAProxy + KeepAlived, or an nginx reverse proxy can be employed. This article will specifically focus on the pre-configured DNS approach.

Overall solution architecture is as folllows:

arch

Prepare Environment

  • Bootstrap node: Operating system centos7.9, IP xxx.xx.xx.193
  • Global single cluster: Operating system centos7.9, IP xxx.xx.xx.194, CRI containerd
  • Backup bootstrap node: Operating system centos7.9, IP xxx.xx.xx.194, the backup bootstrap node can be colocated with the master node of the global cluster.
  • dnsServer: IP xxx.xx.xx.192
  • Domain name: (Initially resolved to the bootstrap node xxx.xx.xx.193)

Steps

Simulate dnsServer and Configure According to Your Actual Situation

Set up dnsServer information on the machine xxx.xx.xx.192. The following configuration information is for demonstration purposes only:

  1. Configure /etc/named.conf

    zone "tinder-node-server.com" IN {
        type master;
        file "tinder-node-server.com.zone";
        allow-update { none; };
    };
    
    zone "41.30.172.in-addr.arpa" IN {
        type master;
        file "tinder-node-server.com.local";
        allow-update { none; };
    };
    
  2. Configure /var/named/tinder-node-server.com.zone

    $TTL 1D
    @   IN SOA  tinder-node-server.com. rname.invalid. (
                        0   ; serial
                        1D  ; refresh
                        1H  ; retry
                        1W  ; expire
                        3H )    ; minimum
        NS  @
        A   127.0.0.1
        AAAA    ::1
        NS  ns.tinder-node-server.com.
    ns  IN A    xxx.xx.xx.193
    www IN A    xxx.xx.xx.193
    email   IN A    xxx.xx.xx.193
    
  3. Configure /var/named/tinder-node-server.com.local

    $TTL 1D
    @   IN SOA  tinder-node-server.com. rname.invalid. (
                        0   ; serial
                        1D  ; refresh
                        1H  ; retry
                        1W  ; expire
                        3H )    ; minimum
        NS  @
        A   127.0.0.1
        AAAA    ::1
        PTR localhost.
        NS  ns.tinder-node-server.com.
    ns  A   xxx.xx.xx.193
    201 PTR www.tinder-node-server.com.
    201 PTR email.tinder-node-server.com.
    
  4. Configure DNS machine /etc/resolv.conf

    # Generated by NetworkManager
    search default.svc.cluster.local svc.cluster.local
    nameserver xxx.xx.xx.192
    nameserver 223.6.6.6
    options ndots:2 timeout:2 attempts:2
    
  5. Verify if dnsServer resolution is correct

    nslookup www.tinder-node-server.com xxx.xx.xx.192
    

    dnsserver

Note

  • If an external DNS service is used to resolve the domain name, make sure that the /etc/hosts file of each node, including the bootstrap node, does not contain the domain name configuration.
  • You can use the nslookup command to check the domain name resolution status. Make sure that each node, including the bootstrap node, can use nslookup to check the resolution status.

Install DCE 5.0 Based on External Domain Name Mode

Overall architecture:

dce01

  1. Refer to the installation process in Offline Installation of DCE 5.0 Enterprise

  2. Example of a well-defined clusterConfig file:

    clusterConfig.yaml
    apiVersion: provision.daocloud.io/v1alpha3
    kind: ClusterConfig
    metadata:
      creationTimestamp: null
    spec:
      clusterName: my-cluster
      bootstrapNode: www.tinder-node-server.com # Based on external domain name mode
    
      masterNodes:
        - nodeName: "g-master1"
          ip: xxx.xx.xx.194
          ansibleUser: "root"
          ansiblePass: "admin"
    
      fullPackagePath: "/home/offline-fix-dns"
      osRepos:
        type: builtin
        isoPath: "/home/CentOS-79-x86_64-DVD-2009.iso"
        osPackagePath: "/home/os-pkgs-centos7-v0.4.8.tar.gz"
    
      imagesAndCharts:
        type: builtin
        additionalSSLSubjectAltName: "xxx.xx.xx.193" # Domain name resolved by the dns service
    
      addonPackage:
      binaries:
        type: builtin
    
  3. Start the installation

    ./dce5-installer cluster-create -c sample/clusterConfig.yaml -m sample/manifest.yaml
    
  4. After the cluster is successfully installed, check the pod image addresses

    check01

  5. Check the hosts file of the global node and the configuration of the coredns pod. There should be no additional domain name configurations

    cat /etc/hostes
    

    hosts

    kubectl -n kube-system get cm coredns -o yaml
    

    coredns

Simulate the Activation of the Backup Bootstrap Node

  1. Ensure that the backup bootstrap node has installed the necessary dependencies, refer to Install Tools,

  2. SCP the offline package of the bootstrap node to the backup bootstrap node

  3. Example of a well-defined clusterConfig file: Start the installation of the bootstrap node using IP mode (bootstrapNode is set to auto or the specific IP address of the bootstrap node)

    clusterConfig.yaml
    apiVersion: provision.daocloud.io/v1alpha3
    kind: ClusterConfig
    metadata:
      creationTimestamp: null
    spec:
      clusterName: my-cluster
      bootstrapNode: 172.30.41.194 # IP-based mode
    
      masterNodes:
        - nodeName: "g-master1"
          ip: xxx.xx.xx.194
          ansibleUser: "root"
          ansiblePass: "admin"
    
      fullPackagePath: "/home/offline-fix-dns"
      osRepos:
        type: builtin
        isoPath: "/home/CentOS-79-x86_64-DVD-2009.iso"
        osPackagePath: "/home/os-pkgs-centos7-v0.4.8.tar.gz"
    
      imagesAndCharts:
        type: builtin
        additionalSSLSubjectAltName: "www.tinder-node-server.com" # Domain name resolved by the dns service
      addonPackage:
      binaries:
        type: builtin
    
  4. Run specific steps 1,2,3,4,5 on the backup bootstrap node

    ./dce5-installer cluster-create -c sample/clusterConfig.yaml -m sample/manifest.yaml -j 1,2,3,4,5
    

    Note

    The -j parameter is necessary here, it only installs on the bootstrap node itself.

Test the High Availability of the Bootstrap Node Based on DNS Resolution

Prerequisite: Update dnsServer to switch DNS resolution to the backup bootstrap node and perform verification after the switch.

arch02

check02

  1. Verify that file downloads are normal

    check03

  2. Verify that image pulls are normal

    check04

  3. Verify that the source bootstrap node and the backup node have normal images

    check05

  4. Verify that the charts repository is normal

    check06

FAQs

Configure bootstrapNode and AdditionalSubjectAltName Fields

Explanation of the configuration of bootstrapNode and imagesAndCharts.additionalSSLSubjectAltName in clusterConfig.yaml for the bootstrap node and the backup bootstrap node:

Node SubjectAltName(bootstrapNode) AdditionalSubjectAltName
Original bootstrap node 193 www.tinder-node-server.com 172.30.41.193
Backup bootstrap node 194 172.30.41.194 www.tinder-node-server.com

Synchronize the Upgrade of the Backup Bootstrap Node in an Upgrade Scenario

Prerequisites:

  • Restore the dns resolution to its original state, i.e., the domain name points to the original bootstrap node xxx.xx.xx.193

    Note

    Why restore the resolution to its original state? Because without modifying the clusterConfig, when starting or updating the bootstrap node in domain name mode, there is a code check logic: Check if the given domain name is resolved to the IP address of the current node. If it is, do nothing; if not, update the hosts file of the bootstrap node to support the domain name mode. So if the resolution is not restored to its original state, the hosts file of the bootstrap node will be modified unnecessarily.

  • The bootstrap node and the backup bootstrap node have downloaded the offline upgrade package

    1. Run the following command to upgrade the bootstrap node on both the bootstrap node and the backup bootstrap node, to upgrade the image, minio files, and charts versions.

      ./dce5-installer cluster-create -c sample/clusterConfig.yaml -m sample/manifest.yaml -u tinder
      

      Modify the fullPackagePath to point to the offline upgrade package address.

    2. After the upgrade, check if the images, files, and charts repositories of the bootstrap node and the backup node can be downloaded normally.

Comments