Yêu cầu Thứ Ba, 1:56 SA 22 0 0
  • 22 0 0
0

Lỗi EOF khi deploy cluster OCP

Chia sẻ
  • 22 0 0

HIện tại em đang triển khai 1 cluster OCP với 3 master node. Em dùng 5 vm on-prem với 3 vm làm master node, 1 vm làm bootstrap node và 1 vm làm bastion host. Em cài đặt HAProxy để làm Load balancer layer 4 và keepalived để tạo VIP cho LB trên cả 3 master node haproxy.cfg:

global
    log /dev/log local0
    log /dev/log local1 notice
    log 127.0.0.1 local2
    pidfile /var/run/haproxy.pid
    maxconn 4000
    chroot /var/lib/haproxy
    stats socket /var/run/haproxy.sock mode 600 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    log global
    mode tcp
    option tcplog
    option dontlognull
    option redispatch
    retries 3
    timeout connect 10s
    timeout client 30s
    timeout server 30s
    timeout check 10s
    maxconn 3000

listen stats
    bind *:9000
    mode http
    stats enable
    stats uri /
    monitor-uri /readyz
    stats show-node
    stats show-legends
    stats refresh 30s

# API Load Balancer
frontend openshift-api-server
    bind *:6443
    default_backend openshift-api-server
    mode tcp
    option tcplog

backend openshift-api-server
    balance roundrobin
    mode tcp
    option ssl-hello-chk
    option tcp-check
    option httpchk GET /readyz
    http-check expect status 200
    default-server inter 5s fall 3 rise 2
    server bootstrap 10.10.2.104:6443 check check-ssl verify none inter 5s
    server master-1 10.10.2.100:6443 check check-ssl verify none inter 5s
    server master-2 10.10.2.101:6443 check check-ssl verify none inter 5s
    server master-3 10.10.2.102:6443 check check-ssl verify none inter 5s

# Machine Config Server
frontend machine-config-server
    bind *:22623
    default_backend machine-config-server
    mode tcp
    option tcplog

backend machine-config-server
    balance roundrobin
    mode tcp
    option ssl-hello-chk
    option tcp-check
    server bootstrap 10.10.2.104:22623 check check-ssl verify none inter 5s
    server master-1 10.10.2.100:22623 check check-ssl verify none inter 5s
    server master-2 10.10.2.101:22623 check check-ssl verify none inter 5s
    server master-3 10.10.2.102:22623 check check-ssl verify none inter 5s

# Application Ingress Load Balancer
frontend ingress-http
    bind *:80
    default_backend ingress-http
    mode tcp
    option tcplog

backend ingress-http
    balance source
    mode tcp
    server master-1 10.10.2.100:80 check
    server master-2 10.10.2.101:80 check
    server master-3 10.10.2.102:80 check

frontend ingress-https
    bind *:443
    default_backend ingress-https
    mode tcp
    option tcplog

backend ingress-https
    balance source
    mode tcp
    server master-1 10.10.2.100:443 check
    server master-2 10.10.2.101:443 check
    server master-3 10.10.2.102:443 check

keepalived.conf:

global_defs {
    router_id OCP
    enable_script_security
}

vrrp_script chk_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}

vrrp_instance VI_1 {
    state MASTER
    interface ens33
    virtual_router_id 51
    priority 101
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass kmtls123
    }
    virtual_ipaddress {
        10.10.2.250/24
    }
    track_script {
        chk_haproxy
    }
}

Sau đó em config dns và tạo các file manifest, ignition rồi thực hiện quá trình giám sát triển khai với lệnh ./openshift-install --dir ocp/ wait-for bootstrap-complete --log-level=info nhưng gặp phải lỗi:

time="2024-12-30T11:35:36+07:00" level=debug msg="OpenShift Installer 4.17.7"
time="2024-12-30T11:35:36+07:00" level=debug msg="Built from commit 433937af28ce3755671643ad73999544ab4e2a61"
time="2024-12-30T11:35:36+07:00" level=info msg="Waiting up to 20m0s (until 11:55AM +07) for the Kubernetes API at https://api.test.dat-tech.site:6443..."
time="2024-12-30T11:35:36+07:00" level=debug msg="Loading Agent Config..."
time="2024-12-30T11:35:46+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:38:17+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:40:47+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:43:17+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:45:48+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:48:18+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:50:48+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:53:19+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:55:49+07:00" level=error msg="Attempted to gather ClusterOperator status after wait failure: listing ClusterOperator objects: Get \"https://api.test.dat-tech.site:6443/apis/config.openshift.io/v1/clusteroperators\": EOF"
time="2024-12-30T11:55:49+07:00" level=info msg="Use the following commands to gather logs from the cluster"
time="2024-12-30T11:55:49+07:00" level=info msg="openshift-install gather bootstrap --help"
time="2024-12-30T11:55:49+07:00" level=error msg="Bootstrap failed to complete: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:55:49+07:00" level=error msg="Failed waiting for Kubernetes API. This error usually happens when there is a problem on the bootstrap host that prevents creating a temporary control plane."

em troubleshoot và cũng research khá nhiều nhưng vẫn chưua thể fix được lõi trên, mong moi người giúp em ạ

Viblo
Hãy đăng ký một tài khoản Viblo để nhận được nhiều bài viết thú vị hơn.
Đăng kí