0
Lỗi EOF khi deploy cluster OCP
HIện tại em đang triển khai 1 cluster OCP với 3 master node. Em dùng 5 vm on-prem với 3 vm làm master node, 1 vm làm bootstrap node và 1 vm làm bastion host. Em cài đặt HAProxy để làm Load balancer layer 4 và keepalived để tạo VIP cho LB trên cả 3 master node haproxy.cfg:
global
log /dev/log local0
log /dev/log local1 notice
log 127.0.0.1 local2
pidfile /var/run/haproxy.pid
maxconn 4000
chroot /var/lib/haproxy
stats socket /var/run/haproxy.sock mode 600 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
log global
mode tcp
option tcplog
option dontlognull
option redispatch
retries 3
timeout connect 10s
timeout client 30s
timeout server 30s
timeout check 10s
maxconn 3000
listen stats
bind *:9000
mode http
stats enable
stats uri /
monitor-uri /readyz
stats show-node
stats show-legends
stats refresh 30s
# API Load Balancer
frontend openshift-api-server
bind *:6443
default_backend openshift-api-server
mode tcp
option tcplog
backend openshift-api-server
balance roundrobin
mode tcp
option ssl-hello-chk
option tcp-check
option httpchk GET /readyz
http-check expect status 200
default-server inter 5s fall 3 rise 2
server bootstrap 10.10.2.104:6443 check check-ssl verify none inter 5s
server master-1 10.10.2.100:6443 check check-ssl verify none inter 5s
server master-2 10.10.2.101:6443 check check-ssl verify none inter 5s
server master-3 10.10.2.102:6443 check check-ssl verify none inter 5s
# Machine Config Server
frontend machine-config-server
bind *:22623
default_backend machine-config-server
mode tcp
option tcplog
backend machine-config-server
balance roundrobin
mode tcp
option ssl-hello-chk
option tcp-check
server bootstrap 10.10.2.104:22623 check check-ssl verify none inter 5s
server master-1 10.10.2.100:22623 check check-ssl verify none inter 5s
server master-2 10.10.2.101:22623 check check-ssl verify none inter 5s
server master-3 10.10.2.102:22623 check check-ssl verify none inter 5s
# Application Ingress Load Balancer
frontend ingress-http
bind *:80
default_backend ingress-http
mode tcp
option tcplog
backend ingress-http
balance source
mode tcp
server master-1 10.10.2.100:80 check
server master-2 10.10.2.101:80 check
server master-3 10.10.2.102:80 check
frontend ingress-https
bind *:443
default_backend ingress-https
mode tcp
option tcplog
backend ingress-https
balance source
mode tcp
server master-1 10.10.2.100:443 check
server master-2 10.10.2.101:443 check
server master-3 10.10.2.102:443 check
keepalived.conf:
global_defs {
router_id OCP
enable_script_security
}
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 51
priority 101
advert_int 1
authentication {
auth_type PASS
auth_pass kmtls123
}
virtual_ipaddress {
10.10.2.250/24
}
track_script {
chk_haproxy
}
}
Sau đó em config dns và tạo các file manifest, ignition rồi thực hiện quá trình giám sát triển khai với lệnh ./openshift-install --dir ocp/ wait-for bootstrap-complete --log-level=info
nhưng gặp phải lỗi:
time="2024-12-30T11:35:36+07:00" level=debug msg="OpenShift Installer 4.17.7"
time="2024-12-30T11:35:36+07:00" level=debug msg="Built from commit 433937af28ce3755671643ad73999544ab4e2a61"
time="2024-12-30T11:35:36+07:00" level=info msg="Waiting up to 20m0s (until 11:55AM +07) for the Kubernetes API at https://api.test.dat-tech.site:6443..."
time="2024-12-30T11:35:36+07:00" level=debug msg="Loading Agent Config..."
time="2024-12-30T11:35:46+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:38:17+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:40:47+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:43:17+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:45:48+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:48:18+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:50:48+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:53:19+07:00" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:55:49+07:00" level=error msg="Attempted to gather ClusterOperator status after wait failure: listing ClusterOperator objects: Get \"https://api.test.dat-tech.site:6443/apis/config.openshift.io/v1/clusteroperators\": EOF"
time="2024-12-30T11:55:49+07:00" level=info msg="Use the following commands to gather logs from the cluster"
time="2024-12-30T11:55:49+07:00" level=info msg="openshift-install gather bootstrap --help"
time="2024-12-30T11:55:49+07:00" level=error msg="Bootstrap failed to complete: Get \"https://api.test.dat-tech.site:6443/version\": EOF"
time="2024-12-30T11:55:49+07:00" level=error msg="Failed waiting for Kubernetes API. This error usually happens when there is a problem on the bootstrap host that prevents creating a temporary control plane."
em troubleshoot và cũng research khá nhiều nhưng vẫn chưua thể fix được lõi trên, mong moi người giúp em ạ
Thêm một bình luận