We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
一,版本: v3.11.2 部署了高可用。
二,有一台结算节点离线
1,POD信息
[root@master1 ~]# kubectl get pods -n onecloud -owide -w | grep node10 default-host-deployer-h7zff 0/1 CrashLoopBackOff 251 19h 172.16.1.234 node10 <none> <none> default-host-health-t9vc5 0/1 CrashLoopBackOff 251 19h 172.16.1.234 node10 <none> <none> default-host-image-gbgn5 0/1 CrashLoopBackOff 251 19h 172.16.1.234 node10 <none> <none> default-host-xvztn 1/3 CrashLoopBackOff 494 19h 172.16.1.234 node10 <none> <none> default-telegraf-f52sw 0/1 Init:CrashLoopBackOff 228 19h 172.16.1.234 node10 <none> <none>
2,host日志:
[root@master1 ~]# kubectl logs default-host-xvztn -n onecloud -c host [info 240520 02:46:56 procutils.WaitZombieLoop(zombie_others.go:36)] My pid is not 1 and no need to wait zombies [info 240520 02:46:56 options.parseOptions(options.go:334)] Use configuration file: /etc/yunion/host.conf [info 240520 02:46:56 options.parseOptions(options.go:357)] Set log level to "info" [info 2024-05-20 02:46:56 options.parseOptions(options.go:334)] Use configuration file: /etc/yunion/common/common.conf [info 2024-05-20 02:46:56 options.parseOptions(options.go:357)] Set log level to "info" [info 2024-05-20 02:46:56 hostman.(*SHostService).InitService(host_services.go:64)] exec socket path: /var/run/onecloud/exec.sock [info 2024-05-20 02:46:56 app.InitApp(app.go:32)] RequestWorkerCount: 8 [info 2024-05-20 02:46:56 appsrv.NewApplication(appsrv.go:121)] App hostId: 4bhtR-oqKZELSL1qp4GCmt0ZpOM= (host,node10,172.16.1.234) 2024/05/20 02:46:56 Allow hosts [] [info 2024-05-20 02:46:56 appsrv.(*Application).SetDefaultTimeout(appsrv.go:137)] adjust application default timeout to 60.000000 seconds [info 2024-05-20 02:46:56 hostinfo.DetectCpuInfo(hostinfohelper.go:78)] cpuinfo freq 2700 [info 2024-05-20 02:46:56 hostinfo.NewHostInfo(hostinfo.go:2446)] CPU Model Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz Microcode 0x2006e05 [info 2024-05-20 02:46:56 hostinfo.NewHostInfo(hostinfo.go:2466)] Get kubelet container image Fs: /opt/docker, eviction config: {"evictionHard":{"imagefs.available":{"Signal":"imagefs.available","Operator":"LessThan","Value":{"Quantity":null,"Percentage":0.05}},"memory.available":{"Signal":"memory.available","Operator":"LessThan","Value":{"Quantity":"100Mi","Percentage":0}},"nodefs.available":{"Signal":"nodefs.available","Operator":"LessThan","Value":{"Quantity":null,"Percentage":0.05}},"nodefs.inodesFree":{"Signal":"nodefs.inodesFree","Operator":"LessThan","Value":{"Quantity":null,"Percentage":0.05}}}} [error 2024-05-20 02:46:59 fileutils2.GetAllBlkdevsIoSchedulers(fileutils.go:171)] no block device avaiable [info 2024-05-20 02:46:59 hostinfo.(*SHostInfo).prepareEnv(hostinfo.go:411)] I/O Scheduler switch to none [info 2024-05-20 02:46:59 hostinfo.(*SHostInfo).getKubeReservedMemMb(hostinfo.go:1572)] Kubelet memory threshold subtracted: 100MB [info 2024-05-20 02:46:59 hostinfo.(*SHostInfo).Init(hostinfo.go:196)] Start detectHostInfo [info 2024-05-20 02:46:59 hostinfo.(*SHostInfo).detectKVMMaxCpus(hostinfo.go:885)] KVM API VERSION 12 [info 2024-05-20 02:46:59 hostinfo.(*SHostInfo).detectKVMMaxCpus(hostinfo.go:890)] KVM CAP MAX VCPUS: 288 [info 2024-05-20 02:46:59 hostinfo.(*SHostInfo).detectKVMMaxCpus(hostinfo.go:898)] KVM CAP NR VCPUS: 240 [info 2024-05-20 02:46:59 sysutils.detectNestSupport(kvm.go:146)] Host is support kvm nest ... [info 2024-05-20 02:46:59 sysutils.detectNestSupport(kvm.go:151)] Host kvm nest is enabled ... [info 2024-05-20 02:46:59 hostinfo.(*SHostInfo).detectOsDist(hostinfo.go:778)] DetectOsDist CentOS Linux 7.9.2009 [info 2024-05-20 02:46:59 hostinfo.(*SHostInfo).detectQemuVersion(hostinfo.go:852)] Detect qemu version is 4.2.0 [info 2024-05-20 02:46:59 hostinfo.(*SHostInfo).detectOvsVersion(hostinfo.go:993)] Detect OVS version is 2.12.4 [info 2024-05-20 02:46:59 hostinfo.(*SHostInfo).detectOvsKOVersion(hostinfo.go:1010)] kernel module openvswitch vermagic: 5.4.130-1.yn20230805.el7.x86_64 SMP mod_unload modversions [info 2024-05-20 02:46:59 hostinfo.(*SHostInfo).Init(hostinfo.go:205)] Start parseConfig [info 2024-05-20 02:46:59 hostinfo.NewNIC(hostinfohelper.go:241)] IP 172.16.1.234/br0/bond1 [info 2024-05-20 02:46:59 hostbridge.(*SBaseBridgeDriver).ConfirmToConfig(hostbridge.go:180)] bridge br0 already has ip 172.16.1.234 [info 2024-05-20 02:46:59 hostinfo.NewNIC(hostinfohelper.go:291)] Confirm to configuration!! [info 2024-05-20 02:46:59 hostinfo.NewNIC(hostinfohelper.go:241)] IP 10.0.1.234/br1/bond0 [info 2024-05-20 02:46:59 hostbridge.(*SBaseBridgeDriver).ConfirmToConfig(hostbridge.go:180)] bridge br1 already has ip 10.0.1.234 [info 2024-05-20 02:46:59 hostinfo.NewNIC(hostinfohelper.go:291)] Confirm to configuration!! [info 2024-05-20 02:46:59 hostinfo.(*SNIC).SetupDhcpRelay(hostinfohelper.go:203)] Not enable dhcp relay on nic: &hostinfo.SNIC{Inter:"bond1", Bridge:"br0", Ip:"172.16.1.234", Wire:"", WireId:"", Mask:24, Bandwidth:1000, BridgeDev:(*hostbridge.SOVSBridgeDriver)(0xc00151ec60), dhcpServer:(*hostdhcp.SGuestDHCPServer)(0xc00151f5f0)} [info 2024-05-20 02:46:59 hostinfo.(*SNIC).SetupDhcpRelay(hostinfohelper.go:203)] Not enable dhcp relay on nic: &hostinfo.SNIC{Inter:"bond0", Bridge:"br1", Ip:"10.0.1.234", Wire:"", WireId:"", Mask:24, Bandwidth:1000, BridgeDev:(*hostbridge.SOVSBridgeDriver)(0xc0016e5590), dhcpServer:(*hostdhcp.SGuestDHCPServer)(0xc0016e5ec0)} [info 2024-05-20 02:46:59 hostinfo.(*SHostInfo).setupOvnChassis(hostinfo.go:223)] Start setting up ovn chassis goroutine 1 [running]: runtime/debug.Stack() /usr/lib/go/src/runtime/debug/stack.go:24 +0x65 runtime/debug.PrintStack() /usr/lib/go/src/runtime/debug/stack.go:16 +0x19 yunion.io/x/onecloud/pkg/util/ovnutils.InitOvn.func1() /root/go/src/yunion.io/x/onecloud/pkg/util/ovnutils/ovnutils.go:125 +0x3b panic({0x2c24140, 0xc000b9e810}) /usr/lib/go/src/runtime/panic.go:838 +0x207 yunion.io/x/onecloud/pkg/util/ovnutils.mustPrepOvsdbConfig({{0xc0016b9b40, 0x1b}, {0xc0016b7fa8, 0x5}, {0x0, 0x0}, {0xc0016b7f80, 0xa}, 0x5dc, {0xc0016b7fd0, ...}, ...}) /root/go/src/yunion.io/x/onecloud/pkg/util/ovnutils/ovnutils.go:93 +0x645 yunion.io/x/onecloud/pkg/util/ovnutils.InitOvn({{0xc0016b9b40, 0x1b}, {0xc0016b7fa8, 0x5}, {0x0, 0x0}, {0xc0016b7f80, 0xa}, 0x5dc, {0xc0016b7fd0, ...}, ...}) /root/go/src/yunion.io/x/onecloud/pkg/util/ovnutils/ovnutils.go:130 +0xb8 yunion.io/x/onecloud/pkg/hostman/hostinfo.(*OvnHelper).Init(...) /root/go/src/yunion.io/x/onecloud/pkg/hostman/hostinfo/hostovn.go:41 yunion.io/x/onecloud/pkg/hostman/hostinfo.(*SHostInfo).setupOvnChassis(0xc000e82000?) /root/go/src/yunion.io/x/onecloud/pkg/hostman/hostinfo/hostinfo.go:225 +0xb8 yunion.io/x/onecloud/pkg/hostman/hostinfo.(*SHostInfo).Init(0x5674ad0?) /root/go/src/yunion.io/x/onecloud/pkg/hostman/hostinfo/hostinfo.go:210 +0xdc yunion.io/x/onecloud/pkg/hostman.(*SHostService).RunService(0xc000010160?) /root/go/src/yunion.io/x/onecloud/pkg/hostman/host_services.go:80 +0x6f yunion.io/x/onecloud/pkg/cloudcommon/service.(*SServiceBase).StartService(0xc00000e108) /root/go/src/yunion.io/x/onecloud/pkg/cloudcommon/service/services.go:58 +0xe4 yunion.io/x/onecloud/pkg/hostman.StartService(...) /root/go/src/yunion.io/x/onecloud/pkg/hostman/host_services.go:163 main.main() /root/go/src/yunion.io/x/onecloud/cmd/host/main.go:30 +0x10a goroutine 1 [running]: runtime/debug.Stack() /usr/lib/go/src/runtime/debug/stack.go:24 +0x65 runtime/debug.PrintStack() /usr/lib/go/src/runtime/debug/stack.go:16 +0x19 yunion.io/x/log.Fatalf({0x30fd118, 0x1c}, {0xc0016dfea8, 0x1, 0x1}) /root/go/src/yunion.io/x/onecloud/vendor/yunion.io/x/log/log.go:138 +0x32 yunion.io/x/onecloud/pkg/hostman.(*SHostService).RunService(0xc000010160?) /root/go/src/yunion.io/x/onecloud/pkg/hostman/host_services.go:81 +0xb4 yunion.io/x/onecloud/pkg/cloudcommon/service.(*SServiceBase).StartService(0xc00000e108) /root/go/src/yunion.io/x/onecloud/pkg/cloudcommon/service/services.go:58 +0xe4 yunion.io/x/onecloud/pkg/hostman.StartService(...) /root/go/src/yunion.io/x/onecloud/pkg/hostman/host_services.go:163 main.main() /root/go/src/yunion.io/x/onecloud/cmd/host/main.go:30 +0x10a [fatal 2024-05-20 02:46:59 hostman.(*SHostService).RunService(host_services.go:81)] Host instance init error: Setup OVN Chassis: normalize db host: dns lookup (default-ovn-north) failed: lookup default-ovn-north on 10.96.0.10:53: no such host
3,计算节点上ipconfig信息:
[root@node10 ~]# ifconfig bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500 ether 9c:74:1a:c1:89:46 txqueuelen 1000 (Ethernet) RX packets 5903 bytes 868402 (848.0 KiB) RX errors 0 dropped 6 overruns 0 frame 0 TX packets 43 bytes 2870 (2.8 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 bond1: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500 ether 04:42:1a:cb:4b:6a txqueuelen 1000 (Ethernet) RX packets 20225 bytes 5402646 (5.1 MiB) RX errors 0 dropped 6 overruns 0 frame 0 TX packets 11854 bytes 1268500 (1.2 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.16.1.234 netmask 255.255.255.0 broadcast 172.16.1.255 inet6 fe80::642:1aff:fecb:4b6a prefixlen 64 scopeid 0x20<link> ether 04:42:1a:cb:4b:6a txqueuelen 1000 (Ethernet) RX packets 14997 bytes 4428205 (4.2 MiB) RX errors 0 dropped 249 overruns 0 frame 0 TX packets 10986 bytes 1159598 (1.1 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 br1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.0.1.234 netmask 255.255.255.0 broadcast 10.0.1.255 inet6 fe80::9e74:1aff:fec1:8946 prefixlen 64 scopeid 0x20<link> ether 9c:74:1a:c1:89:46 txqueuelen 1000 (Ethernet) RX packets 5361 bytes 724815 (707.8 KiB) RX errors 0 dropped 289 overruns 0 frame 0 TX packets 19 bytes 1282 (1.2 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eno1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 ether 04:42:1a:cb:4b:6a txqueuelen 1000 (Ethernet) RX packets 2969 bytes 178698 (174.5 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device memory 0xd3620000-d363ffff eno2: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 ether 04:42:1a:cb:4b:6a txqueuelen 1000 (Ethernet) RX packets 17265 bytes 5224750 (4.9 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 11868 bytes 1272200 (1.2 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device memory 0xd3600000-d361ffff enp28s0f0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 ether 9c:74:1a:c1:89:46 txqueuelen 1000 (Ethernet) RX packets 1332 bytes 330415 (322.6 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 22 bytes 1428 (1.3 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 enp28s0f1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 ether 9c:74:1a:c1:89:46 txqueuelen 1000 (Ethernet) RX packets 4571 bytes 537987 (525.3 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 21 bytes 1442 (1.4 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 genev_sys_6081: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65000 inet6 fe80::ec61:f8ff:fe76:a380 prefixlen 64 scopeid 0x20<link> ether ee:61:f8:76:a3:80 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 13 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 306 bytes 18416 (17.9 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 306 bytes 18416 (17.9 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
4,host.conf的网络信息:
ovn_encap_ip: 10.0.1.234 networks: - bond1/br0/172.16.1.234 - bond0/br1/10.0.1.234
没改动内容情况下,重启该计算节点就报错了。、
请求解决思路,排查问题点,谢谢!!!
The text was updated successfully, but these errors were encountered:
No branches or pull requests
一,版本:
v3.11.2
部署了高可用。
二,有一台结算节点离线
1,POD信息
2,host日志:
3,计算节点上ipconfig信息:
4,host.conf的网络信息:
没改动内容情况下,重启该计算节点就报错了。、
请求解决思路,排查问题点,谢谢!!!
The text was updated successfully, but these errors were encountered: