You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All the DNS names are in host files. I can ssh between the nodes.
Yesterday, I just reset a the offline nodes and things got back online.
Today, just the same errors.
[2021-04-20 16:03:47.685059] E [name.c:266:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution failed on host srv-3:srv-4 [2021-04-20 16:03:50.685442] E [name.c:266:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution failed on host srv-3:srv-4 [2021-04-20 16:03:50.725539] W [fuse-bridge.c:1276:fuse_attr_cbk] 0-glusterfs-fuse: 6885431: LOOKUP() / => -1 (Transport endpoint is not connected) [2021-04-20 16:03:50.753676] I [fuse-bridge.c:6083:fuse_thread_proc] 0-fuse: initiating unmount of /shared The message "E [MSGID: 101075] [common-utils.c:505:gf_resolve_ip6] 0-resolver: getaddrinfo failed (family:2) (Name or service not known)" repeated 17 times between [2021-04-20 16:02:59.633395] and [2021-04-20 16:03:50.685439] [2021-04-20 16:03:50.753827] W [glusterfsd.c:1596:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7e65) [0x7f14965bae65] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x5626eff99625] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x5626eff9948b] ) 0-: received signum (15), shutting down [2021-04-20 16:03:50.753846] I [fuse-bridge.c:6871:fini] 0-fuse: Unmounting '/shared'. [2021-04-20 16:03:50.753852] I [fuse-bridge.c:6876:fini] 0-fuse: Closing fuse connection to '/shared'.
`# gluster volume status
Status of volume: hpc-admin
Gluster process TCP Port RDMA Port Online Pid
Brick serv-2:/DATA/hpc-admin/brick1 49152 0 Y 9181
Brick serv-3:/DATA/hpc-admin/brick1 49152 0 Y 10828
Brick serv-4:/DATA/hpc-admin/brick1 49152 0 Y 9264
Self-heal Daemon on localhost N/A N/A Y 15218
Self-heal Daemon on serv-2 N/A N/A Y 18495
Self-heal Daemon on serv-3 N/A N/A Y 48312
Task Status of Volume hpc-admin
There are no active volume tasks
Status of volume: shared
Gluster process TCP Port RDMA Port Online Pid
Brick serv-2:/DATA/shared/brick1 N/A N/A N N/A
Brick serv-3:/DATA/shared/brick1 49153 0 Y 36391
Brick serv-4:/DATA/shared/brick1 N/A N/A N N/A
Self-heal Daemon on localhost N/A N/A Y 15218
Self-heal Daemon on serv-3 N/A N/A Y 48312
Self-heal Daemon on serv-2 N/A N/A Y 18495
Task Status of Volume shared
There are no active volume tasks
`
The text was updated successfully, but these errors were encountered:
Two nodes not playing along have this:
DATA-shared-brick1[16996]: [2021-04-20 17:48:28.186345] C [MSGID: 113081] [posix-common.c:639:posix_init] 0-shared-posix: Extended attribute not supported, exiting.
mikeatform
changed the title
Working cluster with 2 volumes. spontaneously one volume can't resolve it's peer servers.
wrong form.
Apr 20, 2021
All the DNS names are in host files. I can ssh between the nodes.
Yesterday, I just reset a the offline nodes and things got back online.
Today, just the same errors.
[2021-04-20 16:03:47.685059] E [name.c:266:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution failed on host srv-3:srv-4 [2021-04-20 16:03:50.685442] E [name.c:266:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution failed on host srv-3:srv-4 [2021-04-20 16:03:50.725539] W [fuse-bridge.c:1276:fuse_attr_cbk] 0-glusterfs-fuse: 6885431: LOOKUP() / => -1 (Transport endpoint is not connected) [2021-04-20 16:03:50.753676] I [fuse-bridge.c:6083:fuse_thread_proc] 0-fuse: initiating unmount of /shared The message "E [MSGID: 101075] [common-utils.c:505:gf_resolve_ip6] 0-resolver: getaddrinfo failed (family:2) (Name or service not known)" repeated 17 times between [2021-04-20 16:02:59.633395] and [2021-04-20 16:03:50.685439] [2021-04-20 16:03:50.753827] W [glusterfsd.c:1596:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7e65) [0x7f14965bae65] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x5626eff99625] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x5626eff9948b] ) 0-: received signum (15), shutting down [2021-04-20 16:03:50.753846] I [fuse-bridge.c:6871:fini] 0-fuse: Unmounting '/shared'. [2021-04-20 16:03:50.753852] I [fuse-bridge.c:6876:fini] 0-fuse: Closing fuse connection to '/shared'.
`# gluster volume status
Status of volume: hpc-admin
Gluster process TCP Port RDMA Port Online Pid
Brick serv-2:/DATA/hpc-admin/brick1 49152 0 Y 9181
Brick serv-3:/DATA/hpc-admin/brick1 49152 0 Y 10828
Brick serv-4:/DATA/hpc-admin/brick1 49152 0 Y 9264
Self-heal Daemon on localhost N/A N/A Y 15218
Self-heal Daemon on serv-2 N/A N/A Y 18495
Self-heal Daemon on serv-3 N/A N/A Y 48312
Task Status of Volume hpc-admin
There are no active volume tasks
Status of volume: shared
Gluster process TCP Port RDMA Port Online Pid
Brick serv-2:/DATA/shared/brick1 N/A N/A N N/A
Brick serv-3:/DATA/shared/brick1 49153 0 Y 36391
Brick serv-4:/DATA/shared/brick1 N/A N/A N N/A
Self-heal Daemon on localhost N/A N/A Y 15218
Self-heal Daemon on serv-3 N/A N/A Y 48312
Self-heal Daemon on serv-2 N/A N/A Y 18495
Task Status of Volume shared
There are no active volume tasks
`
The text was updated successfully, but these errors were encountered: