Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VRRP stays in FAULT state after adding a same IP address as VIP on physical interface #2219

Closed
lsang6WIND opened this issue Oct 27, 2022 · 8 comments

Comments

@lsang6WIND
Copy link

lsang6WIND commented Oct 27, 2022

Describe the bug
Hello,

I am in a Qemu virtual machine and compiled the newest keepalived version.
I started keepalived with a VIP 192.168.1.1/24 and the physical interface has no IP addresses yet.
It is normal for VRRP entering FAULT state as no IPv4 address is configured on the interface.

Then, I add 192.168.1.1/24 (same as the VIP) on the physical interface, VRRP is still in FAULT state.

However, either if I add another IP address different than the VIP or before starting keepalived, configure the interface with a IP address (whatever the address) , VRRP works successfully.

To Reproduce

  • No IP address on the physical interface
  • Start keepalived with the configuration file and VRRP should be in FAULT state
  • Add same IP address as VIP on the physical interface: ip a a 192.168.1.1/24 dev ens4

Expected behavior
The transition from FAULT state to BACKUP then from BACKUP state to MASTER state.

Keepalived version

Keepalived v2.2.7 (10/13,2022), git commit v2.2.7-111-gefef3596 

Distro (please complete the following information):

  • Name: Ubuntu
  • Version: 20.04
  • Architecture: x86_64

Details of any containerisation or hosted service (e.g. AWS)
Qemu virtual machine

Configuration file:

global_defs
{
  router_id router
  enable_script_security
  script_user root
  dynamic_interfaces
  vrrp_startup_delay 0
  disable_local_igmp
}


vrrp_instance vrrp {
  version 2
  state BACKUP
  interface ens4

  use_vmac vrrp

  virtual_ipaddress {
    192.168.1.1/24
  }

  track_file {
  }

  garp_master_delay 5

  virtual_router_id 12

  priority 100
  advert_int 1.0

  preempt_delay 0
  notify_deleted
}

System Log entries

Oct 27 14:21:41 ubuntu2004 Keepalived[8087]: Starting Keepalived v2.2.7 (10/13,2022), git commit v2.2.7-111-gefef3596                                                                                                           

Oct 27 14:21:41 ubuntu2004 Keepalived[8087]: WARNING - keepalived was built for newer Linux 5.4.203, running on Linux>                                                                                                                       

Oct 27 14:21:41 ubuntu2004 Keepalived[8087]: Command line: 'keepalived' '-D'                                                                                                                                                                 

Oct 27 14:21:41 ubuntu2004 Keepalived[8087]: Opening file '/etc/keepalived/keepalived.conf'.                                                                                                                                                 

Oct 27 14:21:41 ubuntu2004 Keepalived[8087]: Configuration file /etc/keepalived/keepalived.conf                                                                                                                                              

Oct 27 14:21:41 ubuntu2004 Keepalived[8087]: (Line 7) WARNING - number '0' outside range [0.001000, 4294.967295]                                                                                                                             

Oct 27 14:21:41 ubuntu2004 Keepalived[8087]: (Line 7) vrrp_startup_delay '0' is invalid                                                                                                                                                      

Oct 27 14:21:41 ubuntu2004 Keepalived[8088]: NOTICE: setting config option max_auto_priority should result in better >                                                                                                                       

Oct 27 14:21:41 ubuntu2004 Keepalived[8088]: Starting VRRP child process, pid=8089                                                                                                                                                           

Oct 27 14:21:41 ubuntu2004 Keepalived_vrrp[8089]: Registering Kernel netlink reflector                                                                                                                                                       

Oct 27 14:21:41 ubuntu2004 Keepalived_vrrp[8089]: Registering Kernel netlink command channel                                                                                                                                                 

Oct 27 14:21:41 ubuntu2004 Keepalived_vrrp[8089]: (vrrp): Success creating VMAC interface vrrp                                                                                                                                               

Oct 27 14:21:41 ubuntu2004 networkd-dispatcher[458]: WARNING:Unknown index 10 seen, reloading interface list                                                                                                                                 

Oct 27 14:21:41 ubuntu2004 Keepalived_vrrp[8089]: NOTICE: setting sysctl net.ipv4.conf.all.rp_filter from 2 to 0                                                                                                                             

Oct 27 14:21:41 ubuntu2004 systemd-networkd[398]: vrrp: Link UP                                                                                                                                                                              

Oct 27 14:21:41 ubuntu2004 systemd-networkd[398]: vrrp: Gained carrier                                                                                                                                                                       

Oct 27 14:21:41 ubuntu2004 Keepalived_vrrp[8089]: (vrrp) entering FAULT state (no IPv4 address for interface)                                                                                                                                

Oct 27 14:21:41 ubuntu2004 Keepalived_vrrp[8089]: (vrrp) entering FAULT state                                                                                                                                                                

Oct 27 14:21:41 ubuntu2004 systemd-udevd[8090]: ethtool: autonegotiation is unset or enabled, the speed and duplex ar>                                                                                                                       

Oct 27 14:21:41 ubuntu2004 Keepalived_vrrp[8089]: Registering gratuitous ARP shared channel                                                                                                                                                  

Oct 27 14:21:41 ubuntu2004 systemd-udevd[8090]: Using default interface naming scheme 'v245'.                                                                                                                                                

Oct 27 14:21:41 ubuntu2004 Keepalived[8088]: Startup complete     

Additional context
It seems that keepalived won't catch the IP assignment when the IP is the same as VIP. I wonder if it is intentional.

@lsang6WIND
Copy link
Author

lsang6WIND commented Oct 28, 2022

In keepalived_netlink.c: ignore_address_if_ours_or_link_local checks if vrrp owns the new IP address added on the interface, that is the reason why VRRP won't change the state.

@pqarmitage
Copy link
Collaborator

I suppose that leaves the question, why does ignore_address_if_ours_or_link_local() ignore a VIP when added?

If the VRRP instance had been in master state and subsequently transitioned to backup because a higher priority instance appeared, then since 192.168.1.1 is a VIP and the priority is not 255, the address would be deleted when the VRRP instance transitioned to backup, and it would never be able to become master again. This clearly is an unstable situation, and therefore cannot be allowed.

Another way of looking at it is that if 192.168.1.1 is configured on the interface, then that system would be the address owner (see RFC5798), and so the priority would have to be 255, whereas you have configured the priority to be 100.

Actually the scenario of a VIP missing when the priority is 255 is a big problem. Priority 255 means that that system is the address owner, and that that VRRP instance will be in master state. keepalived can add the VIP when the instance becomes master (which gets around the configuration problem of the keepalived configuration saying the system is the address owner, but the system doesn't have the address). But what if keepalived is stopped on that system. It is still the address owner and so the VIPs will still be configured on the system. However, a backup VRRP instance will now become master and add the VIPs, and then the addresses will be configured on both systems.

This is why I never use priority 255, it just isn't safe. It can work on a router, which is what VRRP is designed for, where if VRRP is configured it will always run, and to disable VRRP on the router which is the address owner would be a configuration error.

So, having a VIP configured on a system when a VRRP instance is not in master state is a configuration error, and not one that keepalived can handle (other than deleting the address if it sees it being added, which is likely to confuse/annoy the person who added the address).

@lsang6WIND
Copy link
Author

OK, thanks for the answer.

@lsang6WIND
Copy link
Author

lsang6WIND commented Nov 2, 2022

Hello,

With this scenario:
Actually the scenario of a VIP missing when the priority is 255 is a big problem.

I wonder if it is possible to make keepalived work as a simple router by deleting the line if (ignore_address_if_ours_or_link_local(..)) or some enhancement that allows the transition when VIP is added and the priority is set to 255? Will the current keepalived implementation support those modifications?

@pqarmitage
Copy link
Collaborator

I can't see any way for a VRRP instance with priority 255 to properly handle missing VIPs. Consider a VRRP instance with priority 255 with two VIPs configured, and one of the VIPs is installed on the system but the other is not. Should the VRRP instance be in FAULT state, since one VIP is missing, or should it be in MASTER state advertising that it has the 2 VIPs whereas in fact it doesn't have one? In the former case, then a backup VRRP instance will become master, and the VIP that is installed on the priority 255 system will be duplicated; in the other case the VIP that is not installed will be missing.

My view is that if a VRRP instance is configured with priority 255 (i.e. telling keepalived that the VIPs are already configured on the system) but one of the VIPs is not configured on the system, then that is a configuration error. The safest thing for keepalived to do in that situation is to do nothing - it cannot determine what the correct thing to do is when there is a configuration error; hence the VRRP instance goes to fault state.

@lsang6WIND
Copy link
Author

lsang6WIND commented Nov 10, 2022

Hello,

I have tested the case that you described in the previous comment. With two machines A and B.

  • On machine B, I started keepalive as a backup (priority 100, use_vmac, VIP 192.168.1.1/24). B has an IP configured on the interface used by keepalived.
  • On machine A, I started keepalive as master (state master, priority 255, usec_vmac, VIP 192.168.1.1/24). A has no IP configured.

With this setup, as A is the address owner but the VIP is not present on the system thus, A transit to FAULT state, B becomes master. If I add VIP to the interface used by keepalived on A, it is still in FAULT state. This is the current behavior of keepalived.
But, as you said

"since one VIP is missing, or should it be in MASTER state advertising that it has the 2 VIPs whereas in fact it doesn't have one"

(I deleted all IP configured on A before processing)
If I add IP addr different than VIP, says 192.168.1.10/24 to A, keepalived can transit to MASTER state, this is normal?

I did the same test with multiple VIPs (192.168.1.1/24, 192.168.1.2/24) configured on A, then add the same IP (192.168.1.10/24) to A, keepalived still can transit to MASTER state.
Keepalived becomes MASTER whereas VIP is missing.

This is not working as you described.

@lsang6WIND
Copy link
Author

lsang6WIND commented Nov 10, 2022

Between, I have made a bit of modification as described in comment 8.
Change

                           if (addr_is_equal2(ifa, addr, vaddr, ifp, vrrp)) 
                                   return true;

To

                           if (addr_is_equal2(ifa, addr, vaddr, ifp, vrrp)) { 
                                   /* When an instance is owning the system IP, do not ignore this IP.*/
                                   if (vrrp->base_priority == VRRP_PRIO_OWNER)                            
                                           continue;                                                        
                                   else                                                                          
                                           return true;

This will allow VRRP instance (address owner, priority 255) transit to MASTER state, if VIP is missing at startup.
I did some tests, this works well as a simple router.
I know that is a minimalist patch, this will not resolve the case you described before.
But, if you have some advice (for example, currently, this vrrp instance will change state whatever an IP is added to the system, I think the transition will take only when VIP is added), you are welcome.

@lsang6WIND
Copy link
Author

lsang6WIND commented Nov 15, 2022

Hello,

I have opened a pull request: #2229

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants