-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: http(s) connection time out from VMs in VLAN netwok #98
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
btw, please unify the wrapped new error
pkg/network/vlan/vlan.go
Outdated
@@ -108,6 +108,16 @@ func (v *Vlan) AddLocalArea(la *LocalArea) error { | |||
if err := v.uplink.AddBridgeVlan(la.Vid); err != nil { | |||
return fmt.Errorf("add bridge vlanconfig %d failed, error: %w", la.Vid, err) | |||
} | |||
br, err := netlink.LinkByName(v.bridge.Name) | |||
if err != nil { | |||
return fmt.Errorf("bridge %s not found error", v.bridge.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrap original err to new Errorf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fmt.Errorf("bridge %s not found error: %w", v.bridge.Name, err)
, you mean like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
brlnk := iface.NewLink(br) | ||
if brlnk.IsBridgeVlanPVID(la.Vid) { | ||
if err := netlink.BridgeVlanAdd(brlnk, la.Vid, false, true, true, false); err != nil { | ||
return fmt.Errorf("add iface untagged vlan failed, error: %v, link: %s, vid: %d", err, br.Attrs().Name, la.Vid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the error fomat is suggested to %w
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please refer:
https://pkg.go.dev/fmt#Errorf
The VM attached to the VLAN network fails to http(s) with management URL. However, there's no problem with SSH/ping which connect to ports other then http(s) since http(s) needs extra route to the CNI interface. The http(s) egress from the uplink bridge interface should be untagged to be correctly routed. Link: harvester/harvester#4359 Signed-off-by: Chris Chiu <[email protected]>
d1a494f
to
a874291
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, I suggest we completely disable the /net/bridge/bridge-nf-call-iptables
kernel tunable during the network controller initialization because
- Calico/Canal CNI's working model does not involve Linux bridges (also confirmed by the contributor of the project on the Slack channel that setting this tunable to
1
is unnecessary ref) - We already have the code that disables the tunable whenever users create new ClusterNetwork/Vlanconfig objects ref
Making packets leave the mgmt-br
bridge interface untagged could solve the issue, but it still leaves the mistake uncorrected. Packets flow to and from the Pods with hostPort
configured will be asymmetric. They should be forwarded outside of the Harvester cluster through the uplink and routed by an external gateway. This issue arises when we mix the usage of Canal and Bridge CNI with the Multus meta plugin. It's just that everything seems fine when no VLANs are involved.
Thank you!
I still have concern about simply disable Then I create a worker node in the same VLAN w/ management, there's no such problem. Since all traffic from this node to management url will be SNAT/DNAT from Maybe we need to list more use cases to find out the real solution. |
I feel it is a different issue since no bridge CNI is involved; it's just pure canal stuff. But I strongly agree with you that we need to test thoroughly if we want to turn off the kernel tunable. Thanks! |
Problem:
The VM attached to the VLAN network fails to http(s) with management URL. However, there's no problem with SSH/ping which connect to ports other then http(s) since http(s) needs extra route to the CNI interface.
Solution:
The http(s) egress from the uplink bridge interface should be untagged to be correctly routed. Since the routing is determined is based on L3, but the VLAN packet is L2.
Related Issue:
harvester/harvester#4359