You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Every few days, one of my servers issues a kernel log message:
TCP: request_sock_TCP: Possible SYN flooding on port 11211. Sending cookies. Check SNMP counters.
Mostly processing continues after this, but sometimes the entire server is unresponsive for minutes.
In this environment we have 10 Go programs using gomemcache hitting one memcached server, and each Go program has 60 goroutines that will call through this library. So I expected a maximum of 600 connections at a time.
I have seen the SYN flooding message at the default memcached connection backlog of 1024, and also after I raised it to 4096.
From inspection of logs, packet traces, etc., I have formed the impression that some glitch in processing or network causes timeout errors (at the default of 100ms), which then cause gomemcache to dial new connections. 60 goroutines waiting 100ms each to dial gives 600 new connections dialed each second, per process.
If the dial attempts are not being discarded on the other end of the wire, then I think it can quickly go over the backlog limit.
I wondered if gomemcache should have a rate-limiter on dial()? I would prefer gomemcache to fail quickly rather than raising the timeout to slow it down.
Any other insight would be valued.
The only related issue I could see here is #108 ; interestingly we are both running the same system.
The text was updated successfully, but these errors were encountered:
bboreham
changed the title
"Possible SYN flooding"
Add a rate-limiter to dial() function?
Aug 14, 2020
Every few days, one of my servers issues a kernel log message:
Mostly processing continues after this, but sometimes the entire server is unresponsive for minutes.
In this environment we have 10 Go programs using
gomemcache
hitting one memcached server, and each Go program has 60 goroutines that will call through this library. So I expected a maximum of 600 connections at a time.I have seen the SYN flooding message at the default memcached connection backlog of 1024, and also after I raised it to 4096.
From inspection of logs, packet traces, etc., I have formed the impression that some glitch in processing or network causes timeout errors (at the default of 100ms), which then cause
gomemcache
to dial new connections. 60 goroutines waiting 100ms each to dial gives 600 new connections dialed each second, per process.If the dial attempts are not being discarded on the other end of the wire, then I think it can quickly go over the backlog limit.
I wondered if
gomemcache
should have a rate-limiter ondial()
? I would prefergomemcache
to fail quickly rather than raising the timeout to slow it down.Any other insight would be valued.
The only related issue I could see here is #108 ; interestingly we are both running the same system.
The text was updated successfully, but these errors were encountered: