Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS issues on NTP client #2440

Open
duduita opened this issue Jul 11, 2024 · 2 comments
Open

DNS issues on NTP client #2440

duduita opened this issue Jul 11, 2024 · 2 comments

Comments

@duduita
Copy link

duduita commented Jul 11, 2024

During NTP server querying through the ntpclient.c, different NTP server domain names (e.g., 0.uk.pool.ntp.org, 1.uk.pool.ntp.org) might resolve to the same set of IP addresses due to DNS caching. This can lead to repeated queries to the same non-responsive IP addresses, resulting in failures to obtain the correct time.

For example, in the following, there are some logs that I added to ntpclient.c, in order to understand why the NTP was failing:

[   51.046000] [25] (Info)  '0.pool.ntp.org' resolved to: 216.238.113.58
[   51.046000] [25] (Info) ntpclient.c-447-gethostbyname for 0.pool.ntp.org OK
[   51.046000] [25] (Info) ntpclient.c-480-Sending a NTP packet
[   51.055000] [25] (Info) ntpclient.c-509-sendto ret: 68
[   51.056000] [25] (Info) ntpclient.c-515-Recv a NTP packet
[   56.055000] [25] (Info) ntpclient.c-521-recvfrom nbytes: -1
[   56.056000] [25] (Info)  '0.pool.ntp.org' resolved to: 216.238.113.58
[   56.056000] [25] (Info) ntpclient.c-447-gethostbyname for 0.pool.ntp.org OK
[   56.056000] [25] (Info) ntpclient.c-480-Sending a NTP packet
[   56.063000] [25] (Info) ntpclient.c-509-sendto ret: 68
[   56.063000] [25] (Info) ntpclient.c-515-Recv a NTP packet
[   61.065000] [25] (Info) ntpclient.c-521-recvfrom nbytes: -1
[   61.066000] [25] (Info)  '0.pool.ntp.org' resolved to: 216.238.113.58
[   61.066000] [25] (Info) ntpclient.c-447-gethostbyname for 0.pool.ntp.org OK
[   61.066000] [25] (Info) ntpclient.c-480-Sending a NTP packet
[   61.075000] [25] (Info) ntpclient.c-509-sendto ret: 68
[   61.075000] [25] (Info) ntpclient.c-515-Recv a NTP packet
[   66.075000] [25] (Info) ntpclient.c-521-recvfrom nbytes: -1
[   66.076000] [25] (Info)  '0.pool.ntp.org' resolved to: 216.238.113.58
[   66.076000] [25] (Info) ntpclient.c-447-gethostbyname for 0.pool.ntp.org OK
[   66.076000] [25] (Info) ntpclient.c-480-Sending a NTP packet
[   66.085000] [25] (Info) ntpclient.c-509-sendto ret: 68
[   66.085000] [25] (Info) ntpclient.c-515-Recv a NTP packet
[   71.085000] [25] (Info) ntpclient.c-521-recvfrom nbytes: -1
[   71.086000] [25] (Info)  '0.pool.ntp.org' resolved to: 216.238.113.58
[   71.086000] [25] (Info) ntpclient.c-447-gethostbyname for 0.pool.ntp.org OK
[   71.086000] [25] (Info) ntpclient.c-480-Sending a NTP packet
[   71.095000] [25] (Info) ntpclient.c-509-sendto ret: 68
[   71.095000] [25] (Info) ntpclient.c-515-Recv a NTP packet
[   76.095000] [25] (Info) ntpclient.c-521-recvfrom nbytes: -1
[   76.095000] [25] (Info) ntpclient.c-563-ERROR: recvfrom() failed: 11
[   76.095000] [25] (Info) ntpclient.c-589-The NTP client is terminating

To mitigate this issue, a possible option is to flush the DNS cache after cycling through all configured NTP servers, ensuring that subsequent DNS resolutions provide potentially new and responsive IP addresses, thereby increasing the likelihood of successful time synchronization. However, I cannot manipulate the DNS cache from the user space, unless I create an API for it.

Overall, do you have a workaround or a hack that I can use in order to solve this NTP issue? Or at least to force a new IP resolution for an NTP hostname after some failures?

@acassis
Copy link
Contributor

acassis commented Jul 18, 2024

@duduita thank you for finding and reporting this issue!

@wengzhe did you see that?

@wengzhe
Copy link
Contributor

wengzhe commented Jul 22, 2024

Hi @duduita , since 12.1.0, the DNS caching will become invalid after the TTL from the DNS server has expired, then we'll send a new query for the domain names if it's being looked up. I found the TTL is always less than 100s (and normally ~20s) for 0.pool.ntp.org in my local environment. Maybe your DNS server gives you non-responsive IP addresses with a longer TTL which causes this problem.

There is a hack that may force resolving the domains: set CONFIG_NETDB_DNSCLIENT_LIFESEC to a shorter value, e.g. 5 seconds, then after 5sec the cache will become invalid and the domain will be resolved again if you do the lookup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants