Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve NTP implementation #2004

Open
wants to merge 12 commits into
base: development-v6
Choose a base branch
from
Open

Improve NTP implementation #2004

wants to merge 12 commits into from

Conversation

DL6ER
Copy link
Member

@DL6ER DL6ER commented Jul 1, 2024

What does this implement/fix?

Start NTP server thread(s) only after the first successful NTP time sync. This ensures the time we offer is correct. As a by-product this also gives some time for alternative NTP servers (e.g., crony) to finish their startup so we don't take the port before they can do it. Also, disable DNSSEC time-windows validation until first sync succeeded to avoid DNSSEC-related issues during NTP sync.


Related issue or feature (if applicable): N/A

Pull request in docs with documentation (if applicable): N/A


By submitting this pull request, I confirm the following:

  1. I have read and understood the contributors guide, as well as this entire template. I understand which branch to base my commits and Pull Requests against.
  2. I have commented my proposed changes within the code.
  3. I am willing to help maintain this change if there are issues with it later.
  4. It is compatible with the EUPL 1.2 license
  5. I have squashed any insignificant commits. (git rebase)

Checklist:

  • The code change is tested and works locally.
  • I based my code and PRs against the repositories developmental branch.
  • I signed off all commits. Pi-hole enforces the DCO for all contributions
  • I signed all my commits. Pi-hole requires signatures to verify authorship
  • I have read the above and my PR is ready for review.

DL6ER added 3 commits July 1, 2024 16:19
…and remove obsolete variable thread_running[] (it is only ever set but never read)

Signed-off-by: DL6ER <[email protected]>
…ld be rejected outside those windows. This generates an interesting chicken-and-egg problem for machines which don't have a hardware real time clock. For these machines to determine the correct time typically requires use of NTP and therefore DNS, but validating DNS requires that the correct time is already known. Resolve this by setting dnssec-no-timecheck removing the time-window checks (but not other DNSSEC validation.) only until NTP sync finishes (or if we realize the user doesn't want it)

We do not use the overloaded SIGINT (as dnsmasq) but SIGUSR7 to avoid killing the process when in debug mode (this is a fundamental drawback of the dnsmasq implementation)

Signed-off-by: DL6ER <[email protected]>
@DL6ER DL6ER requested a review from a team July 1, 2024 15:31
@pralor-bot
Copy link

This pull request has been mentioned on Pi-hole Userspace. There might be relevant details there:

https://discourse.pi-hole.net/t/ntp-feature-request-for-pihole-ftl-v6/70991/2

@DL6ER DL6ER removed the request for review from a team July 1, 2024 17:32
@DL6ER DL6ER marked this pull request as draft July 1, 2024 17:32
@DL6ER
Copy link
Member Author

DL6ER commented Jul 1, 2024

TODO

  • Run NTP test later in the CI test suite to give it time to properly start after initial time synchronization
  • do not start NTP client if CAP_SET_TIME is not available
  • add dedicated boolean to ease disabling of NTP client
  • docs: Ignore SIGUSR7

DL6ER added 5 commits July 1, 2024 20:07
…Also move all the RTC properties inside ntp.sync because this is where they apply and where RTC sync can be disabled

Signed-off-by: DL6ER <[email protected]>
…t accurate time from elsewhere - it isn't intuitive that the server cannot be started without the client

Signed-off-by: DL6ER <[email protected]>
@DL6ER DL6ER changed the title Add NTP delay Imporve NTP implementation Jul 1, 2024
@DL6ER DL6ER changed the title Imporve NTP implementation Improve NTP implementation Jul 1, 2024
@DL6ER DL6ER marked this pull request as ready for review July 1, 2024 18:54
@DL6ER DL6ER requested a review from a team July 1, 2024 18:54
@PromoFaux
Copy link
Member

PromoFaux commented Jul 1, 2024

Without SYS_TIME on a container:

pihole  | 2024-07-01 20:32:06.427 BST [243M] WARNING: Insufficient permissions to set system time, NTP client not available

👍

Is there some value in including a hint that SYS_TIME is unavailable in this warning message?

Similarly

Maybe something here about SYS_NICE

pihole  | 2024-07-01 20:32:33.023 BST [244M] WARNING: Cannot set process priority to -10: Permission denied. Process priority remains at 0

Copy link
Member

@yubiuser yubiuser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it try to open the database if it errored before?

nanopi@nanopi:~$ pihole-FTL ntp de.ntp.org
Using NTP server: de.ntp.org
Error NTP client: Cannot resolve NTP server address: Name has no usable address "de.ntp.org"
Cannot open database in read-write mode

@DL6ER
Copy link
Member Author

DL6ER commented Jul 2, 2024

Why does it try to open the database if it errored before?

Because it wants to store this in the message table. But it does make sense to never store in the message table when running interactively. Pushed now.

…uce chown() code duplication by using a single function for all chown()-activities

Signed-off-by: DL6ER <[email protected]>
@Gontier-Julien
Copy link

@DL6ER With this PR would this mean that it won't throw a warning anymore when there already is an ntp server on the server/system?
Also i don't know/think it already exist so correct me if i'm wrong as always, but there isn't an single option to disable the ntp client & server currently no? As i already have an ntp server that also sync the clock of the server it self there isn't much of a need for pi-hole to act either as a server or client then, so maybe as single option could be added to disable it?

@DL6ER
Copy link
Member Author

DL6ER commented Jul 3, 2024

With this PR would this mean that it won't throw a warning anymore when there already is an ntp server on the server/system?

Well, it'd warn to the log file that the port is already taken and, hence, Pi-hole won't start its own NTP server. But you can safely ignore this warning if you know you have a local NTP server. There will be no warning added to the Pi-hole diagnosis system.


an single option to disable the ntp client & server

There are already options to disable it, even when it is not a single one:

ntp.ipv4.active = false
ntp.ipv6.active = false
ntp.sync.active = false

will disable all NTP components inside Pi-hole. I don't think we really need one option.

@Gontier-Julien
Copy link

Well, it'd warn to the log file that the port is already taken and, hence, Pi-hole won't start its own NTP server. But you can safely ignore this warning if you know you have a local NTP server. There will be no warning added to the Pi-hole diagnosis system.

Well awesome then! ^^

There are already options to disable it, even when it is not a single one:

ntp.ipv4.active = false
ntp.ipv6.active = false
ntp.sync.active = false

will disable all NTP components inside Pi-hole. I don't think we really need one option.

That what i've done currently, but it fine by me too tbh, i was think so it simple for anyone, but i guess that if someone already have a ntp server then it should be easy

Alright then, thank ^^

Copy link
Member

@yubiuser yubiuser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What signal are you sending to dnsmasq to signal correct time? You created a custom SIGUSR7 which is interpreted as SIGUP

2024-07-04 21:06:30.699 CEST [670283/T670448] INFO:  -> DNS cache records: 4020
2024-07-04 21:06:30.699 CEST [670283/T670448] INFO:  -> Known forward destinations: 2
2024-07-04 21:06:30.700 CEST [670283/T670464] INFO: NTP server listening on 0.0.0.0:123 (IPv4)
2024-07-04 21:06:30.700 CEST [670283/T670465] INFO: NTP server listening on :::123 (IPv6)
2024-07-04 21:06:30.703 CEST [670283M] INFO: Received SIGHUP, flushing cache and re-reading config
2024-07-04 21:06:30.704 CEST [670283M] INFO: Blocking status is enabled

dnsmasq config says, one should use SIGINT


I noticed that queries that happen after dnsmasq start before starting DNSSEC validation are not logged to the query database. Basically everything between

Jul  4 21:18:55 dnsmasq[670676]: DNSSEC validation enabled
Jul  4 21:18:55 dnsmasq[670676]: DNSSEC signature timestamps not checked until receipt of SIGINT
....
Jul  4 21:19:03 dnsmasq[670676]: now checking DNSSEC signature timestamps

Is this unavoidable?


How can a DNSSEC validation be secure before the DNSSEC timestamp can be checked?

Jul  4 21:18:59 dnsmasq[670676]: reply rsa2048-sha256.ippacket.stream is DNSKEY keytag 46436, algo 8
Jul  4 21:18:59 dnsmasq[670676]: validation result is SECURE
Jul  4 21:18:59 dnsmasq[670676]: reply sigok.ippacket.stream is <CNAME>
Jul  4 21:18:59 dnsmasq[670676]: reply sigok.rsa2048-sha256.ippacket.stream is 195.201.14.36
....
Jul  4 21:19:03 dnsmasq[670676]: now checking DNSSEC signature timestamps


@DL6ER
Copy link
Member Author

DL6ER commented Jul 5, 2024

What signal are you sending to dnsmasq to signal correct time? You created a custom SIGUSR7 which is interpreted as SIGUP

[...]

dnsmasq config says, one should use SIGINT

Yes. The new signal was introduced because SIGINT has a different effect when starting dnsmasq in debug mode (pihole-FTL -d): it terminates the program.
This is also mentioned in man dnsmasq but it doesn't seem useful here. Hence , I introduced a new signal that does the exact same and reuses the same code except going the exit() path.

The logged message contains the SIGHUP a fixed string which is now outdated. It could already have be triggered with SIGINT before as well so it wasn't necessarily accurate anyway - removed with the last commit.


I noticed that queries that happen after dnsmasq start before starting DNSSEC validation are not logged to the query database.

[...]

Is this unavoidable?

Sorry, I see I have not mentioned this before. It is hard to avoid. The issue is that we can only add queries after the database importing from disk or we will not have the domain, client, and forward IDs as we have not imported these at this point.

edit Turns out it helps to step back and look at things later again - it is actually less hard, I just had to arrange some code to disentangle database initialization. We have now two parts, one running during database initialization (importing all the dependencies for new queries) and another one importing the actual queries which can be delayed until we know we have a proper time (or will never get one).


How can a DNSSEC validation be secure before the DNSSEC timestamp can be checked?

Because dnsmasq still performs all the checks, it will not just consider signature expired or signature not yet valid as errors. The cache is flushed once time is ready to ensure all domains are re-checked and we don't trust an SECURE which was not time-checked.

@yubiuser
Copy link
Member

yubiuser commented Jul 5, 2024

I fear it did not work. The query at 23:03:10 has not been logged to the database.

nanopi@nanopi:~$ cat /var/log/pihole/pihole.log | grep DNSSEC
Jul  5 23:03:08 dnsmasq[684108]: DNSSEC validation enabled
Jul  5 23:03:08 dnsmasq[684108]: DNSSEC signature timestamps not checked until receipt of SIGINT
...
Jul  5 23:03:15 dnsmasq[684108]: now checking DNSSEC signature timestamps

nanopi@nanopi:~$ cat /var/log/pihole/pihole.log | grep sigok
Jul  5 23:03:10 dnsmasq[684108]: query[A] sigok.ippacket.stream from 10.0.1.202
Jul  5 23:03:10 dnsmasq[684108]: forwarded sigok.ippacket.stream to 8.8.8.8
Jul  5 23:03:10 dnsmasq[684108]: reply sigok.ippacket.stream is <CNAME>
Jul  5 23:03:10 dnsmasq[684108]: reply sigok.rsa2048-sha256.ippacket.stream is 195.201.14.36

nanopi@nanopi:~$ pihole-FTL sqlite3 -h  /etc/pihole/pihole-FTL.db "Select * from queries where domain is 'sigok.ippacket.stream'"
id       timestamp         type  status  domain                 client     forward     additional_info  reply_type  reply_time            dnssec  list_id
-------  ----------------  ----  ------  ---------------------  ---------  ----------  ---------------  ----------  --------------------  ------  -------
2930152  1716923761.61009  1     2       sigok.ippacket.stream  10.0.1.24  8.8.8.8#53  (null)           3           0.220105886459351     1       (null) 
2930156  1716923761.41762  1     2       sigok.ippacket.stream  10.0.1.24  8.8.8.8#53  (null)           12          0.181222915649414     5       (null) 
2930240  1716923869.86778  1     2       sigok.ippacket.stream  10.0.1.24  8.8.8.8#53  (null)           12          0.207485675811768     5       (null) 
2930250  1716923870.08774  1     2       sigok.ippacket.stream  10.0.1.24  8.8.8.8#53  (null)           3           0.401464700698853     1       (null) 
3569511  1720119147.28513  1     2       sigok.ippacket.stream  10.0.1.59  8.8.8.8#53  (null)           3           0.428323030471802     1       (null) 
3569513  1720119148.74269  1     3       sigok.ippacket.stream  10.0.1.59  (null)      (null)           3           0.000569820404052734  1       (null) 
3570094  1720119705.92446  1     2       sigok.ippacket.stream  10.0.1.59  8.8.8.8#53  (null)           3           0.266446113586426     1       (null) 
3570098  1720119710.19207  1     3       sigok.ippacket.stream  10.0.1.59  (null)      (null)           3           0.000896692276000977  1       (null) 

Only part of the query has been captured

Screenshot at 2024-07-05 23-11-44

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants