Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS over UDP relay #13

Merged
merged 5 commits into from
Apr 4, 2017
Merged

DNS over UDP relay #13

merged 5 commits into from
Apr 4, 2017

Conversation

alalamav
Copy link
Contributor

  • Perform DNS resolution through Shadowsocks' UDP relay.
  • Deprecate DNS (SOCKS/TCP) resolver, remove dependencies.

@alalamav alalamav requested review from jab and trevj March 31, 2017 16:49
Copy link
Collaborator

@jab jab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first encounter with this codebase! Great to get some time with it and to get to work with you!
🔥 @albertolalama 🔥

Took a first pass at a review and left some line-specific code review comments inline, but leaving some general comments here in a new contributor/devrel mindset (but feel free to address now or (whe)never as you see fit):

  • The README assumes more familiarity with what tun2socks is (and related background info) than I have. (It gives a cursory link into https://github.com/ambrop72/badvpn-googlecode-export but I didn't see much in the way of docs there, and it looks like that project hasn't been touched in several years.) Would be nice to provide more high-level, intro- and overview-type context at some point for new contributors. What should a new contributor know to be able to start closing as many open issues as quickly as possible?
  • Add a "How To Set Up A Development Environment" section to the README.
  • Add a LICENSE and CONTRIBUTING file.
  • Add a .editorconfig with our style settings.
  • Most (all?) documentation comments in the source code are just // line comments. It'd be great to take a quick docs pass and give all public modules, classes, and methods proper /* @javadoc */ comments, and even generate some html from them and toss it up on a github pages or readthedocs site, then add a docs badge to the readme pointing to it.
  • Add unit, functional, fuzz, and/or property-based tests.
  • Set up CI, add a build status badge to the readme. If deferring writing actual tests, we can still set up CI to at least make sure that running a linter and building the docs both succeed.
  • Set up code coverage via e.g. https://codecov.io, add a badge to README.

Is there anything else we should be doing along these lines? Anything from https://github.com/akullpp/awesome-java and https://github.com/aleksandar-todorovic/awesome-c that would make us more awesome here?

I'm not sure I'm up to speed on this code enough yet for my approval to mean much, but I see you've also requested review from @trevj, great. It's awesome to get to look at this with you though, and I'm super psyched to get more familiar with it and hopefully get to contribute more soon!

@@ -126,6 +126,8 @@ public synchronized void stop() {

private static final String VPN_INTERFACE_NETMASK = "255.255.255.0";
private static final int VPN_INTERFACE_MTU = 1500;
private static final String DNS_RESOLVER_IP = "8.8.8.8";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth adding TODO comments for any of the following?

  • support multiple dns resolvers (e.g. 8.8.4.4, 208.67.222.222, etc.)
  • get config-type values like this from actual config rather than hard-coding

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather add issues for these.

  • Multiple DNS servers is more of a feature. Is the second for fallback?
  • A change like this is likely to include some UI component if the user is to choose these servers.

Copy link
Collaborator

@jab jab Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was that allowing the client to intelligently select among several options would allow more resilience to poorer quality of service (or outright blocking) of any one of them (more baskets for our eggs, so to speak). Since it sounds like it might not be YAGNI, I opened #16 to track this.

@@ -1,288 +0,0 @@
package org.uproxy.tun2socks;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, we needed this UDP-over-TCP sadness because uProxy's SOCKS interface stopped supporting UDP way back when obfuscation was added, and now that we're switching to Shadowsocks, we can use its UDP support and drop this.

But UDP support in SOCKS proxies is rare, so if we ever switch away from Shadowsocks, we might need something like this again. Is that right?

If so, maybe worth tagging the revision of master before this gets zapped, and adding some docs somewhere about this, with a link to that tag?

Copy link
Contributor Author

@alalamav alalamav Apr 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do that. There is also a branch that has an optimized version of the resolver.

* @section DESCRIPTION
*
*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is missing link(s) to the relevant spec(s) for new contributors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the SOCKS RFC? This is also a code issue that affects the whole repo - maybe worth creating an issue instead of resolving in this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, SOCKS, and maybe DNS would make sense too? Added an item to the TODO list in #14.

@@ -226,7 +228,8 @@ static void init_arguments (const char* program_name);
// ==== PSIPHON ====

//==== UPROXY ====
BAddr dns_server_address; // IP address of local DNS resolver
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The references to Psiphon in this file are a smell. Are they kept so Psiphon can apply changesets from our fork more easily? If so brb crying

(I know you mentioned that tun2socks has been split off into a multitude of divergent forks, each maintained by different projects for their own needs, and no one is maintaining a canonical version. From what you were saying, it sounds like things could be refactored such that there is once again a maintained canonical version, which other projects can consume, contribute improvements back to, and apply any only-useful-to-them modifications via some kind of supported hook / plugin structure, is that right? And maybe we could contribute this to the community? Or sponsor a GSoC project for it? Worth opening a bug to scope this further, and linking to it from the README or a TODO comment?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We forked tun2socks from Psiphon, there are in the source for purely historic reasons.
Indeed it would be useful to have a single repo with all the changes, we are scoping the amount of work that would be required to achieve something like this.

I am inclined to keep this initiative separate from this repo and instead track it internally.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. So is it valuable to keep the //===== PSIPHON ====== comments in the code? They might add noise/distraction/one extra thing to look up for new contributors. If a new contributor shouldn't have to know what Psiphon is to start being productive, might be worth removing at some point (i.e. we can add this to #15). Hope I'm not being too aggressive with my new-contributor hat!

@@ -294,8 +297,13 @@ static void udp_fd_handler(UdpPcb* udp_pcb, int event) {
return;
}

struct socks_udp_header* header = (struct socks_udp_header*)udp_pcb->udp_recv_buffer;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "pcb" (from udp_pcb) stand for? I checked the UdpPcb typedef above and couldn't tell. It looks like it's a global storing contextual data for the current udp packet, is that right? Could this benefit from a more self-documenting name, and/or explanatory comments where the struct is typedef'd? And likewise for the members?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added comments for now. This code will be refactored and put it its own header file once we design a full UDP solution.

@@ -294,8 +297,13 @@ static void udp_fd_handler(UdpPcb* udp_pcb, int event) {
return;
}

struct socks_udp_header* header = (struct socks_udp_header*)udp_pcb->udp_recv_buffer;
size_t socks_udp_header_len = sizeof(struct socks_udp_header);
size_t udp_data_len = recv_bytes - socks_udp_header_len;
Copy link
Collaborator

@jab jab Apr 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All we know (from the check above) is that recv_bytes > 0. Without an additional check that recv_bytes > socks_udp_header_len, udp_data_len could end up negative or zero, right?

Also, looking above, it looks like udp_recv copies up to UDP_MAX_DATAGRAM_BYTES bytes from sockfd into udp_recv_buffer, and returns the number of bytes copied/received. Do we have to take any additional care to zero out the remainder of the buffer? Are there any better abstractions, practices, or tools we should be using with this code in general to make it more memory-safe? (Some OOP, static analyzer, fuzzer, etc. to prevent or catch memory-safety bugs?)

(Also, it looks like the names of other variables that store a count of bytes all end in _len, except for recv_bytes. Would something like udp_datagram_len be a better name for this than recv_bytes? (Btw I don't care much what we standardize on for count variables (n, num, count, len), but I do vote that we standardize on something as a team).)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Good catch.

if (!options.dns_resolver_address) {
BLog(BLOG_ERROR,
"transparentDNS requires a DNS resolver address");
return 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like in some places (below) where this was returning 0, this changeset replaces that with goto fail (which just does return 0), but not here. Would it be clearer to instead have constants like ERRCODE = 0 and SUCCESSCODE = 1, and replace all the existing return 0, return 1, and goto fail statements with return ERRCODE or return SUCCESSCODE?

(I also noticed this is the opposite of process / main function return codes, where 0 is success and nonzero is failure. Worth flipping?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we can open a issue for this since it is not immediately related to the current PR and instead (likely) affects the whole project.

@@ -1477,22 +1500,35 @@ int process_device_udp_packet (uint8_t *data, int data_len)
BAddr_Print(&local_addr, local_addr_str);
char remote_addr_str[BADDR_MAX_PRINT_LEN];
BAddr_Print(&remote_addr, remote_addr_str);
BLog(BLOG_INFO, "UDP: %s -> %s. DNS: %d", local_addr_str, remote_addr_str, is_dns);
Copy link
Collaborator

@jab jab Apr 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expanding the folded code above, I see the switch (ip_version), for only two meaningful cases (4 and 6), both of which do similar things (parse header, parse udp, check integrity). I know this is less frowned upon in C, but is there an idiomatic way to refactor this into a less repetitive and procedural spaghetti style? (Btw, the original switch statement was found to have been misdesigned. Thankfully they're redesigned in Swift and other modern languages. If we're going to use them in C, all the more reason to set up a linter to detect and warn on possibly-unintentional fall-through.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should look at this in depth when we start work to support IPv6. Create an issue until then?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK! Added to TODO list in #15.

@alalamav
Copy link
Contributor Author

alalamav commented Apr 3, 2017

Thank you for the review @jab! You brought up important code health issues, as well as best development practices. Overall, I think we should create issues for these and instead of fixing them in this PR.

@alalamav alalamav mentioned this pull request Apr 4, 2017
13 tasks
Copy link
Collaborator

@jab jab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 ❗️ Awesome stuff @albertolalama. Over to you @trevj 😉

Copy link
Contributor

@trevj trevj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I can see there's a bunch of work to be done generally making this easier for new contributors so I stuck to just adding a few comments on the new code.

@@ -128,7 +129,8 @@ struct {
int set_signal;
// ==== PSIPHON ====
// ==== UPROXY ====
char *dns_server_address;
char *dns_resolver_address;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere in this struct address becomes addr - rename to dns_resolver_addr and udp_relay_addr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -128,7 +129,8 @@ struct {
int set_signal;
// ==== PSIPHON ====
// ==== UPROXY ====
char *dns_server_address;
char *dns_resolver_address;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely worth adding some comments here. Am I right to say:

  • dns_resolver_address: address of the DNS server to which all DNS requests will be forwarded
  • udp_relay_address: address to which all SOCKS UDP requests will be sent (NOTE: this implementation does not make UDP ASSOCIATE requests)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


init_arguments("uProxy tun2socks");

options.netif_ipaddr = (char*)vpnIpAddressStr;
options.netif_netmask = (char*)vpnNetMaskStr;
options.socks_server_addr = (char*)socksServerAddressStr;
options.dns_server_address = (char*)dnsServerAddressStr;
options.dns_resolver_address = (char*)dnsResolverAddressStr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about udp_relay_address?

Copy link
Contributor Author

@alalamav alalamav Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you know, the UDP relay address is the same as the SOCKS server. I explicitly added an option for this in the JNI.

@alalamav alalamav merged commit 9e87217 into master Apr 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants