Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve URL redirects where possible #62

Open
conorgil opened this issue May 16, 2018 · 8 comments
Open

Resolve URL redirects where possible #62

conorgil opened this issue May 16, 2018 · 8 comments
Labels
data Data sources, cleanliness, etc enhancement New feature or request

Comments

@conorgil
Copy link
Owner

Identify existing URLs that redirect to some other domain. The data includes both a service name and a service URL. The URL should be the fully resolved domain where possible so that tools like 2FAN can take action on it. The service name (not the URL) can be displayed in the UI if necessary, so the fact that the URL might point somewhere "weird" looking isn't really a problem if that is the true destination of the redirects.

Relates to #42

@conorgil conorgil added enhancement New feature or request data Data sources, cleanliness, etc labels May 16, 2018
@designedbinary designedbinary added this to the June 2018 Release milestone May 22, 2018
@kenman345
Copy link

You can make use of Travis-CI (or run locally) the HTML Proofer on the repository that this extension uses for its data (twofactorauth/twofactorauth) and then suggest to them the changes when you have it run through and indicate which ones redirect. The issue is some will redirect based on location/region of the IP making the request.

@conorgil
Copy link
Owner Author

@kenman345 thanks for the thoughts! I had not considered the scenario where the redirect is geo-based. That one will be a bit trickier to figure out.

Certainly, we can automate this using CI, but first we need to figure out the logic for determining what URL to choose as the "resolved" URL.

I took some notes locally that I should share here for transparency and backup:

  • GOAL: fully resolve all origin URLs so that the extension can match on the origin
    • fetch each URL while following redirects. Set the redirect max to something
      large, like 30
  • for each request (recurse)
    • if you get a 200, then you're done
    • if you get a permanent redirect, then follow it
    • if you get a temporary redirect, then
      • if Location header contains the expected URL, then that might indicate that
        it is trying to redirect us to a login page, so stop following redirects and set the URL
        to the current value. (The Location header would
        contain a URL that has the expected as a redirect query param. This
        heuristic might not hold every time, so needs to be manually reviewed)
      • otherwise, follow the redirect

New idea for handling geo-based redirects is to run the above logic on a few servers around the world and compare results. If the resolved URLs match on all of them, then we can have great confidence that is the correct answer. If they vary based on geo, then ???? do something.

@kenman345
Copy link

I think you should only focus on the URL resolution that would happen at a DNS server level. Whats the hostname and work from there. Perhaps even hostname would include the subdomain as a unique type, but otherwise, the URLs should be only what the DNS needs and let the hosting of the site redirect appropriately.

@conorgil
Copy link
Owner Author

@kenman345 I don't understand your comment in the context of the 2FA Notifier web extension.

Currently, the web extension matches on the URL of the current tab. If the current site supports 2FA, then it pops a notification. If it does not match (in the case of redirects, then it does not match), then it does nothing.

Can you explain your idea in more detail? How does going down to the DNS level help here? Are you suggesting that the extension not using URL matching and instead do something at the DNS level?

The other thing to consider is the permissions that the extension requires. I only use the bare minimum permissions to make the extension function correctly. I believe that monitoring every request in/out of the browser requires more permission than it currently has.

Sorry I'm not following. Throw more words at me and I'll see if I get it.

@kenman345
Copy link

hadnt dug into everything, disregard my previous concerns. Your resolved domains after redirects should match the FQDN on 2FAs data.json or else the data should be corrected on 2FAs side. Unfortunately, 2FA has a long slow merge and deploy cycle, so it may be best to figure out other solutions around the problem.

Also, think you can help me get this project running? Your readme is much to be desired in terms of getting it running locally. I would like to assist on the extension and perhaps fork it for a similar extension related to AcceptBitcoin.Cash site which is a fork of the 2FA sites structure. In our case, we update our listings quickly and run new deployments to the site once a week. Also, I have updated the data.json for our site to include a "generated" timestamp in the JSON, and since we have multiple pages, we have the data.json also reflect that.

@conorgil
Copy link
Owner Author

@kenman345 I plan to use twofactorauth.org as a data feed, but do plan to make plenty of changes to the data for use in 2FA Notifier. Check out this HN thread for some of my comments on that topic.

I would LOVE to help you get the project running. Sorry the docs are not up to snuff yet. That is a huge pet peeve of mine and I should just write better docs, but haven't had the time yet. Check out CONTRIBUTING.md and let me know if that is enough to get up and running locally. No automated tests yet :( but, I created an issue for that because we really need them (#70).

I'd love to chat offline about what you're working on. Shoot me an email if you want to connect.

@kenman345
Copy link

I messaged you on Twitter, didnt know your email.

@conorgil
Copy link
Owner Author

Idea: The output of the "resolve URLs" cleanup should be way more detailed so that we have enough information to run the process on a cron job and update data accordingly.

  • if the redirect is a temporary 302, then put that into the output so that 2FA Notifier can match on the domain. This URL will get updated each time the URLs are resolved, so if it changes, then the change will be reflected in the data used by 2FA Notifier and all will be well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Data sources, cleanliness, etc enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants