Skip to content

🌬️urlExpander is a Python package for expanding shortened links (urls).

License

Notifications You must be signed in to change notification settings

AhmedBytesBits/urlExpander

 
 

Repository files navigation

urlExpander

PyPI PyPI DOI

urlExpander is a Python package for quickly and thoroughly expanding shortened URLs.

About

urlExpander is inteded to be used by social media researchers who want to do analysis of links.

Analytics and ad-based services make such analysis difficult. Aside from collecting in-depth user engagement data, these services obfuscate the destination of the shortened URLs.

urlExpander was created to address this challenge in a scalable and robust manner. It does so by providing utility functions to convert Tweets into link datasets, filter for known for link-shortening services (like bit.ly), resolve shortened links, and parse the title and meta description from webpages.

This package differs from other approaches because it handles ad-based urls (like adf.ly, lnx.lu, linkbucks.com, and adfoc.us) thanks to the Unshortenit library, as well as resolves redirects to defunct websites (like blacktolive.com). Most importantly, urlExpander and offers multithreaded url expansion.

The multithreaded url expansion was created to overcome the bottleneck of mass link expansion through parallelization, minimizating http requests, caching results, and chunking the input into smaller pieces.

Installation

pip install urlexpander

Quickstart

import urlexpander
urlexpander.expand('https://trib.al/xXI5ruM')

returns

'https://www.breitbart.com/video/2017/12/31/lindsey-graham-trump-just-cant-tweet-iran/'

The function shines given a massive list of urls to unshorten:

resolved_links = urlexpander.multithread_expand(list_of_short_urls, 
                                                chunksize=1280, 
                                                n_workers=64,
                                                cache_file='tmp.json')

Check out this Jupyter Notebook for a more in-depth quickstart!

Documentation

We'll generate a readthedocs shortly!

Acknowledgements

urlExpander was written by Leon Yin with contributions by Nicole Baram and Gregory Eady for the Social Media and Political Participation Lab at NYU.

Please cite urlExpander in your publications if it helps your research. Here is an example BibTeX entry:

@misc{leon_yin_2018_1345144,
  author       = {Leon Yin},
  title        = {SMAPPNYU/urlExpander: Initial release},
  month        = aug,
  year         = 2018,
  doi          = {10.5281/zenodo.1345144},
  url          = {https://doi.org/10.5281/zenodo.1345144}
}

Please also send us your work :)

Research Output

urlExpander is being used is several forthcoming publications from the SMaPP Lab (and perhaps from you?). We'll keep a running tally here.

About

🌬️urlExpander is a Python package for expanding shortened links (urls).

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%