Skip to content

Latest commit

 

History

History
31 lines (18 loc) · 673 Bytes

README.md

File metadata and controls

31 lines (18 loc) · 673 Bytes

ACL trend survey

Requirements

  • Python 2.7

  • Install scrapy

      $pip install scrapy
    

Run

Data crawling

  • Configure year and journal in crawler/crawler/settings.py. (Haven't tried crawling other proceedings/journals than ACL though)

      $ cd crawler
      $ scrapy crawl acl -o items.csv -t csv
      $ scrapy crawl acl -o items.json -t json
    
  • Be careful of running the code twice because the json file gets appended, rather than overwritten.

Calculate frequent authors

$ python count.py

Author

License