Skip to content
This repository has been archived by the owner on Mar 14, 2024. It is now read-only.
/ free_cite Public archive
forked from shoe/free_cite

Parse citations from plain text strings or HTML

License

Notifications You must be signed in to change notification settings

academia-edu/free_cite

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Excite

Provides a simple Ruby API for parsing citations from plain text strings or HTML.

Usage

  require 'excite'

  Excite.parse_string("Wilcox, Rhonda V. 1991. Shifting roles and synthetic women in Star trek: The next generation. Studies in Popular Culture 13 (June): 53-65.")

  Excite.parse_html("<span>Devine, PG, & Sherman, SJ</span><span>(1992)</span><strong>Intuitive versus rational judgment and the role of stereotyping in the human condition: Kirk or Spock?</strong><em>Psychological Inquiry</em><span>3(2), 153-159</span>")

History and Credits

Derived from FreeCite, minus Rails and all UI elements. The most up-to-date fork of FreeCite of which I am aware is rsinger's. FreeCite in turn is inspired by ParsCit.

The main changes are:

  • No UI, just a gem;
  • New model for parsing HTML;
  • Tokenization and part-of-speech features from EngTagger.

Credit is due to the authors of all the linked projects, as well as Laura Durkay who marked up the HTML training data.

Install required packages

From source

wget http://crfpp.googlecode.com/files/CRF%2B%2B-0.57.tar.gz
tar xvzf CRF++-0.57.tar.gz
cd CRF++-0.57
./configure 
make
sudo make install

On Ubuntu

sudo apt-add-repository 'deb http://cl.naist.jp/~eric-n/ubuntu-nlp oneiric all'
sudo apt-get update
sudo apt-get install libcrf++

On OS X with Homebrew

brew install crf++

About

Parse citations from plain text strings or HTML

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Ruby 100.0%