Skip to content

josepvalls/twit-ranker

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

twit-rank

CIS400 Final Project

swn3.txt is the SentiWordNet database file. It cannot be distributed publicly without the permission from its creators.

Filter

Accessing the Word Frequency List

To access the word frequency list, you will have to unpickle the file wordfreq.pkl. The list is stored as a dictionary, using the word as the key and WordFreq as the value. WordFreq is a namedtuple and the declaration for it is

from collections import namedtuple WordFreq = namedtuple('WordFreq', ['rank','pos', 'frequency', 'dispersion'])

Wikipedia article Frequency List

To access the frequency list of words in a wikipedia article, go to ./filter/wiki_freq/, and initialize the wiki_freq class with the appropriate title. Then to get the dictionary of frequencies, type : "objectName.d"

Language Identifier

To use the language identifier, simply look at the README in the guess_language folder.

Ranker

SentiWordNet

swn3.txt is the SentiWordNet database file. It cannot be distributed publicly without the permission from its creators.

About

CIS400 Final Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Perl 39.7%
  • Python 17.7%
  • Java 15.9%
  • C++ 15.3%
  • C 11.3%
  • M 0.1%