Retrieving Wikipedia articles

In this task, we focused on using nearest neighbors and clustering to retrieve documents that interest users, by analyzing their text. We explored two document representations: word counts and TF-IDF. We also built an Jupyter notebook for retrieving articles from Wikipedia about famous people.

Then we dug deeper into this application, compare results with word counts and TF-IDF, explore the retrieval results for various famous people, and familiarize ourselves with the code needed to build a retrieval system.

Data: people_wiki.sframe

Or if you are using pandas and scikit-learn, you can read people_wiki.csv
Code: Retrieving Wikipedia articles.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
Retrieving Wikipedia articles.ipynb		Retrieving Wikipedia articles.ipynb
people_wiki.csv.zip		people_wiki.csv.zip
people_wiki.sframe.zip		people_wiki.sframe.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Retrieving Wikipedia articles

About

Releases

Packages

Languages

chunwangpro/A-case-study-retrieving-and-measuring-similarity-on-Wikipedia-articles

Folders and files

Latest commit

History

Repository files navigation

Retrieving Wikipedia articles

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages