Implementation of a Vector Space Retrieval Model using TF-IDF and cosine similarity on the Cranfield document corpus
-
Updated
Mar 29, 2020 - Jupyter Notebook
Implementation of a Vector Space Retrieval Model using TF-IDF and cosine similarity on the Cranfield document corpus
Performs tokenization, stemming, lemmatization, index creation, index compression and ranked retrieval of Cranfield documents
An Information Retrieval System with 3 models and 3 datasets from the ir_datasets library .
Implementation of Salton and Buckley paper using 2 methods TF-IDF & Best Weighted Probabilistic
An advanced form of the previously implemented search Engine which acts as a information retrieval system over the cranfield collection of the 1400 documents and also makes use of the stemmer algorithm. Other things are pretty much the same as the previously implemented SearchEngine project.
Python-based Information Retrieval system leveraging the BIM probabilistic model. Features include handling free-form text queries, relevance & pseudo-relevance feedback. Performance is rigorously evaluated using metrics like precision/recall, mean average precision, and R-precision. Utilizes the standard Cranfield dataset from aerodynamics.
🔍 A Lucene demo for searching the Cranfield collection.
Information Retrieval on Cranfield 1400 and NF-Corpus with Vector Space Model and Query Likelihood Model
Lucene SE for Cranfield Collection
Assignment from my MAI module 'Information Retrieval and Web Search' where I index the Cranfield Collection
Search Engine for the Cranfield Collection
Add a description, image, and links to the cranfield-collection topic page so that developers can more easily learn about it.
To associate your repository with the cranfield-collection topic, visit your repo's landing page and select "manage topics."