This directory holds two example data sets. The tutorial exercises may be completed using either data set.
-
yso-nlf is a data set consisting of the trilingual General Finnish Ontology YSO, a training data set constructed from metadata records from the Finna.fi discovery service, and some 2000 English language Master's and doctoral theses from the University of Jyväskylä.
-
stw-zbw contains the STW thesaurus for economics, metadata used in the ZBW retrieval system EconBiz and full texts of working papers in economics uploaded to EconStor.
In addition, hogwarts is a toy data set from the world of Harry Potter. It is used in an optional tutorial exercise.