Releases: BramVanroy/spacy_conll
Documentation spacy-stanfordnlp, custom tagset map
The documentation has been greatly expanded. The most important addition to the README is the mention and explanation of using spacy-stanfordnlp
. spacy_conll
can be used together with this spaCy wrapper around stanfordnlp
. The benefit is that we can use Stanford models, with a spaCy interface. From a user perspective, this means better models, guaranteed Universal Dependencies tagsets, and an easy API through spaCy. (The cost is that Stanford NLP models are significantly slower than spaCy's models.) Small tests for spacy_stanfordnlp
have been added.
A new feature is that you can now add a custom tagset map (conversion_maps
). The idea is that you, as a user, have more control over the output tags. You can for instance specify that all deprel
tags nsubj
should be renamed to subj
. This is useful if your model uses a different tagset than you want. See the advanced example in the README for more information.
This release closes:
- "The dependency relations aren't transformed to universal dependencies" (#4)
Add dependencies to setup.py
This small release adds the dependencies to setup.py
, solving potential issues (e.g. #3).
Current dependencies are:
- packaging
- spacy
spaCy pipeline component, improved command line script with multiprocessing
This small repo has been overhauled so that users can integrate it directly in their spaCy scripts. You can now use it as a spaCy component. Three custom attributes have been added to Doc._.
and a Doc
's sentences. You can find more information in the README as well as example usage.
The command line script has been improved as well, now using the pipeline component instead of Spacy2ConllParser
. The latter has been deprecated (but is still accessible for now). Multiprocessing via the command line script is now possible, too.