Skip to content

Releases: BramVanroy/spacy_conll

Documentation spacy-stanfordnlp, custom tagset map

02 Feb 13:31
e3cd567
Compare
Choose a tag to compare

The documentation has been greatly expanded. The most important addition to the README is the mention and explanation of using spacy-stanfordnlp. spacy_conll can be used together with this spaCy wrapper around stanfordnlp. The benefit is that we can use Stanford models, with a spaCy interface. From a user perspective, this means better models, guaranteed Universal Dependencies tagsets, and an easy API through spaCy. (The cost is that Stanford NLP models are significantly slower than spaCy's models.) Small tests for spacy_stanfordnlp have been added.

A new feature is that you can now add a custom tagset map (conversion_maps). The idea is that you, as a user, have more control over the output tags. You can for instance specify that all deprel tags nsubj should be renamed to subj. This is useful if your model uses a different tagset than you want. See the advanced example in the README for more information.

This release closes:

  • "The dependency relations aren't transformed to universal dependencies" (#4)

Add dependencies to setup.py

21 Jan 09:19
Compare
Choose a tag to compare

This small release adds the dependencies to setup.py, solving potential issues (e.g. #3).

Current dependencies are:

  • packaging
  • spacy

spaCy pipeline component, improved command line script with multiprocessing

15 Jan 14:22
01d717d
Compare
Choose a tag to compare

This small repo has been overhauled so that users can integrate it directly in their spaCy scripts. You can now use it as a spaCy component. Three custom attributes have been added to Doc._. and a Doc's sentences. You can find more information in the README as well as example usage.

The command line script has been improved as well, now using the pipeline component instead of Spacy2ConllParser. The latter has been deprecated (but is still accessible for now). Multiprocessing via the command line script is now possible, too.