Skip to content

spaCy pipeline for french focused on personal pronouns, fictions and first person point of view texts.

License

Notifications You must be signed in to change notification settings

thjbdvlt/solipCysme

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

solipCysme

spaCy pipeline for french fictions or first person point of view texts (with a focus on personal pronouns).

Feature Description
Language french
Name fr_solipcysme
Version 3.7.0
spaCy >=3.7.6,<3.8.0
Default Pipeline presque_normalizer, tokentype, morphologizer, viceverser_lemmatizer, sentencizer, parser
Components quelquhui_tokenizer, presque_normalizer, tokentype, morphologizer, viceverser_lemmatizer, sentencizer, parser
Vectors 504297 keys, 504297 unique vectors (100 dimensions)
Sources Corpus narraFEATS (morphologizer), corpus cabillaUD (parser), corpus attirail (vectors).
License CC BY-NC-SA 4.0
Author thjbdvlt

installation

# from github release
pip install https://github.com/thjbdvlt/solipCysme/releases/download/solipCysme-v3.7.0/fr_solipcysme-3.7.0-py3-none-any.whl

# from huggingface
pip install https://huggingface.co/thjbdvlt/fr_solipcysme/resolve/main/fr_solipcysme-any-py3-none-any.whl

usage

import spacy

nlp = spacy.load("fr_solipcysme")

for i in nlp(
    "la MACHINE à (b)rouiller le temps s'est peuuut-etre déraillée..?"
):
    print(
        i, 
        i.norm_, 
        i.pos_, 
        i.morph, 
        i.lemma_, 
        i.dep_, 
        i._.tokentype,
        i._.vv_pos,
        i._.vv_morph
    )