-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
explosion spaCy Language-support Discussions
Sort by:
Latest activity
Categories, most helpful, and community links
Categories
Community links
🌍 Language Support Discussions
Discuss the language data and training models for new languages
Pinned to Language Support
-
🌍 Adding models for new languages master thread
enhancementFeature requests and improvements lang / allGlobal language data new languageAdding support for new languages to spaCy.
Discussions
-
You must be logged in to vote 🌍 The models directory includes two types of pretrained models:
docsDocumentation and website -
You must be logged in to vote 🌍 Spanish lemmatization suffixes
lang / esSpanish language data and models feat / lemmatizerFeature: Rule-based and lookup lemmatization -
You must be logged in to vote 🌍 Amharic/Tigrinya Tokenizer help
feat / tokenizerFeature: Tokenizer lang / amAmharic language data and models -
You must be logged in to vote 🌍 Amharic Word Segmentation
feat / sentencizerFeature: Sentencizer (rule-based sentence segmenter) lang / amAmharic language data and models -
You must be logged in to vote 🌍 Lemma of masculine nouns inconsistent in Norwegian
enhancementFeature requests and improvements lang / nbNorwegian (Bokmål) language data and models feat / lemmatizerFeature: Rule-based and lookup lemmatization -
You must be logged in to vote 🌍 Korean support without manual system-level operations
enhancementFeature requests and improvements help wantedContributions welcome! lang / koKorean language data and models -
You must be logged in to vote 🌍 Language Code for languages not listed in ISO 639-1
lang / allGlobal language data -
You must be logged in to vote 🌍 Help wanted: Example sentences for testing languages
metaMeta topics, e.g. repo organisation and issue management lang / allGlobal language data -
You must be logged in to vote 🌍 noun_chunks function return a empty list
enhancementFeature requests and improvements lang / zhChinese language data and models -
You must be logged in to vote 🌍 Turkish lang support beta
lang / trTurkish language data and models -
You must be logged in to vote 🌍 Norwegian stopword list needs trimming
enhancementFeature requests and improvements lang / nbNorwegian (Bokmål) language data and models -
You must be logged in to vote 🌍 Remove questionable stopwords from default lists
feat / docFeature: Doc, Span and Token objects -
You must be logged in to vote 🌍 Add an additional Chinese tokenizer
enhancementFeature requests and improvements lang / zhChinese language data and models feat / tokenizerFeature: Tokenizer -
You must be logged in to vote 🌍 Update list of Tagalog stop words
enhancementFeature requests and improvements feat / docFeature: Doc, Span and Token objects -
You must be logged in to vote 🌍 Testing Polish models
lang / plPolish language data and models feat / lemmatizerFeature: Rule-based and lookup lemmatization -
You must be logged in to vote 🌍 Inconsistencies in the Korean/Japanese use of mecab
lang / koKorean language data and models third-partyThird-party packages and services lang / jaJapanese language data and models -
You must be logged in to vote 🌍 Adding Romanian NER
lang / roRomanian language data and models feat / nerFeature: Named Entity Recognizer -
You must be logged in to vote 🌍 Noun Phrase Chunking - Improvements in recall.
enhancementFeature requests and improvements feat / docFeature: Doc, Span and Token objects -
You must be logged in to vote 🌍 nb: lemmatization of copula AUX
feat / lemmatizerFeature: Rule-based and lookup lemmatization -
🌍 Experimental spaCy language model for Indonesian
lang / idIndonesian language data and models -
You must be logged in to vote 🌍 Exploiting available linguistic resources for the Italian language
lang / itItalian language data and models -
🌍 Training a Turkish model
lang / trTurkish language data and models -
You must be logged in to vote 🌍 tokenizer_exceptions problem with Persian
lang / faPersian language data and models feat / tokenizerFeature: Tokenizer