π Vincent D. Warmerdam β£ββ π¦ Open Source Packages β β£ββ bulk - simple bulk labelling interface β β£ββ embetter - embeddings ready for sklearn β β£ββ doubtlab - suite of tools to help find bad labels β β£ββ drawdata - draw datasets in jupyter β β£ββ scikit-lego - lego bricks for sklearn β β£ββ scikit-partial - partial_fit() pipelines for sklearn β β£ββ scikit-bloom - bloom transformers for sklearn β β£ββ fh-matplotlib - matplotlib for FastHTML β β£ββ fh-altair - altair for FastHTML β β£ββ human-learn - rule-based components for sklearn β β£ββ sentence-models - a different take on textcat β β£ββ mktestdocs - turn markdown files into pytest tests β β£ββ lazylines - lightweight utils for .jsonl wrangling β β£ββ cluestar - inspiration for your first text labels β β£ββ durations - pytest duration insights β β£ββ tuilwindcss - tailwindcss for textual tui apps β β£ββ memo - saves a whole log of time β β£ββ skedulord - makes cron a bit more fun β β£ββ icepickle - cool and safe storage for linear models β βββ evol - grammar for genetic heuristics β£ββ π Project Contributions β β£ββ fairlearn - contributed the CorrelationFilter β β£ββ polars - contributed the .pipe() method β βββ BERTopic - added lightweight sklearn pipeline support β£ββ β Online Projects β β£ββ calmcode.io - intermediate developer education β β£ββ koaning.io - personal blog β βββ dearme.email - reflection via a 30 day delay β£ββ ποΈ Popular Talks β β£ββ Natural Intelligence is All You Need β β£ββ Group-by statements that save the day β β£ββ Tools to Improve Training Data β β£ββ Optimal on Paper, Broken in Reality β β£ββ Playing by the Rules-Based-Systems β β£ββ How to Constrain Artificial Stupidity β β£ββ The Profession of Solving the Wrong Problem β β£ββ Winning with Simple, even Linear, Models β βββ Untitled12.ipynb β£ββ π¬ Random Experiments β β£ββ scikit-prune - prune scikit learn pipelines β β£ββ gitlit - tracking github action times across open source β β£ββ sentimany - many sentiment models, one repo β β£ββ tokenwiser - sklearn token tricks β β£ββ clumper - functional API for lists of dicts β βββ whatlies - exploration tools for word embeddings βββ π¨βπ» Employer β£ββ π² :probabl. - scikit-learn and friends β β£ββ scikit-churn - safety rails for churn work β β£ββ scikit-playtime - rethinking pipelines β βββ scikit-mdn - mixture density networks β£ββ π₯ Explosion - developer tools for nlp β β£ββ prodigy-hf - Prodigy integration for the HuggingFace stack β β£ββ prodigy-pdf - Annotate PDFs via Prodigy β β£ββ prodigy-ann - ANN techniques to find relevant subsets β β£ββ prodigy-segment - Prodigy integration for Segment Anything β β£ββ prodigy-lunr - Search techniques to find relevant subsets β β£ββ prodigy-whisper - Transcribe audio with OpenAI's whisper models β β£ββ prodigy-tui - Prodigy from the terminal β βββ cluestar - inspiration for your first text labels βββ π€ Rasa - conversational software provider β£ββ nlu examples - custom nlu components for Rasa β£ββ taipo - data augmentation tools βββ algo whiteboard - nlp education Follow me on twitter @fishnets88
koaning
Follow
Solving problems involving data. Mostly NLP these days. AskMeAnything[tm].
- Amsterdam
- https://koaning.io
- @fishnets88
Pinned Loading
-
-
human-learn
human-learn PublicNatural Intelligence is still a pretty good idea.
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.