🤘 awesome-semantic-segmentation
-
Updated
May 8, 2021
🤘 awesome-semantic-segmentation
☁️ 🚀 📊 📈 Evaluating state of the art in AI
Python package for the evaluation of odometry and SLAM
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
🪢 Open source LLM engineering platform: Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"
TCExam is a CBA (Computer-Based Assessment) system (e-exam, CBT - Computer Based Testing) for universities, schools and companies, that enables educators and trainers to author, schedule, deliver, and report on surveys, quizzes, tests and exams.
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Avalanche: an End-to-End Library for Continual Learning based on PyTorch.
Building a modern functional compiler from first principles. (http://dev.stephendiehl.com/fun/)
FuzzBench - Fuzzer benchmarking as a service.
Test your prompts, agents, and RAGs. Use LLM evals to improve your app's quality and catch problems. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
recommender system library for the CLR (.NET)
SemanticKITTI API for visualizing dataset, processing data, and evaluating results.
LLMOps with Prompt Flow is a "LLMOps template and guidance" to help you build LLM-infused apps using Prompt Flow. It offers a range of features including Centralized Code Hosting, Lifecycle Management, Variant and Hyperparameter Experimentation, A/B Deployment, reporting for all runs and experiments and so on.
A unified evaluation framework for large language models
Add a description, image, and links to the evaluation topic page so that developers can more easily learn about it.
To associate your repository with the evaluation topic, visit your repo's landing page and select "manage topics."