Merge branch 'datasets'

WildlifeDatasets · Oct 31, 2024 · 62095da · 62095da
2 parents 1423e32 + a9fb2e5
commit 62095da
Show file tree

Hide file tree

Showing 88 changed files with 4,088 additions and 9,607 deletions.
diff --git a/README.md b/README.md
@@ -35,12 +35,20 @@
 
 The aim of the project is to provide comprehensive overview of datasets for wildlife individual re-identification and an easy-to-use package for developers of machine learning methods. The core functionality includes:
 
-- overview of 36 publicly available wildlife re-identification datasets.
+- overview of 41 publicly available wildlife re-identification datasets.
 - utilities to mass download and convert them into a unified format and fix some wrong labels.
 - default splits for several machine learning tasks including the ability create additional splits.
 
 An introductory example is provided in a [Jupyter notebook](notebooks/introduction.ipynb). The package provides a natural synergy with [Wildlife tools](https://github.com/WildlifeDatasets/wildlife-tools), which provides our [MegaDescriptor](https://huggingface.co/BVRA/MegaDescriptor-L-384) model and tools for training neural networks. 
 
+## Changelog
+
+[08/10/2024] Added AmvrakikosTurtles, ReunionTurtles, ZakynthosTurtles (sea turtles), ELPephants (elephants) and Chicks4FreeID (chickens).  
+[13/06/2024] Added WildlifeReID-10k (unification of multiple datasets).  
+[09/05/2024] Added CatIndividualImages (cats), CowDataset (cows) and DogFaceNet (dogs).  
+[28/02/2024] Added MPDD (dogs), PolarBearVidID (polar bears) and SeaStarReID2023 (sea stars).  
+[04/01/2024] Received **Best paper award** at WACV 2024.
+
 ## Summary of datasets
 
 An overview of the provided datasets is available in the [documentation](https://wildlifedatasets.github.io/wildlife-datasets/datasets/), while the more numerical summary is located in a [Jupyter notebook](notebooks/dataset_descriptions.ipynb). Due to its size, it may be necessary to view it via [nbviewer](https://nbviewer.org/github/WildlifeDatasets/wildlife-datasets/blob/main/notebooks/dataset_descriptions.ipynb).
@@ -88,7 +96,7 @@ dataset.df
 The dataset also contains basic metadata including information about the number of individuals, time span, licences or published year.
 
 ```
-dataset.metadata
+dataset.summary
 ```
 
 <picture>

diff --git a/baselines/analyze_wildlife_reid_10k.ipynb b/baselines/analyze_wildlife_reid_10k.ipynb
@@ -554,7 +554,7 @@
    "source": [
     "summary_datasets = {}\n",
     "for name, df_red in df.groupby('dataset'):\n",
-    "    metadata = eval(f'datasets.{name}.metadata')\n",
+    "    metadata = eval(f'datasets.{name}.summary')\n",
     "    if 'licenses' in metadata:\n",
     "        license = metadata['licenses']\n",
     "    else:\n",
@@ -1039,7 +1039,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "venv",
    "language": "python",
    "name": "python3"
   },

diff --git a/baselines/prepare_wildlife_reid_10k.py b/baselines/prepare_wildlife_reid_10k.py
diff --git a/baselines/utils.py b/baselines/utils.py
@@ -35,8 +35,8 @@ def rename_index(df):
     rename = {}
     for dataset_name in df.index:
         try:
-            metadata = eval(f'datasets.{dataset_name}.metadata')
-            citation = " \cite{" + metadata['cite'] + "}"
+            summary = eval(f'datasets.{dataset_name}.summary')
+            citation = " \cite{" + summary['cite'] + "}"
         except:
             citation = ''
         rename[dataset_name] = dataset_name + citation