Skip to content

Latest commit

 

History

History

DockerHubMetadata

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

DockerHub Metadata size 1.4GB

Download the configurations.

Download the manifests.

1.46 million Docker image configuration and manifest files on DockerHub fetched in June 2019. A manifest points to the layers of an image and its configuration. A configuration carries all the metadata: architecture, OS, environment variables, entry point, default command, etc., including the layer creation history. The latter allows to reconstruct docker history without having to pull images. As a whole, the provided information can be used to partially (no ADD, stages) recover Dockerfile-s for any image on DockerHub which has it.

The dataset consists of 2 files:

  1. configs.tar.xz - configuration JSON files, 16GB uncompressed.
  2. manifests.tar.xz - manifest JSON files, 8.5GB uncompressed.

Format

The directory structure is the same for configurations and manifests. The top level directory is the first two letters of the image name, the inner directories correspond to the name, including the /. :latest is stripped from the file names. Examples: the configuration for tensorflow/tensorflow:2.0.0b0 will be at te/tensorflow/tensorflow:2.0.0b0.json, and for mongo:latest at mo/mongo.json.

The manifest format is defined at https://docs.docker.com/registry/spec/manifest-v2-2 The configuration format is defined at https://github.com/moby/moby/blob/master/image/spec/v1.2.md

Origin

DockerHub API. We modified skopeo to fetch configurations and manifests at blazing speed (less than 3 hours for the whole DockerHub), the modified source for cmd/skopeo/inspect.go is included into this repository. Image list fetcher is written in Python an is also included. How to reproduce:

pip3 install -r requirements.txt
python3 list_docker_images.py > images.txt
cp inspect.go /path/to/skopeo/cmd/skopeo/inspect.go
make -C /path/to/skopeo/ binary
cat images.txt | /path/to/skopeo/skopeo inspect

Limitations

License

Code: MIT. Compilation: Open Data Commons Open Database License (ODbL). Actual contents: DockerHub Terms of Service.