Skip to content

An application to find similar pictures based on the VGG16 and kNN

License

Notifications You must be signed in to change notification settings

senior-sigan/KawaiiSearch

Repository files navigation

Kawaii Search (Images similarity)

👷 I work hard on new version of Kawaii Search. It'll be an app you can install on your computer and do similar search over your images. No internet or API required, I'll be full in-house self-hosted solution.

✉️ Feel free to open an issue or email me.

The blog post describing how it works is here.

This is a demo of applying VGG16 and kNN to build similar image search.

vk.com/tokyofashion vk.com/tokyofashion

Dataset

You can use any big dataset with images. I used pictures from this group about fashion: vk/tokyofashion.

./data
-- photos.csv # is a csv file with pictures' info and url.
-- images     # is a directory with pictures

You can use src/get_images.py to get all pictures info and urls from a specified group in the vk. Use config.py to set vk group id.

You can use src/download_images.py to download images listed in the data/photos.csv.

Training

I use pretrained VGG16 from the keras, but without last layers, only global max pooling. So i get 512 features per image, that i used in the kNN with the cosine metric to calculate similarity.

But you have to generate all the 512-sized vectors for each image, so run src/vectorize_image.py to do it. On the GPU Tesla K80 in the gcloud for 50_000 images it takes 20 minutes. The result will be saved in the submission/images_vec.npz with submissions/images_order.csv.

Evaluating

Look into test.ipynb file for example of using this model.

Deploy

Modify and copy app.service to the /etc/systemd/system. Run systemctl daemon-reload and systemctl enable app.service.

TODO

  • create a single main file to do all the steps
  • write a blog post about this
  • build a web service with
  • add feature to find similar photos by an image URL
  • create web service for telegram chat
  • download more images

Releases

No releases published

Packages

No packages published

Languages