Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separating hotel room images from AVA dataset #1

Open
mikkokotila opened this issue Aug 28, 2019 · 4 comments
Open

Separating hotel room images from AVA dataset #1

mikkokotila opened this issue Aug 28, 2019 · 4 comments
Assignees
Milestone

Comments

@mikkokotila
Copy link
Collaborator

mikkokotila commented Aug 28, 2019

Background

The AVA dataset contains 250,000 images, many of which have comprehensive meta-data related with image aesthetics.

Objective

The task is to create a binary classifier that returns 1 in the case an input photo is a hotel room. It seems that researchers have already performed this task successfully, which may be a good starting point for this task.

This dataset with 1,000,000 hotel room images from 50,000 hotels for training, may be useful as training data, when combined with other photos.

Steps

  1. Perform a cursory literature review of similar tasks
  2. If literature is found on the specific task, or comparable task, clearly articulate the state-of-the-art
  3. Identify several probable model architecture candidates based on the literature review
  4. Identify meaningful hyperparameters and their corresponding boundaries
  5. Perform a comprehensive hyperparameter search separately for each architecture candidate
  6. Analyze the results, and pick one model architecture to continue with
  7. Based on learnings from the initial hyperparameter searches, set a new boundary
  8. Perform a comprehensive hyperparameter search
  9. Analyze, conclude, and report the results

Tools

You are to use Keras or Pytorch as the differentiation API, and Talos for Hyperparameter search. You are to present the findings in a code complete Jupyter Notebook.

Timeline

The challenge is to be completed by end-of-day 6th of September.

@arditecht
Copy link
Collaborator

Need further clarity on what exactly we need to classify as a part of this task.

  1. Do we have to classify to tell a indoor hotel room from other completely different images? And if so, which rooms it should consider a positive, since the bedroom, bathroom and the lobby all look completely different?
  2. What do we consider a non-hotel room image? How should we treat room balconies or large room-window views differently(degree of difference) from images of say dogs or cats?

@mikkokotila
Copy link
Collaborator Author

  1. The hotel room itself is the classification task. Lobby of the hotel is not the hotel room, also the balcony of a hotel is not the hotel room.

  2. I'm assuming that the 1 million hotel room image dataset above is already handling this for us, so we would not have to worry about it. I don't think they will have photos of dogs or cats. You will have to mix it with random photos so you can use it as your training data. But probably you will only end up using small part of it, and use rest for robust validation.

@priyankav
Copy link
Collaborator

The hotels-50k dataset was originally proposed for identifying the name of the hotel given the hotel image(can be bedroom, bathroom, lobby etc). This dataset has all these images combined and the metadata provided is the hotel id, hotel chain id, latitude longitude etc.

@mikkokotila
Copy link
Collaborator Author

Interesting. I think then one way of looking at this is that we have first a classifier that says if it's a room or not, and then with that we can clean the hotels-50k dataset to just rooms, which we also now know are hotels. Various preparatory steps are very common in challenges where the end goal is very specific.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants