Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Curate, Select and Create a working dataset for task 1: classifying hotels from hon hotel room images. #2

Open
arditecht opened this issue Sep 3, 2019 · 0 comments
Assignees

Comments

@arditecht
Copy link
Collaborator

Since the task involves a lot of variability within the hotel room themselves that we are about to classify as positives, we would need more fine grained distinctions.
Example: Just telling flower/animal images from hotel images would not serve as a reliable model in the real life task as compared to a model which can tell a parking lot from a lobby OR a studio apartment room from a hotel room OR a computer generated cheap 3D graphics room from an actual neat picture with less lighting gradients.

Therefore, our working data-set needs to address sufficiently fine grained distinctions for the eventually trained model to be effective. Make sure we have sufficient variance within what we consider positives and within what we consider negatives, while making sure those actual positives and negatives must look pretty close as in raw data(eg: again parking lot and big lobby).

Tasks:

  1. Select hotel images with sufficient variations. Form a general set of hotel room images.
  2. Select non-hotel room images similarly. Have to make sure they are thematically close to being hotel room for better model.
  3. Select a lot of completely random images. Forests, flowers, animals everything mixed in one set. These are easily distinguishable from anything hotel oriented.
  4. Standardize all the above sets into fixed sizes, formats and channels, details of which shall be decided as soon as we complete the above 3 tasks and have finalized the model architecture to be used. Also, once some other logistical details are taken into consideration.

To be done by end of the day, 4th September

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants