Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embed a single image #81

Open
lamhoangtung opened this issue Jun 3, 2019 · 14 comments
Open

Embed a single image #81

lamhoangtung opened this issue Jun 3, 2019 · 14 comments

Comments

@lamhoangtung
Copy link

lamhoangtung commented Jun 3, 2019

Hi, I'm trying to write a script to embed a single image based on your code, it's look something like this:

import json
import os
from importlib import import_module

import cv2
import tensorflow as tf
import numpy as np

sess = tf.Session()

# Read config
config = json.loads(open(os.path.join(
    '<exp_root>', 'args.json'), 'r').read())

# Input img
net_input_size = (
    config['net_input_height'], config['net_input_width'])
img = tf.placeholder(tf.float32, (None, net_input_size[0], net_input_size[1], 3))

# Create the model and an embedding head.
model = import_module('nets.' + config['model_name'])
head = import_module('heads.' + config['head_name'])

endpoints, _ = model.endpoints(img, is_training=False)
with tf.name_scope('head'):
    endpoints = head.head(endpoints, config['embedding_dim'], is_training=False)

# Initialize the network/load the checkpoint.
checkpoint = tf.train.latest_checkpoint(config['experiment_root'])
print('Restoring from checkpoint: {}'.format(checkpoint))
tf.train.Saver().restore(sess, checkpoint)


raw_img = cv2.imread('<img>')
raw_img = cv2.resize(raw_img, net_input_size)
raw_img = np.swapaxes(raw_img, 0, 1)
raw_img = np.expand_dims(raw_img, axis=0)

emb = sess.run(endpoints['emb'],  feed_dict={img: raw_img})[0]

But the result for a same image with my code and your code are not the same.

Note that there is no any augmentation added when I compute the embedding vector.

Am I missing anything here? Thanks you for the help

@lamhoangtung
Copy link
Author

Quick update, I've just found out that you guys used tf.image.decode_jpeg and tf.image.resize_images instead of OpenCV, I switched to it, the output result is different but still not the same as your code.

Am I missing something like normalization ?? Here is what I've changed:

path = tf.placeholder(tf.string)
image_encoded = tf.read_file(path)
image_decoded = tf.image.decode_jpeg(image_encoded, channels=3)
image_resized = tf.image.resize_images(image_decoded, net_input_size)
img = tf.expand_dims(image_resized, axis=0)

Thanks ;)

@Pandoro
Copy link
Member

Pandoro commented Jun 3, 2019 via email

@lamhoangtung
Copy link
Author

lamhoangtung commented Jun 3, 2019

Hi @Pandoro,
Thanks for the quick response. This is what I use to compute the embedding vector:

python3 embed.py \
    --experiment_root ... \
    --dataset ... \
    --filename ...

I extracted the vector from the .h5 file.

Anyways, how can I do TTA in my case? Are there any code in your repo I can reference?

@Pandoro
Copy link
Member

Pandoro commented Jun 3, 2019 via email

@lamhoangtung lamhoangtung changed the title Embed a single images Embed a single image Jun 3, 2019
@lamhoangtung
Copy link
Author

Hi. I did an experiments with a csv file contain only the image that I want to embed and found something really strange. Actually there might be nothing wrong with you guys’s embed code and my inference code.

  • The h5 output file that I previously use for comparison was created on a remote server with GPU enabled.
  • My inference code was run on my local machine which only have CPU. After I try to compute everything again only on my CPU, I found that there are a big difference on the embedded vector computed by GPU vs CPU. (My code and yours produce exactly the same results)
  • Note that the difference are HUGE, like completely different. I did double check the model, code and input images for the experiment
    Have you ever seen something like this ? Am I wrong at some point ?

@Pandoro
Copy link
Member

Pandoro commented Jun 3, 2019 via email

@lamhoangtung
Copy link
Author

lamhoangtung commented Jun 3, 2019

@Pandoro Same tensorflow 1.12.0 on both machine

@lamhoangtung
Copy link
Author

Some update on this.
I tried to redo everything, even training and here is the result:

  • Embedded vector computed by my single image inference code on both GPU and CPU (same result):
[ 0.1475507   0.26669884 -0.10536072 -0.7495441  -0.05301389 -0.12123938
 -0.2105978   0.34713405 -0.06077751  0.38768452  0.46736327 -0.14455695
 -0.13443749  0.4708902  -0.53196555 -0.4674694   0.4387072  -0.01120797
  0.03252156  0.11937858  0.03637908 -0.23512752 -0.087494    0.40861905
  0.39684698 -0.25528368  0.53282946 -0.7992279  -0.04100448  0.607317
  0.37891495 -0.43027154 -0.09188752 -0.31797376  0.2922396   0.3039867
 -0.21458632 -0.40264758  0.01471368  0.14217973  0.29642326 -0.33412308
  0.61750454  0.02563823 -0.4100364  -0.4894322  -0.33408296 -0.30945992
 -0.03018434  0.06986241 -0.3707401  -0.1222352   0.19458997 -0.11415277
 -0.04913341 -0.0650656  -0.23189925 -0.3081076  -0.04566643  0.56977797
  0.1199189  -0.25228524 -0.10953259  0.5716973   0.07392599 -0.1805463
  0.03953229  0.12185388 -0.15962987 -0.21938688 -0.05884064  0.34342512
  0.26555967  0.21485685  0.3734443  -0.19710182 -0.4279406   0.23197423
 -0.27009133  0.30459598 -0.37105414  0.4993727   0.1789047   0.04352051
 -0.16855955 -0.6482116  -0.1902902  -0.02592199 -0.00989667  0.5478813
  0.3826628  -0.33704245  0.3876207  -0.39746612 -0.4097886   0.14956611
  0.03482605 -0.27635813  0.05575407 -0.26498005 -0.19787493 -0.22036389
  0.21582448  0.46559668 -0.41869876  0.12922227  0.0621463   0.01098646
  0.06490406  0.35996896  0.21602859 -0.34911785 -0.18451497  0.05639197
  0.04268607 -0.072242   -0.23873544 -0.09557254  0.03791614 -0.19931975
 -0.07070286  0.09722421  0.29151836 -0.02433551  0.2241952  -0.96187866
  0.13102485  0.00164846]
  • Embedded vector computed by you guys 's code but on a 100 line .csv file, contain only one sample duplicated 100 times: Same for 100 output vector, same for GPU and CPU and same as above
  • Embedded vector computed by you guys 's code but on a 100k+ line .csv file, which have the first sample use for above experiment: Same for GPU and CPU, but not same as above:
[ 4.46426451e-01  2.67341495e-01 -3.03951055e-01 -1.09888956e-01
  1.48094699e-01  1.09376453e-01  3.18785965e-01 -2.31513470e-01
  9.18060988e-02  9.47581697e-03 -3.14935297e-01 -5.06232917e-01
  2.13361338e-01  5.70732616e-02  5.59608713e-02 -2.04994321e-01
 -7.14561269e-02  4.35655147e-01  4.42430824e-01 -1.19181640e-01
 -9.79143828e-02  3.38607967e-01 -8.01632106e-02  8.19585398e-02
  3.10744733e-01 -5.10766864e-01  3.90632376e-02  3.73192802e-02
 -2.21006293e-02  1.50721356e-01  3.10757637e-01 -1.00263797e-01
 -3.67254391e-02  3.62346590e-01 -2.23815039e-01 -4.09024119e-01
 -7.41786659e-01 -2.77244627e-01 -6.83265150e-01 -3.71105620e-04
  3.62792283e-01 -3.34418714e-01  4.02492136e-01  2.93934852e-01
  5.06364256e-02  1.14161275e-01 -1.49569120e-02  2.07622617e-01
  9.04084072e-02  2.35464871e-01  1.60102062e-02 -1.07340008e-01
 -6.13746643e-01 -1.84301529e-02 -3.65158543e-02 -2.17433404e-02
  4.48067039e-01  3.31106067e-01  2.05742702e-01 -1.24085128e-01
  2.07252398e-01 -5.85925281e-01 -2.59883493e-01  2.63391703e-01
 -3.12482953e-01 -1.48463324e-01 -2.19984993e-01  3.31126675e-02
  1.76012367e-01  3.09261560e-01 -1.59823354e-02  1.53631851e-01
  1.53570157e-02 -2.29165092e-01  3.28389913e-01 -2.26212129e-01
 -3.93793285e-01 -1.54186189e-01 -4.85752940e-01  1.30166719e-02
 -5.14035374e-02 -1.77116096e-01  9.73375281e-05 -2.54578739e-02
  3.99445705e-02  4.45321977e-01  2.78115660e-01 -1.51245281e-01
 -3.03700745e-01 -3.81025001e-02  1.43309757e-01 -6.55035377e-01
  8.83019418e-02 -3.06550767e-02 -4.80769187e-01  4.71787043e-02
  5.49029335e-02 -1.17088296e-01  3.43144536e-01 -7.30120242e-02
 -3.58440757e-01 -1.66995618e-02 -3.06979388e-01  5.11138923e-02
  1.75048336e-01 -1.83060188e-02 -3.81746352e-01 -6.02350771e-01
 -3.84051464e-02  5.41097879e-01  2.33160406e-01  8.10048282e-02
 -4.97415751e-01 -3.47296298e-02 -8.40142891e-02  2.04959571e-01
  6.48377165e-02 -1.64840698e-01  1.98047027e-01  1.82637498e-01
 -9.53407511e-02  2.63416976e-01 -1.82583451e-01 -3.99179049e-02
  2.82630742e-01 -6.65262759e-01 -5.13938844e-01 -1.60764366e-01]
  • I tried to search for the entire 100k+ computed vector to find if there are any index got messed up but I can't find it.

Where can I potentially be wrong ?. Here is how I extract the vector out of the h5 file:

import h5py
import numpy as np

raw_embedding = h5py.File('....h5', 'r')
raw_label = pd.read_csv('...csv')

def load_data():
    features = raw_embedding['emb'].value
    labels = list(raw_label.iloc[:, 1])
    return (features, labels)

vecs, imgs = load_data()
print(vecs[0], imgs[0])

Thanks for your help @Pandoro

@lamhoangtung
Copy link
Author

Note: I tried a bunch of different images, so the problem is not related to the first sample of the dataset only.
=> Question: Did you guys do any datasets level normalization ?

@Pandoro
Copy link
Member

Pandoro commented Jun 4, 2019 via email

@lamhoangtung
Copy link
Author

So which one should I use ? Which one is more accurate ?
Should I create a fake batches ? Or should I keep the batch_size = 1 when inference ?

@Pandoro
Copy link
Member

Pandoro commented Jun 4, 2019 via email

@mazatov
Copy link

mazatov commented Nov 6, 2019

@lamhoangtung , Were you able to figure this out? I'm trying to follow your steps to generate embeddings and compare them. But so far I'm running into some errors:

I cannot load the model this way for some reason. #85

checkpoint = tf.train.latest_checkpoint(config['experiment_root'])

I tried loading the model this way,

saver = tf.train.import_meta_graph('experiments\my_experiment\checkpoint-25000.meta')
saver.restore(sess, 'experiments\my_experiment\checkpoint-25000')

but that still gives me an error when I try to run
emb = sess.run(endpoints['emb'], feed_dict={img: raw_img})[0]

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value resnet_v1_50/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma
	 [[node resnet_v1_50/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma/read (defined at C:\Users\mazat\Documents\Python\trinet\nets\resnet_v1.py:118) ]]
	 [[node head/emb/BiasAdd (defined at C:\Users\mazat\Documents\Python\trinet\heads\fc1024.py:17) ]]

Thanks

@mazatov
Copy link

mazatov commented Nov 7, 2019

@lamhoangtung I think I figured out the first problem.

  1. cv2 loads the image in BGR style so you need to convert it to RGB.
  2. There seem to be some differences in the way cv2 and tensorflow load jpeg image. Check https://stackoverflow.com/questions/45516859/differences-between-cv2-image-processing-and-tf-image-processing

So, to get cv2 load embeddings close to the embed.py values, I did the following.

raw_img = cv2.imread(os.path.join(config['image_root'],'query', '0001_c1s1_001051_00.jpg'))
raw_img = cv2.cvtColor(raw_img, cv2.COLOR_BGR2RGB)
raw_img = cv2.resize(raw_img, (net_input_size[1], net_input_size[0]))
raw_img = np.expand_dims(raw_img, axis=0)

If you want to get the exactly same values you can load the image with TF instead of CV2

image_encoded = tf.read_file(os.path.join(config['image_root'],'query', '0001_c1s1_001051_00.jpg'))
image_decoded = tf.image.decode_jpeg(image_encoded, channels=3)
image_resized = tf.image.resize_images(image_decoded, net_input_size)
img = tf.expand_dims(image_resized, axis=0)

# Create the model and an embedding head.
model = import_module('nets.' + config['model_name'])
head = import_module('heads.' + config['head_name'])

endpoints, _ = model.endpoints(img, is_training=False)
with tf.name_scope('head'):
    endpoints = head.head(endpoints, config['embedding_dim'], is_training=False)

tf.train.Saver().restore(sess, os.path.join(config['experiment_root'],'checkpoint-25000') )



emb = sess.run(endpoints['emb'])[0]

I got almost identical embeddings this way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants