多人图片的识别问题 #4

xinzi2018 · 2022-03-16T08:41:13Z

发现在多人图片的识别效果很不好，如下图的右上角的结果。

然后我使用一下代码的处理方式（先通过detectronV2检测出人体框，然后将人体框逐个人体输入到网络中，最后进行融合）

def ClothSegMultiGen(self,img_cv,size=-1,):

        img = Image.fromarray(cv2.cvtColor(img_cv, cv2.COLOR_BGR2RGB))

        w,h = img.size

        body_boxes, _, sub_bodys = self.face_analysis.DetectronV2BodyBox(img_cv)
        total_rate = np.zeros((4,img_cv.shape[0],img_cv.shape[1]))-float('inf')  # 初始化一个数值最小的矩阵
        if len(sub_bodys)!=0:
            output_img  = np.zeros()
            for i in range(len(sub_bodys)):
                total_sub_rate = np.zeros((4,img_cv.shape[0],img_cv.shape[1]))-float('inf')  # 初始化一个数值最小的矩阵

                left, top, right, bottom = body_boxes[i][0],body_boxes[i][1],body_boxes[i][0]+body_boxes[i][2],body_boxes[i][1]+body_boxes[i][3]
                sub_img = img.crop((left, top, right, bottom))
                sub_img, sub_rate, sub_img_color   = self.ClothSegGen(sub_img,640)
                if i==0:

                    total_rate[:,top:bottom,left:right]= sub_rate
                else:
                    # np.argmax(sub_rate, axis=1)
                    total_sub_rate[:,top:bottom,left:right]= sub_rate
                    total_rate = maxTwoNumpy(total_sub_rate,total_rate)

            output_img = np.argmax(total_rate, axis=0)
            output_img_color = self.indexColor(output_img,w,h)
        else:
            output_img,_,output_img_color   = self.ClothSegGen(img,640)

def ClothSegGen(self,img_cv,size=-1,): #size表示短边长度
        img = Image.fromarray(cv2.cvtColor(img_cv, cv2.COLOR_BGR2RGB))
        w,h = img.size
        if size!=-1:
            if w>h:
                h_out = size
                w_out = w * h_out // h
            else:  
                w_out = size
                h_out = h * w_out // w

            img = img.resize((w_out,h_out))

        image_tensor = self.transform_rgb(img)
        image_tensor = torch.unsqueeze(image_tensor, 0)
        print('衣服识别时输入网络的图片大小：',image_tensor.shape)
        output_tensor = self.net(image_tensor.to(self.device))
        output_tensor = F.log_softmax(output_tensor[0], dim=1)

        output_tensor_ori = output_tensor.clone()
        output_tensor = torch.max(output_tensor_ori, dim=1, keepdim=True)[1]  # troch.max()[1]， 只返回最大值的每个索引


       
        output_tensor = torch.squeeze(output_tensor, dim=0)
        output_tensor = torch.squeeze(output_tensor, dim=0)
        output_arr = output_tensor.cpu().numpy()
        output_img = Image.fromarray(output_arr.astype("uint8"), mode="L")

        
        output_img_color = self.indexColor(output_img,w,h)



        # 单独处理出概率的数据
        output_tensor0 = output_tensor_ori.clone() # 但是此时的float并非0-1之间的概率值
        output_tensor0 = torch.squeeze(output_tensor0, dim=0) 
        # output_tensor0 = torch.squeeze(output_tensor0, dim=0)
        output_rate = output_tensor0.cpu().numpy() # 4*h*w
 

        return output_img,output_rate,output_img_color

但是发现这种方式有很大的融合问题（上一个人体框会压到下一个上面，比如上图中下面那张图片左边两个人连接处的问题）。不知道是不是因为log_softmax的问题？
因为以前是sigmoid的到的概率值，用同样的融合方式都能很正确的融合。

The text was updated successfully, but these errors were encountered:

levindabhi · 2022-04-02T20:39:30Z

I am not sure what your actual question is. If you are asking about the issue of wrong output in the marked region from the below image, then it is caused by these total_rate[:,top:bottom,left:right]= sub_rate and total_sub_rate[:,top:bottom,left:right]= sub_rate two lines of code.

I guess it should be total_rate[:,top:bottom,left:right] += sub_rate and total_sub_rate[:,top:bottom,left:right] += sub_rate with total_rate and total_rate initializes to zero

levindabhi · 2022-04-02T20:46:31Z

Also, your idea of using DetectronV2 is nice. I would be more than happy to add a feature of cloth segmentation in multi-person picture into this repo through your pull request

xinzi2018 · 2022-04-08T10:14:58Z

我已经找到问题的关键所在了！！！
1.需要在生成时候将 output_tensor = F.log_softmax(output_tensor[0], dim=1) 修改为 output_tensor = F.softmax(output_tensor[0], dim=1) 。
2.生成出来的output_tensor尺寸是1/4/H/W,其中的output_tensor[0,0,:,:]中的白色区域（概率大的数据）位置表示的是背景，而我最一开始时候的初始化均为'-float(inf)',这将导致最后融合的时候出现问题
细节的处理代码如下：

def ClothSegMultiGen(self,img_cv,size=-1,):

        img = Image.fromarray(cv2.cvtColor(img_cv, cv2.COLOR_BGR2RGB))
        w,h = img.size
        body_boxes, _, sub_bodys = self.detectbody.gen_bodybox(img_cv)
        print('监测到的人物个数为：',len(sub_bodys))
        total_rate = torch.zeros(1,4,img_cv.shape[0],img_cv.shape[1]).float()  # 初始化一个数值最小的矩阵
        total_rate[:,0,:,:] =  1 # 初始化一个数值最小的矩阵
        if len(sub_bodys)>1:
            for i in range(len(sub_bodys)):
                print('----------------------------------')
                total_sub_rate = torch.zeros(1,4,img_cv.shape[0],img_cv.shape[1]).float()   # 初始化一个数值最小的矩阵
                total_sub_rate[:,0,:,:] = 1   # 初始化一个数值最小的矩阵
                left, top, right, bottom = int(body_boxes[i][0]),int(body_boxes[i][1]),int(body_boxes[i][0])+int(body_boxes[i][2]),int(body_boxes[i][1])+int(body_boxes[i][3])
                sub_img = img.crop((left, top, right, bottom))
                sub_w,sub_h = sub_img.size
                sub_img, sub_rate, sub_img_color   = self.ClothSegGen(sub_img)
                total_sub_rate[0,:,top:bottom,left:right]= sub_rate
                total_rate[:,1:,:,:] = torch.max(total_sub_rate[:,1:,:,:],total_rate[:,1:,:,:])
                total_rate[:,0,:,:] = torch.min(total_sub_rate[:,0,:,:],total_rate[:,0,:,:])  
                a = torch.max(total_rate[:,:,:,:], dim=1, keepdim=True)[1]  # troch.max()[1]， 只返回最大值的每个索引
             

                a = torch.squeeze(a, dim=0)
                a = torch.squeeze(a, dim=0)
                output_arr = a.cpu().numpy()
                output_img = Image.fromarray(output_arr.astype("uint8"), mode="L")

            
                output_img_color = self.indexColor(output_img,w,h)
               
        else:
            output_img,_,output_img_color   = self.ClothSegGen(img,640)


        return output_img,_,output_img_color

最终的结果图如下：

davichen2017 · 2022-04-23T02:44:03Z

我已经找到问题的关键所在了！！！ 1.需要在生成时候将 output_tensor = F.log_softmax(output_tensor[0], dim=1) 修改为 output_tensor = F.softmax(output_tensor[0], dim=1) 。 2.生成出来的output_tensor尺寸是1/4/H/W,其中的output_tensor[0,0,:,:]中的白色区域（概率大的数据）位置表示的是背景，而我最一开始时候的初始化均为'-float(inf)',这将导致最后融合的时候出现问题细节的处理代码如下：

def ClothSegMultiGen(self,img_cv,size=-1,):

        img = Image.fromarray(cv2.cvtColor(img_cv, cv2.COLOR_BGR2RGB))
        w,h = img.size
        body_boxes, _, sub_bodys = self.detectbody.gen_bodybox(img_cv)
        print('监测到的人物个数为：',len(sub_bodys))
        total_rate = torch.zeros(1,4,img_cv.shape[0],img_cv.shape[1]).float()  # 初始化一个数值最小的矩阵
        total_rate[:,0,:,:] =  1 # 初始化一个数值最小的矩阵
        if len(sub_bodys)>1:
            for i in range(len(sub_bodys)):
                print('----------------------------------')
                total_sub_rate = torch.zeros(1,4,img_cv.shape[0],img_cv.shape[1]).float()   # 初始化一个数值最小的矩阵
                total_sub_rate[:,0,:,:] = 1   # 初始化一个数值最小的矩阵
                left, top, right, bottom = int(body_boxes[i][0]),int(body_boxes[i][1]),int(body_boxes[i][0])+int(body_boxes[i][2]),int(body_boxes[i][1])+int(body_boxes[i][3])
                sub_img = img.crop((left, top, right, bottom))
                sub_w,sub_h = sub_img.size
                sub_img, sub_rate, sub_img_color   = self.ClothSegGen(sub_img)
                total_sub_rate[0,:,top:bottom,left:right]= sub_rate
                total_rate[:,1:,:,:] = torch.max(total_sub_rate[:,1:,:,:],total_rate[:,1:,:,:])
                total_rate[:,0,:,:] = torch.min(total_sub_rate[:,0,:,:],total_rate[:,0,:,:])  
                a = torch.max(total_rate[:,:,:,:], dim=1, keepdim=True)[1]  # troch.max()[1]， 只返回最大值的每个索引
             

                a = torch.squeeze(a, dim=0)
                a = torch.squeeze(a, dim=0)
                output_arr = a.cpu().numpy()
                output_img = Image.fromarray(output_arr.astype("uint8"), mode="L")

            
                output_img_color = self.indexColor(output_img,w,h)
               
        else:
            output_img,_,output_img_color   = self.ClothSegGen(img,640)


        return output_img,_,output_img_color

最终的结果图如下：
=====================================================
请问你复现的安装环境是怎样的？为啥我复现出现报错呢？
TypeError: Caught TypeError in DataLoader worker process 0.
TypeError: 'float' object cannot be interpreted as an integer

是数据集方面的问题吗？

xinzi2018 · 2022-04-25T03:35:58Z

在这个代码中，这两个问题我好像没遇到过，实际记不太清了。
关于“TypeError: Caught TypeError in DataLoader worker process 0.”这个问题大概是因为你在dataloader里面你的worker设置的问题，设置成0应该是不报错了
关于“'float' object cannot be interpreted as an integer” 具体报错的定位是哪里呢？

我的环境是torch1.8 torchvision0.9 python3.6.5 cuda11.1

davichen2017 · 2022-04-25T07:53:13Z

在这个代码中，这两个问题我好像没遇到过，实际记不太清了。关于“TypeError: Caught TypeError in DataLoader worker process 0.”这个问题大概是因为你在dataloader里面你的worker设置的问题，设置成0应该是不报错了关于“'float' object cannot be interpreted as an integer” 具体报错的定位是哪里呢？

我的环境是torch1.8 torchvision0.9 python3.6.5 cuda11.1

我后来，将以下强制转化为int型。就不会出现这个错误了。
self.image_info[index]["orig_height"] = int(row["Height"])
self.image_info[index]["orig_width"] = int(row["Width"])

我的环境是torch1.11 torchvision0.12 python 3.8 cuda 11.3

davichen2017 · 2022-04-25T07:58:26Z

在这个代码中，这两个问题我好像没遇到过，实际记不太清了。关于“TypeError: Caught TypeError in DataLoader worker process 0.”这个问题大概是因为你在dataloader里面你的worker设置的问题，设置成0应该是不报错了关于“'float' object cannot be interpreted as an integer” 具体报错的定位是哪里呢？

我的环境是torch1.8 torchvision0.9 python3.6.5 cuda11.1

训练后，我在评估预测时，总是报内存不足问题。但是我的显卡内存为6G。按理说完全可以的。
不知你是否遇到过此问题呢？

xinzi2018 · 2022-04-25T08:00:24Z

会出现这个问题。在输入网络中的图片过大时就会显存不够

davichen2017 · 2022-04-28T06:30:38Z

会出现这个问题。在输入网络中的图片过大时就会显存不够

我是在test数据集里随便找一张，都出现显存不够。

xinzi2018 · 2022-04-28T06:32:04Z

图片多大尺寸？

davichen2017 · 2022-04-28T06:36:19Z

分辨率：2832*4256 ；96dpi

xinzi2018 · 2022-04-28T06:38:57Z

这个大小确实是会显存不够。我在24G的显卡上批量生成的时候，有个别图都会显存不够。

davichen2017 · 2022-04-28T06:42:11Z

嗯嗯，我刚才找了一张800*1200的可以

davichen2017 · 2022-04-28T06:44:10Z

请教你：看了你写的多人识别的函数很好。我们没测试，识别率高吗？

karndeepsingh · 2022-08-04T05:52:45Z

我已经找到问题的关键所在了！！！ 1.需要在生成时候将 output_tensor = F.log_softmax(output_tensor[0], dim=1) 修改为 output_tensor = F.softmax(output_tensor[0], dim=1) 。 2.生成出来的output_tensor尺寸是1/4/H/W,其中的output_tensor[0,0,:,:]中的白色区域（概率大的数据）位置表示的是背景，而我最一开始时候的初始化均为'-float(inf)',这将导致最后融合的时候出现问题细节的处理代码如下：

def ClothSegMultiGen(self,img_cv,size=-1,):

        img = Image.fromarray(cv2.cvtColor(img_cv, cv2.COLOR_BGR2RGB))
        w,h = img.size
        body_boxes, _, sub_bodys = self.detectbody.gen_bodybox(img_cv)
        print('监测到的人物个数为：',len(sub_bodys))
        total_rate = torch.zeros(1,4,img_cv.shape[0],img_cv.shape[1]).float()  # 初始化一个数值最小的矩阵
        total_rate[:,0,:,:] =  1 # 初始化一个数值最小的矩阵
        if len(sub_bodys)>1:
            for i in range(len(sub_bodys)):
                print('----------------------------------')
                total_sub_rate = torch.zeros(1,4,img_cv.shape[0],img_cv.shape[1]).float()   # 初始化一个数值最小的矩阵
                total_sub_rate[:,0,:,:] = 1   # 初始化一个数值最小的矩阵
                left, top, right, bottom = int(body_boxes[i][0]),int(body_boxes[i][1]),int(body_boxes[i][0])+int(body_boxes[i][2]),int(body_boxes[i][1])+int(body_boxes[i][3])
                sub_img = img.crop((left, top, right, bottom))
                sub_w,sub_h = sub_img.size
                sub_img, sub_rate, sub_img_color   = self.ClothSegGen(sub_img)
                total_sub_rate[0,:,top:bottom,left:right]= sub_rate
                total_rate[:,1:,:,:] = torch.max(total_sub_rate[:,1:,:,:],total_rate[:,1:,:,:])
                total_rate[:,0,:,:] = torch.min(total_sub_rate[:,0,:,:],total_rate[:,0,:,:])  
                a = torch.max(total_rate[:,:,:,:], dim=1, keepdim=True)[1]  # troch.max()[1]， 只返回最大值的每个索引
             

                a = torch.squeeze(a, dim=0)
                a = torch.squeeze(a, dim=0)
                output_arr = a.cpu().numpy()
                output_img = Image.fromarray(output_arr.astype("uint8"), mode="L")

            
                output_img_color = self.indexColor(output_img,w,h)
               
        else:
            output_img,_,output_img_color   = self.ClothSegGen(img,640)


        return output_img,_,output_img_color

最终的结果图如下：

Hi @xinzi2018 @levindabhi @davichen2017 .
I want to extract different classes from the detected image with a confidence level. How I can do so with the code provided in the repository for inferencing?

for image_name in images_list:
    img = Image.open(os.path.join(image_dir, image_name)).convert('RGB')
    img_size = img.size
    img = img.resize((768, 768), Image.BICUBIC)
    image_tensor = transform_rgb(img)
    image_tensor = torch.unsqueeze(image_tensor, 0)
    
    output_tensor = net(image_tensor.to(device))
    output_tensor = F.log_softmax(output_tensor[0], dim=1)
    output_tensor = torch.max(output_tensor, dim=1, keepdim=True)[1]
    output_tensor = torch.squeeze(output_tensor, dim=0)
    output_tensor = torch.squeeze(output_tensor, dim=0)
    output_arr = output_tensor.cpu().numpy()

    output_img = Image.fromarray(output_arr.astype('uint8'), mode='L')
    output_img = output_img.resize(img_size, Image.BICUBIC)
    # output_img.save(os.path.join(result_dir, image_name[:-4]+'_generated.png'))
    output_img.putpalette(palette)
    output_img.save(os.path.join(result_dir, image_name[:-4]+'_generated.png'))

Need your help.
Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

多人图片的识别问题 #4

多人图片的识别问题 #4

xinzi2018 commented Mar 16, 2022 •

edited

Loading

levindabhi commented Apr 2, 2022

levindabhi commented Apr 2, 2022

xinzi2018 commented Apr 8, 2022 •

edited

Loading

davichen2017 commented Apr 23, 2022

xinzi2018 commented Apr 25, 2022

davichen2017 commented Apr 25, 2022 •

edited

Loading

davichen2017 commented Apr 25, 2022

xinzi2018 commented Apr 25, 2022

davichen2017 commented Apr 28, 2022

xinzi2018 commented Apr 28, 2022

davichen2017 commented Apr 28, 2022

xinzi2018 commented Apr 28, 2022

davichen2017 commented Apr 28, 2022

davichen2017 commented Apr 28, 2022

karndeepsingh commented Aug 4, 2022

多人图片的识别问题 #4

多人图片的识别问题 #4

Comments

xinzi2018 commented Mar 16, 2022 • edited Loading

levindabhi commented Apr 2, 2022

levindabhi commented Apr 2, 2022

xinzi2018 commented Apr 8, 2022 • edited Loading

davichen2017 commented Apr 23, 2022

xinzi2018 commented Apr 25, 2022

davichen2017 commented Apr 25, 2022 • edited Loading

davichen2017 commented Apr 25, 2022

xinzi2018 commented Apr 25, 2022

davichen2017 commented Apr 28, 2022

xinzi2018 commented Apr 28, 2022

davichen2017 commented Apr 28, 2022

xinzi2018 commented Apr 28, 2022

davichen2017 commented Apr 28, 2022

davichen2017 commented Apr 28, 2022

karndeepsingh commented Aug 4, 2022

xinzi2018 commented Mar 16, 2022 •

edited

Loading

xinzi2018 commented Apr 8, 2022 •

edited

Loading

davichen2017 commented Apr 25, 2022 •

edited

Loading