modes/track/ #7906
Replies: 132 comments 330 replies
-
Can I run two models simultaneously in one video? I want that two models will works simultaneously with cumulative results? Is it possible? Please let me know. Thanks in advance!! |
Beta Was this translation helpful? Give feedback.
-
Hi, First of all, I have been loving working with Yolov8. Great tool! However, I have been having difficulties with a certain task. I want to use model.track on videos that I have, and then use save_crop = True, but save with a naming convention where I can track each persons ID. Currently, save_crop just gives me the cropped images of the objectes detected, but there is not way to know from which frame of the video are the crops, also, which ID is attached which cropped image. The visualization through cv2.imshow shows the IDs accross the different frames, but I cant find a way to save them. The naming convention I am looking for is something like this: "frame_30_ID_1.jpg" My current code looks something like this: from ultralytics import YOLO model = YOLO("yolov8n.pt") # load model video_path = "path/to/video.mp4" ret = True while ret:
cap.release() Any help would be greatly apprecitated! Thanks! |
Beta Was this translation helpful? Give feedback.
-
Hi @pderrenger . Can I run the models using my phones camera? Can you please share the code to invoke my mobile's camera to test the model? Thanks in advance. |
Beta Was this translation helpful? Give feedback.
-
Hi, help me understand why I get this error when tracking with segmentation model . My ultimate goal is to use a custom car plate segmentation model for tracking. Thank you very much
|
Beta Was this translation helpful? Give feedback.
-
Yolov8 has very high overall practicality. Can I implement tracking with two cameras? I hope that when a car tracked by camera A moves to camera B, its frame ID remains the same. However, currently there is always an ID switch happening. Is it because of the model's accuracy? def cam2(): cap=cam a = threading.Thread(target=cam1) a.start() |
Beta Was this translation helpful? Give feedback.
-
Hey there,
|
Beta Was this translation helpful? Give feedback.
-
Hi I saw that I can use an openvino IR format model just like any other pytorch model and then run tracking like normal. I was wondering how I would load the IR '.xml' and '.bin' files as arguments into YOLO(), or if I should load my model using openvino library? Thanks. |
Beta Was this translation helpful? Give feedback.
-
Can i use yolov8 model to track and reidentify person with same id assigned to it in multiple camera feed ? |
Beta Was this translation helpful? Give feedback.
-
How can we only track moving objects in the Plotting Tracks Over Time code: from collections import defaultdict import cv2 from ultralytics import YOLO Load the YOLOv8 modelmodel = YOLO('yolov8n.pt') Open the video filevideo_path = "path/to/video.mp4" Store the track historytrack_history = defaultdict(lambda: []) Loop through the video frameswhile cap.isOpened():
Release the video capture object and close the display windowcap.release() |
Beta Was this translation helpful? Give feedback.
-
import cv2 model = YOLO('yolov8_custom_train.engine', task="detect") Path to the input video fileinput_video_path = '/content/gdrive/MyDrive/yolov8-tensorrt/inference/output_video.mp4' Path to the output video fileoutput_video_path = 'outputtest_video.mp4' Define the coordinates of the polygonpolygon_points = [(670, 66), (1237, 550), (514, 1054), (161, 295)] Open the input video filecap = cv2.VideoCapture(input_video_path) Get video propertiesframe_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) Define the codec and create VideoWriter objectfourcc = cv2.VideoWriter_fourcc(*'mp4v') Function for finding the centroiddef calculate_centroid(box): Function to check if two bounding boxes overlapdef check_overlap(box1, box2): Read until video is completedwhile cap.isOpened():
Release video objectscap.release() Close all OpenCV windowscv2.destroyAllWindows() In this, I am tracking a label Person but in the next 2 to 3 frames, ids are changing so any solution for this? |
Beta Was this translation helpful? Give feedback.
-
What is the difference between these attributes of results[0].boxes: |
Beta Was this translation helpful? Give feedback.
-
is it possible to use our own weighs as a model to track? or we must include the yolov8n.pt? |
Beta Was this translation helpful? Give feedback.
-
So I am using Yolov8 for my current project, it's been a breeze so far. I do have a question on the tracking method provided by the Yolov8. When I am using the generic yolov8n model(or even a custom model mixed with few objects), I know I can specifically filter out things that doesn't interest me by their ID as below:
But, when I caught the an object that I am interested, can I at that time or at that frame, issue a track command to start tracking it? if it can be done, can you tell me how? an short example will be even better! thanks in advance. |
Beta Was this translation helpful? Give feedback.
-
Hi, I want some detailed help and guidance on how to use custom tracker models with my custom yolov8 pose model, the Re-identification problem is being face using bytetrack.yaml so I think I should use StrongSORT or DeepSORT. Therefore, I want the ultralytics team to help me on selecting my tracker model or use multiple tracker models, and guide me properly on how to use them with my YOLOv8 custom trained model. |
Beta Was this translation helpful? Give feedback.
-
import random opening the file in read modemy_file = open("utils/coco.txt", "r") reading the filedata = my_file.read() replacing end splitting the text | when newline ('\n') is seen.class_list = data.split("\n") Generate random colors for class listdetection_colors = [] load a pretrained YOLOv8n modelmodel = YOLO("weights/yolov8n.pt", "v8") Vals to resize video frames | small frame optimise the runframe_wid = 640 def CarBehaviour(frame, color_threshold=1100):
def detect_and_draw(frame, model, class_list, detection_colors):
Open video capturecap = cv2.VideoCapture("/home/opencv_env/Vehicle-rear-lights-analyser-master/testing_data/road_2.mp4") if not cap.isOpened(): while True:
When everything done, release the capturecap.release() |
Beta Was this translation helpful? Give feedback.
-
I have been trying to fine tune my model to track and count fish fingerlings, keep in mind of their small size. Any suggestions of how I can go about this |
Beta Was this translation helpful? Give feedback.
-
The functionality and applications are great of tracking but it would be better to have a functionality to use deep sort, strong sort, boost track++ etc directly, model files we will take care of that wouldn't be an issue. Also i don't know how do i start to integrate any official traking approach like boost track++ with yolo v8 as there is no proper integrated repository and i want to know the process of integrating as i believe it would be better to know rather than looking and finding no solution elsewhere |
Beta Was this translation helpful? Give feedback.
-
how to run a infernece in multiple thread using yolov8 model. i have multipe cpu server. i want to reduce the processing time. i want to use use a multiple cpus for a single inferece . could you help me guys? |
Beta Was this translation helpful? Give feedback.
-
Hello, I have a problem that I have been thinking about for several days, I have this script that captures all the frames of a streaming camera, what happens is that I infer the video with tracking and the frames come from a queue to capture them before but the tracker is much slower in processing than the videos it captures, then the queue grows until I run out of memory or grows to a maximum and I lose frames and I would like not to lose frames and go processing them all but I can't find a way. I pass you the example of the implemented code to see if you can give me a hand. #!/usr/bin/env python3
import cv2
import multiprocessing as mp
import logging
from collections import defaultdict, Counter
from ultralytics import YOLO
import numpy as np
import argparse
from pymongo import MongoClient
import clickhouse_connect
from ultralytics.utils.plotting import Annotator, Colors
from ultralytics.utils.ops import xyxy2ltwh
from dateutil.parser import isoparse
from datetime import datetime, timedelta
from bson import ObjectId
from utils import cam_to_adr, cam_to_direction, day_of_week_translation
import signal
import sys
# Logger configuration
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')
# Output video configuration
output_video_writer = None
# Global variables
track_history = defaultdict(lambda: [])
track_class_history = defaultdict(list)
last_seen_frame = {}
current_class = {}
object_count = defaultdict(int)
frames_until_disappear = 100
grace_period_seconds = 5
current_frame = mp.Value('i', 0) # Shared value between processes
first_detection_time = {}
# Database variables
mongo_client = None
clickhouse_client = None
# Input video variables
video_codec = None
video_framerate = 30
video_resolution = None
DEFAULT_FPS = 30
# Global event to stop processes
stop_event = mp.Event()
def signal_handler(sig, frame, processes):
"""Signal handler to stop the processes gracefully."""
logging.info("Interrupt signal received, stopping processes...")
stop_event.set() # Stop all processes by setting the event.
# Try to wait for processes to finish
for process in processes:
if process.is_alive():
logging.info(f"Waiting for process {process.name} to finish...")
process.join(timeout=5) # Wait up to 5 seconds for each process
# If the process is still alive after 5 seconds, terminate it forcefully
if process.is_alive():
logging.warning(f"Forcing termination of process {process.name}...")
process.terminate()
process.join()
logging.info("All processes have been stopped successfully.")
# Release video resources
if output_video_writer and output_video_writer.isOpened():
output_video_writer.release()
logging.info(f"Video saved and closed successfully.")
# Close MongoDB and ClickHouse connections if open
if mongo_client:
mongo_client.close()
if clickhouse_client:
clickhouse_client.close()
cv2.destroyAllWindows()
sys.exit(0)
def capture_frames(video_source, frame_queue, max_queue_size):
"""Function to capture frames and add them to the queue"""
logging.info("Starting frame capture...")
cap = cv2.VideoCapture(video_source)
global video_framerate, video_resolution, video_codec
video_codec = cap.get(cv2.CAP_PROP_FOURCC)
video_framerate = cap.get(cv2.CAP_PROP_FPS) or DEFAULT_FPS
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
video_resolution = {"width": frame_width, "height": frame_height}
while cap.isOpened() and not stop_event.is_set():
ret, frame = cap.read()
if not ret:
logging.warning("End of video or error in capture.")
break
# Wait until there is space in the queue to add the next frame
while frame_queue.qsize() >= max_queue_size and not stop_event.is_set():
# Actively wait until there is space
continue
# Add the frame to the processing queue
if not stop_event.is_set():
frame_queue.put(frame)
logging.debug(f"Frame captured and added to queue. Current queue size: {frame_queue.qsize()}")
cap.release()
logging.info("Frame capture finished.")
def inference_worker(frame, model):
"""Function to perform inference on a frame"""
logging.debug("Starting inference on frame.")
return model.track(frame, conf=0.3, iou=0.5, persist=True, show=False, stream=True, tracker="botsort.yaml")
def process_disappeared_tracks():
"""Check and process disappeared tracks."""
current_time = datetime.now()
disappeared_ids = []
for track_id, (last_seen_frame_num, last_seen_time) in last_seen_frame.items():
time_since_last_seen = (current_time - last_seen_time).total_seconds()
frames_since_last_seen = current_frame.value - last_seen_frame_num
if time_since_last_seen > grace_period_seconds and frames_since_last_seen > frames_until_disappear:
disappeared_ids.append(track_id)
for track_id in disappeared_ids:
predominant_class = Counter(track_class_history[track_id]).most_common(1)[0][0]
initial_class = current_class[track_id]
if initial_class != predominant_class:
object_count[initial_class] -= 1
object_count[predominant_class] += 1
logging.info(f"Track ID {track_id} disappeared. Changed class from {initial_class} to {predominant_class}. Count updated.")
if not args.no_sync:
save_to_database(track_id, predominant_class)
del last_seen_frame[track_id]
del track_class_history[track_id]
del track_history[track_id]
del current_class[track_id]
del first_detection_time[track_id]
def save_to_database(track_id, predominant_class):
"""The code has no influence"""
def process_frames_sequentially(frame_queue, display_queue, model):
"""Function to process frames sequentially and delegate inference"""
logging.info("Starting sequential frame processing...")
global current_frame
colors = Colors() if args.show else None
while not stop_event.is_set() or not frame_queue.empty():
try:
frame = frame_queue.get()
logging.debug("Frame retrieved from the queue for processing.")
except mp.queues.Empty:
continue # If the queue is empty, wait
with current_frame.get_lock():
current_frame.value += 1
logging.debug(f"Processing frame {current_frame.value}...")
# Perform inference on the frame
results = inference_worker(frame, model)
annotator = Annotator(frame, line_width=3, font_size=16)
detected_ids = set()
for result in results:
if result.boxes is not None and result.boxes.id is not None:
boxes = result.boxes.xyxy.cpu()
track_ids = result.boxes.id.int().cpu().tolist()
classes = result.boxes.cls.int().cpu().tolist()
scores = result.boxes.conf.cpu().tolist()
for box, track_id, cls, score in zip(boxes, track_ids, classes, scores):
x1, y1, x2, y2 = map(int, box)
detected_ids.add(track_id)
last_seen_frame[track_id] = (current_frame.value, datetime.now())
if track_id not in current_class:
initial_class = cls
current_class[track_id] = initial_class
track_class_history[track_id].append(initial_class)
object_count[initial_class] += 1
first_detection_time[track_id] = datetime.now()
logging.info(f"New track assigned: ID {track_id}, class {initial_class}. Total objects: {object_count[initial_class]}")
track_history[track_id].append([x1, y1, x2, y2])
if len(track_history[track_id]) > 30:
track_history[track_id].pop(0)
track_class_history[track_id].append(cls)
color = colors(cls)
if len(track_history[track_id]) > 1:
points = np.array([[(int((b[0] + b[2]) / 2), int((b[1] + b[3]) / 2))] for b in track_history[track_id]]).astype(np.int32)
cv2.polylines(annotator.im, [points], isClosed=False, color=color, thickness=2)
class_name = model.names[cls] if model.names else f'cls {cls}'
label = f'id:{track_id} {class_name} {score:.2f}'
annotator.box_label([x1, y1, x2, y2], label, color=color)
# Process disappeared tracks
process_disappeared_tracks()
# Display object count
print_object_count(annotator)
# Send the annotated frame to the display/save queue
try:
display_queue.put(annotator.result())
logging.debug("Processed frame added to display queue.")
except mp.queues.Full:
logging.warning("Display queue full, frame could not be added.")
logging.info("Finished frame processing.")
def print_object_count(annotator=None):
"""Print the current object count."""
logging.info("\nCurrent object count:")
y_offset = 30
for cls, count in object_count.items():
class_name = model.names[cls] if model.names else f'cls {cls}'
logging.info(f'{class_name}: {count}')
if annotator:
annotator.text((10, y_offset), f'{class_name}: {count}', txt_color=(255, 255, 255))
y_offset += 30
def display_and_save(display_queue):
"""Function that handles displaying and saving processed frames"""
logging.info("Starting frame display and save process...")
global output_video_writer
while not stop_event.is_set() or not display_queue.empty():
try:
frame = display_queue.get(timeout=0.1) # Get frame from display queue
logging.debug("Frame retrieved from the queue for display/save.")
except mp.queues.Empty:
logging.debug("Display queue empty, waiting for more frames.")
continue
if args.show:
cv2.imshow('Tracking', frame)
logging.debug("Displaying frame in window.")
# Capture if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord('q'):
logging.info("Key 'q' pressed. Stopping display process.")
stop_event.set() # Stop everything if 'q' is pressed
break
if output_video_writer is not None:
output_video_writer.write(frame)
logging.debug("Frame saved to video file.")
if output_video_writer is not None and output_video_writer.isOpened():
output_video_writer.release()
logging.info("Video file saved and closed successfully.")
cv2.destroyAllWindows()
logging.info("Display and save process finished.")
# Command-line arguments and model, database configuration, etc.
parser = argparse.ArgumentParser(description="Script for video processing and object detection with YOLO.")
parser.add_argument('video_source', type=str, help='Path to video file or RTSP URL')
parser.add_argument('--output', type=str, default=None, help='Path to output video file (optional)')
parser.add_argument('--model', type=str, default=None, help='Path to YOLO model file')
parser.add_argument("--mongo-uri", type=str, default=None, help="MongoDB URI")
parser.add_argument("--mongo-db-name", type=str, default="vista-count-yolov", help="MongoDB database name")
parser.add_argument("--clickhouse-host", type=str, default=None, help="ClickHouse connection HOST")
parser.add_argument("--clickhouse-port", type=str, default=None, help="ClickHouse connection PORT")
parser.add_argument("--cam-id", type=str, default=None, help="Camera ID (optional)")
parser.add_argument("--no-sync", action="store_true", help="Disable database synchronization")
parser.add_argument('--show', action='store_true', default=False, help='Display results in a window')
args = parser.parse_args()
video_source = args.video_source
output_video_path = args.output
model_path = args.model
model = YOLO(model_path)
if args.mongo_uri and not args.no_sync:
mongo_client = MongoClient(args.mongo_uri)
if args.clickhouse_host and args.clickhouse_port and not args.no_sync:
clickhouse_client = clickhouse_connect.get_client(host=args.clickhouse_host, port=args.clickhouse_port)
# If an output path is provided, initialize VideoWriter
if output_video_path:
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
output_video_writer = cv2.VideoWriter(output_video_path, fourcc, DEFAULT_FPS, (video_resolution["width"], video_resolution["height"]))
# Multiprocessing usage
if __name__ == '__main__':
DEFAULT_SIZE_QUEUE = 750
# Define multiprocessing queues
frame_queue = mp.Queue(maxsize=DEFAULT_SIZE_QUEUE)
display_queue = mp.Queue(maxsize=DEFAULT_SIZE_QUEUE)
# Create processes
capture_process = mp.Process(target=capture_frames, args=(video_source, frame_queue, DEFAULT_SIZE_QUEUE), name="Capture")
processing_process = mp.Process(target=process_frames_sequentially, args=(frame_queue, display_queue, model), name="Processing")
display_process = mp.Process(target=display_and_save, args=(display_queue,), name="Display")
processes = [capture_process, processing_process, display_process]
# Assign signal handler to stop processes
signal.signal(signal.SIGINT, lambda sig, frame: signal_handler(sig, frame, processes))
signal.signal(signal.SIGTERM, lambda sig, frame: signal_handler(sig, frame, processes))
# Start processes
for process in processes:
process.start()
# Wait for processes to finish
for process in processes:
process.join() |
Beta Was this translation helpful? Give feedback.
-
Good evening, I'm trying to configure the botsort file for when there is a drop of water in the bed that makes it lose the vehicle tracking and detect it again, I've been playing with the parameters for a while and I can't get it to associate the same trackId again, because of the damn drop of water, thank you very much for the help. # Ultralytics YOLO 🚀, AGPL-3.0 license
# Default YOLO tracker settings for BoT-SORT tracker https://github.com/NirAharon/BoT-SORT
tracker_type: botsort # tracker type, ['botsort', 'bytetrack']
track_high_thresh: 0.5 # threshold for the first association
track_low_thresh: 0.1 # threshold for the second association
new_track_thresh: 0.6 # threshold for init new track if the detection does not match any tracks
track_buffer: 100 # buffer to calculate the time when to remove tracks
match_thresh: 0.85 # threshold for matching tracks
fuse_score: True # Whether to fuse confidence scores with the iou distances before matching
# min_box_area: 10 # threshold for min box areas(for tracker evaluation, not used for now)
# BoT-SORT settings
gmc_method: sparseOptFlow # method of global motion compensation
# ReID model related thresh (not supported yet)
proximity_thresh: 0.5
appearance_thresh: 0.25
with_reid: True |
Beta Was this translation helpful? Give feedback.
-
Good morning, And here is the line where you have the fps harcode https://github.com/ultralytics/ultralytics/blob/main/ultralytics/trackers/track.py#L45 |
Beta Was this translation helpful? Give feedback.
-
I've been running this code for a couple of weeks and everything was fine until the last update of the Ultralytics library. I have an object counter declared in the following way: counter = solutions.ObjectCounter(
view_img=False,
reg_pts=region_points,
names=model.names,
draw_tracks=True,
line_thickness=2,
) Then, I define the type of tracker to be used with a custom model: tracks = model.track(
frame,
persist=True,
conf=0.6, # Confidence threshold
tracker="bytetrack.yaml",
show=False,
classes=classes_to_count,
verbose=False
) And finally, I am accessing every frame using the 'start_counting' function, specifying the frame to be analyzed and the track type: frame = counter.start_counting(frame, tracks) When I try to execute the code now, I get the error: Traceback (most recent call last):
File "/.../counting_video7_transformer.py", line 171, in <module>
frame = counter.start_counting(frame, tracks)
AttributeError: 'ObjectCounter' object has no attribute 'start_counting' I've attempted to update the method from 'start_counting' to 'count' (according to your limited documentation above). However, I get this error: Traceback (most recent call last):
File "/.../counting_video7_transformer.py", line 171, in <module>
frame = counter.count(frame, tracks)
TypeError: ObjectCounter.count() takes 2 positional arguments but 3 were given Any hints on how this should be handled moving forward? Thanks |
Beta Was this translation helpful? Give feedback.
-
Hello, thank you very much for your work on computer vision. I had a problem in development. For the same test video, the same model is used. When I use predict mode and track mode, there is a large performance gap in target recognition. This is reflected in that fewer target boxes are detected in track mode than in predict mode. I used the default byteTrack tracker or BoT Tracker. I have adjusted the tracker's hyperparameters for many times, but the problem has not been solved. May I ask why? |
Beta Was this translation helpful? Give feedback.
-
I'm trying to use the above example to track hockey players. I have tried many different configurations without great success. (ByteTrack and Botsort) As well as trying to use norrfair. If you watch the video you can see ID:4 changes many times after occlusion with another player. I have uploaded my examples to https://github.com/clearpaint/yolo/ |
Beta Was this translation helpful? Give feedback.
-
Hi we found that object id fluctuating within 19000 - 24000 is there any solution or controllable variable? |
Beta Was this translation helpful? Give feedback.
-
I have a problem with the tracker. I trained Segmentation Model 11, and tried the tracker with the model, but the tracker gives different IDs for the same object. How can I fix that? Are there any tips to solve the problem? Note: I read all the documents on the site but did not find a solution to my problem thank you for your help ! |
Beta Was this translation helpful? Give feedback.
-
Hi, I'm using yolo11, with weights I trained myself. For some reason, the predict and track results are completely the same: model = YOLO(weights) After printing / plotting the annotations, it seems it exactly the same. Thank you in advance! |
Beta Was this translation helpful? Give feedback.
-
I have a model trained on about 1200 images (including augmented images) my project is to detect if there are 0,1 or 2 monkeys within each frame of the video. However if for example there is only one monkey seen in the frame the label will switch randomly between the two label names I have (for now we will say “monkey1” and “monkey2”). This is a problem for me because I save all the cropped images of each label and play it as a video for a closeup view. So then my cropped images will not be consistent. I have done the model.track(persist=True, agnostic_nms=True, iou=0.50) in order to try and reduce the likelihood of this happening however it still does more often then I would like. What options do I have in order to fix this or reduce the likelihood? I am also using Yolov11 fyi. Thank you so much for anything you come up with I’ve been stuck for weeks |
Beta Was this translation helpful? Give feedback.
-
i want to know the yolo model can process every resolution images as it is. or the process only for decided resolution (640,640). do you convert image size before the inference internally? |
Beta Was this translation helpful? Give feedback.
-
Can we use tracker mode other than bot sort and bytetrack? for example with deepsort tracker. |
Beta Was this translation helpful? Give feedback.
-
modes/track/
Learn how to use Ultralytics YOLO for object tracking in video streams. Guides to use different trackers and customise tracker configurations.
https://docs.ultralytics.com/modes/track/
Beta Was this translation helpful? Give feedback.
All reactions