Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low FPS Issue with Camera Detection #35

Open
congngc opened this issue Jun 25, 2024 · 16 comments
Open

Low FPS Issue with Camera Detection #35

congngc opened this issue Jun 25, 2024 · 16 comments

Comments

@congngc
Copy link

congngc commented Jun 25, 2024

Hello,

I have implemented the example from this GitHub repository: https://github.com/ultralytics/yolo-flutter-app/tree/main/example. However, I am experiencing low frame rates with the camera detection feature, which ranges only from 12 to 20 FPS. Could you please advise on how I might improve the FPS to achieve better performance?

Thank you for your assistance.

Screen_Recording_20240625_092403.1.mp4
@pderrenger
Copy link
Member

Hello,

Thank you for reaching out and providing details about the low FPS issue you're experiencing with the camera detection feature. To help improve the frame rate, here are a few suggestions:

  1. Model Quantization: Ensure that you are using a quantized model (FP16 or INT8) as these are optimized for performance on mobile devices. Quantization reduces the model size and computation requirements, leading to faster inference times. You can read more about this in our documentation.

  2. Delegate Selection: The performance can vary significantly depending on the delegate used for model inference. For instance, using the GPU delegate can provide a substantial performance boost on devices with powerful GPUs. Similarly, if your device supports it, leveraging the Hexagon DSP or NNAPI can also improve performance. You can find more details on delegates and their performance variability in our documentation.

  3. Device Specifications: The hardware capabilities of your device play a crucial role in performance. Ensure that your device has a capable processor and sufficient memory. Devices with Qualcomm Snapdragon processors, for example, can leverage the Hexagon DSP for better performance.

  4. Code Optimization: Make sure that your implementation is optimized. For example, reducing the input resolution can help increase the FPS, though it may affect detection accuracy. Here’s a snippet to adjust the input resolution:

    import cv2
    
    # Reduce input resolution
    def resize_frame(frame, width, height):
        return cv2.resize(frame, (width, height))
    
    # Example usage
    frame = resize_frame(frame, 320, 240)  # Adjust width and height as needed
  5. Latest Versions: Verify that you are using the latest versions of the Ultralytics packages and dependencies. Updates often include performance improvements and bug fixes.

If the issue persists, could you please provide a minimum reproducible example? This will help us better understand the problem and provide more specific guidance. You can find more information on creating a reproducible example here.

Thank you for your cooperation, and I look forward to assisting you further!

@congngc
Copy link
Author

congngc commented Jun 25, 2024

Hi @pderrenger ,

Thanks for your insights. I'd like to provide some details about my setup to further the discussion:

  1. Model Quantization: I am currently using the quantized INT8 model of YOLOv8n.

  2. Device Specifications: My device is a Samsung Galaxy S21. In your experience, do you think this device could handle the computational demands effectively?

  3. Code Optimization: I am looking for ways to optimize the UltralyticsYoloCameraController function. Any tips on specific aspects of the code that might benefit from fine-tuning?

  4. Latest Versions: I am using the latest version of Ultralytics software. Could there be any upcoming updates that might help in improving efficiency or accuracy?

Your expertise and suggestions would be greatly appreciated!

@pderrenger
Copy link
Member

Hi @congngc,

Thank you for providing additional details about your setup. Let's dive into each point to help you optimize your camera detection performance:

  1. Model Quantization: Great to hear that you're using the INT8 quantized model. This should indeed help with performance.

  2. Device Specifications: The Samsung Galaxy S21 is a powerful device with a robust Snapdragon processor, which should handle the computational demands effectively. Leveraging the GPU or NNAPI delegates can further enhance performance. You can switch delegates in your code to see which one offers the best performance on your device.

  3. Code Optimization: Optimizing the UltralyticsYoloCameraController function can significantly impact performance. Here are a few tips:

    • Reduce Input Resolution: Lowering the resolution of the input frames can reduce the computational load. For example, resizing the frames to 320x240 or 640x480 can help.
    • Batch Processing: If feasible, process frames in batches rather than individually to take advantage of parallel processing capabilities.
    • Delegate Selection: Experiment with different delegates (CPU, GPU, NNAPI) to find the optimal one for your device. Here’s a snippet to switch delegates:
    import tensorflow as tf
    
    # Example of setting the GPU delegate
    interpreter = tf.lite.Interpreter(model_path="model.tflite", experimental_delegates=[tf.lite.experimental.load_delegate('libedgetpu.so.1')])
    interpreter.allocate_tensors()
  4. Latest Versions: It's excellent that you're using the latest version of the Ultralytics software. The development team continuously works on updates that enhance performance and accuracy. Keep an eye on the Ultralytics GitHub repository for any new releases.

Additionally, if you encounter any specific issues or bugs, providing a minimum reproducible example can be incredibly helpful for us to diagnose and address the problem efficiently. You can find more information on creating a reproducible example here.

Feel free to reach out if you have any further questions or need more assistance. We're here to help! 😊

@congngc
Copy link
Author

congngc commented Jun 26, 2024

Hi @pderrenger,
I have already implemented all the suggestions provided, including using a quantized model, selecting the appropriate delegate, optimizing my device settings, and updating the software. Despite these adjustments, I'm still experiencing low FPS. Could there be other factors affecting the performance? Any further assistance would be greatly appreciated.

@pderrenger
Copy link
Member

Hi @congngc,

Thank you for your detailed follow-up and for implementing the suggestions provided. It's great to see your proactive approach! Given that you've already optimized the model, delegate, device settings, and software, let's explore a few additional factors that might be affecting the performance:

  1. Background Processes: Ensure that there are no other intensive applications or background processes running on your device, as these can consume resources and impact performance.

  2. Thermal Throttling: Extended use of the camera and intensive processing can cause the device to heat up, leading to thermal throttling. This can reduce the performance of the CPU and GPU. Try to keep the device cool and monitor its temperature during use.

  3. Camera Frame Rate: Check the camera settings to ensure that it is set to the highest possible frame rate. Sometimes, the camera itself might be limiting the FPS.

  4. Model Complexity: While you are using a quantized model, the complexity of the model (e.g., YOLOv8n) might still be a factor. Consider experimenting with even lighter models if available, or reducing the input image size further.

  5. Code Profiling: Profile your code to identify any bottlenecks. Tools like Android Studio Profiler can help you pinpoint where the most time is being spent during inference and frame processing.

  6. Thread Management: Ensure that the inference and camera processing are running on separate threads to avoid blocking the main UI thread. Here’s a basic example of how you might handle threading in Python:

    import threading
    
    def process_frame(frame):
        # Your frame processing code here
        pass
    
    def camera_loop():
        while True:
            frame = get_camera_frame()
            threading.Thread(target=process_frame, args=(frame,)).start()
    
    camera_loop()

If the issue persists, providing a minimum reproducible example would be incredibly helpful for us to diagnose the problem more effectively. You can find guidance on creating one here.

Thank you for your patience and cooperation. We're committed to helping you achieve the best performance possible. If you have any further questions or need more assistance, feel free to reach out! 😊

@fransay
Copy link

fransay commented Jun 26, 2024

@congngc can you share the specs of the device you are testing on ?. I think one of the many ways we can increase fps is to spawn new isolates/thread to handle camera inference, but this is a tricky process, since you want to have a synchronous real time effect of the the box appearing on screen, the inference engine make predictions. Myself, I tested on a quite low end device, honor x6a with 4GB RAM and an octacore cpu (mediatek helio). FPS is in the range of 1-3. I think it is an interesting challenge to look into, a slight improvement in algorithmic processes can win us some hardware magic. I will keep this issue updated on my work into fps optimisation.

@mqasim41
Copy link

@congngc can you share the specs of the device you are testing on ?. I think one of the many ways we can increase fps is to spawn new isolates/thread to handle camera inference, but this is a tricky process, since you want to have a synchronous real time effect of the the box appearing on screen, the inference engine make predictions. Myself, I tested on a quite low end device, honor x6a with 4GB RAM and an octacore cpu (mediatek helio). FPS is in the range of 1-3. I think it is an interesting challenge to look into, a slight improvement in algorithmic processes can win us some hardware magic. I will keep this issue updated on my work into fps optimisation.

Did you have any success in increasing FPS ?

@sidewinderz0ne
Copy link

I tried on Samsung S24 Ultra with quantized YOLOv5su model with 320x320 imgsz, got about 20fps

@pderrenger
Copy link
Member

Hello @sidewinderz0ne,

Thank you for sharing your experience with the Samsung S24 Ultra and the quantized YOLOv5su model. Achieving 20 FPS with a 320x320 image size is a good benchmark, especially considering the computational demands of real-time object detection.

Given the context, here are a few additional suggestions that might help further optimize performance:

  1. Delegate Optimization: As mentioned earlier, experimenting with different delegates (CPU, GPU, NNAPI) can yield varying results. Given your device's capabilities, the GPU delegate might offer the best performance boost. Ensure that you are leveraging the most suitable delegate for your hardware.

  2. Thread Management: As @user suggested, spawning new isolates or threads to handle camera inference can be beneficial. This approach can help offload the main UI thread and improve responsiveness. Here’s a basic example in Python to illustrate threading:

    import threading
    
    def process_frame(frame):
        # Your frame processing code here
        pass
    
    def camera_loop():
        while True:
            frame = get_camera_frame()
            threading.Thread(target=process_frame, args=(frame,)).start()
    
    camera_loop()
  3. Profiling and Optimization: Utilize profiling tools like Android Studio Profiler to identify bottlenecks in your code. This can help pinpoint specific areas where optimization can have the most impact.

  4. Model Complexity: While you are using a quantized model, experimenting with even lighter models or further reducing the input image size might help. Balancing between model complexity and input resolution can lead to better FPS without significantly compromising accuracy.

  5. Background Processes and Thermal Management: Ensure that no other intensive applications are running in the background, and monitor the device's temperature to avoid thermal throttling.

If you have already tried these suggestions and the issue persists, it might be helpful to provide a minimum reproducible example. This can help us diagnose the problem more effectively. You can find guidance on creating one here.

Thank you for your patience and cooperation. We're committed to helping you achieve the best performance possible. If you have any further questions or need more assistance, feel free to reach out! 😊

@congngc
Copy link
Author

congngc commented Jul 26, 2024

can you share the specs of the device you are testing on ?. I think one of the many ways we can increase fps is to spawn new isolates/thread to handle camera inference, but this is a tricky process, since you want to have a synchronous real time effect of the the box appearing on screen, the inference engine make predictions. Myself, I tested on a quite low end device, honor x6a with 4GB RAM and an octacore cpu (mediatek helio). FPS is in the range of 1-3. I think it is an interesting challenge to look into, a slight improvement in algorithmic processes can win us some hardware magic. I will keep this issue updated on my work into fps optimisation.

This is my device
https://www.gsmarena.com/samsung_galaxy_s21_5g-10626.php

@pderrenger
Copy link
Member

Hello @congngc,

Thank you for sharing the specifications of your device. The Samsung Galaxy S21 is indeed a powerful device, and it should be capable of handling real-time object detection efficiently. Here are a few additional suggestions to help optimize the FPS:

  1. Delegate Optimization: As you mentioned, spawning new isolates or threads to handle camera inference can be beneficial. This approach can help offload the main UI thread and improve responsiveness. Here’s a basic example in Python to illustrate threading:

    import threading
    
    def process_frame(frame):
        # Your frame processing code here
        pass
    
    def camera_loop():
        while True:
            frame = get_camera_frame()
            threading.Thread(target=process_frame, args=(frame,)).start()
    
    camera_loop()
  2. Profiling and Optimization: Utilize profiling tools like Android Studio Profiler to identify bottlenecks in your code. This can help pinpoint specific areas where optimization can have the most impact.

  3. Model Complexity: While you are using a quantized model, experimenting with even lighter models or further reducing the input image size might help. Balancing between model complexity and input resolution can lead to better FPS without significantly compromising accuracy.

  4. Background Processes and Thermal Management: Ensure that no other intensive applications are running in the background, and monitor the device's temperature to avoid thermal throttling.

  5. Latest Versions: Verify that you are using the latest versions of the Ultralytics packages and dependencies. Updates often include performance improvements and bug fixes. If you haven't already, please check for the latest updates.

If the issue persists, providing a minimum reproducible example would be incredibly helpful for us to diagnose the problem more effectively. You can find guidance on creating one here.

Thank you for your patience and cooperation. We're committed to helping you achieve the best performance possible. If you have any further questions or need more assistance, feel free to reach out! 😊

@muhammad-qasim-cowlar
Copy link

Reducing the imgsz parameter of the quantized models seriously improves FPS at the cost of some accuracy.

@pderrenger
Copy link
Member

Hi @muhammad-qasim-cowlar,

Thank you for your insightful comment! You're absolutely right—reducing the imgsz parameter can significantly improve FPS, albeit with a trade-off in accuracy. This is a practical approach, especially when real-time performance is a priority.

For those looking to implement this, here's a quick example of how you can adjust the imgsz parameter in your code:

# Example of setting the image size for inference
imgsz = 320  # Reduce this value to improve FPS, e.g., 320, 240, etc.

# Load your model with the specified image size
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
model.imgsz = imgsz

# Perform inference
results = model(img)

Additionally, if you haven't already, please ensure you are using the latest versions of the Ultralytics packages. Updates often include performance improvements and bug fixes that could further enhance your FPS.

If you encounter any issues or have further questions, feel free to share more details. We're here to help! 😊

@yangga
Copy link
Contributor

yangga commented Oct 28, 2024

#71 PRed a kind of solution. I hope it gonna be helpful. 🙏

@pderrenger
Copy link
Member

Thank you for your contribution! We'll review your PR and provide feedback soon.

@devit7
Copy link

devit7 commented Dec 19, 2024

#71 PRed a kind of solution. I hope it gonna be helpful. 🙏

@yangga this impactfull. nice job man, thankyou 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants