Skip to content

Latest commit

 

History

History
26 lines (17 loc) · 1.65 KB

README.md

File metadata and controls

26 lines (17 loc) · 1.65 KB

PEACE ✌️: Prompt Engineering Automation for CLIPSeg Enhancement

CLIP Interrogator for Aerial Robotics

Built on top of CLIP Interrogator by @pharmapsychotic

Generate prompts suited for aerial robotics (real and simlation) to be used for CLIPSeg.

Figure 1: System Architecture

About

Figure 2: Segmetation Difference

Comparison of CLIP and CLIPSeg’s original prompt engineering and PEACE using images from CARLA.

a) A photo of grass
b) A photo of grass in animation play morning autumn
c) A photo of grass
d) A photo of grass in cartoon

The CLIP Interrogator is a prompt engineering tool that combines OpenAI's CLIP and Salesforce's BLIP to optimize text prompts to match a given image. Use the resulting prompts with text-to-image models like CLIPSeg. This work is an extenion of DOVESEI (https://arxiv.org/abs/2308.11471), where we improved on the prompt generation and engineering inside DOVESEI. The objective was to generate prompts that are dynamic, such that prompts are adaptive to observed images instead of a static prompt. In addition, they are automatically engineered to describe the observed images better.

Details about DOVESEI: https://github.com/MISTLab/DOVESEI/blob/main/README.md

Publication

For more information about PEACE, refer to our paper: https://arxiv.org/abs/2310.00085