Skip to content

Commit

Permalink
Merge pull request #93 from microsoft/pre-release
Browse files Browse the repository at this point in the history
Release 0.2.1
  • Loading branch information
vyokky authored Jun 24, 2024
2 parents 522a7da + b49705d commit f537ba6
Show file tree
Hide file tree
Showing 80 changed files with 7,618 additions and 4,151 deletions.
3 changes: 1 addition & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,7 @@
/deprecated/*
/test/*.ipynb
/logs/*
/ufo/modules/*
/ufo/agents/*
/customization/*
__pycache__/
**/__pycache__/
*.pyc
Expand Down
7 changes: 6 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,11 @@ Both agents leverage the multi-modal capabilities of GPT-Vision to comprehend th


## 📢 News
- 📅 2024-06-25: **New Release for v0.2.1!** We are excited to announce the release of version 0.2.1! This update includes several new features and improvements:
1. **HostAgent Refactor:** We've refactored the HostAgent to enhance its efficiency in managing AppAgents within UFO.
2. **Evaluation Agent:** Introducing an evaluation agent that assesses task completion and provides real-time feedback.
3. **Google Gemini Support:** UFO now supports Google Gemini as the inference engine. Refer to our detailed guide in [README.md](/model_worker/readme.md).
4. **Customized User Agents:** Users can now create customized agents by simply answering a few questions.
- 📅 2024-05-21: We have reached 5K stars!✨
- 📅 2024-05-08: **New Release for v0.1.1!** We've made some significant updates! Previously known as AppAgent and ActAgent, we've rebranded them to HostAgent and AppAgent to better align with their functionalities. Explore the latest enhancements:
1. **Learning from Human Demonstration:** UFO now supports learning from human demonstration! Utilize the [Windows Step Recorder](https://support.microsoft.com/en-us/windows/record-steps-to-reproduce-a-problem-46582a9b-620f-2e36-00c9-04e25d784e47) to record your steps and demonstrate them for UFO. Refer to our detailed guide in [README.md](/record_processor/README.md) for more information.
Expand All @@ -54,7 +59,7 @@ UFO sightings have garnered attention from various media outlets, including:
- [Microsoft's UFO abducts traditional user interfaces for a smarter Windows experience](https://the-decoder.com/microsofts-ufo-abducts-traditional-user-interfaces-for-a-smarter-windows-experience/)
- [🚀 UFO & GPT-4-V: Sit back and relax, mientras GPT lo hace todo🌌](https://www.linkedin.com/posts/gutierrezfrancois_ai-ufo-microsoft-activity-7176819900399652865-pLoo?utm_source=share&utm_medium=member_desktop)
- [The AI PC - The Future of Computers? - Microsoft UFO](https://www.youtube.com/watch?v=1k4LcffCq3E)
- [下一代Windows系统曝光:基于GPT-4V,Agent跨应用调度,代号UFO](https://www.qbitai.com/2024/02/121048.html)
- [下一代Windows系统曝光:基于GPT-4V,Agent跨应用调度,代号UFO](https://baijiahao.baidu.com/s?id=1790938358152188625&wfr=spider&for=pc)
- [下一代智能版 Windows 要来了?微软推出首个 Windows Agent,命名为 UFO!](https://blog.csdn.net/csdnnews/article/details/136161570)
- [Microsoft発のオープンソース版「UFO」登場! Windowsを自動操縦するAIエージェントを試す](https://internet.watch.impress.co.jp/docs/column/shimizu/1570581.html)
- ...
Expand Down
24 changes: 22 additions & 2 deletions model_worker/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,29 @@
### NOTE
The lite version of the prompt is not fully optimized. To achieve better results, it is recommended that users adjust the prompt according to performance!!!

### If you use Gemini as the Agent

1. Create an account on [Google AI](https://ai.google.dev/) and get your API key.
2. Install the required packages google-generativeai or install the `requirement.txt` with uncommenting the Gemini.
```bash
pip install -U google-generativeai==0.7.0
```
3. Add following configuration to `config.yaml`:
```json showLineNumbers
{
"API_TYPE": "Gemini" ,
"API_KEY": "YOUR_KEY",
"API_MODEL": "YOUR_MODEL"
}
```
NOTE: `API_MODEL` is the model name of QWen LLM API.
You can find the model name in the [Gemini LLM model list](https://ai.google.dev/gemini-api).
If you meet the `429 Resource has been exhausted (e.g. check quota).`, it may because the rate limit of your Gemini API.

### If you use QWEN as the Agent

1. QWen (Tongyi Qianwen) is a LLM developed by Alibaba. Go to [QWen](https://dashscope.aliyun.com/) and register an account and get the API key. More details can be found [here](https://help.aliyun.com/zh/dashscope/developer-reference/activate-dashscope-and-create-an-api-key?spm=a2c4g.11186623.0.0.7b5749d72j3SYU) (in Chinese).
2. Install the required packages dashscope or run the `setup.py` with `-qwen` options.
2. Install the required packages dashscope or install the `requirement.txt` with uncommenting the Qwen.
```bash
pip install dashscope
```
Expand All @@ -23,7 +43,7 @@ You can find the model name in the [QWen LLM model list](https://help.aliyun.com
We provide a short example to show how to configure the ollama in the following, which might change if ollama makes updates.

```bash title="install ollama and serve LLMs in local" showLineNumbers
## Install ollama on Linux & WSL2 or run the `setup.py` with `-ollama` options
## Install ollama on Linux & WSL2
curl https://ollama.ai/install.sh | sh
## Run the serving
ollama serve
Expand Down
4 changes: 3 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,6 @@ sentence-transformers==2.5.1
##For Qwen
#dashscope==1.15.0
##For removing stopwords
#nltk==3.8.1
#nltk==3.8.1
##For Gemini
#google-generativeai==0.7.0
2 changes: 1 addition & 1 deletion ufo/__main__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from . import ufo
from ufo import ufo

if __name__ == "__main__":
# Execute the main script
Expand Down
Loading

0 comments on commit f537ba6

Please sign in to comment.