Welcome to NavAIGuide (/næv eɪ aɪ ɡaɪd/), an extensible multi-modal agentic framework designed to fulfill plans and user queries by tapping into the mobile and desktop ecosystem of apps available. Here's how NavAIGuide stands out:
-
Visual Task Detection: Compatible with GPT-4V and many more vision models, NavAIGuide excels at identifying the next steps directly from page screenshots.
-
Advanced Code Selectors: Recognizing that visual elements and their positions sometimes fall short, NavAIGuide employs grounding techniques for both XML and HTML, allowing for precise matching with the most relevant selectors, tailored to the specific action required.
-
Action-Oriented Execution: At its core, NavAIGuide features an action-based framework using a JSON schema and reproducible outputs.
-
Resilient Error Handling: Understanding that errors are part of AI Agents, NavAIGuide features a built-in retry mechanism with exponential backoff, adeptly navigating through transient failures to ensure the Agent can move forward.
NavAIGuide Agents extend the core toolkit with advanced automation solutions:
- Preview of Appium iOS Agents: Explore how your AI Agents can gain access to the ecosystem of apps and functionalities on your iOS device.
- Preview of Appium Android Agents (Coming soon): Explore how your AI Agents can gain access to the ecosystem of apps and functionalities on your Android device.
- Playwright-based Web Agents (Coming soon): Learn how to build Web AI Agent Companions.
You can choose to either clone the repository or use npm, yarn, or pnpm to install NavAIGuide.
- For Core, see installation steps.
- For iOS, see installation steps.
Project NavAIGuide continues to face challenges in long-horizon planning and code inference accuracy. The current focus is on enhancing the stability of NavAIGuide agents.
We welcome contributions. Please follow the standard fork-and-pull request workflow for your contributions.
NavAIGuide is licensed under the MIT License.
For support, questions, or feature requests, open an issue in the GitHub repository.