Skip to content

gabrycina/llama-lerobot

Repository files navigation

🤖 Gripmind: LLM based robotics interfaces

An open-source project combining Llama 3.2 Vision model, robotic controls and brain computer interfaces to enable intuitive (and accesible!) human-robot interaction. Built on top of open-source projects including LeRobot, Llama, and EMOTIV's Cortex API.

🦙 Features 🦙

👁️ Vision-Based Spatial Understanding

  • Currently powered by Llama 3.2 90B Vision through Groq
  • Real-time environment analysis and spatial reasoning
  • Action sequence generation based on visual input
  • Planned edge deployment using smaller models (1B and 3B parameters)
    • Local inference for improved latency
    • Reduced hardware requirements
    • Offline operation capability

🦾 Robotic Control

  • Compatible with Moss v1 robotic arm
  • Precise motor control through LeRobot integration
  • Support for complex manipulation tasks

🤗 Open-source Contributions

  • 2 fully-trained RL policies openly available on huggingface
  • 20GB+ of human recorded data, openly shared with the community as [huggingface dataset](url

🧠 Brain-Computer Interface

  • Direct mind control of robotic arms using EMOTIV EEG headsets
  • Real-time neural signal processing
  • Built on the open Cortex API for BCI integration

🚀 Getting Started

Prerequisites

Installation

  1. Clone the repository
git clone https://github.com/yourusername/mindgrip.git
cd mindgrip
  1. Install dependencies
pip install -r requirements.txt
  1. Set up environment variables in a global .env with:
export GROQ_API_KEY="your_key_here"  # Required for Llama 90B model

Hardware Setup

  1. Follow the Moss v1 assembly guide for robotic arm setup
  2. Connect your EMOTIV headset following the Cortex API documentation

💡 Usage

Basic Control Flow

from mindgrip.llama import LlamaPolicy
from mindgrip.cortex import CortexInterface

# Initialize components
policy = LlamaPolicy()
bci = CortexInterface()

# Start control loop
while True:
    # Get BCI input
    command = bci.get_command()
    
    # Process with vision system
    action = policy.get_action(command)
    
    # Execute on robot
    robot.execute(action)

🗺️ Roadmap - It's just the start!

  • Initial integration with Llama 3.2 90B Vision
  • Edge deployment with 1B parameter model
  • Edge deployment with 3B parameter model
  • Offline operation support
  • Improved latency through local inference

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

📄 License

This project is fully open source and licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built with these amazing open-source projects:

About

Empower lerobot with multimodal Llama 3.2!

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages