Skip to content
Change the repository type filter

All

    Repositories list

    • Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
      Python
      Apache License 2.0
      3254.9k766Updated Dec 20, 2024Dec 20, 2024
    • apify-cli

      Public
      Apify command-line interface helps you create, develop, build and run Apify actors, and manage the Apify cloud platform.
      TypeScript
      19122364Updated Dec 20, 2024Dec 20, 2024
    • The Apify SDK for Python is the official library for creating Apify Actors in Python. It provides useful features like actor lifecycle management, local storage emulation, and actor event handling.
      Python
      Apache License 2.0
      10120133Updated Dec 20, 2024Dec 20, 2024
    • Apify API client for Python
      Python
      Apache License 2.0
      135193Updated Dec 20, 2024Dec 20, 2024
    • This project is the home of Apify's documentation.
      API Blueprint
      Apache License 2.0
      80297634Updated Dec 20, 2024Dec 20, 2024
    • Transfer data from Apify Actors to vector databases (Chroma, Milvus, Pinecone, PostgreSQL (PG-Vector), Qdrant, and Weaviate)
      Python
      Apache License 2.0
      4410Updated Dec 20, 2024Dec 20, 2024
    • Apify ESLint preset to be shared between projects
      JavaScript
      Apache License 2.0
      0211Updated Dec 20, 2024Dec 20, 2024
    • Utilities and constants shared across Apify projects.
      TypeScript
      Apache License 2.0
      111252Updated Dec 20, 2024Dec 20, 2024
    • crawlee

      Public
      Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
      TypeScript
      Apache License 2.0
      70716k12617Updated Dec 20, 2024Dec 20, 2024
    • Apify API client for JavaScript / Node.js.
      TypeScript
      Apache License 2.0
      2769176Updated Dec 19, 2024Dec 19, 2024
    • This whitepaper describes a new concept for building serverless microapps called Actors, which are easy to develop, share, integrate, and build upon. Actors are a reincarnation of the UNIX philosophy for programs running in the cloud.
      Apache License 2.0
      0566Updated Dec 19, 2024Dec 19, 2024
    • Apify SDK monorepo
      TypeScript
      Apache License 2.0
      39128119Updated Dec 19, 2024Dec 19, 2024
    • workflows

      Public
      Apify's reusable github workflows
      Python
      4746Updated Dec 19, 2024Dec 19, 2024
    • A MCP Server for the RAG Web Browser Actor
      JavaScript
      Apache License 2.0
      21201Updated Dec 18, 2024Dec 18, 2024
    • This project is the 🏠 home of Apify actor template projects to help users quickly get started.
      Python
      182591Updated Dec 18, 2024Dec 18, 2024
    • Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.
      TypeScript
      Apache License 2.0
      1121.1k208Updated Dec 16, 2024Dec 16, 2024
    • Apify's fork of `docusaurus-plugin-typedoc-api`, customized for our Python documentation.
      TypeScript
      28000Updated Dec 16, 2024Dec 16, 2024
    • Base Docker images for Apify actors.
      Dockerfile
      Apache License 2.0
      237093Updated Dec 16, 2024Dec 16, 2024
    • .github

      Public
      Repository to define an organization (or team) wide Github Actions workflows
      0000Updated Dec 13, 2024Dec 13, 2024
    • A Homebrew tap for Apify tools
      Ruby
      1804Updated Dec 12, 2024Dec 12, 2024
    • RAG Web Browser is an Apify Actor to feed your LLM applications and RAG pipelines with up-to-date text content scraped from the web.
      TypeScript
      Apache License 2.0
      11130Updated Dec 11, 2024Dec 11, 2024
    • This tool integrates with AWS to monitor service usage costs and posts a summary of these costs to a Slack channel. The summary includes costs for various AWS services along with a chart that provides a visual breakdown of the costs over time.
      TypeScript
      MIT License
      0001Updated Dec 10, 2024Dec 10, 2024
    • The official integration for Apify and Haystack 2.0
      Python
      Apache License 2.0
      0200Updated Dec 9, 2024Dec 9, 2024
    • JavaScript
      0001Updated Dec 7, 2024Dec 7, 2024
    • A GitHub Action to push an Actor the the Apify platform
      Apache License 2.0
      01500Updated Dec 6, 2024Dec 6, 2024
    • Constants and utilities shared across Apify's Python libraries and projects.
      Python
      Apache License 2.0
      1010Updated Dec 6, 2024Dec 6, 2024
    • Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.
      JavaScript
      Apache License 2.0
      146857711Updated Dec 3, 2024Dec 3, 2024
    • Apify integration for Zapier
      JavaScript
      Apache License 2.0
      1840Updated Nov 29, 2024Nov 29, 2024
    • The Github action that makes sure that each PR is correctly set up and has a milestone set.
      TypeScript
      Apache License 2.0
      1110Updated Nov 29, 2024Nov 29, 2024
    • Generic REST API for scraping websites. Drop-in replacement for ScrapingBee, ScrapingAnt, and ScraperAPI services. And it is open-source!
      TypeScript
      51800Updated Nov 29, 2024Nov 29, 2024