Scripts to capture GitHub repository and usage statistics daily, for all repositories under an organization that uses GitHub Enterprise. Simply download the PowerShell script, edit the settings towards the top of the file (file paths, API key, Organization name, etc.), and run it. You can also schedule it as a Windows Task, or import into a database with an external ETL or ELT tool.
- Set GITHUB_USERNAME and GITHUB_API_KEY in the system environment variables.
- Download the PowerShell script (ExportGitHubUsageStatsForOrganization.ps1). In addition, install the PowerShellForGitHub module.
- Edit the settings at the top of the script, and run it.
- Grab the CSV or JSON files from the output directory. Files are replaced at each run, except for the "rolling" file which appends to previous days' data.
- Detailed setup and use instructions are available in the companion knowledge base article, Gathering GitHub Usage and Web Traffic Data.
- The statistics will be dumped into both CSV and JSON files in the output directory, including:
- github-stats-{OrganizationName}.csv - today's snapshot in CSV format. File is replaced at each run. Recommended for loading into a database.
- github-stats-{OrganizationName}.json - today's snapshot in JSON format. File is replaced at each run. Recommended for loading into a database.
- github-stats-detailed-{OrganizationName}.json - today's snapshot in JSON format, with all detailed included. File is replaced at each run. It can be used for debugging and troubleshooting.
- github-stats-rolling-{OrganizationName}.csv - today's snapshot added to the same CSV, without deleting previous data. This file can be used to create reports directly in Excel, Tableau, PowerBI, etc. without the need for a database.
- All the counts not labeled "yesterday" are 14-day totals, not for an individual day.
- Note that all dates and times are in universal time (UTC), in the GMT time zone (+00:00). That's because GitHub uses GMT to mark a "day" - the most granular time period available - and by keeping things in GMT, reporting becomes easier.
- The script(s) under the SQL folder can be used to create a table to host and accumulate the data. It includes SQL comments for most columns to use in a data dictionary.
-
- Currently, the only script(s) available are for Oracle databases. Some work maybe required to use a different database engine.
- The PowerShell script does not currently save to the database directly. A data pipeline is needed to load the data into a database.
- Although outside the scope of this project, it is worth mentioning that the table created from CSV can be used as-is in visualization tools like Tableau or PowerBI. It can also be further normalized or transformed into a star schema for reporting.
- Example of visualizations for repo usage in Tableau (using v1.0 of the script):
- Companion knowledge base article: Gathering GitHub Usage and Web Traffic Data..
- GitHub API Documentation.
- Microsoft's PowerShell wrapper for the GitHub API.
The Mobile Technologies Core provides investigators across the University of Michigan the support and guidance needed to utilize mobile technologies and digital mental health measures in their studies. Experienced faculty and staff offer hands-on consultative services to researchers throughout the University – regardless of specialty or research focus.
To get in touch, contact the individual developers in the check-in history.
If you need assistance identifying a contact person, email the EFDC's Mobile Technologies Core at: [email protected].
- Eisenberg Family Depression Center (@DepressionCenter)
- Gabriel Mongefranco (@gabrielmongefranco)
- Special thanks to the U-M "HITS Academic Integrations" team and Joe Lipa for creating a data pipeline to load this script into a database.
- Microsoft's PowerShellForGitHub module for PowerShell.
- GitHUB's API.
Copyright © 2024 The Regents of the University of Michigan
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/gpl-3.0-standalone.html.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. You should have received a copy of the license included in the section entitled "GNU Free Documentation License". If not, see https://www.gnu.org/licenses/fdl-1.3-standalone.html
If you find this repository, code or paper useful for your research, please cite it.
Copyright © 2024 The Regents of the University of Michigan