Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initiative: Replace Insights with Aspects V1 #4

Closed
9 tasks done
jmakowski1123 opened this issue Jan 9, 2023 · 1 comment
Closed
9 tasks done

Initiative: Replace Insights with Aspects V1 #4

jmakowski1123 opened this issue Jan 9, 2023 · 1 comment

Comments

@jmakowski1123
Copy link

jmakowski1123 commented Jan 9, 2023

Current Documentation

MVP requirements: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/3988160641/Aspects+V1+Product+Requirements

Tech spec: https://openedx.atlassian.net/wiki/spaces/OEPM/pages/3999203338/Aspects+V1+Release+Technical+Approach

Kanban board: https://github.com/orgs/openedx/projects/5

Initiative Overview

This Initiative offers a solution to replace Insights and lays the foundation for a more robust and sustainable data & analytics system for the Open edX platform.

Problem

Insights is large, complicated, difficult and expensive to run, and fairly locked into Amazon. Furthermore, few Instances use it, creating a reality in which many independent data solutions exist, but without central management of the data pipeline.

Solution

This project will replace the data pipeline with a lightweight, flexible solution [What will we call this solution for end users?]. It will enable more streamlined and standardized data collection without sacrificing Providers' capacity to create custom reports for each client. In fact, this new set of tools will make it easier to do so.

Instead of being software that the Open edX community creates, Aspects will be a recommended set of 3rd party open source tools and the "known good" configurations to combine them into a powerful analytics platform. Initially Tutor plugins will be created to allow "out of the box" deployment of a recommended Aspects configuration for Tutor-based deployments. Using open standards and versatile tools will allow a high degree of customization and choice for site operators to suit their own use cases, which we hope they will contribute back to the Aspects project as instructions for alternate deployment methods.

Specifically Aspects version 1 will support the following data flow:

  1. Open edX events (known as tracking log events) are transformed into xAPI statements, an open standard that unlocks data compatibility with data from other LMSs as well as ensuring data consistency and versioning as the platform evolves. By default these statements will be anonymized to preserve learner privacy.
  2. xAPI statements are sent to a Learning Record Store (LRS), which will validate the statements and store them in an analytic database.
  3. Instructors and other users will access the data via a data exploration and visualization website, using the LMS for authentication and permissions, to view reports and download data for further exploration. Site operators will be able to create new reports and visualizations based on the specific needs of their installation.

User Expectations

The default install will come with reports that meet the majority of requested data needs, and site operators will be able to add additional custom queries on top of that. Operators will be able to share queries and visualizations that they find useful, or contribute them back to the project to be included in future versions.

Users will have access to a data dashboard of reports where they will be able to visualize their data in different views or download for offline processing. Site operators will have full permissions control (including the ability grant permission to explore the entire data set and create new visualizations).

Expected Benefits

  • Access all of the basic information that came with with edX Insights
    • Some advanced queries may be added after version 1, depending on resourcing
  • Leverage a wide variety of charts and graphs to visualize their data
  • Create custom visualizations and contribute them back to the community
  • Download data as CSV for further research
  • Ability to import old tracking log files to backfill an existing installation's data
  • Extreme deployment and configuration flexibility
  • Ease of deployment and low maintenance cost

Approach

  • We will recommend best-in-class open source solutions that can scale from low cost commodity hardware to medium-large scale deployments
    • Large and very large scale deployments often have unique needs that we are mindfully not trying to solve for in version 1, but are excited to help with as those needs arise
  • We will offer sensible default configurations and Tutor plugins to make adoption easy
  • We will maintain a set of instructions and recipes for different types of deployment and configuration to ease the burden of adoption for non-Tutor deployments

Milestones

Milestone 1: Discovery and Specs for Aspects V.1

Milestone 2: Aspects Reference Implementation

@jmakowski1123 jmakowski1123 converted this from a draft issue Jan 9, 2023
@jmakowski1123 jmakowski1123 changed the title Initiative: Replace Insights with OARS, V1 Initiative: Replace Insights with OARS, V1 [do we call this OARS externally?] Jan 10, 2023
@jmakowski1123 jmakowski1123 moved this to Backlog in Data Working Group Jan 10, 2023
@jmakowski1123 jmakowski1123 moved this from Backlog to Epics - no status in Data Working Group Jan 10, 2023
@jmakowski1123 jmakowski1123 added the initiative Huge unit of work, consisting of multiple epics label Jan 10, 2023
@bmtcril bmtcril changed the title Initiative: Replace Insights with OARS, V1 [do we call this OARS externally?] Initiative: Replace Insights with OARS, V1 Jan 10, 2023
@pomegranited
Copy link
Contributor

Huge thanks to @bmtcril and @jmakowski1123 for writing this up and developing the roadmap for this project!

[What will we call this solution for end users?]

I've been thinking about this, and one of the many benefits of recommending existing, open source, 3rd party software is that we don't have to provide all the documentation and support for these pieces. E.g., if people have questions for how to use Apache Superset, or how to write custom reports or add data sets, they can search Superset's docs and forums. But they'll only know to do that if we make it clear that the Analytics UI system they're using is "Apache Superset". Same for Clickhouse, Redis, the LRS...: finding the right support info means knowing which technology to search for. If we hide these names behind our own branding/packaging, then we obscure these names, and make it more difficult for people to learn what they need.

And since OARS is just the "reference system", the end users may end up using different pieces altogether. So I don't personally think we need a name for the "end user solution", as it will be something of their own construction which uses OARS as a guide.

@jmakowski1123 jmakowski1123 moved this to In Progress in Open edX Roadmap Jan 20, 2023
@bmtcril bmtcril changed the title Initiative: Replace Insights with OARS, V1 Initiative: Replace Insights with Aspects (formerly OARS), V1 Jun 26, 2023
@bmtcril bmtcril changed the title Initiative: Replace Insights with Aspects (formerly OARS), V1 Initiative: Replace Insights with Aspects V1 Feb 13, 2024
@jmakowski1123 jmakowski1123 removed the initiative Huge unit of work, consisting of multiple epics label Mar 28, 2024
@crathbun428 crathbun428 moved this from All Epics - no status to Done in Data Working Group Jun 26, 2024
@crathbun428 crathbun428 moved this from Being Developed to Shipped in Open edX Roadmap Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Shipped
Development

No branches or pull requests

3 participants