Skip to content

Commit

Permalink
feat: simulate video learning events
Browse files Browse the repository at this point in the history
Resolves #3
  • Loading branch information
jo-elimu committed Sep 20, 2024
1 parent e629c8c commit 18b9a86
Show file tree
Hide file tree
Showing 7 changed files with 146 additions and 52 deletions.
12 changes: 12 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file

version: 2
updates:
- package-ecosystem: "pip" # See documentation for possible values
directory: "/" # Location of package manifests
schedule:
interval: "daily"
open-pull-requests-limit: 2
45 changes: 45 additions & 0 deletions .github/workflows/simulate-events-daily.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: Simulate events (daily)

on:
schedule:
- cron: 59 11 * * *

jobs:
simulate_events:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Set up Python 3.10
uses: actions/setup-python@v3
with:
python-version: "3.11"

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install flake8
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Lint with flake8
run: |
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Simulate VideoLearningEvents
run: |
python simulate-video-learning-events.py
- name: Git Config
run: |
git config user.name 'Nya Ξlimu'
git config user.email '[email protected]'
- name: Git Commit
run: |
git add **/*.csv
git commit -m 'chore(ml): simulate events' --allow-empty
- name: Git Push
run: |
git push
38 changes: 38 additions & 0 deletions .github/workflows/simulate-events.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
name: Simulate events

on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]

jobs:
simulate_events:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.9", "3.10", "3.11"]
steps:
- uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install flake8
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Lint with flake8
run: |
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Simulate VideoLearningEvents
run: |
python simulate-video-learning-events.py
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.venv
56 changes: 4 additions & 52 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,58 +1,10 @@
# DataGeneratorSimulator
It will generate usage and performance data from user.
# ML: Event Simulator

## Tutorial
*[Building a Data-Driven Education System](http://www2.datainnovation.org/2016-data-driven-education.pdf)

*[Enhancing Teaching and Learning Through Educational Data Mining and Learning Analytics](https://tech.ed.gov/wp-content/uploads/2014/03/edm-la-brief.pdf)


# Focus two major areas:
## Content Analytics

Are teachers using the best possible content? This level of data analysis takes a deeper dive to better inform the design of new course content and understand its impact (or lack thereof) on students.

## Learning Analytics

Game-based learning and adaptive learning systems are growing in use. This technology is designed to build statistical models of student knowledge, tracking their progress in order to personalize the learning experience.

## Requirement

pip install python-testdata


## Examples
We integrate the awsome fake-factory package to generate data using FakeData,
this allows us to usage data event:
* id
* deviceId
* studentId
* packageName
* literacySkill
* and much much more

lets create a very simple factory that generates Users:
## Simulate `VideoLearningEvent`s

```python
import testdata
import datetime

class Usage(testdata.DictFactory):
id = testdata.CountingFactory(10)
deviceId = testdata.CountingFactory(10)
studentId = testdata.CountingFactory(10)
packageName = testdata.RandomSelection(['Literacy', 'Game', 'Speech'])
literacySkill = testdata.RandomLengthStringFactory()
numeracySkill = testdata.RandomLengthStringFactory()
letter = testdata.RandomLengthStringFactory()
number = testdata.RandomLengthStringFactory()
word = testdata.RandomLengthStringFactory()

for usage in Usage().generate(10): # let say we only want 10 users
print usage
#{'numeracySkill': 'PwBiXKjwddG', 'studentId': 17, 'word': 'XctugWiHPobIvHNGEbYlgyOUuuqCSKgoTFAhJSQzUUleDEkygyZOWBnGYiLBXbywpwxAsisToqDDWGPHQqbOOlmGVVa', 'packageName': 'Literacy', 'number': 'IrtJUAxnFVOQyvvqlpIsmkaWnRvADBzWBiCYUPvfSwvdHS', 'literacySkill': 'hSQSRXUevpdYMGAs', 'deviceId': 17, 'letter': 'JYuWfonIdptbdpFhBNhLIkLoyhuUgRUvdiUWBcfReeezORAtXhJvNuLZASFeRCAvxvPgOeTZ', 'id': 17}
#{'numeracySkill': 'ozTpqAwdLstMzeijgJBGYMLantLSMESfYEBMQQxkjILBgNXohBjMbwqrhGsnjoSlcsCGOnTsdgMICQfB', 'studentId': 18, 'word': 'CnhxwMonHnMlEtxcpGowQymEeZtxvlUBDaKHEKRC', 'packageName': 'Literacy', 'number': 'xkerlJLhlyOgsTxHqMPffjPLOqbjgZqtggGzxPTkOleoZtEaDiYnpKxrouCcgRPjdtf', 'literacySkill': 'VlEeAOKKOIgweFTxBeNiOWmoztGPWSqhsIxTr', 'deviceId': 18, 'letter': 'NwJUuHLOkaJHsIvlSQeggfT', 'id': 18}
#{'numeracySkill': 'uaUQunGtHwrFTuRlVrhwEUisIWlcrZXUZKIlILoPoCgnVWHwrrRaHhxQJVnECUtSvppzQDtpiqUSds', 'studentId': 19, 'word': 'vOTlRRgSXwgmXAthOYnQTTtPJyGxGbbMOj', 'packageName': 'Game', 'number': 'bDmhALNhnmazlonmBIjvwWzXgQfPQQekWJErEvJjWWHrufxuINyHuNiLPvFWynVwdNTaTGIgvvGCAqFRZ', 'literacySkill': 'BpfiZyRAzovNbEhtznPSaqsaZhRkFHlWNpmbzBXKCmBJPnuYiQyEToMaOkVJOVZKNCCAyGpZSpGhfseBMfGaFvltHaJyfcdota', 'deviceId': 19, 'letter': 'nvwanqC', 'id': 19}
pip install -r requirements.txt
python simulate-video-learning-events.py
```

---
Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
pandas==2.2.2
45 changes: 45 additions & 0 deletions simulate-video-learning-events.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
from os.path import basename
import pandas

language_codes = ['ENG', 'HIN', 'TGL']
print(basename(__file__), f'language_codes: {language_codes}')

android_ids = ['e387e38700000001', 'e387e38700000002', 'e387e38700000003']
print(basename(__file__), f'android_ids: {android_ids}')

# Should be the same version as the most recent release of the Analytics app:
# https://github.com/elimu-ai/analytics/releases
analytics_version_code = 3001018
print(basename(__file__), f'analytics_version_code: {analytics_version_code}')

def simulateVideoLearningEvent(android_id):
"""Simulate a VideoLearningEvent, e.g. a video being opened."""

# TODO

return {}

for language_code in language_codes:
print(basename(__file__))
print(basename(__file__), f'language_code: {language_code}')

videos_csv_url = f'https://raw.githubusercontent.com/elimu-ai/webapp/main/src/main/resources/db/content_PROD/{language_code.lower()}/videos.csv'
print(basename(__file__), f'videos_csv_url: {videos_csv_url}')
videos_df = pandas.read_csv(videos_csv_url)
print(basename(__file__), f'videos_df: \n{videos_df}')

base_url = f'http://{language_code.lower()}.elimu.ai'
print(basename(__file__), f'base_url: {base_url}')

rest_url = f'{base_url}/rest/v2'
print(basename(__file__), f'rest_url: {rest_url}')

for android_id in android_ids:
print(basename(__file__))
print(basename(__file__), f'android_id: {android_id}')

event = simulateVideoLearningEvent(android_id)
print(basename(__file__), f'event: {event}')

# Export to CSV
# TODO

0 comments on commit 18b9a86

Please sign in to comment.