-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simulate video learning events #5
base: main
Are you sure you want to change the base?
Conversation
Warning Rate limit exceeded@jo-elimu has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 8 minutes and 39 seconds before requesting another review. How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThe changes involve the introduction of a new Python script for simulating video learning events, alongside new configuration files for Dependabot and GitHub Actions workflows. The README file has been significantly refocused to emphasize video learning event simulation, with updated installation instructions. A new dependency for the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant GitHub Actions
participant Python Script
participant Data Source
User->>GitHub Actions: Trigger event simulation
GitHub Actions->>Python Script: Execute simulation script
Python Script->>Data Source: Fetch video data
Data Source-->>Python Script: Return data
Python Script->>User: Display simulated events
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Outside diff range and nitpick comments (4)
.github/workflows/simulate-events.yml (1)
29-34
: Consider failing the build on all flake8 errors.The linting step is set up to fail the build on Python syntax errors and undefined names, which is good. However, it treats all other errors as warnings, which may allow potential issues to slip through.
Consider updating the second flake8 command to remove the
--exit-zero
flag, which will cause the build to fail on all flake8 errors. This will help catch more potential issues early in the development process.simulate-video-learning-events.py (3)
4-13
: LGTM, but consider automating the analytics version code update.The code correctly defines the required variables and uses print statements for debugging purposes.
However, the analytics version code is hardcoded and might need to be updated manually in the future. Consider automating this process to ensure that the version code is always in sync with the latest release of the Analytics app.
15-21
: Reminder: Complete the function implementation.The
simulateVideoLearningEvent
function has a TODO comment indicating that the implementation is pending. Please ensure that the function is properly implemented to simulate a video learning event as described in the docstring.Do you want me to generate the function implementation or open a GitHub issue to track this task?
37-45
: Reminder: Implement the CSV export functionality.The code has a TODO comment indicating that exporting the event to a CSV file is pending. Please ensure that the CSV export functionality is properly implemented to store the generated video learning events.
Do you want me to generate the code for exporting the event to a CSV file or open a GitHub issue to track this task?
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (25)
dashboard/assets/fonts/Pe-icon-7-stroke.eot
is excluded by!**/*.eot
dashboard/assets/fonts/Pe-icon-7-stroke.svg
is excluded by!**/*.svg
dashboard/assets/fonts/Pe-icon-7-stroke.ttf
is excluded by!**/*.ttf
dashboard/assets/fonts/Pe-icon-7-stroke.woff
is excluded by!**/*.woff
dashboard/assets/img/default-avatar.png
is excluded by!**/*.png
dashboard/assets/img/faces/face-0.jpg
is excluded by!**/*.jpg
dashboard/assets/img/faces/face-1.jpg
is excluded by!**/*.jpg
dashboard/assets/img/faces/face-2.jpg
is excluded by!**/*.jpg
dashboard/assets/img/faces/face-3.jpg
is excluded by!**/*.jpg
dashboard/assets/img/faces/face-4.jpg
is excluded by!**/*.jpg
dashboard/assets/img/faces/face-5.jpg
is excluded by!**/*.jpg
dashboard/assets/img/faces/face-6.jpg
is excluded by!**/*.jpg
dashboard/assets/img/faces/face-7.jpg
is excluded by!**/*.jpg
dashboard/assets/img/favicon.ico
is excluded by!**/*.ico
dashboard/assets/img/loading-bubbles.svg
is excluded by!**/*.svg
dashboard/assets/img/mask.png
is excluded by!**/*.png
dashboard/assets/img/new_logo.png
is excluded by!**/*.png
dashboard/assets/img/sidebar-1.jpg
is excluded by!**/*.jpg
dashboard/assets/img/sidebar-2.jpg
is excluded by!**/*.jpg
dashboard/assets/img/sidebar-3.jpg
is excluded by!**/*.jpg
dashboard/assets/img/sidebar-4.jpg
is excluded by!**/*.jpg
dashboard/assets/img/sidebar-5.jpg
is excluded by!**/*.jpg
dashboard/assets/img/tim_80x80.png
is excluded by!**/*.png
dashboard/assets/js/bootstrap.min.js
is excluded by!**/*.min.js
dashboard/assets/js/chartist.min.js
is excluded by!**/*.min.js
Files selected for processing (52)
- .github/dependabot.yml (1 hunks)
- .github/workflows/simulate-events-daily.yml (1 hunks)
- .github/workflows/simulate-events.yml (1 hunks)
- .gitignore (1 hunks)
- README.md (1 hunks)
- dashboard/LICENSE.md (0 hunks)
- dashboard/README.md (0 hunks)
- dashboard/assets/css/animate.min.css (0 hunks)
- dashboard/assets/css/demo.css (0 hunks)
- dashboard/assets/css/light-bootstrap-dashboard.css (0 hunks)
- dashboard/assets/css/pe-icon-7-stroke.css (0 hunks)
- dashboard/assets/js/bootstrap-checkbox-radio-switch.js (0 hunks)
- dashboard/assets/js/bootstrap-notify.js (0 hunks)
- dashboard/assets/js/bootstrap-select.js (0 hunks)
- dashboard/assets/js/demo.js (0 hunks)
- dashboard/assets/js/light-bootstrap-dashboard.js (0 hunks)
- dashboard/assets/sass/lbd/_alerts.scss (0 hunks)
- dashboard/assets/sass/lbd/_buttons.scss (0 hunks)
- dashboard/assets/sass/lbd/_cards.scss (0 hunks)
- dashboard/assets/sass/lbd/_chartist.scss (0 hunks)
- dashboard/assets/sass/lbd/_checkbox-radio-switch.scss (0 hunks)
- dashboard/assets/sass/lbd/_dropdown.scss (0 hunks)
- dashboard/assets/sass/lbd/_footers.scss (0 hunks)
- dashboard/assets/sass/lbd/_inputs.scss (0 hunks)
- dashboard/assets/sass/lbd/_misc.scss (0 hunks)
- dashboard/assets/sass/lbd/_mixins.scss (0 hunks)
- dashboard/assets/sass/lbd/_navbars.scss (0 hunks)
- dashboard/assets/sass/lbd/_responsive.scss (0 hunks)
- dashboard/assets/sass/lbd/_sidebar-and-main-panel.scss (0 hunks)
- dashboard/assets/sass/lbd/_tables.scss (0 hunks)
- dashboard/assets/sass/lbd/_typography.scss (0 hunks)
- dashboard/assets/sass/lbd/_variables.scss (0 hunks)
- dashboard/assets/sass/lbd/mixins/_buttons.scss (0 hunks)
- dashboard/assets/sass/lbd/mixins/_cards.scss (0 hunks)
- dashboard/assets/sass/lbd/mixins/_chartist.scss (0 hunks)
- dashboard/assets/sass/lbd/mixins/_icons.scss (0 hunks)
- dashboard/assets/sass/lbd/mixins/_inputs.scss (0 hunks)
- dashboard/assets/sass/lbd/mixins/_labels.scss (0 hunks)
- dashboard/assets/sass/lbd/mixins/_morphing-buttons.scss (0 hunks)
- dashboard/assets/sass/lbd/mixins/_navbars.scss (0 hunks)
- dashboard/assets/sass/lbd/mixins/_social-buttons.scss (0 hunks)
- dashboard/assets/sass/lbd/mixins/_tabs.scss (0 hunks)
- dashboard/assets/sass/lbd/mixins/_transparency.scss (0 hunks)
- dashboard/assets/sass/lbd/mixins/_vendor-prefixes.scss (0 hunks)
- dashboard/assets/sass/light-bootstrap-dashboard.scss (0 hunks)
- dashboard/dashboard.html (0 hunks)
- docs/README.md (0 hunks)
- model/classifer.py (0 hunks)
- model/score.py (0 hunks)
- requirements.txt (1 hunks)
- simulate-video-learning-events.py (1 hunks)
- test_data.py (0 hunks)
Files not reviewed due to no reviewable changes (45)
- dashboard/LICENSE.md
- dashboard/README.md
- dashboard/assets/css/animate.min.css
- dashboard/assets/css/demo.css
- dashboard/assets/css/light-bootstrap-dashboard.css
- dashboard/assets/css/pe-icon-7-stroke.css
- dashboard/assets/js/bootstrap-checkbox-radio-switch.js
- dashboard/assets/js/bootstrap-notify.js
- dashboard/assets/js/bootstrap-select.js
- dashboard/assets/js/demo.js
- dashboard/assets/js/light-bootstrap-dashboard.js
- dashboard/assets/sass/lbd/_alerts.scss
- dashboard/assets/sass/lbd/_buttons.scss
- dashboard/assets/sass/lbd/_cards.scss
- dashboard/assets/sass/lbd/_chartist.scss
- dashboard/assets/sass/lbd/_checkbox-radio-switch.scss
- dashboard/assets/sass/lbd/_dropdown.scss
- dashboard/assets/sass/lbd/_footers.scss
- dashboard/assets/sass/lbd/_inputs.scss
- dashboard/assets/sass/lbd/_misc.scss
- dashboard/assets/sass/lbd/_mixins.scss
- dashboard/assets/sass/lbd/_navbars.scss
- dashboard/assets/sass/lbd/_responsive.scss
- dashboard/assets/sass/lbd/_sidebar-and-main-panel.scss
- dashboard/assets/sass/lbd/_tables.scss
- dashboard/assets/sass/lbd/_typography.scss
- dashboard/assets/sass/lbd/_variables.scss
- dashboard/assets/sass/lbd/mixins/_buttons.scss
- dashboard/assets/sass/lbd/mixins/_cards.scss
- dashboard/assets/sass/lbd/mixins/_chartist.scss
- dashboard/assets/sass/lbd/mixins/_icons.scss
- dashboard/assets/sass/lbd/mixins/_inputs.scss
- dashboard/assets/sass/lbd/mixins/_labels.scss
- dashboard/assets/sass/lbd/mixins/_morphing-buttons.scss
- dashboard/assets/sass/lbd/mixins/_navbars.scss
- dashboard/assets/sass/lbd/mixins/_social-buttons.scss
- dashboard/assets/sass/lbd/mixins/_tabs.scss
- dashboard/assets/sass/lbd/mixins/_transparency.scss
- dashboard/assets/sass/lbd/mixins/_vendor-prefixes.scss
- dashboard/assets/sass/light-bootstrap-dashboard.scss
- dashboard/dashboard.html
- docs/README.md
- model/classifer.py
- model/score.py
- test_data.py
Files skipped from review due to trivial changes (1)
- .gitignore
Additional context used
actionlint
.github/workflows/simulate-events-daily.yml
40-40: shellcheck reported issue in this script: SC2035:info:1:9: Use ./glob or -- glob so names with dashes won't become options
(shellcheck)
Additional comments not posted (16)
requirements.txt (1)
1-1
: LGTM!The addition of the
pandas
library as a dependency aligns with the PR objectives of generating CSV files and simulating learning events. Specifying the exact version ensures consistent behavior across different environments..github/dependabot.yml (1)
1-12
: Dependabot configuration looks good!The Dependabot configuration file is set up correctly:
- It uses version 2 of the configuration format.
- It targets the "pip" package ecosystem, which is appropriate for a Python project.
- It looks for the
requirements.txt
file at the root of the repository.- It schedules daily checks for dependency updates.
- It limits the number of open pull requests to 2, which helps prevent maintainer overload.
This configuration will help keep the project's dependencies up to date automatically, saving time and effort for the maintainers. Nice work!
README.md (3)
1-1
: The title change is appropriate.The new title "ML: Event Simulator" accurately reflects the updated focus of the project on simulating events for machine learning purposes, specifically video learning events as mentioned in the PR objectives.
6-6
: Using a requirements file is a good practice.The change from installing a specific package to using a
requirements.txt
file is a better approach for managing dependencies. It allows for easier tracking and management of all the required packages and their versions in one place.
7-7
: The new example usage is clear and aligns with the PR objective.Replacing the previous example with the command to run the "simulate-video-learning-events.py" script provides a straightforward way to execute the video learning event simulation. The script name is descriptive and clearly conveys its purpose, which aligns with the PR objective.
.github/workflows/simulate-events.yml (4)
3-7
: LGTM!The workflow triggers are set up correctly to run the workflow on every commit pushed to the main branch and every pull request targeting the main branch.
12-14
: LGTM!The matrix strategy is set up correctly to run the workflow on multiple Python versions. This helps catch any version-specific issues.
23-27
: LGTM!The dependency installation step is set up correctly to upgrade pip, install flake8 for linting, and install dependencies from requirements.txt if the file exists.
36-38
: Verify the event simulation script.The workflow runs the
simulate-video-learning-events.py
script to simulate video learning events. However, the script file is not provided in the context, so it's not possible to analyze its contents.Please ensure that the
simulate-video-learning-events.py
script file exists in the repository and is properly implemented to simulate the events correctly. It would be helpful to provide the script file for review to ensure it aligns with the PR objectives..github/workflows/simulate-events-daily.yml (4)
1-6
: LGTM!The workflow name clearly conveys its purpose, and the cron expression is set up correctly to trigger the workflow daily at 11:59 UTC.
7-29
: LGTM!The job setup looks good:
- Running on the latest Ubuntu version ensures an up-to-date environment.
- Setting up Python 3.11 using the
actions/setup-python
action is a standard practice.- Installing dependencies and running linting are good practices to ensure code quality.
- The flake8 configurations seem reasonable.
31-33
: Verify the correctness and completeness of the simulation logic.Running the
simulate-video-learning-events.py
script aligns with the purpose of the workflow, and the script name clearly indicates its responsibility. However, please ensure that the script correctly and completely simulates the desired VideoLearningEvents.
35-38
: LGTM!Setting the Git user name and email is necessary for making commits, and the configuration commands are correct.
simulate-video-learning-events.py (3)
1-3
: LGTM!The imports are correct and serve their intended purpose.
22-29
: LGTM!The code correctly iterates over the language codes, fetches the corresponding video data from the specified URL, and reads the CSV file using the
pandas
library. Printing the DataFrame is useful for debugging purposes.
31-35
: LGTM!The code correctly constructs the base URL and the REST API URL based on the current language code. Printing the URLs is useful for debugging purposes.
- name: Git Commit | ||
run: | | ||
git add **/*.csv | ||
git commit -m 'chore(ml): simulate events' --allow-empty | ||
- name: Git Push | ||
run: | | ||
git push |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the shellcheck issue in the git add
command.
To prevent file names with dashes from being interpreted as options, update the git add
command to use ./*glob*
or -- *glob*
as suggested by shellcheck.
Apply this diff to fix the issue:
-git add **/*.csv
+git add -- **/*.csv
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
- name: Git Commit | |
run: | | |
git add **/*.csv | |
git commit -m 'chore(ml): simulate events' --allow-empty | |
- name: Git Push | |
run: | | |
git push | |
- name: Git Commit | |
run: | | |
git add -- **/*.csv | |
git commit -m 'chore(ml): simulate events' --allow-empty | |
- name: Git Push | |
run: | | |
git push |
Tools
actionlint
40-40: shellcheck reported issue in this script: SC2035:info:1:9: Use ./glob or -- glob so names with dashes won't become options
(shellcheck)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai How can we include shellcheck in our GitHub workflow files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jo-elimu, you can include shellcheck in your GitHub workflow files by adding steps to install and run shellcheck on your shell scripts. Here's how you can modify your existing workflow in .github/workflows/simulate-events-daily.yml
:
name: Simulate events (daily)
on:
schedule:
- cron: 59 11 * * *
jobs:
simulate_events:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.11
uses: actions/setup-python@v3
with:
python-version: "3.11"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install flake8
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Lint with flake8
run: |
# Stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# Exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Install ShellCheck
run: |
sudo apt-get update
sudo apt-get install -y shellcheck
- name: Lint shell scripts with ShellCheck
run: |
shellcheck **/*.sh
shellcheck **/*.bash
shellcheck **/*.zsh
shellcheck **/*.ksh
- name: Simulate VideoLearningEvents
run: |
python simulate-video-learning-events.py
- name: Git Config
run: |
git config user.name 'Nya Ξlimu'
git config user.email '[email protected]'
- name: Git Commit
run: |
git add -- **/*.csv
git commit -m 'chore(ml): simulate events' --allow-empty
- name: Git Push
run: |
git push
This adds two new steps:
- Install ShellCheck: Installs ShellCheck on the runner using
apt-get
. - Lint shell scripts with ShellCheck: Runs ShellCheck on all your shell script files.
Alternatively, you can use a pre-made GitHub Action for ShellCheck. Here's how you can do it using ludeeus/action-shellcheck
:
- name: Lint shell scripts with ShellCheck
uses: ludeeus/[email protected]
with:
path: ./
This action will automatically run ShellCheck on your shell scripts in the repository.
Let me know if you need any further assistance!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Outside diff range and nitpick comments (4)
simulate-video-learning-events.py (4)
8-8
: Consider removing print statements for production code.The print statements are useful for debugging purposes but should be removed or commented out in production code to avoid unnecessary output and potential performance overhead.
Also applies to: 11-11, 16-16, 19-19
58-58
: Address the static analysis hint regarding the unused loop variable.The static analysis tool suggests renaming the unused loop variable
i
to_i
to indicate that it is intentionally unused.Apply this change:
-for i in range(random_number_of_events): +for _i in range(random_number_of_events):Tools
Ruff
58-58: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
65-65
: Remove the print statement for thevideo_learning_events_df
DataFrame.Printing the entire DataFrame can be useful for debugging purposes but should be removed or commented out in production code to avoid unnecessary output and potential performance overhead, especially if the DataFrame is large.
71-71
: Address the static analysis hint regarding the f-string without placeholders.The static analysis tool suggests removing the extraneous
f
prefix from the string since it doesn't contain any placeholders.Apply this change:
-storybook_learning_events_dir = os.path.join(version_code_dir, f'storybook-learning-events') +storybook_learning_events_dir = os.path.join(version_code_dir, 'storybook-learning-events')Tools
Ruff
71-71: f-string without any placeholders
Remove extraneous
f
prefix(F541)
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (9)
lang-ENG/android-id-e387e38700000001/version-code-3001018/storybook-learning-events/e387e38700000001_3001018_storybook-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-ENG/android-id-e387e38700000002/version-code-3001018/storybook-learning-events/e387e38700000002_3001018_storybook-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-ENG/android-id-e387e38700000003/version-code-3001018/storybook-learning-events/e387e38700000003_3001018_storybook-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-HIN/android-id-e387e38700000001/version-code-3001018/storybook-learning-events/e387e38700000001_3001018_storybook-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-HIN/android-id-e387e38700000002/version-code-3001018/storybook-learning-events/e387e38700000002_3001018_storybook-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-HIN/android-id-e387e38700000003/version-code-3001018/storybook-learning-events/e387e38700000003_3001018_storybook-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-TGL/android-id-e387e38700000001/version-code-3001018/storybook-learning-events/e387e38700000001_3001018_storybook-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-TGL/android-id-e387e38700000002/version-code-3001018/storybook-learning-events/e387e38700000002_3001018_storybook-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-TGL/android-id-e387e38700000003/version-code-3001018/storybook-learning-events/e387e38700000003_3001018_storybook-learning-events_2024-09-21.csv
is excluded by!**/*.csv
Files selected for processing (1)
- simulate-video-learning-events.py (1 hunks)
Additional context used
Ruff
simulate-video-learning-events.py
58-58: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
71-71: f-string without any placeholders
Remove extraneous
f
prefix(F541)
simulate-video-learning-events.py
Outdated
def simulateVideoLearningEvent(android_id): | ||
""" | ||
Simulate a VideoLearningEvent, e.g. a video being opened. | ||
|
||
Should match the CSV format in https://github.com/elimu-ai/analytics/blob/main/app/src/main/java/ai/elimu/analytics/task/ExportEventsToCsvWorker.java | ||
""" | ||
|
||
id = 0 | ||
time = 0 | ||
|
||
return { | ||
'id': id, | ||
'time': time | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complete the implementation of simulateVideoLearningEvent
function.
The current implementation of the function is incomplete and returns hardcoded values. Please consider the following suggestions:
-
Use the
android_id
parameter to generate realistic values forid
andtime
fields. For example, you can use a combination ofandroid_id
and a random number to generate uniqueid
values. -
Generate a realistic
time
value based on the current timestamp or a simulated timestamp within a reasonable range. -
Consider adding more fields to the returned dictionary to simulate a more realistic video learning event, such as
video_id
,duration
,progress
, etc.
simulate-video-learning-events.py
Outdated
from datetime import datetime | ||
import os | ||
from os.path import basename | ||
import pandas | ||
import random | ||
|
||
language_codes = ['ENG', 'HIN', 'TGL'] | ||
print(basename(__file__), f'language_codes: {language_codes}') | ||
|
||
android_ids = ['e387e38700000001', 'e387e38700000002', 'e387e38700000003'] | ||
print(basename(__file__), f'android_ids: {android_ids}') | ||
|
||
# Should be the same version as the most recent release of the Analytics app: | ||
# https://github.com/elimu-ai/analytics/releases | ||
analytics_version_code = 3001018 | ||
print(basename(__file__), f'analytics_version_code: {analytics_version_code}') | ||
|
||
date = datetime.today().strftime('%Y-%m-%d') | ||
print(basename(__file__), f'date: {date}') | ||
|
||
def simulateVideoLearningEvent(android_id): | ||
""" | ||
Simulate a VideoLearningEvent, e.g. a video being opened. | ||
|
||
Should match the CSV format in https://github.com/elimu-ai/analytics/blob/main/app/src/main/java/ai/elimu/analytics/task/ExportEventsToCsvWorker.java | ||
""" | ||
|
||
id = 0 | ||
time = 0 | ||
|
||
return { | ||
'id': id, | ||
'time': time | ||
} | ||
|
||
for language_code in language_codes: | ||
print(basename(__file__)) | ||
print(basename(__file__), f'language_code: {language_code}') | ||
|
||
videos_csv_url = f'https://raw.githubusercontent.com/elimu-ai/webapp/main/src/main/resources/db/content_PROD/{language_code.lower()}/videos.csv' | ||
print(basename(__file__), f'videos_csv_url: {videos_csv_url}') | ||
videos_df = pandas.read_csv(videos_csv_url) | ||
print(basename(__file__), f'videos_df: \n{videos_df}') | ||
|
||
base_url = f'http://{language_code.lower()}.elimu.ai' | ||
print(basename(__file__), f'base_url: {base_url}') | ||
|
||
rest_url = f'{base_url}/rest/v2' | ||
print(basename(__file__), f'rest_url: {rest_url}') | ||
|
||
for android_id in android_ids: | ||
print(basename(__file__)) | ||
print(basename(__file__), f'android_id: {android_id}') | ||
|
||
video_learning_events = [] | ||
random_number_of_events = random.randrange(0, 20) | ||
print(basename(__file__), f'random_number_of_events: {random_number_of_events}') | ||
for i in range(random_number_of_events): | ||
print(basename(__file__)) | ||
event = simulateVideoLearningEvent(android_id) | ||
print(basename(__file__), f'event: {event}') | ||
video_learning_events.append(event) | ||
|
||
video_learning_events_df = pandas.DataFrame(video_learning_events) | ||
print(basename(__file__), f'video_learning_events_df: \n{video_learning_events_df}') | ||
|
||
# Export to CSV | ||
language_dir = f'lang-{language_code}' | ||
android_id_dir = os.path.join(language_dir, f'android-id-{android_id}') | ||
version_code_dir = os.path.join(android_id_dir, f'version-code-{analytics_version_code}') | ||
storybook_learning_events_dir = os.path.join(version_code_dir, f'storybook-learning-events') | ||
if not os.path.exists(storybook_learning_events_dir): | ||
os.makedirs(storybook_learning_events_dir) | ||
csv_path = os.path.join(storybook_learning_events_dir, f'{android_id}_{analytics_version_code}_storybook-learning-events_{date}.csv') | ||
print(basename(__file__), f'csv_path: \n{csv_path}') | ||
video_learning_events_df.to_csv(csv_path, index=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start on the video learning event simulation script!
The script provides a basic framework for simulating video learning events and exporting them to CSV files. It demonstrates the use of pandas for data manipulation and follows a consistent directory structure and naming convention.
However, there are a few areas that need improvement:
-
Complete the implementation of the
simulateVideoLearningEvent
function to generate more realistic event data based on the providedandroid_id
and other relevant factors. -
Enhance the event generation strategy to reflect more realistic usage patterns and consider using historical data or analytics to inform the process.
-
Add error handling and fallback mechanisms to handle potential issues when fetching the videos CSV file from the GitHub URL.
-
Remove or comment out the print statements and debug output in production code to avoid unnecessary output and potential performance overhead.
-
Consider adding configuration options or command-line arguments to make the script more flexible and reusable, such as specifying the number of events to generate, the date range, or the output directory.
Overall, the script is a good starting point but needs some refinement and additional features to be more robust and realistic. Please let me know if you have any questions or need further assistance!
Tools
Ruff
58-58: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
71-71: f-string without any placeholders
Remove extraneous
f
prefix(F541)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Outside diff range and nitpick comments (4)
simulate-video-learning-events.py (4)
7-11
: Consider making the language codes and Android IDs configurable.The language codes and Android IDs are currently hardcoded in the script. To improve flexibility and maintainability, consider the following suggestions:
- Move the hardcoded values to a configuration file (e.g., YAML or JSON) and read them from there.
- Make the language codes and Android IDs configurable via command-line arguments using a library like
argparse
.Additionally, remove or comment out the
13-21
: Consider making the package name and analytics version code configurable.The package name and analytics version code are currently hardcoded in the script. To improve flexibility and maintainability, consider the following suggestions:
- Move the hardcoded values to a configuration file (e.g., YAML or JSON) and read them from there.
- Make the package name and analytics version code configurable via command-line arguments using a library like
argparse
.Additionally, remove or comment out the
23-24
: Remove or comment out theThe
44-73
: Remove or comment out theThe
Tools
Ruff
66-66: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (9)
lang-ENG/android-id-e387e38700000001/version-code-3001018/video-learning-events/e387e38700000001_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-ENG/android-id-e387e38700000002/version-code-3001018/video-learning-events/e387e38700000002_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-ENG/android-id-e387e38700000003/version-code-3001018/video-learning-events/e387e38700000003_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-HIN/android-id-e387e38700000001/version-code-3001018/video-learning-events/e387e38700000001_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-HIN/android-id-e387e38700000002/version-code-3001018/video-learning-events/e387e38700000002_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-HIN/android-id-e387e38700000003/version-code-3001018/video-learning-events/e387e38700000003_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-TGL/android-id-e387e38700000001/version-code-3001018/video-learning-events/e387e38700000001_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-TGL/android-id-e387e38700000002/version-code-3001018/video-learning-events/e387e38700000002_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-TGL/android-id-e387e38700000003/version-code-3001018/video-learning-events/e387e38700000003_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
Files selected for processing (1)
- simulate-video-learning-events.py (1 hunks)
Additional context used
Ruff
simulate-video-learning-events.py
66-66: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
79-79: f-string without any placeholders
Remove extraneous
f
prefix(F541)
Additional comments not posted (2)
simulate-video-learning-events.py (2)
1-6
: LGTM!The imported libraries and modules are relevant and necessary for the script's functionality. The code segment looks good.
75-84
: LGTM!The code segment correctly exports the simulated video learning events to a CSV file. The directory structure and file naming convention are consistent and meaningful. The export functionality looks good.
Tools
Ruff
79-79: f-string without any placeholders
Remove extraneous
f
prefix(F541)
simulate-video-learning-events.py
Outdated
def simulateVideoLearningEvent(android_id): | ||
""" | ||
Simulate a VideoLearningEvent, e.g. a video being opened. | ||
|
||
Should match the CSV format in https://github.com/elimu-ai/analytics/blob/main/app/src/main/java/ai/elimu/analytics/task/ExportEventsToCsvWorker.java | ||
""" | ||
|
||
id = 0 | ||
timestamp = int(datetime.now().timestamp()) | ||
|
||
return { | ||
'id': id, | ||
'timestamp': timestamp, | ||
'android_id': android_id, | ||
'package_name': package_name, | ||
'video_id': 0 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complete the implementation of simulateVideoLearningEvent
function.
The current implementation of the function is incomplete and returns hardcoded values. Please consider the following suggestions:
-
Use the
android_id
parameter to generate realistic values forid
andtimestamp
fields. For example, you can use a combination ofandroid_id
and a random number to generate uniqueid
values. -
Generate a realistic
timestamp
value based on the current timestamp or a simulated timestamp within a reasonable range. -
Consider adding more fields to the returned dictionary to simulate a more realistic video learning event, such as
video_id
,duration
,progress
, etc.
random_number_of_events = random.randrange(0, 20) | ||
print(basename(__file__), f'random_number_of_events: {random_number_of_events}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a more realistic event generation strategy.
The current implementation generates a random number of events for each Android ID, which might not reflect realistic usage patterns. Please consider the following suggestions:
-
Use a more sophisticated algorithm or probability distribution to generate the number of events based on factors such as user engagement, time of day, device type, etc.
-
Consider using historical data or analytics to inform the event generation process and make it more realistic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Outside diff range and nitpick comments (2)
simulate-video-learning-events.py (2)
49-49
: Specify a meaningful 'learning_event_type'The
learning_event_type
field is currently an empty string. To simulate realistic event data, consider specifying a meaningful event type, such as'VIDEO_OPENED'
or'VIDEO_COMPLETED'
.Apply this diff to set a default learning event type:
'video_title': random_video.title, - 'learning_event_type': '' + 'learning_event_type': 'VIDEO_OPENED'
92-92
: Remove unnecessary 'f' prefix from the stringThe string
'video-learning-events'
does not contain any placeholders. Removing thef
prefix eliminates confusion.Apply this diff to fix the issue:
- video_learning_events_dir = os.path.join(version_code_dir, f'video-learning-events') + video_learning_events_dir = os.path.join(version_code_dir, 'video-learning-events')Tools
Ruff
92-92: f-string without any placeholders
Remove extraneous
f
prefix(F541)
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (6)
lang-ENG/android-id-e387e38700000001/version-code-3001018/video-learning-events/e387e38700000001_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-ENG/android-id-e387e38700000002/version-code-3001018/video-learning-events/e387e38700000002_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-ENG/android-id-e387e38700000003/version-code-3001018/video-learning-events/e387e38700000003_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-TGL/android-id-e387e38700000001/version-code-3001018/video-learning-events/e387e38700000001_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-TGL/android-id-e387e38700000002/version-code-3001018/video-learning-events/e387e38700000002_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-TGL/android-id-e387e38700000003/version-code-3001018/video-learning-events/e387e38700000003_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
Files selected for processing (1)
- simulate-video-learning-events.py (1 hunks)
Additional context used
Ruff
simulate-video-learning-events.py
79-79: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
92-92: f-string without any placeholders
Remove extraneous
f
prefix(F541)
Should match the CSV format in https://github.com/elimu-ai/analytics/blob/main/app/src/main/java/ai/elimu/analytics/task/ExportEventsToCsvWorker.java | ||
""" | ||
|
||
id = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generate unique IDs for learning events
Currently, the id
field is set to 0
for all events, which could lead to duplicate IDs in your data. Consider generating a unique ID for each event using uuid.uuid4()
.
Apply this diff to generate unique IDs:
+import uuid # Add at the top of the file
...
def simulate_video_learning_event(android_id, videos_df: pandas.DataFrame):
- id = 0
+ id = str(uuid.uuid4())
Committable suggestion was skipped due to low confidence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Outside diff range and nitpick comments (1)
simulate-video-learning-events.py (1)
7-24
: Remove or comment out print statements in production code.The print statements are used for debugging purposes and should be removed or commented out in production code to avoid unnecessary output and potential performance overhead.
Apply this diff to remove the print statements:
-print(basename(__file__), f'language_codes: {language_codes}') ... -print(basename(__file__), f'android_ids: {android_ids}') ... -print(basename(__file__), f'package_name: {package_name}') ... -print(basename(__file__), f'analytics_version_code: {analytics_version_code}') ... -print(basename(__file__), f'date_iso_8601: {date_iso_8601}')
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (6)
lang-ENG/android-id-e387e38700000001/version-code-3001018/video-learning-events/e387e38700000001_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-ENG/android-id-e387e38700000002/version-code-3001018/video-learning-events/e387e38700000002_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-ENG/android-id-e387e38700000003/version-code-3001018/video-learning-events/e387e38700000003_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-TGL/android-id-e387e38700000001/version-code-3001018/video-learning-events/e387e38700000001_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-TGL/android-id-e387e38700000002/version-code-3001018/video-learning-events/e387e38700000002_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
lang-TGL/android-id-e387e38700000003/version-code-3001018/video-learning-events/e387e38700000003_3001018_video-learning-events_2024-09-21.csv
is excluded by!**/*.csv
Files selected for processing (1)
- simulate-video-learning-events.py (1 hunks)
Additional context used
Ruff
simulate-video-learning-events.py
57-57: Local variable
second_learning_event_type
is assigned to but never usedRemove assignment to unused variable
second_learning_event_type
(F841)
100-100: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
110-110: f-string without any placeholders
Remove extraneous
f
prefix(F541)
Additional comments not posted (1)
simulate-video-learning-events.py (1)
1-6
: LGTM!The imports are relevant and necessary for the script's functionality.
simulate-video-learning-events.py
Outdated
def simulate_video_learning_event(android_id, videos_df: pandas.DataFrame, video_learning_events): | ||
print(basename(__file__), 'simulate_video_learning_event') | ||
""" | ||
Simulate a VideoLearningEvent, e.g. a video being opened. | ||
|
||
Should match the CSV format in https://github.com/elimu-ai/analytics/blob/main/app/src/main/java/ai/elimu/analytics/task/ExportEventsToCsvWorker.java | ||
""" | ||
|
||
id = 0 | ||
timestamp_ms = int(datetime.now().timestamp() * 1_000) | ||
|
||
# Locate a random video in the DataFrame | ||
number_of_videos = len(videos_df.index) | ||
random_video_index = random.randrange(0, number_of_videos) | ||
random_video = videos_df.loc[random_video_index] | ||
|
||
# https://github.com/elimu-ai/model/blob/main/src/main/java/ai/elimu/model/v2/enums/analytics/LearningEventType.java | ||
learning_event_types = ['VIDEO_OPENED', 'VIDEO_CLOSED_BEFORE_COMPLETED', 'VIDEO_COMPLETED'] | ||
|
||
video_learning_events.append({ | ||
'id': id, | ||
'timestamp': timestamp_ms, | ||
'android_id': android_id, | ||
'package_name': package_name, | ||
'video_id': random_video.id, | ||
'video_title': random_video.title, | ||
'learning_event_type': learning_event_types[0] | ||
}) | ||
|
||
# A `VIDEO_OPENED_EVENT` should always be followed by a `VIDEO_CLOSED_BEFORE_COMPLETED` or | ||
# a `VIDEO_COMPLETED` event. | ||
second_learning_event_type = random.choice(learning_event_types[-2:]) | ||
|
||
# Increase timestamp to simulate passage of time between the `VIDEO_OPENED` event and the | ||
# second event. Increase by a random number between 1 second and 60 seconds. | ||
timestamp_ms += random.randrange(1_000, 1_000 * 60) | ||
|
||
video_learning_events.append({ | ||
'id': id, | ||
'timestamp': timestamp_ms, | ||
'android_id': android_id, | ||
'package_name': package_name, | ||
'video_id': random_video.id, | ||
'video_title': random_video.title, | ||
'learning_event_type': random.choice(learning_event_types[-2:]) | ||
}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider enhancing the event generation strategy.
The current implementation of the simulate_video_learning_event
function generates random events, which might not reflect realistic usage patterns. Please consider the following suggestions:
-
Use a more sophisticated algorithm or probability distribution to generate the events based on factors such as user engagement, time of day, device type, etc.
-
Consider using historical data or analytics to inform the event generation process and make it more realistic.
Tools
Ruff
57-57: Local variable
second_learning_event_type
is assigned to but never usedRemove assignment to unused variable
second_learning_event_type
(F841)
videos_csv_url = f'https://raw.githubusercontent.com/elimu-ai/webapp/main/src/main/resources/db/content_PROD/{language_code.lower()}/videos.csv' | ||
print(basename(__file__), f'videos_csv_url: {videos_csv_url}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider handling potential errors when fetching the videos CSV file.
Fetching the videos CSV file from a GitHub URL might cause issues if the file is moved or the URL changes. Please consider the following suggestions:
-
Add error handling to catch and handle any exceptions that might occur during the CSV file download, such as network errors or invalid URLs.
-
Consider providing a fallback mechanism or default data in case the CSV file cannot be fetched successfully.
|
||
# A `VIDEO_OPENED_EVENT` should always be followed by a `VIDEO_CLOSED_BEFORE_COMPLETED` or | ||
# a `VIDEO_COMPLETED` event. | ||
second_learning_event_type = random.choice(learning_event_types[-2:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the unused second_learning_event_type
variable.
The second_learning_event_type
variable is assigned but never used in the function. Please remove it to improve code quality.
Apply this diff to remove the unused variable:
-second_learning_event_type = random.choice(learning_event_types[-2:])
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
second_learning_event_type = random.choice(learning_event_types[-2:]) |
Tools
Ruff
57-57: Local variable
second_learning_event_type
is assigned to but never usedRemove assignment to unused variable
second_learning_event_type
(F841)
video_learning_events = [] | ||
random_number_of_events = random.randrange(0, 20) | ||
print(basename(__file__), f'random_number_of_events: {random_number_of_events}') | ||
for i in range(random_number_of_events): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename unused loop variable i
to _
.
The loop variable i
is not used within the loop body. Renaming it to _
indicates that it is intentionally unused and improves code quality.
Apply this diff to rename the variable:
-for i in range(random_number_of_events):
+for _ in range(random_number_of_events):
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
for i in range(random_number_of_events): | |
for _ in range(random_number_of_events): |
Tools
Ruff
100-100: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
language_dir = f'lang-{language_code}' | ||
android_id_dir = os.path.join(language_dir, f'android-id-{android_id}') | ||
version_code_dir = os.path.join(android_id_dir, f'version-code-{analytics_version_code}') | ||
video_learning_events_dir = os.path.join(version_code_dir, f'video-learning-events') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extraneous f
prefix in the f-string.
The f-string at line 110 does not contain any placeholders, so the f
prefix is unnecessary. Please remove it to improve code quality.
Apply this diff to remove the extraneous f
prefix:
-video_learning_events_dir = os.path.join(version_code_dir, f'video-learning-events')
+video_learning_events_dir = os.path.join(version_code_dir, 'video-learning-events')
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
video_learning_events_dir = os.path.join(version_code_dir, f'video-learning-events') | |
video_learning_events_dir = os.path.join(version_code_dir, 'video-learning-events') |
Tools
Ruff
110-110: f-string without any placeholders
Remove extraneous
f
prefix(F541)
Resolves #3
Purpose
Add simulation of learning events
Technical Details
Once per day CSVs will be randomly generated, and then uploaded to the webapp's REST API.
Testing Instructions
See
README.md
Screenshots
Summary by CodeRabbit
New Features
Documentation
Chores
pandas
as a dependency for data manipulation and analysis..gitignore
to exclude the.venv
directory.