Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: adds How To do xAPI Transforms #114

Merged
merged 5 commits into from
Feb 8, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 86 additions & 21 deletions docs/concepts/xapi_concepts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,47 +4,112 @@ xAPI
****

Introduction
###################
############

xAPI, also known as the Experience API or Tin Can API, is a specification for capturing
xAPI, also known as the `Experience API`_ or Tin Can API, is a specification for capturing
and tracking learning experiences in a wide range of contexts. It is an e-learning
standard that allows organizations to gather data about learner activities and interactions
across various platforms and devices. xAPI provides a flexible and comprehensive
framework for collecting, storing, and analyzing learning data, enabling a deeper
understanding of learners' experiences and performance.

Components of xAPI
###################
Together, these components form the foundation of the xAPI ecosystem, enabling
the tracking and analysis of learner activities and experiences beyond traditional
learning management systems. The flexibility and interoperability of xAPI make it
an ideal solution for organizations looking to gain deeper insights into their
learners' performance and behavior.

The xAPI specification consists of three key components:
xAPI Schema
###########

.. _actor_concept:
While the components of an :ref:`xapi_statement` are well specified, the actual detailed format or "schema" used for
each type of xAPI event is up to us to decide, as users of xAPI.

Actor
~~~~~
In xAPI, an actor represents an entity that interacts with the learning system.
It can be a learner, instructor, system administrator, or any other agent
involved in the learning process. Actors are identified using a unique identifier,
such as an email address or username.
See :ref:`xapi_transforms` for information on how the schemas are built.

.. _xapi_statement:

xAPI Statement
##############

Statement
~~~~~~~~~
A statement, also referred to as an "xAPI statement" or "verb-object statement,"
is the core building block of xAPI. It captures a specific learning activity
or experience in a structured format. A statement consists of three essential
elements: the actor who performed the action, the verb that describes the action,
and the object representing the target or learning experience.

An xAPI statement can be expressed briefly as:

**Actor** **Verbed** an **Object** (within **Context**).

ID
~~

Each xAPI event emitted by Open edX will contain a `Universally Unique Identifier`_ (UUID)
which is generated from the event actor, timestamp, verb, and child event ID (if present).
This ID is used to de-duplicate events, for example when re-processing old tracking logs.

.. _actor_concept:

Actor
~~~~~
In xAPI, an actor represents an entity that interacts with the learning system.
It can be a learner, instructor, system administrator, or any other agent
involved in the learning process.

Actors in Aspects are uniquely identified using a platform-generated ``external_id`` which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to link in this doc: changing_actor_identifier somewhere here as it has some important specifics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helps keep event data anonymous. Aspects can link this ``external_id`` back to a username or
email address where individual identities are needed.

Verb
~~~~
Verbs in xAPI are URIs from the paired with short, translated, human-friendly labels. Verbs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Couple of typos in this sentence

Copy link
Contributor Author

@pomegranited pomegranited Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be in past tense, idenoting that the action has already been performed.

There are many verbs used in Aspects events, for example: completed, progressed,
registered, unregistered, passed, failed, voted, asked, reported, attempted, completed,
earned, etc. Each verb has its own specific purpose and target Object.

Object
~~~~~~
xAPI objects consist of a unique identifier and a "definition" stanza, which describes the
object of the actor's verb.

Most objects in Aspects xAPI events are of type "Course", "Discussion", or "Activity". For
example, learners may enrol or unenrol from a Course, post in a Discussion, or complete an
assignment Activity.

Context
~~~~~~~

xAPI events generally occur within some learning context, and so this stanza provides a
place to record important meta information about the event. Context stanzas may contain
"extension" fields which allow arbitrary data to be attached to an event.

Aspects uses context to provide enclosing course data where the object is an activity or
discussion, or "context activities" like a problem block which wraps several questions.
Aspects also uses an "extension" to store the version of the xAPI transforms that were
applied to the event.

Learning Record Store (LRS)
~~~~~~~~~~~~~~~~~~~~~~~~~~~
###########################

The Learning Record Store serves as the repository for storing and retrieving xAPI
statements. It is a database or storage system that receives and securely stores
the statements generated by learning activities. The LRS enables the collection
and organization of learning data from various sources, allowing for robust
analysis and reporting of learner experiences.
analysis and reporting of learner experiences. An LRS may also be used to recommend
content to learners based on previous performance or interest.

Together, these components form the foundation of the xAPI ecosystem, enabling
the tracking and analysis of learner activities and experiences beyond traditional
learning management systems. The flexibility and interoperability of xAPI make it
an ideal solution for organizations looking to gain deeper insights into their
learners' performance and behavior.
References
##########

* `OEP-26`_: xAPI Real-time events
* `Experience API`_: specification for xAPI
* `xAPI Statements 101`_: guide to xAPI statements and their components


.. _OEP-26: https://open-edx-proposals.readthedocs.io/en/latest/architectural-decisions/oep-0026/xapi-realtime-events.html
.. _Experience API: https://xapi.com/specification/
.. _xAPI Statements 101: https://xapi.com/statements-101/
.. _Universally Unique Identifier: https://en.wikipedia.org/wiki/Universally_unique_identifier
2 changes: 1 addition & 1 deletion docs/how-tos/dbt_extensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ Update the following Tutor variables to use your package instead of the Aspects
Once your package is configured in Tutor, you can run dbt commands directly on your deployment; run ``tutor [dev|local] do dbt --help`` for details.

References
**********
##########

* `Building dbt packages`_: dbt's guide to building packages
* `Writing data tests`_: dbt's guide to writing package tests
Expand Down
3 changes: 2 additions & 1 deletion docs/how-tos/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,6 @@ How-Tos
Superset custom roles <superset_roles>
Clickhouse extra SQL <clickhouse_sql>
Connect to external Clickhouse database <remote_clickhouse>
Extending DBT <dbt_extensions>
Extending dbt <dbt_extensions>
Run Aspects in a ClickHouse cluster <clickhouse_cluster>
xAPI Transforms <xapi_transforms>
230 changes: 230 additions & 0 deletions docs/how-tos/xapi_transforms.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
.. _xapi_transforms:

xAPI Transforms
***************

Aspects converts raw Open edX tracking event JSON into :ref:`xapi-concepts` for storage and analysis. This conversion process is called "transformation".

Events emitted by ``openedx`` packages are transformed by `event-routing-backends`_ (ERB), a Django plugin which Aspects installs on Open edX.

Transformers for events emitted by non-openedx packages should be stored close to the code that produces the events, and registered using decorators provided by `event-routing-backends`_. We
will use OpenCraft's `completion aggregator`_ as the example for this tutorial, building on events emitted by `pr#173`_.

xAPI Schema
###########

To decide on an event's xAPI schema, consider any similar events already being transformed, and what event data will be useful for analysis or visualization in Aspects.

The schema for a new event must uniquely describe that event. However, it's also important to be as consistent as possible with existing event schemas so that the event can be processed and
used in Aspects in a similar way to other events.

As a reminder, an xAPI statement can be expressed as:

**Actor** **Verbed** an **Object** (within **Context**).

Actor
~~~~~

For most events, the default Actor transform is enough:

.. code-block:: json

{
"objectType": "Agent",
"account": {
"homePage": "https://lms.url",
"name": "32e08e30-f8ae-4ce2-94a8-c2bfe38a70cb"
}
}


Here, the actor's `external ID`_ (of type=xapi) is used as the ``name`` field. This external ID can be matched against PII data to access the actor's name, email, and other profile details.

Verb
~~~~

The verb is the primary differentiator between different xAPI events in Open edX. Select a verb that describes the event as concisely and accurately as possible, so that future, similar
events can still be discerned.

Where possible, reuse verbs from one of the registered `xAPI profiles`_. See `ERB's verb list`_ for verbs currently in use in Aspects.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably discourage people from reusing verbs that are already in use in ERB whenever possible, as we do a lot of processing based on the verb and it will potentially lead to confusion and maybe performance issues to overload terms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking about it, we should also update our filters to not only filter events based on the verb but also the object type and even the context activities. Mainly to allow developers to use the same verb and don't trigger any error on the downstream tables.

For the completion-aggregator and with internal libraries @andrey-canon had a lot of issues with the completion events because he was using the same verb

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking about it, we should also update our filters to not only filter events based on the verb but also the object type and even the context activities.

That's a good point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


For example, the completion aggregator will emit events when progress has been made on a
unit/section/subsection/course, so we could use the verb `progressed`_.

.. code-block:: json

{
"id": "http://adlnet.gov/expapi/verbs/progressed",
"display": {
"en": "progressed"
}
}

Object
~~~~~~

Most events in Open edX are Activities, which look like this:

.. code-block:: json

{
"id": "https://lms.url/block/block-v1:edX+DemoX+Demo_Course+type@video+block@0b9e39477cf34507a7a48f74be381fdd",
"description": {
"type": "block",
"name": {
"en": "Welcome!"
},
}
}

* ``id`` should uniquely identify the activity
* ``type`` should describe the type of activity, e.g. "unit" or "course"
* ``name`` should provide human-friendly display name(s) for the activity
* ``extensions`` can be added to provide any extra data important to the activity

Context
~~~~~~~

Most events in Open edX happen on an element within a course, like a block or a discussion forum, and so the "context activity" for the event is the course.

Aspects also uses "extensions" to record extra information, like the transformer code version and the actor's session ID (if found in the event). These "extensions" can be used to
communicate any high-level information that is important for the event record.

For example:

.. code-block:: json

{
"contextActivities": {
"parent": [
{
"id": "https://lms.url/course/course-v1:edX+DemoX+Demo_Course",
"object_type": "Activity",
"definition": {
"type": "course",
"name": {
"en-US": "Demonstration Course"
}
}
}
]
},
"extensions": {
"https://w3id.org/xapi/openedx/extension/transformer-version": "7.2.0",
"https://w3id.org/xapi/openedx/extensions/session-id": "993110e9c27848a545da74a74114158d"
}
}


Result
~~~~~~

Some Open edX events use a "result" stanza that communicates information about the effect that this event had. For example, "problem check" events record whether the problem was answered
correctly, and what score the actor received.

For these completion "progressed" events, we would want to store:

.. code-block:: json

{
"completion": false,
"score": {
"scaled": ".45"
}
}


xAPI Transformer Registry
#########################

Once the xAPI event schema is settled, the implementation should be pretty straightforward using
`event-routing-backends`_ and `TinCan`_.

#. Create a new transformer class that extends `XApiTransformer`_.
#. Implement the `get_verb` method, returning your chosen verb URI and its short name.
#. Implement any other custom components by overriding their ``get`` method.

For example, to customize the context activities for your event, override `get_context_activities`.

Use the built-in transformer method `get_data` to parse and return data from the original tracking event.
#. Register your transformer class using the registry decorator.

Use the raw tracking event's ``type`` as the parameter to ensure this class is used to transform those type of events.


.. code-block:: python

from tincan import LanguageMap, Result, Verb
from event_routing_backends.processors.xapi.registry import XApiTransformersRegistry
from event_routing_backends.processors.xapi.transformer import XApiTransformer

class ProgressTransformerBase(XApiTransformer):
"""
Transformer for completion-aggregated "progress" events.

Uses the default implementations for `get_actor` and `get_context`.

Expects at these fields to be present in the original tracking event:

{
"data": {
"block_id": "block-v1:...", # block usage key
"percent": "0.123", # percent completed, > 0, < 1.0
}
}
"""
object_type = None
additional_fields = ('result', )

def get_verb(self) -> Verb:
return Verb(
id="http://adlnet.gov/expapi/verbs/progressed",
display=LanguageMap({"en": "progressed"}),
)

def get_object(self) -> Activity:
return Activity(
id=self.get_object_iri("xblock", self.get_data("data.block_id")),
definition=ActivityDefinition(
type=self.object_type,
)
)

def get_result(self) -> Result:
return Result(
completion=self.get_data("data.percent") == 1.0,
score={
"scaled": self.get_data("data.percent") or 0,
},
)

# Register subclasses for each individual object type

@XApiTransformersRegistry.register("edx.completion_aggregator.progress.chapter")
@XApiTransformersRegistry.register("edx.completion_aggregator.progress.sequential")
@XApiTransformersRegistry.register("edx.completion_aggregator.progress.vertical")
class ModuleProgressTransformer
object_type = "http://adlnet.gov/expapi/activities/module"

@XApiTransformersRegistry.register("edx.completion_aggregator.progress.course")
class CourseProgressTransformer
object_type = "http://adlnet.gov/expapi/activities/course"


References
##########

* `event-routing-backends`_: Django plugin that receives tracking events and transforms them into xAPI
* `completion aggregator`_: OpenCraft's plugin which accumulates block completion up to the enclosing unit/section/subsection/course.
* `xAPI profiles`: registry of xAPI schemas


.. _completion aggregator: https://github.com/open-craft/openedx-completion-aggregator
.. _event-routing-backends: https://github.com/openedx/event-routing-backends
.. _ERB's verb list: https://github.com/openedx/event-routing-backends/blob/master/event_routing_backends/processors/xapi/constants.py
.. _external ID: https://github.com/openedx/edx-platform/blob/master/openedx/core/djangoapps/external_user_ids/docs/decisions/0001-externalid.rst
.. _pr#173: https://github.com/open-craft/openedx-completion-aggregator/pull/173
.. _progressed: http://adlnet.gov/expapi/verbs/progressed
.. _TinCan: https://github.com/RusticiSoftware/TinCanPython
.. _xAPI profiles: https://profiles.adlnet.gov/
.. _XApiTransformer: https://github.com/nelc/event-routing-backends/blob/master/event_routing_backends/processors/xapi/transformer.py#L27