We present PhysWikiQuiz, a Physics Exam Question Generation and Test System. The system can generate physics questions from Wikidata items, given a Formula Concept Name or QID (e.g., 'speed' or 'Q11376') user input. Each question contains comprehensive details of the involved physical quantities, and randomly generated numerical values. In a consecutive step, a student can input an answer. The system then checks the correctness of the user input in terms of quantity value and unit respectively. In the last stage, the system finally generates a correct explanation of the solution, including its calculation path.
PhysWikiQuiz is a web-based system, implemented with Flask, a micro web-framework written in Python. The required metadata for formulae is retrieved from Wikidata by means of SPARQL queries or Pywikibot. The system is hosted by Wikimedia at https://physwikiquiz.wmflabs.org.
Examination is the essential part in every student’s academic life. A significant portion of any physics examination is based upon formula based numerical examples. This system accelerates the process of examination preparation by generating large numbers of novel questions from the open and continuously evolving semantic knowledge-base Wikidata. The question generation helps students to have study material to train for exams and teachers, tutors, or professors to automatically prepare exam sheets, which significantly reduces their time efforts. Moreover, generating different questions and values for each student separately makes student cheating more difficult.
Demo of example formula concept "speed":
Demo of example formula concept "acceleration":
You can quickly check the system hosted at https://physwikiquiz.wmflabs.org.
You also find a video demonstration of the PhysWikiQuiz system and its evaluation here (you will be redirected to YouTube).
PhysWikiQuiz employs the open access semantic knowledge-base Wikidata to retrieve Wikimedia community-curated physics formulae with identifier properties and units using their concept name as input. A given formula is then rearranged, i.e., solved for each occurring identifier by a Computer Algebra System (CAS) to create several question families. For each rearrangement, random identifier values are generated (in a specified range). Finally, the system compares the student's answer input to a CAS computed solution for both value and unit separately and generates an explanation text.
The following diagram illustrates the fundamental workflow of the system.
In module 1, formula and identifier data is retrieved from Wikidata. In module 2, the formula is rearranged using the python CAS Sympy. In module 3, random values are generated for the formula identifiers. In module 4, the question text is generated from the available information. In module 5, the student's answer is compared to the system's solution. Finally, module 6 generates an explanation text for the student. In case some step or module can not be successfully executed, the user is notified, e.g., 'No Wikidata item with formula found'.
Let us learn the workflow of the system with the example formula concept "acceleration".
The following figure shows an example expression tree for "acceleration". Below each of the identifier symbols a, v, and t, information about its properties (Wikidata item name and QID, unit dimension) is displayed.
We now describe the module tasks using the example.
Module 1 retrieves formula and identifier information from Wikidata properties.
In our example case:
- 'defining formula' (P2534): 'a=\frac{dv}{dt}'
- 'in defining formula' (P7235): 'a'
- 'has part' (P527) or 'calculated from' (P4934):
- 'velocity' (Q11465)
- 'duration' (Q2199864)
Module 2 generates various possible rearrangements of the retrieved formula.
In our example case:
Module 3 generates random numerical values for the unknown identifier variables and performs the required mathematical operations in order to calculate the numerical value of the desired identifier.
In our example case: For the two available formula rearrangements , , and , the system calculates the respective right-hand side identifier values by performing the required multiplication or division operation. The result is later compared to the user input to check its correctness.
Module 4 generates a well-structured question in natural language by using the available names, symbols, and values for the occurring formula identifiers.
In our example case: "What is the acceleration a, given velocity v = 4 m s^-1, duration t = 5 s?"
Module 5 checks the solution value and unit entered by the user and displays a correctness assessment.
In our example case: If the user entered "4/5 m s^-2" value and unit are correct. If it was "4 m s^-2", only the unit is correct. If it was "4/5 m", only the value is correct.
Module 6 generates an explanation text with the formula (including source) and a calculation path for the student's understanding.
In our example case: "Solution from www.wikidata.org/wiki/Q11376 formula a = v/t with 4/5 m s^-2 = 4 m s^-1 / 5 s ."
In the following, you can see the final stage of the system, after finishing all tasks:
In principle, PhysWikiQuiz does not depend on formula rearrangements and the workflow would be complete without them. However, they enhance the availability of additional question variations. In the case of our example "speed", using Sympy rearrangements also the other variables "distance" and "durations" can be queried, providing additional concept questions.
The number of questions generated by >>PhysWikiQuiz<< can be calculated as N_generated = N_identifiers * R_values ^ (N_identifiers - 1) with N_identifiers: Number of identifiers in Formula Concept, R_values = 10: Range for random identifier value generation (here from 1 to 10).
This leads to the following table:
N_identifiers | N_generated |
---|---|
1 | 1 |
2 | 20 |
3 | 300 |
4 | 4000 |
5 | 50000 |
6 | 600000 |
7 | 7000000 |
8 | 80000000 |
9 | 900000000 |
10 | 10000000000 |
It is evident that for large formulae, PhysWikiQuiz can generate a tremendous amount of question variations. But even for a small formula with 2 identifiers there are already 20 possibilities. On average, the formulae in the testset contain 3 identifiers, which leads to 300 potential questions per Formula Concept. Without rearrangements the possibilities need to be divided by the number of formula identifiers.
We made a detailed PhysWikiQuiz system evaluation at each individual stage of its workflow. We carried out unit tests for the different modules and an integration test to assess the overall performance on a Formula Concept benchmark dataset.
The open-access platform MathMLben stores and displays a benchmark of semantically annotated mathematical formulae. They were extracted from Wikipedia, the arXiv and the Digital Library of Mathematical Functions DLMF and augmented by Wikidata markup. The benchmark can be used to evaluate a variety of MathIR tasks, such as the automatic conversion between different CAS or Mathematical Question Answering MathQA. In our PhysWikiQuiz evaluation, we employ a selection of 66 formulae (GoldID 310-375) from the MathMLben goldstandard. The Formula Concepts were extracted from Wikipedia articles using the formula and identifier annotation recommendation system AnnoMathTeX.
The following table shows the PhysWikiQuiz system performance on the selected Formula Concept benchmark (MathMLben GoldID 310-375). For each concept, e.g., 'speed', the results of the different modules is displayed. At the bottom the total percentage of the available functionality is provided.
All result tables and evaluation logs can be found here.
A deployed version of the system is available online, hosted by Wikimedia at https://physwikiquiz.wmflabs.org. Installing locally on your machine, the system mainly depends on the following python packages (for a full list see requirements.txt
).
The web framework Flask version 0.12.2
is used as web framework middleware used as an interface between the frontend and the backend.
pip3 install Flask
Requests version 2.26.0
is an HTTP library designed to make HTTP requests simpler and more human-friendly.
pip3 install requests
Pywikibot version 5.6.0
is used to extract the formula concept data from Wikidata: https://tools.wmflabs.org/pywikibot
pip3 install pywikibot
SPARQLWrapper version 1.8.2
is a simple Python wrapper around a SPARQL service to remotely execute queries. It helps to create the query invokation and convert the result into a more manageable format.
pip3 install sparqlwrapper
The Computer Algebra System (CAS) Sympy version 1.7.1
is used for the calculation module to get result values given a retrieved formula and inputs for the variables.
apt-get install python3-sympy
LaTeX2Sympy version 1.6.2
is used to convert variants of LaTeX formula strings to a Sympy equivalent form.
- ANTLR is used to generate the parser:
sudo apt-get install antlr4
- Download latex2sympy from https://github.com/augustt198/latex2sympy
git clone https://github.com/ag-gipp/PhysWikiQuiz.git
Example query using Wikidata item name ('speed'):
https://physwikiquiz.wmflabs.org/api/v1?name=speed
Example query using Wikidata item QID ('Q124164'):
https://physwikiquiz.wmflabs.org/api/v1?qid=Q124164
- Wikidata: A SPARQL query to the Wikidata Query Services API retrieves lists or properties of Wikidata items
- VMEXT: LaTeX to SymPy formula conversion is done via the 'LaCASt' translator
This project is licensed under the Apache License 2.0.
We thank the Wikimedia foundation for hosting our web-based system.