General Setup

Simple setup:
- If on Mac, can run the setup script to install all requirements (and skip following section)
- chmod +x setup_mac.sh
- ./setup_mac.sh
Requirements:
- Clone external submodules: git submodule update --init --recursive
- Set Python version to 3.10: pyenv global 3.10
- Install Python requirements using pip:
  - python -m venv venv
  - source venv/bin/activate
  - pip install -r requirements.txt
- If on Mac, download and install shell requirement VideoSnap (a macOS command line tool for recording video and audio from any attached capture device):
  - wget https://github.com/matthutchinson/videosnap/releases/download/v0.0.9/videosnap-0.0.9.pkg
  - sudo installer -pkg videosnap-0.0.9.pkg -target /
Contents:
- Audio and video capture module is located within directory capture
- AV synchronisation detection using Synchformer is located within directory av_sync_detection
- Stutter detection using MaxVQA and Essentia is located within directory stutter_detection
- Video quality assessment using Google UVQ is located within directory video_quality_assessment

AV Capture System

Setup mode to check input audio/video sources: python capture/capture.py --setup-mode
Run capture pipeline to generate AV files: python capture/capture.py -a AUDIO_SOURCE -v VIDEO_SOURCE
This capture audio and video in 10s segments and save them to the local directory output/capture/
Halt capture by interrupting execution with CTRL+C

General CLI

usage: capture.py [-h] [-m] [-na] [-nv] [-s] [-a AUDIO] [-v VIDEO] [-o OUTPUT_PATH]

Capture audio and video streams from a camera/microphone and split into segments for processing.

options:
  -h, --help            show this help message and exit
  -m, --setup-mode      display video to be captured in setup mode with no capture/processing
  -na, --no-audio       do not include audio in captured segments
  -nv, --no-video       do not include video in captured segments
  -s, --split-av-out    output audio and video in separate files (WAV and MP4)
  -a AUDIO, --audio AUDIO
                        index of input audio device
  -v VIDEO, --video VIDEO
                        index of input video device
  -o OUTPUT_PATH, --output-path OUTPUT_PATH
                        directory to output captured video segments to

AV Synchronisation Detection

Complete Detection System

The complete build of the AV sync detection system uses Synchformer to predict AV offsets (as this was found to be the most accurate model during experimentation).
Detection can be completed over a video file or directory of files.
Can also enable streaming mode that continuously checks a directory for files and processes as they are added. This can be used in conjunction with the capture system to perform AV sync detection in real-time.
Run inference on static files at PATH: python AVSyncDetection.py PATH --plot
Run in streaming mode on captured video segments: python AVSyncDetection.py ../output/capture/segments/ -sxp
If running on an Apple Silicon Mac: python AVSyncDetection.py PATH -p --device mps
If running on a GPU: python AVSyncDetection.py PATH -p --device cuda

General CLI

usage: AVSyncDetection.py [-h] [-p] [-s] [-i] [-d DEVICE] [-t TRUE_OFFSET] directory

Run Synchformer AV sync offset detection model over local AV segments.

positional arguments:
  directory

options:
  -h, --help            show this help message and exit
  -p, --plot            plot sync predictions as generated by model
  -s, --streaming       real-time detection of streamed input by continuously locating & processing video segments
  -i, --time-indexed-files
                        label output predictions with available timestamps of input video segments
  -d DEVICE, --device DEVICE
                        harware device to run model on
  -t TRUE_OFFSET, --true-offset TRUE_OFFSET
                        known true av offset of the input video

Stutter Detection

Installing

Installing Video Stutter Module

Install ExplainableVQA deps:

git submodule update --init --recursive
pip install -r ExplainableVQA/requirements.txt

Install open_clip:

On Mac:

sed -i "" "92s/return x\[0\]/return x/" ExplainableVQA/open_clip/src/open_clip/modified_resnet.py
pip install -e ExplainableVQA/open_clip

On Linux:

sed -i '92s/return x\[0\]/return x/' ExplainableVQA/open_clip/src/open_clip/modified_resnet.py
pip install -e ExplainableVQA/open_clip

Install Dover:

On Mac first run this before continuing: sed -i "" "4s/decord/eva-decord/" ExplainableVQA/DOVER/requirements.txt

pip install -e ExplainableVQA/DOVER
mkdir ExplainableVQA/DOVER/pretrained_weights
wget https://github.com/VQAssessment/DOVER/releases/download/v0.1.0/DOVER.pth -P ExplainableVQA/DOVER/pretrained_weights/

Running

Run inference on directory or video/audio file at PATH: python StutterDetection.py PATH
This will output a plot of the "motion fluency" over the course of the video (low fluency may indicate stuttering events) and/or a plot of audio stutter times detected in the waveform.

General CLI

usage: StutterDetection.py [-h] [-na] [-nv] [-c] [-t] [-i] [-f FRAMES] [-e EPOCHS]
                           [-d DEVICE]
                           directory

Run audio and video stutter detection algorithms over local AV segments.

positional arguments:
  directory

options:
  -h, --help            show this help message and exit
  -na, --no-audio       Do not perform stutter detection on the audio track
  -nv, --no-video       Do not perform stutter detection on the video track
  -c, --clean-video     Testing on clean stutter-free videos (for experimentation)
  -t, --true-timestamps
                        Plot known stutter times on the output graph, specified in
                        'true-stutter-timestamps.json
  -i, --time-indexed-files
                        Label batch of detections over video segments with their
                        time range (from filename)
  -f FRAMES, --frames FRAMES
                        Number of frames to downsample video to
  -e EPOCHS, --epochs EPOCHS
                        Number of times to repeat inference per video
  -d DEVICE, --device DEVICE
                        Specify processing hardware

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

General Setup

AV Capture System

General CLI

AV Synchronisation Detection

Complete Detection System

General CLI

Stutter Detection

Installing

Installing Video Stutter Module

Running

General CLI

Files

README.md

Latest commit

History

README.md

File metadata and controls

General Setup

AV Capture System

General CLI

AV Synchronisation Detection

Complete Detection System

General CLI

Stutter Detection

Installing

Installing Video Stutter Module

Running

General CLI