GitHub - JamesGTang/CV-Aid-for-Visually-Impaired: 360° Camera Scene Understanding for the Visually Impaired using Computer Vision

Written with StackEdit.

Project Title:

360° Camera Scene Understanding for the Visually Impaired (VI)

Project description

Despite the popular misconception that deep learning has solved the scene recognition problem, the quality of machine-generated captions for real-world scenes tends to be quite poor. In the context of live captioning for visually impaired users, this problem is exacerbated by the limited field of view of conventional smartphone cameras and camera-equipped eyewear. This results in a narrow view of the world that often misses important details relevant to gaining an understanding of their environment.

This project will make use of a head-mounted 360° camera and a combination of human (crowdsourced) labeling with deep learning to train systems to provide more relevant and accurate scene descriptions, in particular for navigation in indoor environments, and guidance for intersection crossing, improving upon a system recently developed by our lab for this purpose. Results will be compared to those obtained using camera input from a smartphone held in a neck-worn lanyard. source: http://srl.mcgill.ca/projects/

Project Abstract

The goal of this project is to aid VI people cross an intersection through the use of crowdsourcing and a 360° camera. The visual data acquired from this camera will be compared to a smartphone attached to a lanyard. Through the use of machine learning and computer vision algorithms, the collected data will be used to predict the correct crossing direction and determine the distance needed to cross the intersection.

The motivation for this project is to facilitate crossing an intersection for the VI safely. When crossing an intersection, veering outside the white lines can cause accidents and result in injuries/death. Currently, Professor Jeremy Cooperstock’s lab worked on a method which uses a conventional smartphone camera to detect the white lines on a crossing. The limited field-of-view tends to miss out on relevant information required to understand the surrounding environment. The field-of-view of a 360° camera can acquire this missing information to determine a safer route for the VI.

Our primary objective for this project is to provide feedback for VI people to successfully walk across an intersection using their smartphone. If we are to achieve this successfully, we would like to expand the project scope to alternate scenarios including traffic light state, obstacle detection and guidance for doorways and stairwells.

Intellectual Property:

There will be an industrial partner who will lending the 360° camera hardware. Any intellectual property developed is owned by Adeeb Ibne Amjad, Stefano Commodari, Hakim Amarouche and James Tang.

Project Supervisors:

Dr. Jeremy Cooperstock, Centre for Intelligent Machines McGill University jer at cim.mcgill.ca

http://www.cim.mcgill.ca/

Dataset

The original dataset is published for non-commercial purpose, you can use this dataset for research and you must credit, Shared Reality Lab at McGill Center for Intelligent Machines (CIM), http://srl.mcgill.ca/ in your publication. Data colllection credit by Adeeb Ibne Amjad, Stefano Commodari, Hakim Amarouche and James Tang.

Image captured using normal phone camera

File link	Size
https://drive.google.com/open?id=1yaG_2NSW768V_njTIiE01Sc9i23iQW9B	~ 962.9 MB

360° Camera

Uncompressed video footage of intersection crossing, each intersection contained in one folder, the video files named PRIM0001.mp4,PRIM0002.mp4,PRIM0003.mp4,PRIM0004.mp4 in each folder is the same crossing captured from the four lens installed the glass.

File link	Size
https://drive.google.com/open?id=1nWGju1AHbJoYvNZnnofRb8Y7jbVBeBCE	~ 15.00 GB

File type	Description
.THM	Video file thumbnail
.MP4	Original video file of intersection crossing captured at 30FPS
.BIN	Lens configuration binary file
.CFG	Property file containing the configuration properties of the lens

Video footages compressed into zip files.

File link	Size
https://drive.google.com/open?id=13gjGYKL2uD86O0bYRAj2wlqGXIJE0QW1	~ 3.51 GB
https://drive.google.com/open?id=1_eBZ_63d-734eNjh8_7jbAOSO8nCIsd0	~ 3.27 GB
https://drive.google.com/open?id=14bGY0XIOrh-dWGjtVB6tfMvVB-R6gyhJ	~ 2.73 GB
https://drive.google.com/open?id=1JXUz4884y5gNmPkgKoIWyBGHVJGg8kf3	~ 3.51 GB

File Index

Folder Name	Description
ICA	Experimental code using Individual Component Analysis
camera-data	Image captured using normal phone camera
data_processing	Data preprocessing pipeline written in Python
orbi_processed	Stiched images from frames captured by Orbi lenses
training	Code for training the CNN model and the saved model
model_conversion	Code used to convert saved model for use on mobile devices
graphs	Graph resources for presentation
report_resource	Resources for report
orbi_sample	Sample images from orbi for demontration purposes

Credit

Name	Email
Stefano Commodari	stefano.commodari at mail.mcgill.ca
Adeeb Ibne Amjad	adeeb.amjad at mail.mcgill.ca
Hakim Amarouche	hakim.amarouche at mail.mcgill.ca
James Tang	guokai.tang at mail.mcgill.ca

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
ICA		ICA
android-app		android-app
camera-data		camera-data
data_processing		data_processing
graphs		graphs
orbi_sample		orbi_sample
orbi_sample_processed		orbi_sample_processed
orbi_sample_simulation		orbi_sample_simulation
orbi_to_cam_processed		orbi_to_cam_processed
report_resource		report_resource
training		training
._README.md		._README.md
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Title:

Project description

Project Abstract

Intellectual Property:

Project Supervisors:

Dataset

Image captured using normal phone camera

360° Camera

File Index

Credit

About

Releases

Packages

Contributors 4

Languages

License

JamesGTang/CV-Aid-for-Visually-Impaired

Folders and files

Latest commit

History

Repository files navigation

Project Title:

Project description

Project Abstract

Intellectual Property:

Project Supervisors:

Dataset

Image captured using normal phone camera

360° Camera

File Index

Credit

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages