Honours Project

Research Question: How effective can gesture recognition be at aiding accessibility to video games?

To research suitable methods withing machine learning to recognise hand gestures.
To evaluate how effective gesture control can be for accessibility in games.
To build a system that allows control of video games through gesture recognition.

To better understand how an Artificial Neural Network functions, the first 3-4 months of my honours project were dedicated to creating one from scratch in C++. I created a network that was capable of learning how to identify handwritten numbers from the MNIST database however, when attempting to train on the dataset I created for use in this project, I could not find the hyperparameters for the network to bite on training. This resulted in a network that would descend towards the global minimum but slowly.

With the deadline quickly approaching, I opted to switch over to using Tensorflow and implementing a Convolutional Neural Network. Below is a diagram I created of the network architecture.

Network Input

I used OpenCV as the API for capturing the input used for the network. The first 60 frames of the program running are used to accumulate average pixels values of the user's background, this allows for background seperation to be used. Below is the five stages an image is processed before being used for input to the network.
Input Process Steps

A frame is captured in BGR every 25ms, this frame is first converted to greyscale then blurred with a gaussian blur kernel to remove hard edges. The next step is to seperate the user's hand from the frame, this is done by comparing the current frame to the accumulated background average. The final step is to change any pixel above a certain value to white and any below to black, this provides a clear monochrome image of the user's hand.

Network Output

Using a Python library called Pynut, the model's predicition is passed to a dictionary which returns the key to be pressed. The key press event is sent to windows whenever the prediction has changed or whenever the user removes or inserts their hand in to the region of interest. Each prediction is paired with a key to press, these keys can be changed to allow any combination of virtual keys to be used and even sentences. When the user removes or inserts their hand in to the frame then a virtual key is sent that allows for the entering and exiting of cover in the game.

Other Projects

Lochsite

WFFC

Audio Project

Steven Anderson