From brain waves to robot movements with deep learning: an introduction.

Visualizing and decoding brain activity with neural networks.

Published in

Towards Data Science

8 min readMay 16, 2018

You can find all the code of this article in this online Colaboratory Notebook, and you can run it directly on your browser. Here’s the GitHub repo.

Follow me on Twitter for updates on my work and more: https://twitter.com/normandipalo

The nervous system is an incredibly complex structure. Across your whole body, more than a hundred thousand kilometers of nerves wire every part of it with your spinal cord and brain. This “grid” transmits electrical impulses that control every movement. And every one of those commands starts from your brain, an even more amazing structure of neurons that communicate with electrical activation signals. Understanding and interpreting the brain electrical patterns is one of the biggest moonshots of neuroscientists and neurobiologists, but it has been proved to be an extremely challenging task.

One non invasive way to register brain activity is electroencephalography (EEG). This is a technique that allows to record brain voltage fluctuations using electrodes placed on the scalp of a patient. Usually, around 30 of these electrodes are placed all around the scalp, allowing to register a global activity of brain waves. Anyway, the relationship between brain activity and EEG signals is complex and poorly understood outside of specific laboratory tests. Thus, a great challenge is learning how to “decode”, in some sense, these EEG scans, that could allow to control robotic prosthetic limbs and other devices using non-invasive brain-computer interfaces (BCI).

Example of brain waves recorded with EEG. CC BY-SA 2.0, https://commons.wikimedia.org/w/index.php?curid=845554

Being strongly data-driven disciplines, the recent breakthroughs of deep learning in related pattern recognition tasks have created a new way of analyzing these electrical signals using neural networks. In this post, we will see an introduction of this topic: we will read EEG data given by a Kaggle competition that aims at detecting which EEG patterns correspond to particular arm and hand gestures, such as grabbing or lifting an object. We will then design a neural network to perform such kind of classification, after having pre-processed the data in different ways. I will also show some data visualization of brain activity, to give a general idea of the data we are working with. The final goal of this area of research is developing affordable and useful prosthetic devices that can help amputees to regain the ability to perform basic tasks easily, by controlling a prothesis with their brain. Similar techniques can also be applied to read muscles electric activations, and thus decoding the kind of movement that a person is trying to perform by analyzing the activated muscles.

You can find all the code of this article in this online Colaboratory Notebook, and you can run it directly on your browser. Here’s the GitHub repo.

Introduction to the data

You can download the data freely if you have a Kaggle account here. As you will see, the data is composed simply of several .csv files. Those files are, respectively:

EEG data used as inputs to a model, recorded with 32 electrodes placed on the scalp of the patient. The data is recorded at 500Hz.
Frame-wise labels of the movement that the human tester is trying to achieve, among 6 possible ones.

The data is collected by recording a EEG of different human testers while they perform simple actions, like grabbing and lifting an object. Thus, the dataset is divided into different episodes, but also different subjects. We will see later, in the accuracy prediction, that brain waves can be quite personal, since a model can predict with great accuracy the intentions in unseen episodes of the same person, but can struggle in doing the same with new testers if the training wasn’t various enough.

The goal is therefore to create a neural network that takes as input the EEG readings and outputs a probability distribution of these 6 possible actions that the tester is trying to achieve. Since “no action” is not a possible class, we can either add it as a class or set all the possible outputs as values between 0 and 1 and using a threshold to decide if that action is detected. If every action falls under the threshold, we consider it as no action.

Position of electrodes, source: https://www.kaggle.com/c/grasp-and-lift-eeg-detection/data

I made an animated data visualization of the activity of those electrodes. Since the sampling frequency is quite high (500 Hz), I used a simple 3-steps low pass filter to smoothen the data and I created an animation with the first 100 frames, that is around 1/5 of second.

Activations of the 32 electrodes in the first 1/5 of second.

We can also visualize the temporal data as a 2D heatmap, where the vertical axis is time (that start from the top and goes down) and the horizontal axis indicates the 32 electrodes.

EEG temporal heatmap. (time starts from top and goes down)

This is also very useful because, as we will see, it will allow us to work with spatio-temporal convolutions.

Data Preprocessing

This raw data should be preprocessed in order to agevolate the learning phase. For instance, the very high sampling frequency of the EEG in contrast with the relatively low rate of change of the performed action can cause many problems: the data changes very quickly, but the action actually stays the same, thus the fluctuations can almost be considered as noise. Also, a temporal model would receive a lot of fast changing data while the classification output never changes.

The first possible step is to filter the data with a low-pass filter. Even a simple running average would work: in this way we mitigate the high-frequency changes of the data while preserving the low-frequency structure that can be more useful, since the movements that we will classify have a very low frequency of change (at most 1Hz). After that, we can subsample the data, i.e. we can keep only one data point every 10,100, etc. This also helps reducing the time dimensionality and lowers the correlation of the data, making it more time-sparse, in a sense.

Numerous other preprocessing techniques could be adopted, but for the sake of simplicity of this introduction we can stop here and start designing our neural network.

Neural Network Design and Experiments

Dealing with temporal data, one of the first architectures that comes to our mind is Recurrent Neural Networks. These networks have a dynamical structure and thus an internal state that allows them to encode temporal data, so they compute their output based also on past inputs. I designed a LSTM network in Keras, and fed it the training data with the sequential temporal structure. The results were good, but in this particular example I’m more interested in showing how a convolutional neural network, that is often used for images, can work very well on temporal data.

As previously described, we are actually dealing with spatio-temporal data in a sense: the vertical axis of the above showed heatmap represents the temporal evolution, while the horizontal axis shows the various electrodes, and the close ones are, almost always, also close spatially on the human scalp. This mean that we can actually extract useful features with convolutions: a 2D kernel would encode patterns in both time and space. Imagine a 3x3 convolutional kernel: it is able, on the matrix depicted in the heatmap, to extract features by doing a weighted sum on three different time steps (the 3 kernel rows) but also on three different electrodes (the 3 kernel columns). Thus, a CNN with many kernels can find features of how the activations of electrodes change on a finite temporal period in relation to the wanted movement.

I implemented a simple CNN in Keras to check its performance on this dataset. You can find all the code of this article in this online Colaboratory Notebook, and you can run it directly on your browser. Here’s the GitHub repo.

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import LSTM, CuDNNLSTM, BatchNormalization, Conv2D, Flatten, MaxPooling2D, Dropout
from keras.optimizers import Adammodel = Sequential()
#model.add(CuDNNLSTM(128, input_shape = (time_steps//subsample, 32)))
model.add(Conv2D(filters = 64, kernel_size = (7,7), padding = "same", activation = "relu", input_shape = (time_steps//subsample, 32, 1)))
model.add(BatchNormalization())
#model.add(MaxPooling2D(pool_size = (3,3)))
model.add(Conv2D(filters = 64, kernel_size = (5,5), padding = "same", activation = "relu", input_shape = (time_steps//subsample, 32, 1)))
model.add(BatchNormalization())
#model.add(MaxPooling2D(pool_size = (3,3)))
model.add(Conv2D(filters = 64, kernel_size = (3,3), padding = "same", activation = "relu", input_shape = (time_steps//subsample, 32, 1)))
model.add(BatchNormalization())
#model.add(MaxPooling2D(pool_size = (3,3)))
model.add(Flatten())
#model.add(Dropout(0.2))
model.add(Dense(32, activation = "relu"))
model.add(BatchNormalization())
# model.add(Dropout(0.2))
model.add(Dense(6, activation = "sigmoid"))adam = Adam(lr = 0.001)model.compile(optimizer = adam, loss = "categorical_crossentropy", metrics = ["accuracy"])model.summary()

To check the performance of our model, as suggested in the Kaggle competition, we check the AUC score. If you are unfamiliar with what AUC is, I suggest this clear and intuitive explanation. As you can check by yourself in the online notebook, we can reach around 0.85 AUC score with a quick training phase.

Numerous improvements can be achieved by training different neural networks architectures, as well as preprocessing techniques and so on, but this introductory proof of concept shows a remarkable capacity of neural network to learn from this kind of data.

Conclusion

In this post we was an introduction to brain electric signals with EEG, a non-invasive and relatively simple way to record useful signals from a user’s scalp. We saw some intuitive visualizations of the data and how to extract features like movement intentionality from them using neural networks. I believe that this field (robotic prothesis, brain computer interfaces) will have a profound boost thanks to deep learning, a more widely the various data science techniques, platforms and competitions that are growing year after year.

The impact of such technologies would be tremendous. Having low-cost prothesis that can be controlled in a natural way could improve dramatically the lives of millions of people.

I suggest you to check out the Symbionic Project, a recently started project of talented people trying to hack together a low-cost, intelligent arm prothesis that can be controlled with muscle activations, to really democratize the accessibility of such devices.