Biosignal Data Augmentation Based on Generative Adversarial Networks

In this paper, we propose a synthetic generation method for time-series data based on generative adversarial networks (GANs) and apply it to data augmentation for biosignal classification. GANs are a recently proposed framework for learning a generative model, where two neural networks, one generating synthetic data and the other discriminating synthetic and real data, are trained while competing with each other. In the proposed method, each neural network in GANs is developed based on a recurrent neural network using long short-term memories, thereby allowing the adaptation of the GANs framework to time-series data generation. In the experiments, we confirmed the capability of the proposed method for generating synthetic biosignals using the electrocardiogram and electroencephalogram datasets. We also showed the effectiveness of the proposed method for data augmentation in the biosignal classification problem.  


Real-Valued (Medical) Time Series Generation With Recurrent Conditional GANs

Generative Adversarial Networks (GANs) have shown remarkable success as a framework for training models to produce realistic-looking data. In this work, we propose a Recurrent GAN (RGAN) and Recurrent Conditional GAN (RCGAN) to produce realistic real-valued multi-dimensional time series, with an emphasis on their application to medical data. RGANs make use of recurrent neural networks in the generator and the discriminator. In the case of RCGANs, both of these RNNs are conditioned on auxiliary information. We demonstrate our models in a set of toy datasets, where we show visually and quantitatively (using sample likelihood and maximum mean discrepancy) that they can successfully generate realistic time-series. We also describe novel evaluation methods for GANs, where we generate a synthetic labelled training dataset, and evaluate on a real test set the performance of a model trained on the synthetic data, and vice-versa. We illustrate with these metrics that RCGANs can generate time-series data useful for supervised training, with only minor degradation in performance on real test data. This is demonstrated on digit classification from 'serialised' MNIST and by training an early warning system on a medical dataset of 17,000 patients from an intensive care unit. We further discuss and analyse the privacy concerns that may arise when using RCGANs to generate realistic synthetic medical time series data.
Sinusoidal sequence MNIST data Gaussian process Philips eICU data
Maximum meam discrepancy TSTR classification score TRTS classification score  


Feature-Driven Time Series Generation

Time series data are an ubiquitous and important data source in many domains. Most companies and organizations rely on this data for critical tasks like decision-making, planning, and analytics in general. Usually, all these tasks focus on actual data representing organization and business processes. In order to assess the robustness of current systems and methods, it is also desirable to focus on time-series scenarios which represent specific time-series features. This work presents a generally applicable and easy-to-use method for the generation of feature-driven time series data. Our approach extracts descriptive features of a data set and allows the construction of a specific version by means of the modification of these features.
M-3 Competition
Time series visualization  


C-RNN-GAN: Continuous recurrent neural networks with adversarial training

Generative adversarial networks have been proposed as a way of efficiently training deep generative neural networks. We propose a generative adversarial model that works on continuous sequential data, and apply it by training it on a collection of classical music. We conclude that it generates music that sounds better and better as the model is trained, report statistics on generated music, and let the reader judge the quality by downloading the generated songs  


Bidirectional Recurrent Neural Networks As Generative Models

Bidirectional recurrent neural networks (RNN) are trained to predict both in the positive and negative time directions simultaneously. They have not been used commonly in unsupervised tasks, because a probabilistic interpretation of the model has been difficult. Recently, two different frameworks, GSN and NADE, provide a connection between reconstruction and probabilistic modeling, which makes the interpretation possible. As far as we know, neither GSN or NADE have been studied in the context of time series before. As an example of an unsupervised task, we study the problem of filling in gaps in high-dimensional time series with complex dynamics. Although unidirectional RNNs have recently been trained successfully to model such time series, inference in the negative time direction is non-trivial. We propose two probabilistic interpretations of bidirectional RNNs that can be used to reconstruct missing gaps efficiently. Our experiments on text data show that both proposed methods are much more accurate than unidirectional reconstructions, although a bit less accurate than a computationally complex bidirectional Bayesian inference on the unidirectional RNN. We also provide results on music data for which the Bayesian inference is computationally infeasible, demonstrating the scalability of the proposed methods.
English text from Wikipedia
Mean log-likelihood