Evaluation metrics

We include the following test metric to evaluate model performance:

Sig-W1 metric [1]: a generic metric for distribution induced by time series.
Metrics on marginal distribution [1]: to measure the fitting of generative models in terms of the marginal distribution.
Metrics on dependency [1]: to measure the fitting of generative models in terms of correlation and autocorrelation.
Discriminative score [2]: to train a classifier to distinguish whether the sample is from the true distribution or synthetic distribution. The smaller the discriminative score, the better generator.
Predictive score [2]: train a sequence-to-sequence model to predict the latter part of a time series given the first part, using generated data, then evaluate on the true data. Smaller losses, meaning ability to predict, are better.
Distance-based metrics [6]: this metrics concerns the distance between real and generated samples. Can be used to assess the diversity and fidelity of generated data, furthermore, it can determine a potential model collapse.
Permutation tests [7]: perform a permutation test using signature-based MMD to obtain the power of the test.
t-SNE plot: a statistical method for visualizing high-dimensional data embedded into 2-dimensional data based on Stochastic Neighbor Embedding.

[1] Ni, H., Szpruch, L., Wiese, M., Liao, S. and Xiao, B., 2021. Sig-Wasserstein GANs for Time Series Generation.
[2] Yoon, J., Jarrett, D. and Van der Schaar, M., 2019. Time-series generative adversarial networks. Advances in neural information processing systems, 32.
[3] Esteban, C., Hyland, S.L. and Rätsch, G., 2017. Real-valued (medical) time series generation with recurrent conditional gans. arXiv preprint arXiv:1706.02633.
[4] Desai A., Freeman C., Wang, Z.H., Beaver I., 2021. TimeVAE: A Variational Auto-Encoder For Multivariate Time Series Generation. arXiv preprint arXiv:2111.08095.
[5] Zhang S., Guo B., Dong A., He J., Xu Z., and Chen S.X., 2017. Cautionary tales on air-quality improvement in Beijing. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.
[6] Anonymous authors, 2022. SOK: The Great GAN Bake Off, An Extensive Systematic Evaluation Of Generative Adversarial Network Architectures For Time Series.
[7] Chevyrev I. and Oberhauser H., 2022. Signature moments to characterize laws of stochastic processes.

Evaluation of Time Series Generative Models

Evaluation metrics

Add new comment

Pipeline