[1] Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X., 2015. TensorFlow: Large-scale machine learning on heterogeneous systems. http://dx.doi.org/10. 1038/nn.3331, URL: http://download.tensorflow.org/paper/whitepaper2015.pdf.
[3] Bevilacqua, M., Gaetan, C., Mateu, J., Porcu, E., 2012. Estimating space and space-time covariance functions for large data sets: A weighted composite likelihood approach. J. Amer. Statist. Assoc. 107, 268–280. http://dx.doi.org/10.1080/01621459.2011. 646928.
[4] Blei, D.M., Kucukelbir, A., McAuliffe, J.D., 2017. Variational inference: A review for statisticians. J. Amer. Statist. Assoc. 112, 859–877. http://dx.doi.org/10.1080/ 01621459.2017.1285773, ar\mathbf{x}_iv:1601.00670.
[5] Burt, D.R., Rasmussen, C.E., Van Der Wilk, M., 2019. Rates of convergence for sparse variational Gaussian process regression. In: 36th International Conference on Machine Learning. Long Beach, California, p. 10.
[6] Quiñonero Candela, J., Rasmussen, C.E., Herbrich, R., 2005. A unifying view of sparse appro\mathbf{x}_imate Gaussian process regression. J. Mach. Learn. Res. 6, 1935–1959, URL: http://jmlr.org/papers/volume6/quinonero-candela05a/quinonero-candela05a.pdf.
[7] Chilès, J.-p., Delfiner, P., 1999. Geostatistics: Modeling Spatial Uncertainty. Wiley, New York, New York, USA, p. 695.
[8] Cohen, S., Mbuvha, R., Marwala, T., Deisenroth, M.P., 2020. Healing products of Gaussian process experts. In: Proceedings of the 37th International Conference on Machine Learning.
[9] Crameri, F., Shephard, G.E., Heron, P.J., 2020. The misuse of colour in science communication. Nature Commun. 11, 1–10. http://dx.doi.org/10.1038/s41467020- 19160- 7.
[10] Cressie, N.A.C., 1991. Statistics for Spatial Data. John Wiley & Sons, New York, p. 887.
[11] Damianou, A.C., Lawrence, N.D., 2013. Deep Gaussian processes. In: International Conference on Artificial Intelligence and Statistics, vol. 31, pp. 207–215. http: //dx.doi.org/10.1002/nme.1296, ar\mathbf{x}_iv:1211.0358.
[12] Datta, A., Banerjee, S., Finley, A.O., Gelfand, A.E., 2016. Hierarchical nearestneighbor Gaussian process models for large geostatistical datasets. J. Amer. Statist. Assoc. 111, 800–812. http://dx.doi.org/10.1080/01621459.2015.1044091, ar\mathbf{x}_iv: 1406.7343
[13] Desassis, N., Renard, D., 2013. Automatic variogram modeling by iterative least squares: Univariate and multivariate cases. Math. Geosci. 45, 453–470. http://dx.doi.org/10.1007/s11004-012-9434-1.
[14] Diggle, P., Ribeiro, P.J., 2007. Model-Based Geostatistics, 1 Springer-Verlag, New York, p. 232. http://dx.doi.org/10.1007/978-0-387-48536-2.
[15] Dunlop, M.M., Girolami, M.A., Stuart, A.M., Teckentrup, A.L., 2018. How deep are deep Gaussian processes? J. Mach. Learn. Res. 19, 1–46.
[16] Gonçalves, Í.G., Echer, E., Frigo, E., 2020. Sunspot cycle prediction using warped Gaussian process regression. Adv. Space Res. 65, 677–683. http://dx.doi.org/10. 1016/j.asr.2019.11.011, URL: https://www.sciencedirect.com/science/article/pii/ S0273117719308026.
[17] Goovaerts, P., 1997. Geoestatistics for Natural Resources Evaluation. Oxford University Press, New York, New York, USA, p. 483.
[18] Hegde, P., Heinonen, M., Kaski, S., 2018. Variational zero-inflated Gaussian processes with sparse kernels. In: Conference on Uncertainty in Artificial Intelligence. Monterey, California, USA.
[19] Hensman, J., Matthews, A.G., Ghahramani, Z., 2015. Scalable variational Gaussian process classification. J. Mach. Learn. Res. 38, 351–360, ar\mathbf{x}_iv:1411.2005.
[20] Hensman, J., Sheffield, U., Fusi, N., Lawrence, N., 2013. Gaussian processes for big data. In: Uncertainty in Artificial Intelligence 29. pp. 282–290. http://dx.doi.org/10. 1162/089976699300016331, URL: http://auai.org/uai2013/prints/papers/244.pdf, ar\mathbf{x}_iv:1309.6835.
[21] Katzfuss, M., Guinness, J., 2021. A general framework for vecchia appro\mathbf{x}_imations of Gaussian processes. Statist. Sci. 36, 124–141. http://dx.doi.org/10.1214/19-sts755, ar\mathbf{x}_iv:1708.06302.
[22] Katzfuss, M., Guinness, J., Gong, W., Zilber, D., 2020. Vecchia appro\mathbf{x}_imations of Gaussian-process predictions. J. Agric. Biol. Environ. Stat. 25, 383–414. http: //dx.doi.org/10.1007/s13253-020-00401-7, ar\mathbf{x}_iv:1805.03309.
[23] Kingma, D.P., Ba, J.L., 2015. Adam: A method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015. ar\mathbf{x}_iv:1412.6980.
[24] Li, Z., Zhang, X., Clarke, K.C., Liu, G., Zhu, R., 2018. An automatic variogram modeling method with high reliability fitness and estimates. Comput. Geosci. 120, 48–59. http://dx.doi.org/10.1016/j.cageo.2018.07.011.
[25] Moreno-Muñoz, P., Artés-Rodríguez, A., Álvarez, M.A., 2018. Heterogeneous multioutput Gaussian process prediction. In: Advances in Neural Information Processing Systems. Montreal, pp. 6711–6720. ar\mathbf{x}_iv:1805.07633.
[26] Nguyen, T.V., Bonilla, E.V., 2013. Efficient variational inference for Gaussian process regression networks. J. Mach. Learn. Res. 31, 472–480.
[27] Opper, M., Archambeau, C., 2009. The variational gaussian appro\mathbf{x}_imation revisited. Neural Comput. 21, 786–792. http://dx.doi.org/10.1162/neco.2008.08-07-592.
[28] Padoan, S.A., Bevilacqua, M., 2015. Analysis of random fields using compRandFld. J. Stat. Softw. 63, 1–27. http://dx.doi.org/10.18637/jss.v063.i09.
[29] Pebesma, E.J., 2004. Multivariable geostatistics in S: The gstat package. Comput. Geosci. 30, 683–691. http://dx.doi.org/10.1016/j.cageo.2004.03.012.
[30] Rasmussen, C.E., Williams, C.K.I., 2006. Gaussian Processes for Machine Learning. MIT Press, Cambridge, Massachusetts, p. 266. http://dx.doi.org/10.1142/ S0129065704001899, ar\mathbf{x}_iv:026218253X.
[32] Rullière, D., Durrande, N., Bachoc, F., Chevalier, C., 2018. Nested Kriging predictions for datasets with a large number of observations. Stat. Comput. http://dx.doi.org/ 10.1007/s11222-017-9766-2, ar\mathbf{x}_iv:1607.05432.
[33] Salimbeni, H., Deisenroth, M., 2017. Doubly stochastic variational inference for deep Gaussian processes. In: 31st Conference on Neural Information Processing Systems, NIPS 2017. Long Beach, CA, USA. ar\mathbf{x}_iv:1705.08933.
[34] Snelson, E., Rasmussen, C.E., Ghahramani, Z., 2004. Warped Gaussian processes. Adv. Neural Inf. Proc. Syst. 16, 337–344.
[35] Sollich, P., 2002. Bayesian methods for support vector machines: Evidence and predictive class probabilities. Mach. Learn. 46, 21–52.
[36] Titsias, M., 2009. Variational learning of inducing variables in sparse Gaussian processes. In: Aistats, vol. 5, pp. 567–574, URL: http://eprints.pascal-network.org/ archive/00006353/.
[37] Trapp, M., Peharz, R., Pernkopf, F., Rasmussen, C.E., 2019. Deep structured mixtures of Gaussian processes. ar\mathbf{x}_iv 108. ar\mathbf{x}_iv:1910.04536.
[39] Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer Verlag, New York, p. 189.
[40] Williams, C.K.I., Seeger, M., 2000. Using the nystrom method to speed up kernel machines. In: NIPS’00: Proceedings of the 13th International Conference on Neural Information Processing. MIT Press, pp. 661–667.
[41] Wilson, A.G., Hu, Z., Salakhutdinov, R., \mathbf{x}_ing, E.P., 2016. Stochastic variational deep kernel learning. In: 29th Conference on Neural Information Processing Systems, NIPS 2016. Barcelona, Spain. http://dx.doi.org/10.1016/j.neucom.2008.12.019, URL: http://ar\mathbf{x}_iv.org/abs/1611.00336, ar\mathbf{x}_iv:1611.00336.