Abstract
Visualization of large-scale data inherently requires dimensionality reduction to 1D, 2D, or 3D space. Autoassociative neural networks with bottleneck layer are commonly used as a nonlinear dimensionality reduction technique. However, many real-world problems suffer from incomplete data sets, i.e. some values may be missing. Common methods dealing with missing data include deletion of all cases with missing values from the data set or replacement with mean or “normal” values for specific variables. Such methods are appropriate when just a few values are missing. But in the case when a substantial portion of data is missing, these methods may significantly bias the results of modeling. To overcome this difficulty, we propose a modified learning procedure for the autoassociative neural network that directly takes into account missing values. The outputs of the trained network may be used for substitution of the missing values in the original data set.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baldi, P., Hornik, K.: Neural networks and principal component analysis: learning from examples without local minima. Neural Networks 2, 53–58 (1989)
Bourlard, H., Kamp, Y.: Auto-association by multilayer perceptrons and singular value decomposition. Biological Cybernetics 59, 291–294 (1988)
Hastie, T., Stuetzle, W.: Principal curves. Journal of the American Statistical Association 84, 502–516 (1989)
Jolliffe, I.T.: Principal component analysis. Springer, New York (1986)
Kegl, B., Krzyzak, A., Linder, T., Zeger, K.: Learning and design of principal curves. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 281–297 (2000)
Kramer, M.A.: Nonlinear principal component analysis using autoassociative neural networks. Journal of the American Institute of Chemical Engineers 37, 233–243 (1991)
Kruskal, J.B., Wish, M.: Multidimensional Scaling. Sage Publications, Newbury Park (1978)
Oja, E.: Data compression, feature extraction, and autoassociation in feedforward neural networks. In: Kohonen, T., Makisara, M., Simula, O., Kangas, J. (eds.) Proceedings of the International Conference on Artificial Neural Networks, vol. 1, pp. 737–745 (1991)
Pearson, K.: On lines and planes of closest fit to systems of points in space. The London, Edinburgh and Dublin Philosophical Magazine and Journal of Sciences 6, 559–572 (1901)
Roweis, S.: EM algorithm for PCA and SPCA. Neural Information Processing Systems 10, 626–632 (1997)
Torgerson, W.S.: Multidimensional scaling: I. Theory and method. Psychometrika 17, 401–419 (1952)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Popov, S. (2006). Nonlinear Visualization of Incomplete Data Sets. In: Grigoriev, D., Harrison, J., Hirsch, E.A. (eds) Computer Science – Theory and Applications. CSR 2006. Lecture Notes in Computer Science, vol 3967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11753728_53
Download citation
DOI: https://doi.org/10.1007/11753728_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34166-6
Online ISBN: 978-3-540-34168-0
eBook Packages: Computer ScienceComputer Science (R0)