Feature Selection With Neural Networks

Leray, Philippe; Gallinari, Patrick

doi:10.2333/bhmk.26.145

Feature Selection With Neural Networks

Open access
Published: 01 January 1999

Volume 26, pages 145–166, (1999)
Cite this article

Download PDF

You have full access to this open access article

Behaviormetrika Aims and scope Submit manuscript

Feature Selection With Neural Networks

Download PDF

1008 Accesses
3 Altmetric
Explore all metrics

A Correction to this article was published on 01 January 2021

This article has been updated

Abstract

The observed features of a given phenomenon are not all equally informative: some may be noisy, others correlated or irrelevant. The purpose of feature selection is to select a set of features pertinent to a given task. This is a complex process, but it is an important issue in many fields. In neural networks, feature selection has been studied for the last ten years, using conventional and original methods. This paper is a review of neural network approaches to feature selection. We first briefly introduce baseline statistical methods used in regression and classification. We then describe families of methods which have been developed specifically for neural networks. Representative methods are then compared on different test problems.

Article PDF

The autofeat Python Library for Automated Feature Engineering and Selection

Feature Selection for Data and Pattern Recognition: An Introduction

Feature Selection

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Change history

13 February 2021
A Correction to this paper has been published: https://doi.org/10.1007/s41237-020-00127-3

References

Akaike, H. (1970). Statistical Predictor Identification, Ann. Inst. Statist. Math., 22, 203–217.
Article MathSciNet Google Scholar
Battiti, R. (1994). Using Mutual Information for Selecting Features in Supervised Neural Net Learning, IEEE Transactions on Neural Networks, 5(4), 537–550.
Article Google Scholar
Baxt, W.G. & White, H. (1995). Bootstrapping confidence intervals for clinical input variable effects in a network trained to identify the presence of acute myocardial infraction, Neural Computation, 7, 624–638.
Article Google Scholar
Bonnlander, B.V. & Weigend, A.S. (1994). Selecting Input Variables Using Mutual Information and Nonparametric Density Evaluation, Proceedings of ISANN’94, 42–50.
Google Scholar
Breiman, L., Friedman, J., Olshen, R. & Stone, C. (1984). Classification and Regression Trees. Wadsworth International Group.
MATH Google Scholar
Cibas, T., Fogelman Soulié, F., Gallinari, P. & Raudys, S. (1994a). Variable Selection with Optimal Cell Damage. Proceedings of ICANN’94.
Google Scholar
Cibas, T., Fogelman Soulié, F., Gallinari, P. & Raudys, S. (1996). Variable Selection with Neural Networks. Neurocomputing, 12, 223–248.
Article Google Scholar
Czernichow, T. (1996). Architecture Selection through Statistical Sensitivity Analysis. Procceings of ICANN’96, Bochum, Germany.
Google Scholar
Dorizzi, B., Pellieux, G., Jacquet, F., Czernichow, T. & Munoz, A. (1996). Variable Selection Using Generalized RBF Networks: Application to the Forecast of the French T-Bonds. Proceedings of IEEE-IMACS’96, Lille, France.
Google Scholar
Fraser, A.M. & Swinney, H.L. (1986). Independent Coordinates for Strange Attractors from Mutual Information, Physical Review A, 33(2), 1134–1140.
Article MathSciNet Google Scholar
Goutte, C. (1997). Extracting the Relevant Decays in Time Series Modelling, Neural Networks for Signal Processing VII, Proceedings of the IEEE Workshop.
Google Scholar
Gustafson & Hajlmarsso. (1995). 21 maximum likelihood estimators for model selection. Automatica.
Google Scholar
Habbema, J.D.F & Hermans, J. (1977). Selection of Variables in Discriminant Analysis by F -statistic and Error Rate, Technometrics, 19(4), 487–493.
Article Google Scholar
Härdie, W. (1990). Applied Nonparametric Regression. Cambridge University Press. Econometric Society Monograph n. 19.
Book Google Scholar
Hashem, S. (1992). Sensitivity Adalysis for Feedforward Artificial Neural Networks with Differentiable Activation Functions. Proceedings 1992 International Joint Conference on Neural Networks IJCNN92 I, 419–424.
Google Scholar
Hassibi, B. & Stork, D.G. (1993). Second Order Derivatives for Network Pruning: Optimal Brain Surgeon Neural Information Processing Systems, 5, 164–171.
Google Scholar
Kittler. (1986). Feature Selection are Extraction, Chaptre 3 in Handbook of Pattern Recogntion and Image Processing, Eds. Tzay Y. Young, King-Sun Fu, Academic Press. 59–83.
Google Scholar
Larsen, J. & Hansen, L.K. (1994). Generalized perfomance of regularized neural networks models. Proceedings of the 1994 IEEE Workshop on Neural Networks for Signal Processing. 42–51.
Chapter Google Scholar
LeCun, Y., Denker, J.S. & Solla, S.A. (1990). Optimal Brain Damage. Neural Information Processing Systems, 2, 598–605.
Google Scholar
MacKay, D.J.C. (1994). Bayesian Non-linear Modelling for the Energy Prediction Competition. ASHRAE Transactions. 1053–1062.
Google Scholar
Mao, J., Mohiuddin, K. & Jain, A.K. (1994). Parsimonious Network Design and Feature Selection Through Node Pruning. Proceedings of the 12th International Conference on Pattern Recognition. 622–624.
Google Scholar
McLachlan, G.J. (1992). Discriminant Analysis and Statistical Pattern Recognition, Wiley-Interscience publication.
Book Google Scholar
Moody, J. (1991). Note on generalization, regularization and architecture selection in nonlinear learning systems. Proceedings of the first IEEE Workshop on Neural Networks for Signal Processing. 1–10.
Google Scholar
Moody, J. & Utans, J. (1992). Principled Architecture Selection for Neural Networks: Application to Corporate Bond Rating Prediction. Neural Information Processing Systems, 4.
Google Scholar
Moody, J. (1994). Prediction Risk and Architecture Selection for Neural Networks, in From Statistics to Neural Networks—Theory and Pattern Recognition Applications, Eds V. Cherkass-ky, J.H. Friedman, H. Wechsler, Springer-Verlag.
Google Scholar
Narendra, P.M. & Fukunaga, K. (1977). A Branch and Bound Algorithm for Feature Subset Selection. IEEE Transactions on Computers, 26(9), 917–922.
Article Google Scholar
Pedersen, M.W., Hansen, L.K. & Larsen, J. (1996). Pruning with generalisation based weight salien-cies: γOBD, γOBS. Neural Information Processing Systems, 8.
Priddy, K.L., Rogers, S.K., Ruck, D.W., Tarr, G.L. & Kabrisky, M. (1993). Bayesian Selection of Important Features for Eeedforward Neural Networks. Neurocomputing, 5, 91–103. Elsevier ed.
Article Google Scholar
Pudil, P., Novovicova, J. & Kittler, J. (1994). Floating search methods in feature selection. Pattern Recognition Letters, 15 1119–1125.
Article Google Scholar
Refenes, A.N., Zapranis, A. & Utans, J. (1996). Neural Model Identification, Variable Selection and Model Adequacy. Neural Networks in Financial Engineering, Proceedings of NnCM-96.
Chapter Google Scholar
Rossi, F. (1996). Attribute Suppression with Multi-Layer Perceptron. Proceedings of IEEE-IMACS’96, Lille, France.
Google Scholar
Ruck, D.W., Rogers, S.K. & Kabrispy, M. (1990). Feature Selection Using a MultiLayer Perceptron. J. Neural Network Comput., 2(2), 40–48.
Google Scholar
Stahlberger, A. & Riedmiller, M. (1997). Fast Network Pruning and Feature Extraction Using the Unit-OBS Algorithm. Neural Information Processing Systems, 9, 655–661.
Google Scholar
Thompson, MX. (1978). Selection of Variables in Multiple Regression. Part I: A Review and Evaluation, International Statistical Review, 46, 1–19.
MathSciNet MATH Google Scholar
Thompson, MX. (1978). Selection of Variables in Multiple Regression. Part II: Chosen Procedures, Computations and Examples, International Statistical Review, 46, 129–146.
MathSciNet MATH Google Scholar
Tresp, V., Neuneier, R. & Zimmermann, G. (1997). Early Brain Damage. Neural Information Processing Systems, 9, 669–675.
Google Scholar
Van de Laar, P., Gielen, S. & Heskes, T. (1997). Input Selection with Partial Retraining. Proceedings of ICANN’97.
Google Scholar
White, H. (1989). Learning in Artificial Neural Networks: A Statistical Perspective. Neural Computation, 1, 425–464.
Article Google Scholar
Wilks, S.S. (1963). Mathematical Statistics, Wiley, New York.
MATH Google Scholar
Yacoub, M. & Bennani, Y. (1997). HVS: A Heuristic for Variable Selection in Multilayer Artificial Neural Network Classifier. Proceedings of ANNIE’97. 527–532.
Google Scholar

Download references

Author information

Authors and Affiliations

LIP6-Pôle IA-Université Paris 6-boite 169, 4, Place Jussieu-75252, Paris cedex 05, France
Philippe Leray & Patrick Gallinari

Authors

Philippe Leray
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Gallinari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philippe Leray.

Additional information

The original online version of this article was revised due to the retrospective open access order.

Rights and permissions

This article is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.

About this article

Cite this article

Leray, P., Gallinari, P. Feature Selection With Neural Networks. Behaviormetrika 26, 145–166 (1999). https://doi.org/10.2333/bhmk.26.145

Download citation

Received: 15 March 1998
Revised: 15 September 1998
Published: 01 January 1999
Issue Date: January 1999
DOI: https://doi.org/10.2333/bhmk.26.145

Key Words and Phrases

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Feature Selection With Neural Networks

Abstract

Article PDF

Similar content being viewed by others

The autofeat Python Library for Automated Feature Engineering and Selection

Feature Selection for Data and Pattern Recognition: An Introduction

Feature Selection

Change history

13 February 2021

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key Words and Phrases

Navigation

Feature Selection With Neural Networks

Abstract

Article PDF

Similar content being viewed by others

The autofeat Python Library for Automated Feature Engineering and Selection

Feature Selection for Data and Pattern Recognition: An Introduction

Feature Selection

Explore related subjects

Change history

13 February 2021

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key Words and Phrases

Search

Navigation