Spectral Feature Selection for Data Mining introduces a novel feature selection technique that establishes a general platform for studying existing feature selection algorithms and developing new algorithms for emerging problems in real-world applications. This technique represents a unified framework for supervised, unsupervised, and semisupervised feature selection.
The book explores the latest research achievements, sheds light on new research directions, and stimulates readers to make the next creative breakthroughs. It presents the intrinsic ideas behind spectral feature selection, its theoretical foundations, its connections to other algorithms, and its use in handling both large-scale data sets and small sample problems. The authors also cover feature selection and feature extraction, including basic concepts, popular existing algorithms, and applications.
A timely introduction to spectral feature selection, this book illustrates the potential of this powerful dimensionality reduction technique in high-dimensional data processing. Readers learn how to use spectral feature selection to solve challenging problems in real-life applications and discover how general feature selection and extraction are connected to spectral feature selection.
Zheng Zhao is a research statistician at the SAS Institute, Inc. His recent research focuses on designing and developing novel analytic approaches for handling large-scale data of extremely high dimensionality. Dr. Zhao is the author of PROC HPREDUCE, which is a SAS High Performance Analytics procedure for large-scale parallel variable selection. He was co-chair of the 2010 PAKDD Workshop on Feature Selection in Data Mining. He earned a Ph.D. in computer science and engineering from Arizona State University. Huan Liu is a professor of computer science and engineering at Arizona State University. Dr. Liu serves on journal editorial boards and conference program committees and is a founding organizer of the International Conference Series on Social Computing, Behavioral-Cultural Modeling, and Prediction. He earned a Ph.D. in computer science from the University of Southern California. With a focus on data mining, machine learning, social computing, and artificial intelligence, his research investigates problems in real-world application with high-dimensional data of disparate forms, such as social media, group interaction and modeling, data preprocessing, and text/web mining.
Data of High Dimensionality and Challenges Dimensionality Reduction Techniques Feature Selection for Data Mining Spectral Feature Selection Organization of the Book Univariate Formulations for Spectral Feature Selection Modeling Target Concept via Similarity Matrix The Laplacian Matrix of a Graph Evaluating Features on the Graph An Extension for Feature Ranking Functions Spectral Feature Selection via Ranking Robustness Analysis for SPEC Discussions Multivariate Formulations The Similarity Preserving Nature of SPEC A Sparse Multi-Output Regression Formulation Solving the L2,1-Regularized Regression Problem Efficient Multivariate Spectral Feature Selection A Formulation Based on Matrix Comparison Feature Selection with Proposed Formulations Connections to Existing Algorithms Connections to Existing Feature Selection Algorithms Connections to Other Learning Models An Experimental Study of the Algorithms Discussions Large-Scale Spectral Feature Selection Data Partitioning for Parallel Processing MPI for Distributed Parallel Computing Parallel Spectral Feature Selection Computing the Similarity Matrix in Parallel Parallelization of the Univariate Formulations Parallel MRSF Parallel MCSF Discussions Multi-Source Spectral Feature Selection Categorization of Different Types of Knowledge A Framework Based on Combining Similarity Matrices A Framework Based on Rank Aggregation Experimental Results Discussions References Index