Scaling up Machine Learning: Parallel and Distributed Approaches

Scaling up Machine Learning: Parallel and Distributed Approaches

By: John Langford (editor), Ron Bekkerman (editor), Mikhail Bilenko (editor)Paperback

Special OrderSpecial Order item not currently available. We'll try and order for you.


This book presents an integrated collection of representative approaches for scaling up machine learning and data mining methods on parallel and distributed computing platforms. Demand for parallelizing learning algorithms is highly task-specific: in some settings it is driven by the enormous dataset sizes, in others by model complexity or by real-time performance requirements. Making task-appropriate algorithm and platform choices for large-scale machine learning requires understanding the benefits, trade-offs and constraints of the available options. Solutions presented in the book cover a range of parallelization platforms from FPGAs and GPUs to multi-core systems and commodity clusters, concurrent programming frameworks including CUDA, MPI, MapReduce and DryadLINQ, and learning settings (supervised, unsupervised, semi-supervised and online learning). Extensive coverage of parallelization of boosted trees, SVMs, spectral clustering, belief propagation and other popular learning algorithms, and deep dives into several applications, make the book equally useful for researchers, students and practitioners.

About Author

Ron Bekkerman is a computer engineer and scientist whose experience spans across disciplines from video processing to business intelligence. Currently a senior research scientist at LinkedIn, he previously worked for a number of major companies including Hewlett-Packard and Motorola. Bekkerman's research interests lie primarily in the area of large-scale unsupervised learning. He is the corresponding author of several publications in top-tier venues, such as ICML, KDD, SIGIR, WWW, IJCAI, CVPR, EMNLP and JMLR. Mikhail Bilenko is a researcher in the Machine Learning and Intelligence group at Microsoft Research. His research interests center on machine learning and data mining tasks that arise in the context of large behavioral and textual datasets. Bilenko's recent work has focused on learning algorithms that leverage user behavior to improve online advertising. His papers have been published at KDD, ICML, SIGIR, and WWW among other venues, and he has received best paper awards from SIGIR and KDD. John Langford is a computer scientist working as a senior researcher at Yahoo! Research. Previously, he was affiliated with the Toyota Technological Institute and IBM T. J. Watson Research Center. Langford's work has been published at conferences and in journals including ICML, COLT, NIPS, UAI, KDD, JMLR and MLJ. He received the Pat Goldberg Memorial Best Paper Award, as well as best paper awards from ACM EC and WSDM. He is also the author of the popular machine learning weblog,


1. Scaling up machine learning: introduction Ron Bekkerman, Mikhail Bilenko and John Langford; Part I. Frameworks for Scaling Up Machine Learning: 2. Mapreduce and its application to massively parallel learning of decision tree ensembles Biswanath Panda, Joshua S. Herbach, Sugato Basu and Roberto J. Bayardo; 3. Large-scale machine learning using DryadLINQ Mihai Budiu, Dennis Fetterly, Michael Isard, Frank McSherry and Yuan Yu; 4. IBM parallel machine learning toolbox Edwin Pednault, Elad Yom-Tov and Amol Ghoting; 5. Uniformly fine-grained data parallel computing for machine learning algorithms Meichun Hsu, Ren Wu and Bin Zhang; Part II. Supervised and Unsupervised Learning Algorithms: 6. PSVM: parallel support vector machines with incomplete Cholesky Factorization Edward Chang, Hongjie Bai, Kaihua Zhu, Hao Wang, Jian Li and Zhihuan Qiu; 7. Massive SVM parallelization using hardware accelerators Igor Durdanovic, Eric Cosatto, Hans Peter Graf, Srihari Cadambi, Venkata Jakkula, Srimat Chakradhar and Abhinandan Majumdar; 8. Large-scale learning to rank using boosted decision trees Krysta M. Svore and Christopher J. C. Burges; 9. The transform regression algorithm Ramesh Natarajan and Edwin Pednault; 10. Parallel belief propagation in factor graphs Joseph Gonzalez, Yucheng Low and Carlos Guestrin; 11. Distributed Gibbs sampling for latent variable models Arthur Asuncion, Padhraic Smyth, Max Welling, David Newman, Ian Porteous and Scott Triglia; 12. Large-scale spectral clustering with Mapreduce and MPI Wen-Yen Chen, Yangqiu Song, Hongjie Bai, Chih-Jen Lin and Edward Y. Chang; 13. Parallelizing information-theoretic clustering methods Ron Bekkerman and Martin Scholz; Part III. Alternative Learning Settings: 14. Parallel online learning Daniel Hsu, Nikos Karampatziakis, John Langford and Alex J. Smola; 15. Parallel graph-based semi-supervised learning Jeff Bilmes and Amarnag Subramanya; 16. Distributed transfer learning via cooperative matrix factorization Evan Xiang, Nathan Liu and Qiang Yang; 17. Parallel large-scale feature selection Jeremy Kubica, Sameer Singh and Daria Sorokina; Part IV. Applications: 18. Large-scale learning for vision with GPUS Adam Coates, Rajat Raina and Andrew Y. Ng; 19. Large-scale FPGA-based convolutional networks Clement Farabet, Yann LeCun, Koray Kavukcuoglu, Berin Martini, Polina Akselrod, Selcuk Talay and Eugenio Culurciello; 20. Mining tree structured data on multicore systems Shirish Tatikonda and Srinivasan Parthasarathy; 21. Scalable parallelization of automatic speech recognition Jike Chong, Ekaterina Gonina, Kisun You and Kurt Keutzer.

Product Details

  • ISBN13: 9781108461740
  • Format: Paperback
  • Number Of Pages: 491
  • ID: 9781108461740
  • weight: 1000
  • ISBN10: 1108461743

Delivery Information

  • Saver Delivery: Yes
  • 1st Class Delivery: Yes
  • Courier Delivery: Yes
  • Store Delivery: Yes

Prices are for internet purchases only. Prices and availability in WHSmith Stores may vary significantly