Machine learning is one of the fastest growing areas of computer science, with far-reaching applications. The aim of this textbook is to introduce machine learning, and the algorithmic paradigms it offers, in a principled way. The book provides a theoretical account of the fundamentals underlying machine learning and the mathematical derivations that transform these principles into practical algorithms. Following a presentation of the basics, the book covers a wide array of central topics unaddressed by previous textbooks. These include a discussion of the computational complexity of learning and the concepts of convexity and stability; important algorithmic paradigms including stochastic gradient descent, neural networks, and structured output learning; and emerging theoretical concepts such as the PAC-Bayes approach and compression-based bounds. Designed for advanced undergraduates or beginning graduates, the text makes the fundamentals and algorithms of machine learning accessible to students and non-expert readers in statistics, computer science, mathematics and engineering.
Shai Shalev-Shwartz is an Associate Professor at the School of Computer Science and Engineering at the Hebrew University of Jerusalem, Israel. Shai Ben-David is a Professor in the School of Computer Science at the University of Waterloo, Canada.
1. Introduction; Part I. Foundations: 2. A gentle start; 3. A formal learning model; 4. Learning via uniform convergence; 5. The bias-complexity trade-off; 6. The VC-dimension; 7. Non-uniform learnability; 8. The runtime of learning; Part II. From Theory to Algorithms: 9. Linear predictors; 10. Boosting; 11. Model selection and validation; 12. Convex learning problems; 13. Regularization and stability; 14. Stochastic gradient descent; 15. Support vector machines; 16. Kernel methods; 17. Multiclass, ranking, and complex prediction problems; 18. Decision trees; 19. Nearest neighbor; 20. Neural networks; Part III. Additional Learning Models: 21. Online learning; 22. Clustering; 23. Dimensionality reduction; 24. Generative models; 25. Feature selection and generation; Part IV. Advanced Theory: 26. Rademacher complexities; 27. Covering numbers; 28. Proof of the fundamental theorem of learning theory; 29. Multiclass learnability; 30. Compression bounds; 31. PAC-Bayes; Appendix A. Technical lemmas; Appendix B. Measure concentration; Appendix C. Linear algebra.