Data Mining,
Edition 5 Practical Machine Learning Tools and TechniquesEditors: By James Foulds, Ph.D., Ian H. Witten, Eibe Frank, Mark A. Hall and Christopher J. Pal
No accessibility information available.
Data Mining: Practical Machine Learning Tools and Techniques, Fifth Edition, offers a thorough grounding in machine learning concepts, along with practical advice on applying these tools and techniques in real-world data mining situations. This highly anticipated new edition of the most acclaimed work on data mining and machine learning teaches readers everything they need to know to get going, from preparing inputs, interpreting outputs, evaluating results, to the algorithmic methods at the heart of successful data mining approaches.
Extensive updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including more recent deep learning content on topics such as generative AI (GANs, VAEs, diffusion models), large language models (transformers, BERT and GPT models), and adversarial examples, as well as a comprehensive treatment of ethical and responsible artificial intelligence topics. Authors Ian H. Witten, Eibe Frank, Mark A. Hall, and Christopher J. Pal, along with new author James R. Foulds, include today’s techniques coupled with the methods at the leading edge of contemporary research
Key Features
- Provides a thorough grounding in machine learning concepts, as well as practical advice on applying the tools and techniques to data mining projects
- Presents concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods
- Features in-depth information on deep learning and probabilistic models
- Covers performance improvement techniques, including input preprocessing and combining output from different methods
- Provides an appendix introducing the WEKA machine learning workbench and links to algorithm implementations in the software
- Includes all-new exercises for each chapter
About the author
By James Foulds, Ph.D., Department of Information Systems, University of Maryland Baltimore County, Baltimore, MD, USA; Ian H. Witten, Computer Science Department, University of Waikato, New Zealand; Eibe Frank, Computer Science Department, University of Waikato, New Zealand; Mark A. Hall, Computer Science Department, University of Waikato, New Zealand and Christopher J. Pal, Department of Computer Engineering and Software Engineering, Polytechnique Montréal, Quebec, Canada
PART I: INTRODUCTION TO DATA MINING
1. What’s it all about?
2. Input: concepts, instances, attributes
3. Output: knowledge representation
4. Algorithms: the basic methods
5. Credibility: evaluating what’s been learned
6. Preparation: data preprocessing and exploratory data analysis
7. Ethics: what are the impacts of what's been learned?
PART II: MORE ADVANCED MACHINE LEARNING SCHEMES
8. Ensemble learning
9. Extending instance-based and linear models
10. Deep learning: fundamentals
11. Advanced deep learning methods
12. Beyond supervised and unsupervised learning
13. Probabilistic methods: fundamentals
14. Advanced probabilistic methods
15. Moving on: applications and their consequences
Appendix
A. Theoretical foundations
B. The WEKA workbench
C. Implementation details of trees and rules
D. Technical details of deep learning
Reviews
Data Mining by Ian Witten is a masterfully written and highly accessible introduction to the world of machine learning and data analysis. The book stands out for its clear explanations, logical flow, and hands-on approach that makes complex concepts approachable for students and newcomers alike. Witten’s ability to bridge theory with practical application through the Java-based WEKA software is one of its greatest strengths. What makes this book particularly valuable is how it demystifies core data mining concepts — from decision trees and clustering to association rules and model evaluation — without overwhelming readers with heavy mathematics. The inclusion of real-world examples and WEKA exercises enables readers to experiment, explore, and truly understand how algorithms behave with real data. Whether used as a textbook in a data science or machine learning course, or as a self-study guide, this book remains one of the most student-friendly and pedagogically sound resources available. It not only teaches the techniques but also instills a deeper intuition for data-driven discovery. In short, Data Mining by Witten is a timeless classic — clear, practical, and empowering — an essential read for anyone beginning their journey into data science and machine learning.