The Data Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business (Wiley Series on Parallel and Distributed Computing)

The Data Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business (Wiley Series on Parallel and Distributed Computing)

By: Jano Van Hemert (editor), David Snelling (editor), Peter Brezany (editor), Malcolm Atkinson (editor), Mark Parsons (editor), Oscar Corcho (editor), Rob Baxter (editor), Michelle Galea (editor)Hardback

Up to 2 WeeksUsually despatched within 2 weeks

Description

This book presents the most up-to-date opportunities and challenges emerging in knowledge discovery, helping readers develop the technical skills to design and develop data-intensive methods and processes. Offering an introduction to the current R and D efforts worldwide, the book includes examples and case studies with strategies for addressing a wide variety of data-intensive challenges. The book includes a discussion of the DISPEL language, its development, enactment, and applications as well as data-intensive beacons of success, focusing on methods in astronomy, interactive interpretation of environment data, and data-driven research in humanities. A must-have resource for researchers in industry, governmental organizations, and academia.

About Author

MALCOLM ATKINSON, PhD, is Professor of e-Science in the School of Informatics at the University of Edinburgh in Scotland. He is also Data-Intensive Research Group leader, Director of the e-Science Institute, IT architect for the ADMIRE and VERCE EU projects and UK e-Science Envoy. Professor Atkinson has been leading research projects for several decades and served on many advisory bodies.

Contents

CONTRIBUTORS xv FOREWORD xvii PREFACE xix THE EDITORS xxix PART I STRATEGIES FOR SUCCESS IN THE DIGITAL-DATA REVOLUTION 1 1. The Digital-Data Challenge 5 Malcolm Atkinson and Mark Parsons 1.1 The Digital Revolution 5 1.2 Changing How We Think and Behave 6 1.3 Moving Adroitly in this Fast-Changing Field 8 1.4 Digital-Data Challenges Exist Everywhere 8 1.5 Changing How We Work 9 1.6 Divide and Conquer Offers the Solution 10 1.7 Engineering Data-to-Knowledge Highways 12 2. The Digital-Data Revolution 15 Malcolm Atkinson 2.1 Data, Information, and Knowledge 16 2.2 Increasing Volumes and Diversity of Data 18 2.3 Changing the Ways We Work with Data 28 3. The Data-Intensive Survival Guide 37 Malcolm Atkinson 3.1 Introduction: Challenges and Strategy 38 3.2 Three Categories of Expert 39 3.3 The Data-Intensive Architecture 41 3.4 An Operational Data-Intensive System 42 3.5 Introducing DISPEL 44 3.6 A Simple DISPEL Example 45 3.7 Supporting Data-Intensive Experts 47 3.8 DISPEL in the Context of Contemporary Systems 48 3.9 Datascopes 51 3.10 Ramps for Incremental Engagement 54 3.11 Readers Guide to the Rest of This Book 56 4. Data-Intensive Thinking with DISPEL 61 Malcolm Atkinson 4.1 Processing Elements 62 4.2 Connections 64 4.3 Data Streams and Structure 65 4.4 Functions 66 4.5 The Three-Level Type System 72 4.6 Registry, Libraries, and Descriptions 81 4.7 Achieving Data-Intensive Performance 86 4.8 Reliability and Control 108 4.9 The Data-to-Knowledge Highway 116 PART II DATA-INTENSIVE KNOWLEDGE DISCOVERY 123 5. Data-Intensive Analysis 127 Oscar Corcho and Jano van Hemert 5.1 Knowledge Discovery in Telco Inc. 128 5.2 Understanding Customers to Prevent Churn 130 5.3 Preventing Churn Across Multiple Companies 134 5.4 Understanding Customers by Combining Heterogeneous Public and Private Data 137 5.5 Conclusions 144 6. Problem Solving in Data-Intensive Knowledge Discovery 147 Oscar Corcho and Jano van Hemert 6.1 The Conventional Life Cycle of Knowledge Discovery 148 6.2 Knowledge Discovery Over Heterogeneous Data Sources 155 6.3 Knowledge Discovery from Private and Public, Structured and Nonstructured Data 158 6.4 Conclusions 162 7. Data-Intensive Components and Usage Patterns 165 Oscar Corcho 7.1 Data Source Access and Transformation Components 166 7.2 Data Integration Components 172 7.3 Data Preparation and Processing Components 173 7.4 Data-Mining Components 174 7.5 Visualization and Knowledge Delivery Components 176 8. Sharing and Reuse in Knowledge Discovery 181 Oscar Corcho 8.1 Strategies for Sharing and Reuse 182 8.2 Data Analysis Ontologies for Data Analysis Experts 185 8.3 Generic Ontologies for Metadata Generation 188 8.4 Domain Ontologies for Domain Experts 189 8.5 Conclusions 190 PART III DATA-INTENSIVE ENGINEERING 193 9. Platforms for Data-Intensive Analysis 197 David Snelling 9.1 The Hourglass Reprise 198 9.2 The Motivation for a Platform 200 9.3 Realization 201 10. Definition of the DISPEL Language 203 Paul Martin and Gagarine Yaikhom 10.1 A Simple Example 204 10.2 Processing Elements 205 10.3 Data Streams 213 10.4 Type System 217 10.5 Registration 222 10.6 Packaging 224 10.7 Workflow Submission 225 10.8 Examples of DISPEL 227 10.9 Summary 235 11. DISPEL Development 237 Adrian Mouat and David Snelling 11.1 The Development Landscape 237 11.2 Data-Intensive Workbenches 239 11.3 Data-Intensive Component Libraries 247 11.4 Summary 248 12. DISPEL Enactment 251 Chee Sun Liew, Amrey Krause, and David Snelling 12.1 Overview of DISPEL Enactment 251 12.2 DISPEL Language Processing 253 12.3 DISPEL Optimization 255 12.4 DISPEL Deployment 266 12.5 DISPEL Execution and Control 268 PART IV DATA-INTENSIVE APPLICATION EXPERIENCE 275 13. The Application Foundations of DISPEL 277 Rob Baxter 13.1 Characteristics of Data-Intensive Applications 277 13.2 Evaluating Application Performance 280 13.3 Reviewing the Data-Intensive Strategy 283 14. Analytical Platform for Customer Relationship Management 287 Maciej Jarka and Mark Parsons 14.1 Data Analysis in the Telecoms Business 288 14.2 Analytical Customer Relationship Management 289 14.3 Scenario 1: Churn Prediction 291 14.4 Scenario 2: Cross Selling 293 14.5 Exploiting the Models and Rules 296 14.6 Summary: Lessons Learned 299 15. Environmental Risk Management 301 Ladislav Hluchy, Ondrej Habala, Viet Tran, and Branislav Simo 15.1 Environmental Modeling 302 15.2 Cascading Simulation Models 303 15.3 Environmental Data Sources and Their Management 305 15.4 Scenario 1: ORAVA 309 15.5 Scenario 2: RADAR 313 15.6 Scenario 3: SVP 318 15.7 New Technologies for Environmental Data Mining 321 15.8 Summary: Lessons Learned 323 16. Analyzing Gene Expression Imaging Data in Developmental Biology 327 Liangxiu Han, Jano van Hemert, Ian Overton, Paolo Besana, and Richard Baldock 16.1 Understanding Biological Function 328 16.2 Gene Image Annotation 330 16.3 Automated Annotation of Gene Expression Images 331 16.4 Exploitation and Future Work 341 16.5 Summary 345 17. Data-Intensive Seismology: Research Horizons 353 Michelle Galea, Andreas Rietbrock, Alessandro Spinuso, and Luca Trani 17.1 Introduction 354 17.2 Seismic Ambient Noise Processing 356 17.3 Solution Implementation 358 17.4 Evaluation 369 17.5 Further Work 372 17.6 Conclusions 373 PART V DATA-INTENSIVE BEACONS OF SUCCESS 377 18. Data-Intensive Methods in Astronomy 381 Thomas D. Kitching, Robert G. Mann, Laura E. Valkonen, Mark S. Holliman, Alastair Hume, and Keith T. Noddle 18.1 Introduction 381 18.2 The Virtual Observatory 382 18.3 Data-Intensive Photometric Classification of Quasars 383 18.4 Probing the Dark Universe with Weak Gravitational Lensing 387 18.5 Future Research Issues 392 18.6 Conclusions 392 19. The World at One's Fingertips: Interactive Interpretation of Environmental Data 395 Jon Blower, Keith Haines, and Alastair Gemmell 19.1 Introduction 395 19.2 The Current State of the Art 397 19.3 The Technical Landscape 401 19.4 Interactive Visualization 403 19.5 From Visualization to Intercomparison 406 19.6 Future Development: The Environmental Cloud 409 19.7 Conclusions 411 20. Data-Driven Research in the Humanities the DARIAH Research Infrastructure 417 Andreas Aschenbrenner, Tobias Blanke, Christiane Fritze, andWolfgang Pempe 20.1 Introduction 417 20.2 The Tradition of Digital Humanities 420 20.3 Humanities Research Data 422 20.4 Use Case 426 20.5 Conclusion and Future Development 429 21. Analysis of Large and Complex Engineering and Transport Data 431 Jim Austin 21.1 Introduction 431 21.2 Applications and Challenges 432 21.3 The Methods Used 434 21.4 Future Developments 438 21.5 Conclusions 439 References 440 22. Estimating Species Distributions Across Space, Through Time, and with Features of the Environment 441 Steve Kelling, Daniel Fink, Wesley Hochachka, Ken Rosenberg, Robert Cook, Theodoros Damoulas, Claudio Silva, and William Michener 22.1 Introduction 442 22.2 Data Discovery, Access, and Synthesis 443 22.3 Model Development 448 22.4 Managing Computational Requirements 449 22.5 Exploring and Visualizing Model Results 450 22.6 Analysis Results 452 22.7 Conclusion 454 PART VI THE DATA-INTENSIVE FUTURE 459 23. Data-Intensive Trends 461 Malcolm Atkinson and Paolo Besana 23.1 Reprise 461 23.2 Data-Intensive Applications 469 24. Data-Rich Futures 477 Malcolm Atkinson 24.1 Future Data Infrastructure 478 24.2 Future Data Economy 485 24.3 Future Data Society and Professionalism 489 References 494 Appendix A: Glossary 499 Michelle Galea and Malcolm Atkinson Appendix B: DISPEL Reference Manual 507 Paul Martin Appendix C: Component Definitions 531 Malcolm Atkinson and Chee Sun Liew INDEX 537

Product Details

  • ISBN13: 9781118398647
  • Format: Hardback
  • Number Of Pages: 576
  • ID: 9781118398647
  • weight: 1098
  • ISBN10: 1118398645

Delivery Information

  • Saver Delivery: Yes
  • 1st Class Delivery: Yes
  • Courier Delivery: Yes
  • Store Delivery: Yes

Prices are for internet purchases only. Prices and availability in WHSmith Stores may vary significantly

Close