Building and Maintaining a Data Warehouse

Building and Maintaining a Data Warehouse

By: Fon Silvers (author)Hardback

1 - 2 weeks availability

Description

As it is with building a house, most of the work necessary to build a data warehouse is neither visible nor obvious when looking at the completed product. While it may be easy to plan for a data warehouse that incorporates all the right concepts, taking the steps needed to create a warehouse that is as functional and user-friendly as it is theoretically sound, is not especially easy. That's the challenge that Building and Maintaininga Data Warehouse answers. Based on a foundation of industry-accepted principles, this work provides an easy-to-follow approach that is cohesive and holistic. By offering the perspective of a successful data warehouse, as well as that of a failed one, this workdetails those factors that must be accomplished and those that are best avoided. Organized to logically progress from more general to specific information, this valuable guide: * Presents areas of a data warehouse individually and in sequence, showing how each piece becomes a working part of the whole * Examines the concepts and principles that are at the foundation of every successful data warehouse * Explains how to recognize and attend to problematic gaps in an established data warehouse * Provides the big picture perspective that planners and executives require Those considering the planning and creation of a data warehouse, as well as those who've already built one will profit greatly from the insights garnered by the author during his years of creating and gathering information on state-of-the-art data warehouses that are accessible, convenient, and reliable.

Create a review

Contents

The Big Picture: An Introduction to Data Warehousing Decision Support Systems; Dimensional and Third Normal Form Data Models; Storing the Data; Data Availability; Monitoring Data Quality. Data Warehouse Philosophy Enterprise Data; Subject Orientation; Data Integration; Form; Function; Grain; Nonvolatility; Time Variant; One Version of the Truth; Long-Term Investment. Source System Analysis Source System Analysis Principles; System of Record; Entity Data Arithmetic Data: Absolute, Relative, Numeric Data That Isn't Arithmetic; Alphanumeric Data; Granularity; Latency; Transaction Data; Snapshot Data; Source System Analysis Methods; Data Profile; Data Flow Diagram; Data State Diagram; System of Record; Business Rules. Relational Database Management System (RDBMS) Relational Set Theory; RDBMS Product Offerings; Residual Costs; Licensing; Support and Maintenance; Extensibility; Connective Capacity. Database Design Data Modeling Methodology; Conceptual and Logical Data Models; Logical (Primary) Key; Attribute; Primary Key/Foreign Key Relation; Cardinality; Super Types and Subtypes; Physical Data Model; Dimensional Data Model; Third Normal Form Data Model; Recursive Data Model; Physical Data Model Summary; Data Architecture; Enterprise Data Warehouse; Data Mart; Operational Data Store; Summaries and Aggregates. Data Acquisition and Integration Source System and Target System Analysis; Direct and Indirect Requirements; Language; Data Profile; Data State; Data Mapping; Business Rules; Architecture; Extract, Transform, and Load (ETL); Extract, Load, and Transform (ELT); ETL Design and Process Principles. Eleven Principles; Staging Principles Conclusion; ETL Functions; Extract Data from a Contiguous Dataset and from a Data Flow; Row-Level and Dataset-Level Transformation. Surrogate Key Generation: Intradataset, Data Warehouse-Level Transformation, Intra-Data Warehouse, Changed Data Capture, ETL Key, Universe to Universe and Candidate to Universe, Load Data from a Stable and Contiguous Dataset, Load Data from a Data Flow. Transaction Summary; Dimension Aggregation. Common Problems: Source Data Anomalies, Incomplete, Redundant, and Misstated Source Data; Business Rule Changes, Obsolete, Redefined, and Unrecorded Data. Business Intelligence (BI) Reporting Success Factors; Performance; User Interface; Presentation of the Data Architecture; Alignment with the Data Model; Ability to Answer Questions; Mobility; Flexibility; Availability; Customer Success Factors. Processes: Proactive, Reactive, Predefined, Ad Hoc; Data, Information, and Analytic Needs; BI Reporting Application and Architecture. BI Reporting Methods: Predefined and Interactive Reports, Online Analytical Process (OLAP) Reports, MOLAP, ROLAP, HOLAP; Drilling; Push versus Pull; Printed on Paper; Report Archives; Web-Based BI Reporting; Operational BI Reporting: From an ODS, From an Operational System (Real-Time), EDI, Partnerships, and Data Sharing. BI Reporting: Customer Relationship Management (CRM), Business Metrics Measure the Enterprise, Decisions and Decision Making Closer to the Action; Reporting around the Event; BI Search; Sarbanes-Oxley and BI Reporting; Data Mining; Statistics Concepts; Random Error; Statistical Significance. Variables: Dependent and Independent. Hypothesis; Data Mining Tools and Activities; Data Cleansing; Data Inspection; Compound, Lag, Numeric, and Categorical Variables; Hypothesis; Data Mining Algorithms; Neural Network; Decision Tree; CHAID; Nearest Neighbor; Rule Induction; Genetic Algorithm; Rule Validation and Testing; Overfitting. Data Quality Deming's Definition of Quality; Data Quality Service Level Agreement (SLA); Deming's Statistical Process Control; Process Measurement; Methods and Strategies; Data Stewardship; Post-Load Audit and Report Errant Data. Plug in a Default Value and Report Errant Data; Reject a Record and Report the Errant Record; Reject a Dataset and Report the Errant Dataset. Recycle the Data: In Place and Report Errant Data, Recycle Wheel and Report Errant Data, Data Quality Repository; Data Quality Fact Table: Dimensional Data Model, Third Normal Form Data Model; Data Quality Reporting. Metadata Types of Metadata; Static and Dynamic Metadata; Metadata Service Level Agreement (SLA); Metadata Repository; Central Metadata Repository: Dimensional Data Model; Third Normal Form; Distributed Metadata Repository; Real-Time Metadata; Data Quality as Metadata; Make or Buy a Metadata Repository. Data Warehouse Customers Strategic Decision Makers; Tactical Decision Makers; Knowledge Workers; Operational Applications; External Partners; Electronic Data Interchange (EDI) Partners; Data Warehouse Plan. Future of Data Warehousing: An Epilogue Scalability and Performance; Real-Time Data Warehousing; Increased Corporate Presence; Back to the Basics; Data Quality. Short TOC The Big Picture: An Introduction to Data Warehousing Data Warehouse Philosophy Source System Analysis Relational Database Management System (RDBMS) Database Design Data Acquisition and Integration Business Intelligence Reporting Data Quality Metadata Data Warehouse Customers Future of Data Warehousing: An Epilogue Bibliography Index Toc to post to abstract Preface Acknowledgments The Author Introduction The Big Picture: An Introduction to Data Warehousing Introduction Decision Support Systems Dimensional and Third Normal Form Data Models Storing the Data Data Availability Monitoring Data Quality Data Warehouse Philosophy Introduction Enterprise Data Subject Orientation Data Integration Form Function Grain Nonvolatility Time Variant One Version of the Truth Long-Term Investment References Source System Analysis Introduction Source System Analysis Principles System of Record Entity Data Arithmetic Data Absolute Arithmetic Data Relative Arithmetic Data Numeric Data That Isn't Arithmetic Alphanumeric Data Granularity Latency Transaction Data Snapshot Data Source System Analysis Methods Data Profile Data Flow Diagram Data State Diagram System of Record Business Rules Closing Remarks References Relational Database Management System (RDBMS) Introduction Relational Set Theory RDBMS Product Offerings Residual Costs Licensing Support and Maintenance Extensibility Connective Capacity Closing Remarks References Database Design Introduction Data Modeling Methodology Conceptual Data Model Logical Data Model Logical (Primary) Key Attribute Primary Key/Foreign Key Relation Cardinality Super Types and Subtypes Putting It All Together Physical Data Model Dimensional Data Model Third Normal Form Data Model Recursive Data Model Physical Data Model Summary Data Architecture Enterprise Data Warehouse Data Mart Operational Data Store Summaries and Aggregates Closing Remarks References Data Acquisition and Integration Introduction Source System Analysis Target System Analysis Direct Requirements Indirect Requirements Direct and Indirect Requirements Language Data Profile Data State Data Mapping Business Rules Architecture Extract, Transform, and Load (ETL) Extract, Load, and Transform (ELT) ETL Design Principles ETL Process Principles Principle 01: One Thing at a Time Principle 02: Know When to Begin Principle 03: Know When to End Principle 04: Large to Medium to Small Principle 05: Stage Data Integrity Principle 06: Know What You Have Process Principles Conclusion ETL Staging Principles Principle 07: Name the Data Principle 08: Own the Data Principle 09: Build the Data Principle 10: Type the Data Principle 11: Land the Data Staging Principles Conclusion ETL Functions Extract Data from a Contiguous Dataset Extract Data from a Data Flow Row-Level Transformation Dataset-Level Transformation Surrogate Key Generation: Intradataset Data Warehouse-Level Transformation Surrogate Key Generation: Intra-Data Warehouse Look-Up Changed Data Capture ETL Key Universe to Universe and Candidate to Universe Load Data from a Stable and Contiguous Dataset Load Data from a Data Flow Transaction Summary Dimension Aggregation Common Problems Source Data Anomalies Incomplete Source Data Redundant Source Data Misstated Source Data Business Rule Changes Obsolete Data Redefined Data Unrecorded Data Closing Remarks References Business Intelligence Reporting Introduction BI Reporting Success Factors Performance User Interface Presentation of the Data Architecture Alignment with the Data Model Ability to Answer Questions Mobility Flexibility Availability BI Customer Success Factors Proactive Processes Reactive Processes Predefined Processes Ad Hoc Processes Data Needs. Information Needs Analytic Needs BI Reporting Application Architecture BI Reporting Methods. Predefined Reports Interactive Reports Online Analytical Process (OLAP) Reports MOLAP ROLAP HOLAP Drilling Push versus Pull Push Pull Printed on Paper Report Archives Web-Based BI Reporting Operational BI Reporting: From an ODS Operational BI Reporting: From an Operational System (Real-Time) Operational BI Reporting: EDI, Partnerships, and Data Sharing. BI Reporting: Thus Far. Customer Relationship Management (CRM) Business Metrics Measure the Enterprise Decisions and Decision Making Closer to the Action BI Reporting: Coming Soon Reporting around the Event BI Search Sarbanes-Oxley and BI Reporting Data Mining Statistics Concepts Random Error Statistical Significance Variables: Dependent and Independent Hypothesis Data Mining Tools Data Mining Activities Data Cleansing Data Inspection Compound Variables Lag Variables Numeric Variables Categorical Variables Hypothesis Data Mining Algorithms Neural Network Decision Tree CHAID Nearest Neighbor Rule Induction Genetic Algorithm Rule Validation and Testing Overfitting Closing Remarks References Data Quality Introduction Deming's Definition of Quality Data Quality Service Level Agreement (SLA) Deming's Statistical Process Control Process Measurement Methods and Strategies Data Stewardship Post-Load Audit and Report Errant Data Plug in a Default Value and Report Errant Data Reject a Record and Report the Errant Record Reject a Dataset and Report the Errant Dataset Recycle the Data: In Place and Report Errant Data Recycle the Data: Recycle Wheel and Report Errant Data Data Quality Repository Data Quality Fact Table: Dimensional Data Model Data Quality Fact Table: Third Normal Form Data Model Data Quality Reporting Follow Through Closing Remarks References Metadata Introduction Types of Metadata Static Metadata Dynamic Metadata Metadata Service Level Agreement (SLA) Metadata Repository Central Metadata Repository: Dimensional Data Model Central Metadata Repository: Third Normal Form Distributed Metadata Repository Real-Time Metadata Data Quality as Metadata Make or Buy a Metadata Repository Closing Remarks References Data Warehouse Customers Introduction Strategic Decision Makers Tactical Decision Makers Knowledge Workers Operational Applications External Partners Electronic Data Interchange (EDI) Partners Data Warehouse Plan Strategic Decision Makers Tactical Decision Makers Knowledge Workers Operational Applications External Partners Electronic Data Interchange (EDI) Partners Closing Remarks Future of Data Warehousing: An Epilogue Introduction Scalability and Performance Real-Time Data Warehousing Increased Corporate Presence Back to the Basics Data Quality Bibliography Index

Product Details

  • publication date: 10/04/2008
  • ISBN13: 9781420064629
  • Format: Hardback
  • Number Of Pages: 328
  • ID: 9781420064629
  • weight: 612
  • ISBN10: 1420064622

Delivery Information

  • Saver Delivery: Yes
  • 1st Class Delivery: Yes
  • Courier Delivery: Yes
  • Store Delivery: Yes

Prices are for internet purchases only. Prices and availability in WHSmith Stores may vary significantly

Close