Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance

ISBN 9780471777090

Autor: Christopher Adamson

Wydawca: Wiley

Dostępność: 3-6 tygodni

Cena: 318,15 zł

Przed złożeniem zamówienia prosimy o kontakt mailowy celem potwierdzenia ceny.

Informacje dodatkowe
Opis książki

ISBN13:	9780471777090
ISBN10:	0471777099
Autor:	Christopher Adamson
Oprawa:	Paperback
Rok Wydania:	2006-07-14
Ilość stron:	384
Wymiary:	235x192
Tematy:	UH

The first book to offer in–depth coverage of star schema aggregate tables
Dubbed by Ralph Kimball as the most effective technique for maximizing star schema performance, dimensional aggregates are a powerful and efficient tool that can accelerate data warehouse queries more dramatically than any other technology. After you ensure that a database is properly designed, configured, and tuned, any measures you take to address data warehouse performance should begin with aggregates.
Yet, many businesses ignore aggregates, instead turning to specialized, proprietary hardware and software products to solve performance problems. This book fills the knowledge gap that has led businesses on this expensive and risky path.
Data warehouse expert Chris Adamson shows how a well–planned set of aggregates can have an extraordinary effect on the overall throughput of your data warehouse. Regardless of your role or current level of star schema expertise, the best practices in this book will help you achieve astounding performance increases, while avoiding common pitfalls.
From star schema basics through advanced aggregation techniques, this book covers the impact of aggregate tables on the entire data warehouse lifecycle. After establishing a few fundamentals, including the star schema approach to data warehouse design, chapters are dedicated to major phases of the data warehouse lifecycle. Topics include:Fundamental principles of aggregate schema designHow to use aggregates in a production environment, with or without an aggregate navigatorIntegration of aggregate processing into the ETL processStandard tasks and deliverables for incorporating aggregates into data warehouse development projectsHow to organize and execute a project that adds aggregate capability to an existing star schemaThe impact of advanced schema design techniques such as bridge tables, heterogeneous stars, or snapshot models on aggregationSpecial con siderations when implementing dimensional aggregates using Oracle′s materialized views or IBM′s materialized query tablesHow aggregates can add value in other areas, including database security and the archive strategy
Wiley Technology Publishing Timely. Practical. Reliable.
Visit our Web site at www.wiley.com/compbooks/

Spis treści:
Foreword.
Acknowledgments.
Introduction.
Chapter 1 Fundamentals of Aggregates.
Star Schema Basics.
Operational Systems and the Data Warehouse.
Operational Systems.
Data Warehouse Systems.
Facts and Dimensions.
The Star Schema.
Dimension Tables and Surrogate Keys.
Fact Tables and Grain.
Using the Star Schema.
Multiple Stars and Conformance.
Data Warehouse Architecture.
Invisible Aggregates.
Improving Performance.
The Base Schema and the Aggregate Schema.
The Aggregate Navigator.
Principles of Aggregation.
Providing the Same Results.
The Same Facts and Dimension Attributes as the Base Schema.
Other Types of Summarization.
Pre–Joined Aggregates.
Derived Tables.
Tables with New Facts.
Summary.
Chapter 2 Choosing Aggregates.
What Is a Potential Aggregate?
Aggregate Fact Tables: A Question of Grain.
Aggregate Dimensions Must Conform.
Pre–Joined Aggregates Have Grain Too.
Enumerating Potential Aggregates.
Identifying Potentially Useful Aggregates.
Drawing on Initial Design.
Design Decisions.
Listening to Users.
Where Subject Areas Meet.
The Conformance Bus.
Aggregates for Drilling Across.
Query Patterns of an Existing System.
Analyzing Reports for Potential Aggregates.
Choosing Which Reports to Analyze.
Assessing the Value of Potential Aggregates.
Number of Aggregates.
Presence of an Aggregate Navigator.
Space Consumed by Aggregate Tables.
How Many Rows Are Summarized.
Examining the Number of Rows Summarized.
The Cardinality Trap and Sparsity.
Who Will Benefit from the Aggregate.
Summary.
Chapter 3 Designing Aggregates.
The Base Schema.
Identification of Grain.
When Grain Is Forgotten.
Grain and Aggregates.
Conformance Bus.
Rollup Dimensions.
Aggregation Points.
Natural Keys.
Source Mapping.
Slow Change Processing.
Hierarchies.
Housekeeping Columns.
Design Principles for the Aggregate Schema.
A Separate Star for Each Aggregation.
Single Schema and the Level Field.
Drawbacks to the Single Schema Approach.
Advantages of Separate Tables.
Pre–Joined Aggregates.
Naming Conventions.
Naming the Attributes.
Naming Aggregate Tables.
Aggregate Dimension Design.
Attributes of Aggregate Dimensions.
Sourcing Aggregate Dimensions.
Shared Dimensions.
Aggregate Fact Table Design.
Aggregate Facts: Names and Data Types.
No New Facts, Including Counts.
Degenerate Dimensions.
Audit Dimension.
Sourcing Aggregate Fact Tables.
Pre–Joined Aggregate Design.
Documenting the Aggregate Schema.
Identify Schema Families.
Identify Dimensional Conformance.
Documenting Aggregate Dimension Tables.
Documenting Aggregate Fact Tables.
Pre–Joined Aggregates.
Materialized Views and Materialized Query Tables.
Summary.
Chapter 4 Using Aggregates.
Which Tables to Use?
The Schema Design.
Relative Size.
Aggregate Portfolio and Availability.
Requirements for the Aggregate Navigator.
Why an Aggregate Navigator?
Two Views and Query Rewrite.
Dynamic Availability.
Multiple Front Ends.
Multiple Back Ends.
Evaluating Aggregate Navigators.
Front–End Aggregate Navigators.
Approach.
Pros and Cons.
Back–End Aggregate Navigation.
Approach.
Pros and Cons.
Performance Add–On Technologies and OLAP.
Approach.
Pros and Cons.
Specific Solutions.
Living with Materializ ed Views.
Using Materialized Views.
Materialized Views as Pre–Joined Aggregates.
Materialized Views as Aggregate Fact Tables (Without Aggregate Dimensions).
Materialized Views and Aggregate Dimension Tables.
Additional Considerations.
Living with Materialized Query Tables.
Using Materialized Query Tables.
Materialized Query Tables as Pre–Joined Aggregates.
Materialized Query Tables as Aggregate Fact Tables (Without Aggregate Dimensions).
Materialized Query Tables and Aggregate Dimension Tables.
Additional Considerations.
Working Without an Aggregate Navigator.
Human Decisions.
Maintaining the Aggregate Portfolio.
Impact on the ETL Process.
Summary.
Chapter 5 ETL Part 1: Incorporating Aggregates.
The Load Process.
The Importance of the Load.
Tools of the Load.
Incremental Loads and Changed Data Identification.
The Top–Level Process.
Loading the Base Star Schema.
Loading Dimension Tables.
Attributes of the Dimension Table.
Requirements for the Dimension Load Process.
Extracting and Preparing the Record.
Process New Records.
Process Type 1 Changes.
Process Type 2 Changes.
Loading Fact Tables.
Requirements for the Fact Table Load Process.
Acquire Data and Assemble Facts.
Identification of Surrogate Keys.
Putting It All Together.
Loading the Aggregate Schema.
Loading Aggregates Separately from Base Schema Tables.
Invalid Aggregates.
Load Frequency.
Taking Aggregates Off–Line.
Off–Line Load Processes.
Materialized Views and Materialized Query Tables.
Drop and Rebuild Versus Incremental Load.
Drop and Rebuild.
Incremental Loading of Aggregates.
Real–Time Loads.
Real–Time Load of the Base Schema.
Real–Time Load and Aggregate Tables.
Partitioning the Schema.
Summary.
Chapter 6 ETL Part 2: Loading Aggregates.
The Source Data for Aggregate Tables.
Changed Dat