In the Beginning, life was simple
But
Our information needs
Kept growing. (The Spider web)
Purpose
To explore and discuss the purpose and principles of data warehousing.

So What Is a Data Warehouse?
Definition: A data warehouse is the data repository of an enterprise. It is generally used for research and decision support.
By comparison: an OLTP (on-line transaction processor) or operational system is used to deal with the everyday running of one aspect of an enterprise.
OLTP systems are usually designed independently of each other and it is difficult for them to share information.
Why Do We Need Data Warehouses?
Consolidation of information resources
Improved query performance
Separate research and decision support functions from the operational systems
Foundation for data mining, data visualization, advanced reporting and OLAP tools
What Is a Data Warehouse Used for?
Knowledge discovery
Making consolidated reports
Finding relationships and correlations
Data mining
Examples
Banks identifying credit risks
Insurance companies searching for fraud
Medical research

Goals
Structure
Size
Performance optimization
Technologies used
Comparison Chart of Database Types
Design Differences
Star Schema
Supporting a Complete Solution
Data Warehouses, Data Marts, and Operational Data Stores
Data Warehouse -The queryable source of data in the enterprise. It is comprised of the union of all of its constituent data marts.
Data Mart -A logical subset of the complete data warehouse. Often viewed as a restriction of the data warehouse to a single business process or to a group of related business processes targeted toward a particular business group.
Operational Data Store (ODS) -A point of integration for operational systems that developed independent of each other. Since an ODS supports day to day operations, it needs to be continually updated.

Building a Data Warehouse
Analysis
Design
Import data
Install front-end tools
Test and deploy
Stage 1: Analysis
Identify:
Target Questions
Data needs
Timeliness of data
Granularity
Create an enterprise-level data dictionary
Dimensional analysis
Identify facts and dimensions
Stage 2: Design
Star schema
Data Transformation
Aggregates
Pre-calculated Values
HW/SW Architecture
Dimensional Modeling

Fact Table -The primary table in a dimensional model that is meant to contain measurements of the business.
Dimension Table -One of a set of companion tables to a fact table. Most dimension tables contain many textual attributes that are the basis for constraining and grouping within data warehouse queries.
Stage 3: Import Data
Identify data sources
Extract the needed data from existing systems to a data staging area
Transform and Clean the data
Resolve data type conflicts
Resolve naming and key conflicts
Remove, correct, or flag bad data
Conform Dimensions
Load the data into the warehouse
Importing Data Into the Warehouse
Stage 4: Install Front-end Tools
Reporting tools
Data mining tools
GIS
Etc.
Stage 5: Test and Deploy
Usability tests
Software installation
User training
Performance tweaking based on usage
Special Concerns
Time and expense
Managing the complexity
Update procedures and maintenance
Changes to source systems over time
Changes to data needs over time

Goals of the STORET Central Warehouse
Improved performance and faster data retrieval
Ability to produce larger reports
Ability to provide more data query options
Streamlined application navigation
Old Web Application Flow
Central Warehouse Application Flow
Web Application Demo
STORET Central Warehouse -Potential Future Enhancements
More query functionality
Additional report types
Web Services
Additional source systems?
Data Warehouse Components
Data Warehouse Components -Detailed

Categories: News