Sunday, February 26, 2023

Basics of Data Warehousing Concepts

 What is Data Warehousing?

A Data Warehousing (DW) is process for collecting and managing data from heterogenous sources to provide meaningful business insights. It acts has a Core of BI System on which is built for reporting and data analysis.

Data warehouse system is also known by the following name:

  • Decision Support System (DSS)
  • Executive Information System
  • Management Information System
  • Business Intelligence Solution
  • Analytic Application
  • Data Warehouse

Types of Data Warehouse

Three main types of Data Warehouses (DWH) are:

1. Enterprise Data Warehouse (EDW):

Enterprise Data Warehouse (EDW) is a centralized warehouse. It provides decision support service across the enterprise. It offers a unified approach for organizing and representing data. It also provide the ability to classify data according to the subject and give access according to those divisions.

2. Operational Data Store:

Operational Data Store, which is also called ODS, are nothing but data store required when neither Data warehouse nor OLTP systems support organizations reporting needs. In ODS, Data warehouse is refreshed in real time. Hence, it is widely preferred for routine activities like storing records of the Employees.

3. Data Mart:

A data mart is a subset of the data warehouse. It specially designed for a particular line of business, such as sales, finance, sales or finance. In an independent data mart, data can collect directly from sources.

Who needs Data warehouse?

DWH (Data warehouse) is needed for all types of users like:

  • Decision makers such as Data Analyst, Data Scientist who rely on mass amount of data
  • Users who use customized, complex processes to obtain information from multiple data sources.
  • It is also used by the people who want simple technology to access the data such as Data Engineer.
  • If the user wants fast performance on a huge amount of data which is a necessity for reports, grids or charts, then Data warehouse proves useful.
  • Data warehouse is a first step If you want to discover ‘hidden patterns’ of data-flows and groupings.


What Is a Data Warehouse Used For?

Here, are most common sectors where Data warehouse is used:

Airline:

In the Airline system, it is used for operation purpose like crew assignment, analyses of route profitability, frequent flyer program promotions, etc.

Banking:

It is widely used in the banking sector to manage the resources available on desk effectively. Few banks also used for the market research, performance analysis of the product and operations.

Healthcare:

Healthcare sector also used Data warehouse to strategize and predict outcomes, generate patient’s treatment reports, share data with tie-in insurance companies, medical aid services, etc.

Public sector:

In the public sector, data warehouse is used for intelligence gathering. It helps government agencies to maintain and analyze tax records, health policy records, for every individual.

Investment and Insurance sector:

In this sector, the warehouses are primarily used to analyze data patterns, customer trends, and to track market movements.

Retain chain:

In retail chains, Data warehouse is widely used for distribution and marketing. It also helps to track items, customer buying pattern, promotions and also used for determining pricing policy.

Telecommunication:

A data warehouse is used in this sector for product promotions, sales decisions and to make distribution decisions.

Hospitality Industry:

This Industry utilizes warehouse services to design as well as estimate their advertising and promotion campaigns where they want to target clients based on their feedback and travel patterns.


Why We Need Data Warehouse? 

PRO's of DWH:

  • Data warehouse allows business users to quickly access critical data from some sources all in one place.
  • Data warehouse provides consistent information on various cross-functional activities. It is also supporting ad-hoc reporting and query.
  • Data Warehouse helps to integrate many sources of data to reduce stress on the production system.
  • Data warehouse helps to reduce total turnaround time for analysis and reporting.
  • Restructuring and Integration make it easier for the user to use for reporting and analysis.
  • Data warehouse allows users to access critical data from the number of sources in a single place. Therefore, it saves user’s time of retrieving data from multiple sources.
  • Data warehouse stores a large amount of historical data. This helps users to analyze different time periods and trends to make future predictions.


CON's of DWH:

  • Not an ideal option for unstructured data.
  • Creation and Implementation of Data Warehouse is surely time confusing affair.
  • Data Warehouse can be outdated relatively quickly.
  • Difficult to make changes in data types and ranges, data source schema, indexes, and queries.
  • The data warehouse may seem easy, but actually, it is too complex for the average users.
  • Despite best efforts at project management, data warehousing project scope will always increase.
  • Sometime warehouse users will develop different business rules.
  • Organizations need to spend lots of their resources for training and Implementation purpose.

Characteristics of Data warehouse

Data Warehouse Concepts have following characteristics:


  • Subject-Oriented
  • Integrated
  • Time-variant
  • Non-volatile


Data Warehouse Architecture:


The Data Warehouse is based on an RDBMS server which is a central information repository that is surrounded by mainly 5 Components.






Source System: It mainly includes - Flat Files (FF), CSV, TSV, RDBMS, XML files, Messaging Queue, JSON files, COBOL files and so on.


ETL Tools: The ETL tools used to Extract, Transform, Load the data to Target systems.
The Various ETL tools are Power Center Informatica, IBM DATASTAGE, ABINITIO, Oracle and so on.

DWH Database: The Central repository database using which meaningful business insights are pulled. DHW DB are Teradata, Oracle, MS-SQL and so on.


Reporting Tools: The Various reporting tools are SAP BO, Power BI, Tableau and so on.


Query Tools: The primary objects of data warehousing are to provide information to businesses to make strategic decisions. Query tools allow users to interact with the data warehouse system.

These tools fall into four different categories:

  1. Query and reporting tools
  2. Application Development tools
  3. Data mining tools
  4. OLAP tools






 






No comments:

Post a Comment

Introduction to Database Management System

 A Database Management System (DBMS) is a collection of programs that enables user to create and maintain a database. The DBMS is a general-...