My e-Notes about DataScience, Machine Learning, Python, Data Analytics, DataStage, DWH and ETL Concepts


Friday, 18 March 2016

Data Warehouse Glossary #1

Ad Hoc Query:

A database search that is designed to extract specific information from a database.  It is ad hoc if it is designed at the point of execution as opposed to being a “canned” report.  Most ad hoc query software uses the structured query language (SQL).


The process of summarizing or combining data.


A component of a data dictionary that describes and organizes the various aspects of a database such as its folders, dimensions, measures, prompts, functions, queries and other database objects.  It is used to create queries, reports, analyses and cubes.

Cross Tab:

A type of multi-dimensional report that displays values or measures in cells created by the intersection of two or more dimensions in a table format.


A data visualization method and workflow management tool that brings together useful information on a series of screens and/or web pages.  Some of the information that may be contained on a dashboard includes reports, web links, calendar, news, tasks, e-mail, etc.  When incorporated into a DSS or EIS key performance indicators may be represented as graphics that are linked to various hyperlinks, graphs, tables and other reports.  The dashboard draws its information from multiple sources applications, office products, databases, Internet, etc.


A multi-dimensional matrix of data that has multiple dimensions (independent variables) and measures (dependent variables) that are created by an Online Analytical Processing System (OLAP).  Each dimension may be organized into a hierarchy with multiple levels.  The intersection of two or more dimensional categories is referred to as a cell.

Data-based Knowledge:

Factual information used in the decision making process that is derived from data marts or warehouses using business intelligence tools.  Data warehousing organizes information into a format so that it represents an organizations knowledge with respect to a particular subject area, e.g. finance or clinical outcomes.

Data Cleansing:

The process of cleaning or removing errors, redundancies and inconsistencies in the data that is being imported into a data mart or data warehouse.  It is part of the quality assurance process.

Data Mart:

A database that is similar in structure to a data warehouse, but is typically smaller and is focused on a more limited area.  Multiple, integrated data marts are sometimes referred to as an Integrated Data Warehouse.  Data marts may be used in place of a larger data warehouse or in conjunction with it.  They are typically less expensive to develop and faster to deploy and are therefore becoming more popular with smaller organizations.

Data Migration:

The transfer of data from one platform to another.  This may include conversion from one language, file structure and/or operating environment to another.

Data Mining:

The process of researching data marts and data warehouses to detect specific patterns in the data sets.  Data mining may be performed on databases and multi-dimensional data cubes with ad hoc query tools and OLAP software.  The queries and reports are typically designed to answer specific questions to uncover trends or hidden relationships in the data.

Data Scrubbing:

See Data Cleansing

Data Transformation:

The modification of transaction data extracted from one or more data sources before it is loaded into the data mart or warehouse.  The modifications may include data cleansing, translation of data into a common format so that is can be aggregated and compared, summarizing the data, etc.

Data Warehouse:

An integrated, non-volatile database of historical information that is designed around specific content areas and is used to answer questions regarding an organizations operations and environment.

Database Management System:

The software that is used to create data warehouses and data marts.  For the purposes of data warehousing, they typically include relational database management systems and multi-dimensional database management systems.  Both types of database management systems create the database structures, store and retrieve the data and include various administrative functions.

Decision Support System (DSS):

A set of queries, reports, rule-based analyses, tables and charts that are designed to aid management with their decision-making responsibilities.  These functions are typically “wrapped around” a data mart or data warehouse.  The DSS tends to employ more detailed level data than an EIS.

Like the below page to get update

No comments:

Post a comment


The postings on this site are my own and don't necessarily represent IBM's or other companies positions, strategies or opinions. All content provided on this blog is for informational purposes and knowledge sharing only.
The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of his information.