Data warehousing holds great promise. Technology
planners rightly assert that it supports better customer relationships
and decision making. But such goals can be achieved only where the
application is appropriate and is executed skillfully.
Companies are feeling the pressure to cut expenses
and increase profits. In today's business environment, the advantage
goes to those companies that can respond to rapidly changing demands.
Facilitating the timely access of corporate information to the employees
will allow companies the competitive edge.
Today, the most successful way to provide valuable
information to users is by modeling operational data into a format that
is easy for end users to access and analyze. This approach is referred
to as "Data Warehousing".
A data warehouse is a central repository for all, or
significant parts of the data that an enterprise of various computer
systems collection. Typically, a data warehouse is housed on an
enterprise's server, but it can reside with a storage service provider
also. Data from various online transaction processing applications and
other sources are selectively extracted and organized on the data
ELEMENTS OF DATA WAREHOUSING
The major elements of data warehousing are data
acquisition, data modeling and schema, metadata and data management.
Data acquisition involves identifying, capturing, and
transforming data in operational systems so the data can be loaded into
a data warehouse. This transformation uses application tools that allow
data warehouse administrators to apply business rules for integrating
data from source systems.
Data modeling is the analysis of data objects used in
a business or other context (e.g. a product, a product price, or a
product sale) and expressions of the relationships between each of these
A schema, in computer programming, is the
organization or structure for a database. The activity of data modeling
leads to a schema.
Metadata is a definition or description of data,
commonly known as "data about data." It simplifies the process
of managing all the processes involved in creating and sustaining a data
Data management includes the access and storage
mechanisms that support the data warehouse. This is usually a
relational, multidimensional, or other specialized database designed to
facilitate complex queries.
While data warehousing is a simple, logical concept,
technically it is very complex and challenging. A data warehouse cannot
be purchased as a completed application; it must be built and assembled
differently to meet each organization's unique requirements.
JUSTIFYING THE DATA WAREHOUSE
Implementing a data warehousing solution can provide
organizations a competitive advantage. It enables them to leverage
information residing within their databases to respond quickly to
changes in the marketplace and make better business decisions.
ENHANCED CUSTOMER SERVICE:
The competitive nature of current business environments forces
enhancements in providing improved customer service near the top of
every corporation's list of goals.
Improvements in this area translate directly into
additional business and cost savings.
BETTER BUSINESS INTELLIGENCE:
Information related to organization's financial conditions, customers'
vital data and product information etc. already exists within the
organization. A data warehouse gives this ability to pull together
information from multiple operational systems and combine that with
information that may be acquired from outside sources.
SUPPORT BUSINESS PROCESS RE-ENGINEERING:
It provides business users unlimited analysis of their business
information and often provides insights into the work processes. This
will yield breakthrough ideas for the re-engineering of those business
A RETURN ON INVESTMENT:
In the beginning, like any other technologies, the business benefits
that a data warehouse can bring are intangible; resulting unable to
convince executive to get financial approval for data warehousing
A study by International Data Corporation revealed
that data warehousing generated an average three-year return on
investment (ROI) of 401%. Over 90% of the organizations included in the
analysis reported return on investment returns over 40%.
Data warehousing is a technology that is being
embraced by all major industries which include DFIs, insurance,
manufacturing, retail, healthcare, telecommunications etc. Companies are
setting up data warehouses for finance, budgeting, sales &
marketing, customer information systems, product/inventory analysis,
human resources etc. The trend is moving away from centralized
mainframe-oriented data warehouses toward departmental server-based
solutions. In some applications single subject-oriented data warehouses
(data marts) are being implemented that can provide immediate value to
targeted business users.
Trends also indicate the acceptance of data
warehouse/data mart solutions that are integrated with application
package solutions. A number of strategic relationships are in place
between vendors to provide partial or complete data warehouse
environments for popular manufacturing, human resources and financial
BUILDING A DATA WAREHOUSE
Steps needed to be identified and integrated to
create a data warehouse solution are:
IDENTIFYING SOURCE DATA:
The first step in building a data warehouse is to identify the sources,
format and location of data required to meet the needs of the business
users. Most organizations have data in multiple locations and formats.
The challenge for the data warehouse is not only to collect the data but
to make the data from many different sources consistent and usable.
TRANSFORMATION AND CLEANSING:
Data that is extracted from existing systems or acquired from outside
sources must be examined for consistency and quality before it is loaded
into the data warehouse.
REQUIREMENT FOR DATA WAREHOUSE SERVER
Data warehouses of more than a terabyte (1000 GB) are becoming more and
more common. So, it is essential that the database server not only meet
users' performance expectations during analysis but must be capable of
handling complex analysis against huge amounts of data.
database server and hardware must be capable of storing entire data
warehouse, including both relational and multidimensional data.
Normally, data warehouse implementation starts out as a departmental
solution and evolves over time into an enterprise-wide warehouse. It is
important that the server should be scalable to handle the requirements
of an ever growing amount of data in the data warehouse.
The database server must be a proven as a cost
effective platform. All the basic elements, hardware, operating system,
database and communications sources should work together to provide an
integrated, problem-free data warehouse solution.
In order to achieve optimum performance level of
storing and analyzing fairly large amount of data and to meet users'
expectations, the database engine must exploit the hardware power to it
maximum level. To achieve such high performance, parallel input/output
(I/O) processing would be suggested; which means that a single user can
submit a query operation against the database and multiple I/O processor
tasks are initiated internally to process the request. This capability
can result up to a 700% improvement in I/O-bound query performance.
This results in significant savings when comparing
costs of ownership for several data warehousing platforms.
WHERE TO START?
The very core of a successful data warehouse is to
have a clear understanding of a business problem and the user
requirements for information analysis. It is strongly recommended that
the team assembled to create a data warehouse must be comprised of IT
professionals and business executives. Projects must have a clearly
defined scope for managing economic and operational limitations. The
process will be highly time-bound as IT experts and end users must work
together towards a reasonable aggregation level for data in the
The first data warehouse should cover a single
subject area and be delivered at a relatively low cost. To minimize the
risk, the targeted platform should have developed some IT
infrastructure. The level of complexity involved in successfully
designing and implementing a data warehouse must not be underestimated.
Time must be spent to acquire and develop additional skills for data
warehousing developers and users.
Data Warehousing is a complex field, with many
vendors striving for market awareness.
The complexity of the technology the interactions
between the various tools and the high price points for the products
require companies to perform careful technology evaluation before
embarking on a warehousing project. However, the potential for enormous
returns on investment and competitive advantage make data warehousing
difficult to ignore.
The author is Head of Department, Computer Science
& IT, Beaconhouse-Informatics.