1.
The organizations need in their process to have access to transactional systems to generate, analyze and consult information, but surely the company could have problems regarding response times; the information distributed in different systems that are not homogeneous could cause inflexible and complex reports and results.
Historical Justification
Today, information technologies have automated processes typically repetitive or administrative nature, using what we call information operational systems. The operational applications are those that meet the operational needs of the company. In such systems, the most important concepts are updating and response time. Once satisfied the most pressing operational needs, a new set of requirements on the systems of the company, which will qualify as informational needs. For informational needs, those that aim to obtain the necessary information for decision making both strategic and tactical. Those informational requirements are based largely on the analysis of a huge amount of data, which is so important to get a very detailed business value as the totalized value for it. It is also crucial the analysis of all the variables and environmental data. These requirements are not, a priori, difficult to resolve because the information is actually in operational systems. Any activity performed by the company is thoroughly reflected in its databases (Oracle, 2016).
The reality, however, is different since meeting the needs of informational type systems makers is facing many problems. First, to make massive information queries (to get the ratio, grouped or group of values requested value), the company could be harmed by the level of service from other systems, since queries the organization is taking, usually quite expensive in resources. Also, the needs are unmet by the limited flexibility to browse the information and its inconsistency due to the lack of a global vision (each particular view of the data is stored in the operating system that runs it).
In this situation, the step to meet the information requirements of the organizations was a twin environment operation, which is commonly called Information Center, where the information is refreshed with less frequency than in the operational environments and the requirements on the level service users. With the previous strategy, the problem of resource planning is solved because applications that require a high level of service using the operational environment and requiring massive information queries work in the Information Center. The benefit of the data warehouse is the no interference in operational applications.
The information remains in the same structure as in operational applications so that such consultations must access many places to get the desired data set. The response time to requests for information is elevated. Additionally, when conducting information from different systems, with different views and different objectives, it is often not possible to obtain the desired information in an easy way lacking the necessary reliability.
For the user, these problems translate into the unavailability in a time of the information requested and must engage more intensively to obtain the information that the analysis of it, which is where brings added value.
The concept of data warehouse emerges as a solution to global informational needs of the company. This term was coined by Bill Inmon. However, if the data warehouse were only a data store, problems would remain the same as in the Information Centers.
The main advantage of the data warehouse is based on the fundamental concept, the information structure. The storage of homogeneous and reliable information in a structure are based on consultation and treatment at the same time with a differentiated environment of operating systems. As defined by Bill Inmon, the data warehouse is characterized by:
■ Integration: The data stored in the data warehouse must be integrated into a consistent structure; the existing inconsistencies between the various operational systems must be eliminated. The information is structured in different levels of detail to suit different user needs.
■ Theme: Only the necessary data for the process of generating business knowledge are integrated from the operational environment. The data is organized by subjects for easy access and understanding by end users. For example, all customer data can be consolidated into a single table of the data warehouse. In this way, requests for customer information will be easier to answer because all information resides in the same place.
■ History: Time is an implicit part of the information contained in a data warehouse. In operating systems, data always reflect the state of business activity at present. On the contrary, the information stored in the data warehouse serves, among other things, for trend analysis. Therefore, the data warehouse is loaded with different values of a variable in time to allow comparisons.
Non-volatile: The stored information from a data warehouse exists to be read and not modified. The information is, therefore, permanent; meaning the update of the data warehouse incorporates the latest values taking the variables contained therein without any action on what already existed.
The decisions of the organizations are based on an analysis of multidimensional nature, trying to solve with technology not oriented to this nature. The multidimensional analysis is part of a vision of information as business dimensions. These business dimensions are better understood by setting an example of what we show, for a system of records management, hierarchies that could handle the number of the same for dimensions: geographic area, file type, and time resolution.
Ten years ago it was possible to say how many products were sold in a specific state of the country. Today not only managers can say the previous, but how many products were sold in a state, a county, a city, a local supermarket and a specific day.
Another feature of the data warehouse is that it contains data on the data, a concept that has been associated with the term metadata. Metadata enables maintaining information on the origin of the information, refresh frequency, reliability, and calculation method on data from the warehouse. The metadata simplifies and automates the collection of information from operational and informational systems.
The objectives to be met by metadata are:
• Endure to the end user, helping the organization to access the data warehouse with its business language, indicating what information is and what meaning does. Metadata helps to build queries, reports, and analysis through navigation tools.
• To support responsible technicians for the data warehouse in areas of auditing, historical information management, data warehouse management, development of information extraction programs and interfaces specification.
Extraction: The process consists of obtaining information from different internal and external sources.
Preparation: The process consists of the filtering, cleaning, cleansing, standardization and pooling of information.
Load: The process consists of the organization and updating of data and metadata in the database.
Land use: The process consists of the extraction and analysis of information at different levels of aggregation.
Differences of a data warehouse with relational databases could be summarized in the Table 1.
One of the keys to success in building a Data warehouse is gradually developing, selecting a user department as a pilot and gradually expanding the data store to other users. It is, therefore, important to choose the initial user or pilot, a department with few users, where the need for such systems is very high and can be obtained and measure short-term results.
Benefits of the relational database and data warehouse for the organization
■ They provide a tool for decision making in any functional area, based on integrated and comprehensive business information.
■ They facilitate the application of statistical techniques and modeling analysis to find hidden relationships between data warehouse; obtaining an added value for the business of such information.
■ They provide the ability to learn from past data and predict future situations in different scenarios.
■ They Simplify the company within the implementation of integrated management systems of the customer relationship.
2.-3.
Database schema explanation:
The proposed database schema is of a supply chain structure. The supply chain structure starts with the supply of the raw material to the delivery of final products of the organization. The database must be adapted to the information that the company generates to be effective. The database has seven tables: categories, products, orders, suppliers, employees, customers and shippers (Image 1).
2.-3. Database Schema (Image 1):
4.
The process starts with the requirements of the customers for a product of the company (Image 2). The customers generate a production order, according to the category of the product. The product requires resources for the production of human resources –employees and raw materials –supplies-. Once the product is ready, the product is delivered to the customer using a shipping provider (shippers).
Image 2: Entity – Relationship diagram
5. Data Flow Diagram
The data flow diagram reflects the flow of the information in the organization data base. The process starts with the order generation by the customer and ends by the delivery of the products from the company to the customer (Image 3)
Image 3: Data Flow Diagram
6. Data flow of the data warehouse
Image 4: Data warehouse dara flow
7. Project Plan
Reference List
Oracle. (2016). Data warehouse. Retrieved 14 May 2016, from Oracle: https://www.oracle.com/database/data-warehouse/index.html
Tech Target. (2015). Data warehouse concepts: Data flow and sending data to source systems. Retrieved 14 May 2016, from http://searchdatamanagement.techtarget.com/answer/Data-warehouse-concepts-Data-flow-and-sending-data-to-source-systems
Tech Target. (2015). Data warehouse definition. Retrieved 14 May 2016, from Tech Target: http://searchsqlserver.techtarget.com/definition/data-warehouse
Visual Paradigm. (2015). What is a data flow diagram (DFD)? Retrieved 14 May 2016, from Visual Paradigm: https://www.visual-paradigm.com/tutorials/data-flow-diagram-dfd.jsp