Businesses tend to utilize data at their disposal to influence the strategic competitive edge. This calls for database systems that transform raw data into meaningful information. Many organizations employ the use of relational databases in spite of enormous volumes and sources of data to be analyzed. Acting in the capacity of a CIO of Celtic Analytics Ltd, I intend to come up with a database that collects data by use of the following mechanism;
- Web Analytics
- Operational analysis.
Data warehousing is a process of storing the entire company’s data in a big repository meant to provide seamless integration and guide in the process of decision making. In a data warehouse, operational data needed for business operations are continuously changed. Tables in this database are continually refreshed by removing old data. On the contrary, patches of old data are still available and their accumulation leads to voluminous amounts ranging in terabytes. This distinct feature has enabled many organizations to utilize the massive storage capabilities of warehouses to realize remarkable benefits in the way they handle their data.
A schema representing the sample database for a data analytic company is as shown below.
Fig 1
The process of analyzing data for decision making purposes in the business becomes simplified due to the decision support tools associated with the warehouse. Thus generally, data warehousing is proven to be superior than and more useful than executive information systems and decision support systems.
The process of designing a database is not exceptionally different from the standard software development process. The stages include,
- Problem definition.
- Analysis and requirement.
- Data modeling.
- Design and prototyping.
- Development and documentation.
- Test, review and operation.
- Design best practices.
In the design process, the design team comprising of programmers, database administrators and information security experts should critically evaluate each and every phase to come up with a standard warehouse that satisfies the organization's objectives.
Among the best practice considerations include;
The data house is structured into functional groups or specialty areas such as employee details, departments or project characteristics. Using an entity relationship data model, abstract schemas are designed and denoted with various entities. Likewise, in object-oriented data model, objects are denoted with different classes with a provision that segregate functional primary data from processes that are involved in creation modification of such data. The functional areas are primarily mapped out during the problem definition stage and should be independent of other processes. This is one crucial stage that should not be overlooked by the design team in order to produce a functional database.
Celtic Analytics desires a database that can be easily integrated. The design team should therefore standardize common data presentation to allow mapping.
A data warehouse that supports both distributed and centralized data is beneficial for a growing business such as Celtic Analytics. Data distributed over the network should be readily supported by the system.
The primary objective of any design is to achieve usability. Data warehousing is no exception. The user interface should be designed in a way to deliver better user interaction as well as effective system usage. SQL is the most preferred interface for a data warehouse system because of its unique interface and multidimensional view of relational data. In addition it provides the options for retrieving, analyzing and formatting data.
Another important characteristic of a well-organized warehouse is history rewriting capabilities. The design team should concentrate on designing and implementing the if analysis to achieve successful data rollback. To achieve this, designers should consider introducing data in appropriate granularity level to permit administrators to update the rights of historical data.
Lastly, the warehouse should be designed in a way that allows flexibility. Considering the rapid growth of the company, data schemas should always have the capacity to accommodate new incoming data. Celtic Analytics is continually craving for faster data retrieval mechanisms, thus granularity and the sort of the data should be critically well thought-out.
ENTITY RELATIONSHIP
In order to develop an entity relationship diagram, different aspects are put in place. To start with, we define the major business that needs to be automated. This forms the entities. Attributes are defined by the various storage fields and tables in the database. In the case of Celtic Analytics we are going to implement a data warehouse that bears the resemblance to a relational database system.
The process will involve the translation of ideas into sophisticated plan that will permit the development team to put together a relational data warehouse schema in lieu of the key functionality of the warehouse.
Figure 2 ENTITY RELATION DIAGRAM
Data is transferred from various databases to the new created warehouse where cleaning, verification and validation are conducted before it is finally stored in designated marts. The various sources of data represent sources such as manufacturer’s, customers, suppliers, the management and other third party conduct. For instance the employee table link to the particular data marts.
We can conclude that the data warehouse in an organization such as Celtic Analytics is equipped with tools that ensure that data is organized and scalable at any given moment to put up with the dynamic operational data. Operational data systems try to minimize their tables as much as possible by scrubbing out old data. However, the warehouse accommodates patches of data which swell with time to reach trillions of bytes. The massive storage capabilities in a warehouse can easily accommodate this amounts of data.
An illustration of a data flow diagram showing the flow of data from clients into the company and back. Data to be analyzed is feed into the company in form of raw data from the web as well as operational data. Web features includes site visits, web log files, PHP and JavaScript pagetags. Operational data include data from financial systems, business related metrics and other sources. After rigorous analysis using our tools, the results in form web report analysis, marketing report analysis, and e-commerce report analysis among other deliverables are delivered to the concerned parties.
Fig. 3 DATA FLOW DIAGRAM
References
Harrington, J. L. (2002). Relational database design clearly explained . Morgan Kaufmann,.
Shivaraju. (2012). Database Systems: Design, Implementation, and Management [With Access Code]. Cengage Learning.
Sikha Bagui, R. E. (2011). Database Design Using Entity-relationship Diagrams. CRC Press.