Describe the kinds of “big data” collected by the organizations described in this case.
The organizations described in Big data big rewards collect different kinds of big data. The British library collects data about the billions of searches performed by their website visitors. The library also preserves historically significant sites that no longer exist, for example, websites of past politicians. State and federal law enforcement agencies collect big data on all crimes and information about criminals such as suspect’s photos and their past offenses. They also have data on criminal organizations, illegal activities, and crime complaints. Vestas collects big data about the weather and the location they want to set up turbines. They have data on approximately 178 parameters such as humidity, wind direction, temperature wind velocity as well as historical data from all the 66 countries. Hertz collects big data from web surveys, e-mails, text messages, web site traffic patterns and onsite from all the outlets.
The business intelligence technologies described include IBM BigSheets. It is a technology that is built atop the Hadoop framework. The technology supports the efficiency and the speed with which bulk data is processed. The IBM BigSheets deliver results of what they have extracted, annotated and visually analyzed from the large volumes of unstructured data in a web browser. The IBM data warehouses are another business intelligence technology described in the case. The technology allows organizations to capture, store, manage, process and retrieve all sorts of data. The IBM InfoSphere BigInsight software is another business intelligence technology described in the case. The technology is supported by IBM’s x iDataPlex server that allows for high performance. The IBM InfoSphere has software that allows it to process big data for analysis and visualization. The system of software is powered by Apache Hadoop.
Why did the companies describe in this case need to maintain and analyze big data? What
business benefits did they obtain?
The British library needed to capture, maintain and analyze big data in order extract useful information from all the data. The library benefited by keeping up to date information that the visitors need. Finally, the library acts as a custodian of historical information that includes maintaining websites that no longer exist. The state and federal law enforcement agencies need to keep big data in order to be able to combat crime effectively. The benefits include the provision of critical information that is useful in determining and predicting criminal activities. Vestas keeps big data so that they can know the best places to put up wind turbines for generating power. Optimal placement of turbines is achieved through maintaining big data, and it allows the company to realize a return on investment much faster. The hertz company maintains big data so that they can be able to analyze customer feedback. The benefits include increased company performance and customer satisfaction.
Identify three decisions that were improved by using “big data.”
The first decision that was improved by using big data is the decision by law enforcement officers when deciding who is a suspect and who is not. The big data provides relevant information such as face recognition, address, and past offenses. The second decision is the optimal location to set up turbines. The big data allows reduced resolutions of wind data grids by nearly 90%. The third decision how to improve customer satisfaction at Hertz. The big data helped identify where the problems were and aided by providing useful information on how to resolve delays in customer feedback.
What kinds of organizations are most likely to need “big data” management and analytical tools?Why?
The kinds of organizations that most likely need big data management and analytical tools are those that deal with large amounts of data. The company’s performance is dependent on decisions made from the information extracted from past and present data. This is because they need to capture and store huge volumes of data, process the data for useful information needed in decision making. The organizations must have the information readily accessible whenever there is a need.
Define a database and a database management system and describe how it solves the problems of a traditional file environment.
A database is centralized data collected and organized to serve many applications while managing redundant data. A database management system is a software. The software is designed to allow centralization of data, increased efficiency in management and access to the data through application programs. The DBMS acts as an interface between the physical data files and application programs. A DBMS solves the problem of data redundancy that is common with traditional file environment. The DBMS does this by having a central database that stores different types of data unlike having data stored in multiple locations in the same organization. The DBMS also eliminates data inconsistency, because the database is central then all the information about particular aspects accessed by different users is the same.
The capabilities of the DBMS include data definition. The capability allows the system to specify the structure of the content of the database. The capability allows for the creation of database tables and for defining the characteristics of the fields in each table. The second capability is the data dictionary which is a manual or automated file. The data dictionary stores definitions of data components, elements, and their characteristics. For example, it will show name, description, type, size, authorization and business functions. Finally, the DBMS has the data manipulation language capability. It allows users to have information access and information manipulation in the database. The capacity is used to add, change, delete and retrieve data in the database. Examples include structured query language.
Define a relational DBMS and explain how it organizes data.
Relational DBMS is a digital program that allows an organization to create, update and administer a database on the relational model. The RDBMS organizes data as two-dimensional tables that are referred to as relations. The tables or files contain data and attributes on the entity. Examples of RDBMS are Oracle, Microsoft SQL server and DB2. The tables in RDBMS consist of a grid of columns or fields and rows of data. The data is organized such that each element is stored as a separate field which represents an attribute for the object. The rows in the system are records that contain the actual information about the entity. The table has a key field that supports retrieval, updating and sorting of the records. The primary key is a special and unique identifier for all the information in a row.
List and describe the components of a contemporary business intelligence infrastructure
Data warehouse that is a database that stores both current and historical data that potential decision makers might require. The data warehouse provides data to anyone interested but limits them on making any alterations. The data warehouse system provides standardized tools for querying, analyzing and reporting. The second component is a data mart which is a highly focused subset of data warehouse that is designed for specific users.The third component is Hadoop. Hadoop is a software that supports distributed parallel processing of large volumes of data across inexpensive computers. Hadoop supports handling of structured, semi-structured and unstructured data. The fourth component is in-memory computing that relies on a computers RAM. In-memory processing shortens query response times by storing huge sets of data in the memory. The fifth component is the analytic platforms that are optimized for analyzing large data sets. Examples include IBM Netezza and Oracle Exadata.
Define data mining, describing how it differs from OLAP and the types of information it
provides.
Data mining is a discovery-driven activity aimed at finding hidden patterns, data relationships in databases and making rules that can be used to predict future behavior. OLAP is concerned about multidimensional data analysis while Data mining deals with the analysis designed for forming patterns and rules that guide decision-making and predict the effect of the decisions. Data mining provides information on associations. This is information that shows how occurrences are linked to a single event. The second type of information is sequences, which is how events are linked over time. The third type of information is classifications. The group to which an item belongs to is described by recognizing patterns and by examining existing items. The fourth type of information is clustering that entails classification with no defined groups.