The History of Data Models
The history of database development is a history of the development of data management systems in the external memory of a computer. On the first electronic computers, there were 2 types of external devices - magnetic tapes and magnetic drums. Magnetic tape had a large enough capacity, but their main drawback was that you could not assess the required information easily. For example, it was necessary to read all the previous section before reading information stored in the middle or end of the tape. Magnetic drums were allowed random memory access, but the amount of information stored on them was limited (Aes.org, 2016). At that moment, there were no systems of data management created. If an application required to store data in external memory, it determined the location of data on a magnetic tape or a drum by itself. The functions of information exchange between RAM and external memory, naming and structuring of data were performed also by a special application.
The history of data models begins with the appearance of the magnetic disks. It has already more than 30 years. In 1968, the first commercial database was created - IBM company has developed the IMS system (Www-03.ibm.com, 2016). In 1969, there was the first RDBMS standard developed by the Association of Language Information Processing Systems - CODASYL (Conference of Data System Language) data model. This standard defined a number of fundamental concepts in the theory of database systems that are still fundamental to the network data models (Remote-dba.net, 2016). In 1981, EF Codd created the relational data model and applied it to the operations of relational algebra.
The following steps can be distinguished in the history of data models:
1. Files and file systems.
2. Databases Mainframe. The first database management systems (DBMS).
3. The era of personal computers. Desktop database.
4. Distributed database.
An important step in the development of information systems was the creation of the centralized file management system (FMS). These systems allow you to create, edit, copy and move files. At present, such systems are part of any operating system. The file management system performs the following functions:
· Distribution of external memory,
· Mapping of file names to the corresponding addresses in the external memory,
· Providing access to data.
However, FMS does not know the specific of any file structure. The organization of work with the file records is assigned to the application program, which works with the file. In addition, the decentralized access to the files is implemented in FMS - all the actions, which a particular user may perform, are entitled to a specific file, encoded and stored together with the file. Also, it was impossible to multiple users to work in FMS simultaneously with the same file.
The first database on a mainframe (such as the IBM 360/370, UCS, different models of Hewlett Packard) have appeared in the 70s of the last century. Databases were stored in the external memory of the central computer (EnterpriseTech, 2014). The users of the database were the tasks that were run mostly in batch mode. Interactive mode provided access via the console terminal, which did not have their own computing resources (CPU, RAM and external memory) and served only the input-output devices for a central computer. Database access programs were written in conventional programming languages. They were run as a usual numerical program. Databases of this period were working with a centralized database for distributed access mode. The allocation of resources management functions was performed by the operating system. In addition, databases supported data administration. At this time, a serious work on the foundation and the formalization of the relational data models was carried out.
The following types of data models will be considered in this paper:
Hierarchical
Relational
Network
Hierarchical Data Model
Hierarchical database model is the earliest representation of complex data structures. The information in a hierarchical structure is based on the basis of a tree structure, in the form of "parent-child" relations. Each entry consist of at most one “parent” record and several subordinates. The relationship between the entries are implemented by pointers from one entry to another. The main disadvantage of the hierarchical structure of the database is the inability to realize the relations "many-to-many", as well as a situation where entry has multiple predecessors.
Graphically, this structure can be represented as a tree of objects at different levels. The upper level takes one object, the second takes the second-level objects and so on. There is a relationship between the objects: each object can include multiple lower-level objects. Such objects are in relation of parent (object closer to the root) to child (lower level object). It is possible that a parent object has no children or has multiple children, while the child object should necessarily have only one parent. Objects that have a common parent are called twins.
The main disadvantages of the hierarchical model are the following: inefficiencies, remote access to segments of the data hierarchy of the lower levels, a clear focus on certain types of queries, etc. Also, the disadvantages of a hierarchical model are its bulkiness for information processing with fairly complex logical connections, as well as the complexity of understanding for ordinary user. Hierarchical databases quickly passed the peak of popularity, which is responsible for their early appearance on the market. Then their shortcomings made them uncompetitive. Currently, hierarchical models are only of historical interest (Quickbase.intuit.com, 2016).
Relational Data Model
The relational model is focused on the organization of data in the form of two-dimensional tables (Ecomputernotes.com, 2016). Each relational table is a two-dimensional array and has the following properties:
Each entry in the table is a data element
Each column has a unique name
The identical rows in the table are missing
All the columns in the table are homogeneous, i.e. all the elements in a column are of the same type
The sequence of rows and columns can be arbitrary
Relational database management systems that aimed at the realization of operating data processing systems are less effective in analytical processing applications than the multidimensional databases. This is due to the presence of quite severe restrictions imposed by the current implementation of the SQL language. An example of such real-life limitation is the assumption that the data in a relational database is unordered (or more accurately, arranged randomly). At the same time, ordering requires additional time to sort for each access to the database. In analytical systems, enterings and samplings are carried out in large portions. The data after entering the database remains unchanged for a long period of time. It should be noted that the data storage in the form of partially denormalized tables is more effective. In these tables, the productivity is increased by storing not only detailed, but also pre-computed aggregations. For navigation and selection, can be used specialized methods of addressing and indexing that are based on the assumption of a low volatility and inactive data in the database. This way of organizing data is sometimes called precomputation. This method differs from the normalized relational approach, involving the dynamic calculation of various types of outcomes (aggregation) and the establishment of links between the details of the different tables (connection operations).
In addition to low efficiency, which was mentioned earlier, the shortcomings of traditional relational databases is the fact that the primary (and often the only) mechanism to ensure a quick search and retrieve individual rows from a table (or tables that are related through foreign keys) is usually various modifications of indices, based on the B-trees. This solution is effective only when small groups of records are processed and the data modification intensity in databases is high.
Network Data Model
Network database is a database management system that supports the network organization: any record, which is called parent record, may contain data relating to a set of other records, called child records (Ecomputernotes.com, 2016).
A typical representative is the Integrated Database Management System (IDMS) has appeared in the 70s (Dictionary.com, 2016). The network approach to the organization of the data is an extension of hierarchical. A network database consists of a set of records and a set of relationships between the records. There are no special restrictions imposed on the formation of the relationship between the records. In the hierarchical structures, a child record must have exactly one parent, but in the network structure, data may have any number of parents.
The advantage of the network data model is the ability to effectively implement the memory in terms of cost and efficiency. In comparison with the hierarchical model, network model provides a great opportunity in terms of the admissibility of the formation of arbitrary connections. As part of the network database, hierarchical models can be easily implemented. Network DBSMs support complex relationships between data types, which makes them useful in various applications. Thus, the main advantages of network database include the following:
Processing of large volumes of information (the possibility of building on the basis of the database "data warehouse");
Support of analytical data;
Effective implementation of the data in terms of memory consumption and responsiveness.
The users of of network database are limited by the relationships that are specific to their database application developers. Like the hierarchical, network databases suggest developing database applications by experienced programmers and system analysts.
Another disadvantage is that a network data model is of high complexity and rigidity of the database circuit, which is built on its basis, as well as the difficulty in understanding and the information processing in the database by a regular user. In addition, the control of the integrity between the connections in network databases is weakened due to the admissibility of arbitrary relationships between the records.
References
Aes.org,. (2016). The History of Magnetic Recording. Retrieved 28 February 2016, from http://www.aes.org/aeshc/docs/recording.technology.history/magnetic4.html
Dictionary.com,. (2016). the definition of integrated database management system. Retrieved 28 February 2016, from http://dictionary.reference.com/browse/integrated-database-management-system
Ecomputernotes.com,. (2016). Network Model. Retrieved 28 February 2016, from http://ecomputernotes.com/fundamental/what-is-a-database/network-model
Ecomputernotes.com,. (2016). Relational Model. Retrieved 28 February 2016, from http://ecomputernotes.com/fundamental/what-is-a-database/relational-model
EnterpriseTech,. (2014). IBM System/360: The Original Enterprise Tech. Retrieved 28 February 2016, from http://www.enterprisetech.com/2014/04/08/ibm-system360-original-enterprise-tech/
Quickbase.intuit.com,. (2016). A Timeline of Database History | Intuit QuickBase. Retrieved 28 February 2016, from http://quickbase.intuit.com/articles/timeline-of-database-history
Remote-dba.net,. (2016). The CODASYL Network Model - CODASYL and the Object Database Management Group (ODMG). Retrieved 28 February 2016, from http://www.remote-dba.net/t_object_codasyl_network.htm
Www-03.ibm.com,. (2016). IBM100 - Information Management System. Retrieved 28 February 2016, from http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/ibmims/