An entity relationship model/diagram commonly referred to as ERD is a data model method that represents system entities and relationships between the entities. When ERD created is by analysts, it enables understanding of business model deeper and links business models to database. The ERD theoretically represents model of records used in representing entity frameworks. The major and important elements in an ERD are entities, attributes plus relationships. An entity refers to a thing or a concept that has distinct and independent features or characteristics, for example a car. An attribute refers to the characteristics that an entity thing contains. For example the car can be turned on, can move or can stop. A relationship is a rational concept that links two entities in a database in order to maintain referential integrity (Kambayashi, Winiwarter, & Arikawa, 2002). Relationships allows a relational database to contain and split data into different tables but at the same time linking dissimilar data items.
Creating of an ERD model is an iterative procedure that first involves creation of entity concepts that relevant to your business model. Then you should identify significant attributes to that are within the identified entities. Don’t forget to now identify primary keys in the attributes that you have created. They will uniquely identify an entity in a database. Then you should model relationships that are used to connect related entities. The relationships should be labeled using numeric notations or verbs. Now consider consulting technical and business stakeholders to review your model and repeat until your domain is fully representing your model. When all that is done, draw an entity relationship diagram using rectangular to represent entities and lines to represent relationships. When that is complete view the diagram form the user’s point of view (Allen, Chatwin, & Creary, 2004)
When developing an ERD model several risks may occur if the steps for modeling are skipped. Theoretical background is important for understanding the requirements for coming up with a proper ERM diagram, for example the basic knowledge of relational databases. Some of the mistakes done can include using of invalid names to name your entities or attributes. Use of words that can contradict or conflict database keywords for example giving an attribute a name like primary. Primary is a reserved key name in database creation. Not considering possible increase in volume of data. Naming choices in modeling a ERD does not only enable identifying the purpose of an object but also allows future users or programmers identify the purpose of the object used. If you are modeling a project based on another older system, you can use the system to check previous data volume. For example, in developing the human resource database, you may realize an increase in job activities in the government, therefore instead of just having an erd with only a job entity, consider adding department entity in your model. Indexing in creating an ERD is when you identify primary keys and foreign keys among the attributes of each entity. Poor indexing can lead to deprived ERD diagram and consequently developing a poor database for an organization. The other risk can be collation of words that you use to name entities and attributes. It is an important concept that is widely ignored or unknown. Poor collation affects the final database functions that were created based on the developed ERD. This affects the sorting of words in a database. Proper order of words in a database are determined by organization/collation (Shoval, 2007).
The five major entities that may be required to develop the repository are employees (Employees), training history (Training_History), departments(Department), performance review (Performance_Review) and employee skills (Employee_Skills). Employee entity will be used to store employee data for example id, name, gender, date joined and left, address and employee details. Training history will have data such as employee id, course and date of course. The performance review will contain employee id, date of review, comments by employee and reviewer. Department will have data such as department id, name address and details of the department. The employee skill entity which are very important in determining if he/she can be hired will contain employee id, skill details, skill required and skill level code. This five entities can also be accompanied by other entities namely, location, regions and reference courses. This entities will help in developing a government human resource database that will be universal. The location entity will hold data such as location id, location name, address and details. The region will contain region id, region name and description. The reference course will be a place holder for required courses. The entity will hold course id, name and description. All the entities will be well interrelated since in each we have foreign keys representing other tables. This is important in maintaining referential integrity.
Data warehouse focuses on changes over time and that is what is called time variant. Data warehouse is referred to as a collection of data so as to support management decision making procedure. Data warehouse acts as repository of historical data. For example our client who is the government can salvage data from three, six, twelve or more months from a data warehouse. According to (Wang, Zhou, & Le (2014). Policy enforcement and training management is able to be tracked overtime by been stored in the database. But this is only possible by having specific database components. For this we require, policy database to collect settings for the software and policy information, policy brokers for to manage requests from web sense, policy servers for identifying and tracking locations and master database to include multimillion data size. Data marts is very important in big organizations since it is a concept designed to cater for particular line of business (Kambayashi, Winiwarter, & Arikawa, 2002).
Normalization is simply the process of consolidating columns / attributes, tables / relations of a database to reduce or completely avoid data redundancy and maintain data integrity. The following is the normalization of the human resource database.
Employee Table
1NF - Employee ID, First Name, Middle Name, Last Name, gender, data of birth, date joined, date registered, date hired, date left, employee address, phone
2NF - Employee ID, First Name, last name, gender, data of birth, date joined, date registered, date hired, date left, employee address, phone
3NF - Employee ID, First Name, last name, gender, data of birth, date joined, date left, employee address.
Performance Review table
1NF - Review id, Employee id, first review date, final review date, comments by Employee, Comments by employee, reviewer name.
2NF - Review id, Employee id, review_ date, comments by Employee, Comments by employee, reviewer name.
3NF - Review id, Employee id, review_ date, comments by Employee, Comments by reviewer,
Training History table
1NF - Employee id, name of course, course id, date of course, Course description
2NF - Employee id, course id, date of course, Course description
3NF - Employee id, course id, date of course,
1NF - Department id, location, department name, department address, number of employees, details, number of departments
2NF - Department id, location, department name, department address, details, number of departments
3NF - Department id, location, department name, department address, details
Dependency Diagrams
Multivalued dependencies
Reference
Allen, C., Chatwin, S., & Creary, C. (2004). Introduction to relational databases and SQL programming (1st ed.). Burr Ridge, IL: McGraw Hill Technology Education.
Kambayashi, Y., Winiwarter, W., & Arikawa, M. (2002). Data warehousing and knowledge discovery (1st ed.). Berlin: Springer.
Shoval, P. (2007). Functional and object oriented analysis and design (1st ed.). Hershey, PA: Idea Group Pub.
WANG, M., ZHOU, J., & LE, J. (2014). A Data Reusing Strategy in Column-Store Data Warehouse. Chinese Journal Of Computers, 36(8), 1626-1635. http://dx.doi.org/10.3724/sp.j.1016.2013.01626