1. INTRODUCTION
Data classification is the main means by which data is protected based on its need for secrecy, sensitivity or confidentiality. It is ineffective to treat all data the same when crafting and implementing a security system since some data items need more security than others. Data classification is used to establish how much effort, money and resources are allocated to guard the data and manage access to it. The main objective of data classification schemes is to sanctify and stratify the procedure of securing data based on assigned labels if importance and sensitivity. Data classification is used to offer security mechanisms for storing, processing and transferring data.
The criterion by which data is classified vary based on the institution performing the
classification. Data classification schemes are more common in government/military classification and commercial business/private sector classification.
2. DESIGN OR METHODOLOGY
Many classification methods have been proposed by researchers in machine learning, pattern recognition and statistics. Most algorithms are memory resident, typically presumptuous of a small data size.
Classification has numerous applications including fraud detection, performance prediction, manufacturing and medical diagnosis.
The use of Bayesian classification is based on bayes’ theorem of posterior probability. It assumes class conditional independence, that the effect of a quality value on a given class is independent of the value of others.
The other method used in classification is prediction method. The prediction method is one that data classified is only through future predictions of their importance and role in their company or data bases.
3. EVALUATION CRITERIA
In the evaluation of data classification in an institution they should be able to first identify the information that need to be classified and by doing this is done mostly by gathering all the data through use of questionnaires, interviews and doing surveys to obtain the information. It helps in giving high level of the company information, people who own the data, those individuals responsible in maintaining of the data and the type of resource used to store the data and it can be classified into groups or separately.
Information can also be protected from various sources like security policies of the institution and informal data approaches they should be able to promote confidentially by ensuring that individuals who want to access the information are who they claim to be by providing an identification card or the password they use. Access information should be provided based on job function or business need where the owner is supposed to authorize business need. An institution should be able to incrypt their data to avoid it from being viewed inappropriately or altered without detection for sensitive and personal information which should be protected. Data should be well safeguarded by monitoring them properly and preventing easy access of the information.
In evaluating criteria of data there should be a way of identifying information classes since critical and sensitive data has different meaning to different people and this helps both those protecting and classifying the information. Before classifying data the institution should ensure that protection measures are well mapped to reflect their goals and what they want to achieve and this ensures that protection is appropriate for the information. Individuals doing data classification should have speed which helps in cutting cost of the work they are doing and use of classification accuracy which helps in prediction of new and previously unseen data. It should be able to make correct prediction when given data that has missing words or values and should also be able to work when given large amount of data.
4. CONCLUSION.
In conclusion data classification is a technique used to safeguard and control information accessibility. Companies use this as a way of improving their product market and their target market. The use of this technique has also been used by several banks in categorically giving out of loans. Efficiency of data classification can subsequently improve on the company’s manufacturing know how thus improving on productivity. As this has been adopted by many organizations it is important for companies to be able to safeguard such information as it may be confidential.
5. REFERENCES
1. Goodman, J. A. (2009). strategic customer service: Managing the customer experience to increase positive word of mouth, build loyalty and maximise profits. Amacon division american management assocation .
2. james M stewart, E. T. (2011). CISSP: certified information systems security proffessinal study guide. john wiley and sons.
3. Jiawei Han, M. k. (2011). Data mining concepts and techniques. the morgan kauffman series in data management systems. elsevier.
4. leung, y. (2009). knowledge discovery in spatial data: advances in spatial science. springer.
5. Singh, D. (2003). practical statistics 2 vols set. antlantic publishers & distirbutors.