Data mining refers to the computational process useful for the discovery of large sets of data, which employs methods on knowledge for artificial intelligence, statistics analysis, machine learning and database systems. The data mining process aims at extracting information from a set of data to transform it into a clearly understandable form that is useful for further analysis. Apart from the raw analysis move, the process involves database and data management processes, data pre-processing, model and interface considerations, interestingness metrics, considerations for complexities, post processing for obtained structures, visualization, as well as online updating (Frand, 2013).
On the other hand, data analytic is a process that involves exclusive analysis of the data that is acquired through data mining. The process relies on the raw data to transform it to highly reliable data useful for major purposes. The process is differentiated from data mining following the scope, purpose, as well as focus for the process. Data miners go through massive data sets with the help of refined software to come up with the undiscovered patterns and identify the unseen connections. In short, data analysis aims at inference, which is the process of coming up with a conclusion depending only on what the researcher already knows. Data analysis has two main components of exploratory data analysis (EDA), which involves discovery of new data and confirmatory data analysis (CDA), relevant for proving the existing hypotheses to be true or false.
The two processes are extremely useful in fighting terrorism. It is clear that terrorism is a major security threat for the globe. Following the complexity of the plans and the attacks of the terrorist's exclusive processes are relevant to fight the rate at which the level of crime rises in the ladder of terrorism.
Data mining is excessively relevant since it would be useful in identifying any data connected to the attack and through proper analysis, it would be easy to identify the source of certain attack. There are different forms of terrorism evident in the world. Each of them uses a unique strategy to reach the goal of attacking a certain people. They employ unique strategies that may be hard to track and bring down. Therefore, proper understanding of data mining and analytics would be relevant in dealing with the cases as they come up (Bearson, & Hearling, 2010).
In most cases, terrorists are using the sophistication of the technology to achieve their attack targets. In order to narrow down to the suspects of a crime, the activities, links, associations, as well as relations of the criminals could be easily realized through the data mining and analytic processes.
The first and most useful data mining process is link analysis technique or the data dependency modeling. The process is highly significant in detecting valid and relevant patterns. The process works with the framework that events will always be linked to one another and thus mutually exclusive (Bearson, & Hearling, 2010).
When any form of link is visualized, it gets into a form of the graph, where the link analysis applies various graphs for theoretic techniques. The process falls to the analysis of the graphs. The analysis considers the anomalies and inconsistencies in the system. The analysis helps in identifying the network relationships, as well contacts hidden in the data (Brown, 2012).
Link analysis is the first level through which networks of people, places, vehicles, telephone calls, emails, as well as other tangible entities are discovered, linked , assembled, examined, detected ,as well as analyzed. With excellent link analysis, strategy the law enforcement agency will have to mine data connected to certain suspected terrorists. There must be proper indication of how certain activities such as telephone calls link to certain places and occurrences at a certain time (Schmitt, Sanger& Savage, 2013).
Through the graphs identified in the link analysis, it is easy to locate the connection for all the activities. It is extremely easy to analyze the individual activities through formation of link for their activities. The links may take the form of conversations through the phone, and places visited, emails sent and received, as well bank transactions. Through the link, analysis algorithm links between suspected individuals are built to allow excellent move to note down the suspects.
Since the link analysis process will table a report to the security team, the necessary step must take place. In most cases, a link analysis report will involve names of suspects for certain terrorist attack or planning. The suspects are subject to further investigation by the body in charge of terrorist attacks. The team employs its ability and techniques to understand certain forms of terrorist attacks and connect it with the data or information obtained from the link analysis team (Harress, 2012)
If the findings of the process are clear, the security team undertakes the role of trailing the suspects at the places linked to their activities. They use all the necessary and available techniques to ensure they get hold of the main suspects , and try to identify some facts from the connections that would be relevant to conclude that the individuals are fully linked to the attacks.
The next data mining technique relevant in dealing with terrorism is clustering. It is the process of discovering groups as well as structures in the data that are related in various ways. The process does not necessarily apply known structures in the data.
Similar objects are placed in one cluster while dissimilar objects are grouped depending on their characteristics. Therefore, clustering is a reliable method of automatically segregating an individual into a certain group. Depending on the characteristics that may be identified in the segregation certain data occurrences can be put into a detailed surveillance.
Clustering is a common strategy in the modern world, and only that people do not realize its significance. People always come together depending on certain qualities, attributes, as well as characteristics. It is always easy to find people from the same religion, country, race, or tribe clustering together. Therefore, the main aim of data clustering is that it allows people to build simpler and more understandable models of the world, which can be executed more easily.
In fighting crime, clustering involves techniques and algorithm that depends on the real-life model in which individuals with certain elements must cluster together. In the terrorism, the activities of the terrorists can be clustered together. The incidences of cyber terrorism may be clustered together with all the elements that relate with the terrorism act. There is an assumption in clustering that dictates that terrorists with certain specialties will cluster together.
However, the different groups must show extensive connections amongst themselves to help them in executing the terrorist attacks in the right manner. In this case, they seek to achieve excellence in their activities.
If the security body has proper knowledge on the clusters available related to certain terrorist attacks, it will be extremely easy to curb their activities. It will be extremely easy to identify their clusters, as well as their operational areas. Therefore, any time an attack is committed, the law enforcement agencies can connect with the related clusters and examine them for clues. If the analysis matches the clusters, it will be extremely easy to define the approach for the attack.
However, at some point it becomes confusing to the security agency when two attacks occur at the same time. Terrorists use the multiple attack strategy to cause confusion among security bodies. They believe in causing confusion in the regions of attack. However, with proper clustering strategy it will be easy to connect certain attacks to a terrorism group. It becomes extremely easy to track the groups if their attack formulae are clear to the security agencies.
Classification is the other data mining method useful in fighting terrorism. The process has some similarities with the clustering process. It is a method for identifying sub-populations for a new observation. The sub-populations consist of observations for familiar membership categories. For example, various people may classify an email from a terrorist group as a spam. The traits of the email are the proper representation of activities of the terrorist group. Therefore, it will be easy to trail the group of people through the email address that they use.
The terrorists must be analyzed through a set of quantifiable elements famous explanatory variables. The process applies the classifier technique, which may involve a mathematical function applicable through the algorithm technique that maps data to a category useful in defining certain rules. The classification method of dealing with crime is substantial in the sense that it defines excellence in bringing down terrorists.
Terrorists in the same classification are likely to cause the same damage and use the same attack strategies. Therefore, it will be extremely easy to get them and identify with all the activities in which they may be involved.
The other data mining method useful in fighting terrorism is regression. It is a data mining model that focuses on the relationship between data. The data contains minimum error, and it is free from manipulation by an individual. The method is highly useful in defining significance for research and analysis.
The method helps in digging a set of information related to the attack and using it for the relevant purpose of proposing reasonable solution to the problem. He regression process may take some time as it models a mathematical function with the least error to pose an explanation for an issue that comes up in the society.
The accuracy of the model makes its highly reliable among individuals and groups. The method is free from stakes that may be evident in other mathematical models. For terrorism threat, the method is useful in keeping track of the activities of the group of the people as well as defining the relationship among the suspects. Proper analysis of the probable connections of certain terrorist group to an attack defines the significance of an excellence regression model to solve the problem (Montalbano, 2011).
There must be proper outline of the activities and how they are linked to the terrorism activities of certain people. The connection must remain concrete and viable to allow proper analysis with the regression model. The model works with the existence of dependent and independent variables. There are the standing factors of the terrorist groups that cannot be changed by anyone such as committing attacks and targeting innocent people. However, the different terrorism attacks call for different attack strategies, which may be the dependent variables.
In addition, automatic summarization data mining method may play a critical role in controlling terrorism. It is the process of reducing a text document into a substantial summary through a computer program to retain the most important points of the original document. As time elapses, the problem of data overload continues to affect any research. There is a lot of information hanging around and if not used on time it may be irrelevant. Therefore, it becomes highly advisable to have such information summarized and used for the right purposes.
Summarization in curbing terrorism may be of two forms. The first one is extraction. It is a method where an investigating team may select a subset of existing activities of the terrorists, or their processes in the original text and define the necessary summary. On the other hand, the abstractive method develops an internal semantic approach and then apply natural language generation methods to develop a summary that is near to what human beings may generate.
The research by abstractive method is extremely significant in dealing with the rising levels of terrorism. It defines reliable techniques of using the little information concerning a group of people by fashioning or defining the traits of the group and comes up with proper analysis of the most efficient strategies to fight the group. With proper understanding of the processes and activities of the terrorist groups, it would be extremely easy to deal with the problems that they cause (Rouse, 2013).
The high level of technology such as the use of internet surveillance system by the government of the United States to control the flow of terrorism messages is also paramount. The method is known as the Prism. It follows the flow of email to certain accounts and accesses the data in the email to use it in defining a legal process for the terrorists. The method identifies certain email addresses through the methods of clustering and classification to fashion exclusive research and findings of the process (Frand, 2013). The prism works best since it is able to locate the origin of the messages and can easily track the individuals who are involved in the planning or executing the respective attacks.
After the data mining process, proper analysis of the data may be useful in defining the reliability of the mined data. In most cases, this process involves further identification of data or search for the forgotten data to maximize on output data useful in defining significance. On issues regarding terrorism, data mining would be highly reliable since it would aid in influencing excellent decisions over insecurity issues. The analysis method may use several technological techniques to deal with the issues as they arise.
In that case, a software like the SPSS becomes highly reliable in executing the analysis. SPSS is one of the most efficient and reliable software useful in data analysis. In the analysis for insecurity issues as a result of terrorism, the software would be useful in defining the different elements that bring relationship of a certain attack with the other attacks that had been caused by the terrorist group. The software gives accurate analysis, which could be relevant to make conclusions over certain issues that are likely to arise in the society. The software has become common among terrorism analysts (Li, 2013).
In addition, the National Counterterrorism Center (NCTC) developed a software known as the Datasphere. It was developed in 2010 with the aim of it assisting the intelligence agencies in retrieving and analyzing intelligence information. The software mainly relies on data from the national intelligence headquarters through the director.
Datasphere software employs excellent analysis tools on existing data about famous suspected terrorists and their associates. It aids in detecting the patterns in the data that connects the individuals to events and actions (webdatamining.net, n.d.).
The software identified a set of people that fit the parameters associated in a threat-intelligence communication based on the data mining report. Intelligence powers can use the information in their investigations for the terrorist attacks that are likely to occur (Lee & Stolfo, 2011).
Also, the Intelligence Advanced Research Projects Activity (IARPA) has been developed, and it is useful in exploring the innovations in data mining to collect and make analysis for the intelligence information. The program is highly reliable and makes relevant use of the available data to define excellence in the intelligence processes (Lee & Stolfo, 2011).
For extensive analysis and solution to the terrorist attacks that are being witnessed at the current times, the Knowledge Discovery and Dissemination (KDD) is highly relevant. The program is useful in the dissemination of information from large, complex, as well as varied data sets so that they may be integrated with the other data sets that are already in the application (Okonkwo & Enem, 2011).
The project also works with analysis tools that are highly useful across data sets once they are in proper alignment based on the intelligence report. In most cases, the IARPA and KDD programs work together. They require the contribution of each other in defining excellence or success for continuity in undertaking the necessary intelligence analysis processes (Okonkwo & Enem, 2011).
Lastly, the Automated Low-level Analysis and Description of Diverse Intelligence is a common intelligence analysis tool. It is reliable in a set of ways. The first significance of the tool is that it is relevant as it works as a video-query program that aims at replacing a manual process already in the application. The program is useful in allowing the intelligence analysts to search large video data sets as well as locate clips quickly and reliably in support for a certain type of event (Lee & Stolfo, 2011).
The above methods of intelligence analysis are highly reliable and would be of extensive use in reaching concrete decisions regarding terrorism in the society. The programs are designed in such a way that they can easily track processes, procedures, and the individuals executing them in the trail for terrorists.
However, it would erroneous to conclude that the data mining and analytic processes are highly reliable or useful in defining the required change in the terrorism threat. The data mining and analytic process is highly reliable and would serve the purpose of proper tracking of terrorists until it curbs some of the challenges that surround it.
There are numerous challenges that cover the process, and they must be dealt with accordingly. At no point will the challenges are ignored since they would make the intelligence processes null and void.
Although, the technology in data mining and analysis is of the modern times, it requires extensive expertise to handle. The element defines the first challenge of data mining and analytic which is the personnel. There are no well trained individuals to handle the risks of the data mining and analytic processes. Expertise and knowledge for the field are relevant as they mean excellence for the programs or software that the intelligence services use (Oracle, 2014).
Data mining requires skilled technical and analytical specialists who can structure the analysis as well as interpret the output that is created. Consequently, the challenges of data mining are primarily data or personnel connected, rather than technology connected (Hinman, 2013).
The other limitation of data mining is the data not already at hand. In most cases, the individual and corporate data that may be required to track the suspected terrorists, as well as crime perpetrators, are not available. Patches of data that may be available are not relevant enough to deal with the level of terrorism (Hinman, 2013).
Also, it is wrong to assume that the terrorist do not understand the level of technology useful in tracking them. Their knowledge on the methods is a major challenge since they would use it to counter the intelligence services.
Lastly, some data is immense and may require exclusive research and analysis. The data may demand more resources, and a lot of time may be wasted in the process. On the other hand, the data may lack thoroughness following its complexity (Fan, Liu & Han, 2010).
In conclusion, the process of dealing with terrorism through data mining and analytic approach is advisable. The process consists of extensive forces and creativity in trying to access massive information regarding certain processes. Data mining goes into the details of acquiring sensitive information concerning a group of people and using it to reach the people. The methods chosen must provide relevant information useful in reaching the group. However, data mining requires proper analysis tools to ensure reliability of the data obtained. The data analysis techniques try to gather more data as well as use the one used in data mining process. However, the processes are full of challenges, and it requires further investments to match the challenges.
References
Bearson, A., & Thearling, K. (2010). An Overview of Data Mining Techniques. An Overview of Data Mining Techniques. Retrieved October 9, 2014, from http://www.thearling.com/text/dmtechniques/dmtechniques.htm
Brown, M. (2012, December 11). Data mining techniques. Data mining techniques. Retrieved October 9, 2014, from http://www.ibm.com/developerworks/library/ba-data-mining-techniques/
Fan, J., Liu, H., & Han, F. (2010). arXiv.org > stat > arXiv:1308.1479. [1308.1479] Challenges of Big Data Analysis. Retrieved October 9, 2014, from http://arxiv.org/abs/1308.1479
Frand, J. (2013). Data Mining: What is Data Mining?. Data Mining: What is Data Mining?. Retrieved October 7, 2014, from http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm
Harress, C. (2012, February 18). Obama Says Cyberterrorism Is Country's Biggest Threat, U.S. Government Assembles "Cyber Warriors". International Business Times. Retrieved October 6, 2014, from http://www.ibtimes.com/obama-says-cyberterrorism-countrys-biggest-threat-us-government-assembles-cyber-warriors-1556337
Hinman, H. (2013, July 23). 9 Data Mining Challenges From Data Scientists Like You. 9 Data Mining Challenges From Data Scientists Like You. Retrieved October 9, 2014, from http://1.salford-systems.com/blog/bid/305673/9-Data-Mining-Challenges-From-Data-Scientists-Like-You
Lee, W., & Stolfo, S. (2011). Data Mining Approaches for Intrusion Detection1.Data Mining Approaches for Intrusion Detection. Retrieved October 9, 2014, from http://www.cs.columbia.edu/~wenke/papers/usenix/usenix.html
Li, J. (2013, April 12). Quantitative Data Analysis Techniques for Data-Driven Marketing. iAcquire Quantitative Data Analysis Techniques for DataDriven Marketing Comments. Retrieved October 9, 2014, from http://www.iacquire.com/blog/quantitative-data-analysis-techniques-for-data-driven-marketing-2
Montalbano, E. (2011, May 12). Government Developing Data Mining Tools To Fight Terrorism - InformationWeek. InformationWeek. Retrieved October 9, 2014, from http://www.informationweek.com/applications/government-developing-data-mining-tools-to-fight-terrorism/d/d-id/1097714?
Okonkwo, R. O., & Enem, F. (2011, March 21). Information Technology for People-Centred Development (ITePED 2011).COMBATING CRIME AND TERRORISM USING DATA MINING TECHNIQUES. Retrieved October 9, 2014, from http://www.ncs.org.ng/wp-content/uploads/2011//ITePED2011-Paper10.pdf
Oracle. (2014). 1 What Is Data Mining?.What Is Data Mining?. Retrieved October 8, 2014, from http://docs.oracle.com/cd/B28359_01/datamine.111/b28129/process.htm#DMCON002
ResponsibleConductinDataManagement. (n.d.). Data Analysis. Data Analysis. Retrieved October 9, 2014, from http://ori.hhs.gov/education/products/n_illinois_u/datamanagement/datopic.html
Rouse, M. (2013). data analytics (DA).What is ?. Retrieved October 8, 2014, from http://searchdatamanagement.techtarget.com/definition/data-analytics
Schmitt, E., Sanger, D., & Savage, C. (2013, June 7). Administration Says Mining of Data Is Crucial to Fight Terror.The New York Times. Retrieved October 9, 2014, from http://www.nytimes.com/2013/06/08/us/mining-of-data-is-called-crucial-to-fight-terror.html?pagewanted=all&_r=0
webdatamining.net. (n.d.). Web Data Mining. Predictive Analytics and Data Mining. Retrieved October 9, 2014, from http://www.web-datamining.net/analytics/