__ December 2014 (date)
Compare and Contrast to Natural Language Processing
Computational linguistics is relatively new and fast developing science. In the future the results of the researchers’ work on it may bring changes to many aspects of life and economy, such as businesses, marketing, professional services, etc. Computational linguistics collaborates with many disciplines, such as Psychology, Computer Science, Mathematics, Statistics, etc.
Natural language processing (NLP) is a field of artificial intelligence, computer science, and linguistics dealing with the interactions between human (natural) languages and computers. NLP is related to the field of human-computer interaction. A lot of challenges in NLP involve understanding of natural language, that is, making computers derive meaning from natural language or human input, and others involve generation of natural language. Natural language processing (NLP) is the possibility of a computer programs to understand human language as it is spoken. Human speech is not precise, it may be ambiguous and the linguistic structure may depend on different complex variables, including regional dialects, slang and social context.
Elizabeth D. Liddy (Syracuse University) suggests the following definition “Natural Language Processing means theoretically motivated computational techniques for representing and analyzing naturally spoken texts at various linguistic analysis levels in order to achieve human-like language processing to fulfill a range of applications and tasks. Goal. The main goal of NLP, as mentioned above, is to realize human-like language processing.
The using of the word ‘processing’ is deliberate, and should not be substituted by ‘understanding’. Although the NLP field was originally named as Natural Language Understanding (NLU) in the early days of artificial intelligence, it is agreed now that while the goal of NLP is NLU, that goal has not been accomplished yet. A complete NLU System should be able to:
1. Paraphrase the input texts;
2. Translate the text into foreign language;
3. Answer the questions about the text contents;
4. Make conclusions from the text.
Origins. As most of the modern disciplines, the field of NLP is mixed, and still now has significant emphases by various groups, the backgrounds of which are influenced by other disciplines. The main contributors to the practice and discipline of NLP are:
- Linguistics - focuses on structural, formal language models and language universals discovery. The field of NLP was originally called Computational Linguistics;
- Computer Science - deals with data developing internal representations and processing of the structures;
- Cognitive Psychology – sees the language usage as a window into cognitive processes of humans, and has the goal of the use of language modeling in a psychologically plausible manner.
The most appropriate method for presenting what happens inside a Natural Language Processing system is with the help of the language levels approach. It is also called the synchronic language model and is distinguished from the previous sequential model that hypothesizes that the human language processing levels follow one another in a sequential manner. Research in Psycholinguistics suggests that natural language processing is more dynamic, as the levels may interact in different orders. Introspection shows that we often use information we receive from what is normally thought of as a higher level processing to help in a lower level of the analysis.
Phonology. This level is connected with the interpretation of the sounds across and within words. There are three rules’ types that are used in phonological analysis: 1) phonetic – for sounds inside words; 2) phonemic – for pronunciation variations when words are pronounced together; 3) prosodic – for stress and intonation fluctuation within a sentence.
Morphology. This level is connected with the componential structure of words that are composed of morphemes (the smallest meaning units).
Lexical. At this level, people and NLP systems, interpret the individual words’ meaning. Some types of processing make contribution to word-level understanding, the first of them being assignment of a part-of-speech single tag to every word. While this processing, words, which can serve as more than only one part-of-speech are assigned the most typical part-of speech tag according to the context in which they appear. Besides, at the lexical level, the words, that have one possible meaning or sense, may be substituted by that meaning’s semantic representation.
Syntactic. This level relates to analyzing the words within a sentence in order to uncover the sentence’s grammatical structure. This requires both a parser and a grammar. The output of this processing level is a sentence’s representation that reveals the relationships of structural dependency between the words. There are different grammars that may be utilized, and that will impact the parser choice.
Semantic. Most people think that at this level the meaning is determined, but, as we can see from the above levels’ definitions, all the levels contribute to meaning. Semantic processing identifies the sentence’s possible meanings by the interactions between word-level meanings within a sentence.
Discourse. If semantics and syntax deal with sentence-length units, the level of discourse deals with the text units that are longer than a sentence. It means that it does not interpret texts as only concatenated sentences, while each of which can be interpreted separately. Rather, discourse makes stress on the properties of the whole text that deliver meaning by means of the connections between sentences’ components.
Pragmatic. This level deals with the intentional use of language for the situations and utilizes context above and over the text contents for understanding. The purpose is to understand how extra meaning is read into the text without being encoded into them. It requires broad world knowledge, including the understanding of plans, intentions, and goals” (Liddy). Natural language processing approaches
Approaches to natural language processing fall into “four categories: statistical, symbolic, connectionist and hybrid. Statistical and symbolic approaches coexist since the first days of this field. Connectionist NLP appeared in the 1960’s. Symbolic approaches were dominating in the field for a long time. Statistical approaches have got popularity again in the 1980’s due to the availability of computational resources and the necessity to deal with real-world and broad contexts. Connectionist approaches recovered from previous criticism by demonstrating the usefulness of neural networks in the language processing.
Technological progress has been contributed by the recent production of complex and large datasets, called big data. For example, with an increasing quantity of human-translated digitalized texts, the machine translator’s success can now be assessed by its accuracy in observed translations reproducing. Information from the documents of United Nations, which are translated by humans into six languages, make possible for Google Translator to control and improve the different machine translation algorithms’ performance.
Multi language algorithms can determine similarities between new and old data, contributing to the computerization of purposes, for which big data has become available. Due to it, computerization is no longer devoted to routine tasks that can be created as rule-based software units, but is expanding on non-routine tasks where big data is available.Due to the availability of big data, large quantities of non-routine cognitive tasks are getting computerizable. That is, in addition to the general technological progress improvement due to big data, algorithms for big data are quickly entering domains dependant on accessing or storing information. The big data use is afforded by the chief advantages of computers in relation to labor of humans: scalability. Little evidence is enough to demonstrate that, in making the task of laborious computation, machines networks are better than labor of humans. Thus, computers can better perform the large calculations that are used for large datasets. Multi language algorithms available on computers are now, often, better able to distinguish patterns in big data than people” (Frey and Osborne).
Python Language of Programming
NLTK (or NTLK) is the package of libraries and programs for symbolic and statistical processing of the natural language, written in the Python language of programming. It contains graphical units and data samples. It has a broad documentation, including the book with explanations of main conceptions that exist behind those tasks of natural language processing that can be realized with the help of this package.
“Python and the Natural Language Toolkit (NLTK) make possible for any programmer to get acquainted with the tasks of NLP easily without spending too much time on collecting resources.
The Python language of programming is an object-oriented, dynamically-typed interpreted language. Its primary advantage lies in the easiness, with which it allows programmers to quickly prototype a project, its mature and powerful set of standard libraries helps it a great fit for production-level large-scale software engineering projects. Python has an excellent resource for online learning and a shallow learning curve” (Madnani).
Promise of Natural Language Processing Technology
As mentioned above, Natural Language Processing technology is able to change our life very significantly. It can simplify many processes. For example, in case there is a computer program that understands spoken language, customers can receive consultations on-line without interacting with human workers. Now it is difficult to say whether the new language technology would make only positive things for the society. But, as far as it can be seen from the below researches, positive aspects would prevail. The technology would be very useful in various businesses, marketing industry, education and language learning and many others. Also, it will greatly contribute to the development of Linguistics as a science.
Marketing. Scientists intend to use Natural Language Processing outcomes also for marketing purposes: “organizations usually use systems that are sentiment analysis-based, or even resort to manual analysis, to derive meaning from the digital ‘chatter’ of their clients. Motivated by the necessity for a more precise way to qualitatively mine qualitative product consumer-generated and brand-oriented text, the research experimentally tests the possibility of an NLP-based approach of analytics to extracting information from much unstructured text. The results indicate that for problems detecting from social media information, NLP outperforms analysis of sentiment. Surprisingly, the experiment shows that sentiment analysis is no better than manual social media data analysis toward the goal of contribution to organizational decision-making, and may even state disadvantageous to such work” (Larson and Watson).
Natural language processing technology can also be useful for learning foreign languages: “CALL (Computer-Assisted Language Learning) tools can provide rather limited types of exercises and most of them are about base forms, together with the limited possibilities to feedback provision because the exercises are static, and the keys have to be prestored. ICALL (Intelligent CALL) makes natural language processing (NLP) the tool for language learning.
Many languages have much in common, such as:
- complex morphology;
- extensive linguistic variation and weak norms;
- distance teaching need;
- not enough teaching materials;
- insufficient text corpus.
With a NLP based system it is possible to:
- create large amounts of grammar tasks in accordance with the learners' needs;
- put the tasks in grammar into a meaningful unit;
- provide automatic diagnosis and answer to the learner in accordance with what he has produced;
- improve digital dictionaries for learners;
- linguistic variation handling.
ICALL tools can partially be built on existing NLP resources. For many languages analyzers are not available, but ICALL tools can be established as the NLP tools, may be limited to the morphology and vocabulary addressed in the learning materials. Creation of such basic analyzers may also be a start for creation of analyzers for the whole language, which is demanded for making programs of spell checking” (Antonsen).
Limitations of Natural Language Processing Technology
First, labor saving technologies can only be adopted in case the access to cheap labor is low or capital prices are relatively high. We do not account for capital prices, future levels of wages or labor shortages. These factors will influence our predictions timeline; labor is the nonsufficient factor, implying that the wage levels will rise in relation to capital prices, making computerization more and more profitable.
Second, regulatory issues and politicians may slow down the computerization process. The states of Nevada and California are, for instance, are currently making changes in the legislation to allow driverless cars. Similar actions will be necessary in other states in relation to different technologies.
Third, making prognosis about technological progress is rather difficult. Therefore, the authors make accent on near-term technological development. They do not make any predictions about the necessary time to overcome different engineering difficulties to computerization” (Frey and Osborne).
The Impact of Language Technology on Society and Jobs
The technological development in general and language technology in particular make a very significant influence upon society. New technologies constantly change and automate various processes in all spheres of our life and every 3-5 years we feel that the world has progressed to the next level.
Language Technology is rather young, but actively developing field that is supposed to bring changes into people’s lives very soon. This technology, as any other, may have positive and negative sides. For example, creation of on-line translation tools has brought a lot of positive effect for people. It can be of significant help for those people, who work with foreign languages professionally, and, at the same time, it helps to translate sentences and texts into the language that the person doesn’t know at all. As everybody has noticed, the quality of the translation provided by the on-line translation tools is rather poor from the point of view of grammar, syntax, and etc. In some cases, we even cannot understand what is meant in the translated phrase, especially if the target language and the language of original text belong to the different families. But what would happen if we receive perfect translation from the on-line tool? This is convenient, but this possibility will eliminate the role of human translators and thus leave them without job.
The secular price change in the real price of computing has created many economic incentives for companies to substitute human work for computer capital. So, the tasks that computers can perform depend upon the possibility of a programmer to create a set of rules or procedures that appropriately lead the technology in every possible contingency. Therefore computers will be productive to human work when a problem can be named in the sense that the success criteria are quantifiable and can be evaluated. The extent of the computerization of job will be limited by technological advances that make possible for engineering problems to be precisely specified, which establishes the boundaries for the computerization scope.
In accordance with the authors’ estimate, 47% of US employment is in the category of high risk, meaning that jobs are potentially automatable within some unknown number of years, maybe a decade or two.
A less significant change, but one with a far larger influence on employment, is happening in professional services and clerical work. Technologies like the Web, big data, artificial intelligence, and improved analytics are automating a lot of routine tasks. Numerous white-collar jobs, such as in customer service and in the post office, have disappeared (MIT Technology Review).
The demand for specialists in foreign languages may also decrease due to the above mentioned process. Many companies now have tools that can provide information in different languages. Thus, in customer service the client just chooses his or her native language and immediately receives the page version in their language. With the development of translation tools it may be possible to make errorless translations of the texts in the future without human assistance. However, by the time being, we cannot yet speak about substitution of the human labor in this sphere, as the tools are not efficient enough.
Even if digital technologies are decreasing job creation, history offers that it is most probably an albeit painful, temporary, shock; as workers change their skills and knowledge and entrepreneurs offer opportunities related to the new technologies, the quantity of jobs will rebound.
It is difficult to determine the net influence on jobs is that automation is usually used to make workers more efficient, with no necessity to replace them. Increasing productivity means that businesses may do the same work with less quantity of employees, but it may also enable the businesses to increase production with their current workers, and even to find new markets” (MIT Technology Review).
Conclusion
As we can see from this paper, natural language processing technology has a powerful potential. The development of e language technology will bring a lot to people and businesses. However, it may cause also negative influences upon the society, such as shortage of jobs number. From the point of view of the scientists, even if it is not good for people, this fact will most probably make people change and learn new professions, while entrepreneurs will creates job opportunities in new fields. Such transition will lead to a significant change and even can create so called ‘new economy’ with new technologies and jobs.
Works Cited
Antonsen, Lene. 'The Impact of Language Technology On Society'. en.uit.no. N.p., 2013. Web. 2 Dec. 2014.
Frey, Carl Benedikt, and Michael A. Osborne. 'THE FUTURE OF EMPLOYMENT: HOW SUSCEPTIBLE ARE JOBS TO COMPUTERISATION?'. Oxford Martin School (2013): 1-45. Web. 1 Dec. 2014.
Larson, Keri, and Richard T. Watson. 'THE IMPACT OF NATURAL LANGUAGE PROCESSINGBASED TEXTUAL ANALYSIS OF SOCIAL MEDIA INTERACTIONS ON DECISION MAKING'. Proceedings of the 21st European Conference on Information Systems (2014): 1-12. Web. 1 Dec. 2014.
Liddy, E. D. 'Natural Language Processing'. Encyclopedia of Library and Information Science (2001): n. pag. Web. 1 Dec. 2014.
Madnani, Nitin. 'Getting Started On Natural Language Processing With Python'. ACM Crossroads (2013): 1-15. Web. 2 Dec. 2014.
MIT Technology Review,. 'How Technology Is Destroying Jobs | MIT Technology Review'. N.p., 2013. Web. 1 Dec. 2014.