|
Natural
Language Processing |
|
|||||||
|
The value to our society of being able to communicate with computers in everyday "natural" language cannot be overstated. Imagine asking your computer "Does this candidate have a good record on the environment?" or "When is the next televised National League baseball game?" Or being able to tell your PC "Please format my homework the way my English professor likes it." Commercial products can already do some of these things, and AI scientists expect many more in the next decade. One goal of AI work in natural language is to enable communication between people and computers without resorting to memorization of complex commands and procedures. Automatic translation---enabling scientists, business people and just plain folks to interact easily with people around the world---is another goal. Both are just part of the broad field of AI and natural language, along with the cognitive science aspect of using computers to study how humans understand language. What is Computational Linguistics? Hans Uszkoreit, CL Department, University of the Saarland, Germany. 2000. A short, non-technical overview this exciting field Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. By Daniel Jurafsky and James H. Martin. Prentice-Hall, 2000. Both the Preface and Chaper 1 are available online as are the resources for all of the chapters. Natural Language. A summary by Patrick Doyle. Very informative, though there are some spots that are quite technical. NLP Tutorials. From Dave Inman, School of Computing, South Bank University, London. The topics covered include: Can computers understand language?; What kinds of ambiguity exist and why does ambiguity hinder NLP?; and A simple Prolog parser to analyse the structure of language. Natural Language Processing. Lecture Notes from Associate Professor John Batali's course in Artificial Intelligence Modeling at the University of California at San Diego's Department of Cognitive Science. Glossary of Linguistic Terms. Compiled by Dr. Peter Coxhead of The University of Birmingham School of Computer Science for his students.
The Futurist - The Intelligent Internet. The Promise of Smart Computers and E-Commerce. By William E. Halal. Government Computer News Daily News (June 23, 2004). "Scientific advances are making it possible for people to talk to smart computers, while more enterprises are exploiting the commercial potential of the Internet. ... [F]orecasts conducted under the TechCast Project at George Washington University indicate that 20 commercial aspects of Internet use should reach 30% 'take-off' adoption levels during the second half of this decade to rejuvenate the economy. Meanwhile, the project's technology scanning finds that advances in speech recognition, artificial intelligence, powerful computers, virtual environments, and flat wall monitors are producing a 'conversational' human-machine interface. These powerful trends will drive the next generation of information technology into the mainstream by about 2010. ... The following are a few of the advances in speech recognition, artificial intelligence, powerful chips, virtual environments, and flat-screen wall monitors that are likely to produce this intelligent interface. ... IBM has a Super Human Speech Recognition Program to greatly improve accuracy, and in the next decade Microsoft's program is expected to reduce the error rate of speech recognition, matching human capabilities. ... MIT is planning to demonstrate their Project Oxygen, which features a voice-machine interface. ... Amtrak, Wells Fargo, Land's End, and many other organizations are replacing keypad-menu call centers with speech-recognition systems because they improve customer service and recover investment in a year or two. ... General Motors OnStar driver assistance system relies primarily on voice commands, with live staff for backup; the number of subscribers has grown from 200,000 to 2 million and is expected to increase by 1 million per year. The Lexus DVD Navigation System responds to over 100 commands and guides the driver with voice and visual directions." Experts Use AI to Help GIs Learn Arabic. By Eric Mankin. USC News (June 21, 2004). " To teach soldiers basic Arabic quickly, USC computer scientists are developing a system that merges artificial intelligence with computer game techniques. The Rapid Tactical Language Training System, created by the USC Viterbi School of Engineering's Center for Research in Technology for Education (CARTE) and partners, tests soldier students with videogame missions in animated virtual environments where, to pass, the students must successfully phrase questions and understand answers in Arabic." Read the story and then watch the video! Natural Language Processing: She Needs Something Old & Something New (maybe something borrowed and something blue, too.) Karen Sparck Jones, University of Cambridge, UK. Her 1994 Presidential Address to the Assn. for Computational Linguistics (ACL). "I want to assess where we are now, in computational linguistics and natural language processing, compared with where we started, and to put my view of what we need to do next. ... Computational linguistics, or natural language processing (NLP), is nearly as old as serious computing. Work began more than forty years ago, and one can see it going through successive phases...." Natural Language Processing FAQ. Maintained by Dragomir R. Radev. Dept. of Computer Science, Columbia University. An Overview of Empirical Natural Language Processing. By Eric Brill and Raymond J. Mooney (1997). AI Magazine 18 (4): 13 - 24. "In recent years, there has been a resurgence in research on empirical methods in natural language processing. These methods employ learning techniques to automatically extract linguistic knowledge from natural language corpora rather than require the system developer to manually encode the requisite knowledge. The current special issue reviews recent research in empirical methods in speech recognition, syntactic parsing, semantic processing, information extraction, and machine translation. This article presents an introduction to the series of specialized articles on these topics and attempts to describe and explain the growing interest in using learning methods to aid the development of natural language processing systems."
"Computational Linguistics is the only publication devoted exclusively to the design and analysis of natural language processing systems. From this unique quarterly, university and industry linguists, computational linguists, artificial intelligence (AI) investigators, cognitive scientists, speech specialists, and philosophers get information about computational aspects of research on language, linguistics, and the psychology of language processing and performance. Published by The MIT Press for: The Association for Computational Linguistics." Abstracts are available online. Natural Language Understanding. By Avron Barr (1980). AI Magazine 1(1): 5-10. "This is an excerpt from the Handbook of Artificial Intelligence, a compendium of hundreds of articles about AI ideas, techniques, and programs being prepared at Stanford University by AI researchers and students from across the country." Don't miss the fascinating section: Early History. Empirical Methods in Information Extraction. By Claire Cardie (1997). AI Magazine 18 (4): 65-79. "This article surveys the use of empirical, machine-learning methods for a particular natural language-understanding task-information extraction. The author presents a generic architecture for information-extraction systems and then surveys the learning algorithms that have been developed to address the problems of accuracy, portability, and knowledge acquisition for each component of the architecture." Duo-Mining
- Combining Data and Text Mining. A
Short Introduction to Text-to-Speech Synthesis. At I.B.M., That Google Thing Is So Yesterday. By James Fallows. The New York Times (December 26, 2004; reg. req'd.). "Suddenly, the computer world is interesting again. ... The most attractive offerings are free, and they are concentrated in the newly sexy field of 'search.' ... [T]oday's subject is the virtually unpublicized search strategy of another industry heavyweight: I.B.M. ... I.B.M. says that its tools will make possible a further search approach, that of 'discovery systems' that will extract the underlying meaning from stored material no matter how it is structured (databases, e-mail files, audio recordings, pictures or video files) or even what language it is in. The specific means for doing so involve steps that will raise suspicions among many computer veterans. These include 'natural language processing,' computerized translation of foreign languages and other efforts that have broken the hearts of artificial-intelligence researchers through the years. But the combination of ever-faster computers and ever-evolving programming allowed the systems I saw to succeed at tasks that have beaten their predecessors. ... ... Jennifer Chu-Carroll of I.B.M. demonstrated a system called Piquant, which analyzed the semantic structure of a passage and therefore exposed 'knowledge' that wasn't explicitly there. After scanning a news article about Canadian politics, the system responded correctly to the question, 'Who is Canada's prime minister?' even though those exact words didn't appear in the article."
dialogues with colorful personalities of early ai. By Guven Guzeldere and Stefano Franchi. (1995). From Constructions of the Mind: Artificial Intelligence and the Humanities, a special issue of the Stanford Humanities Review, Volume 4,Issue 2. "Of all the legacies of the era of the sixties, three colorful, not to say garrulous, "personalities" that emerged from the early days of artificial intelligence research are worth mentioning: ELIZA, the Rogerian psychotherapist; PARRY, the paranoid; and (as part of a younger generation) RACTER, the "artificially insane" raconteur. All three of these "characters" are natural language processing systems that can "converse" with human beings (or with one another) in English.
LifeCode: A Deployed Application for Automated Medical Coding. By Daniel T. Heinze, Mark Morsch, Ronald Sheffer, Michelle Jimmink, Mark Jennings, William Morris, and Amy Morsch. AI Magazine 22(2): 76-88 (Summer 2001). This paper is based on the authors' presentation at the Twelfth Innovative Applications of Artificial Intelligence Conference (IAAI-2000). "LifeCode is a natural language processing (NLP) and expert system that extracts demographic and clinical information from free-text clinical records." Natural Language Processing and Knowledge Representation: Language for Knowledge and Knowledge for Language. Edited by Lucja M. Iwanska and Stuart C. Shapiro. AAAI Press. The following excerpt is from the Preface which is available online: "The research direction of natural language-based knowledge representation and reasoning systems constitutes a tremendous change in how we view the role of natural language in an intelligent computer system. The traditional view, widely held within the artificial intelligence and computational linguistics communities, considers natural language as an interface or front end to a system such as an expert system or knowledge base. In this view, inferencing and other interesting information and knowledge processing tasks are not part of natural language processing. By contrast, the computational models of natural language presented in this book view natural language as a knowledge representation and reasoning system with its own unique, computationally attractive representational and inferential machinery. This new perspective sheds some light on the actual, still largely unknown, relationship between natural language and the human mind. Taken to an extreme, such approaches speculate that the structure of the human mind is close to natural language. In other words, natural language is essentially the language of human thought." "I'm sorry Dave, I'm afraid I can't do that": Linguistics, Statistics, and Natural Language Processing circa 2001. By Lillian Lee, Cornell Natural Language Processing Group. To appear in the National Academies' Study on Fundamentals of Computer Science. "A brief, general-audience overview of the history of natural language processing, focusing on data-driven approaches." A Performance Evaluation of Text-Analysis Technologies. By Wendy Lehnert and Beth Sundheim (1991). AI Magazine 12 (3): 81-94. "A performance evaluation of 15 text-analysis systems conducted to assess the state of the art for detailed information extraction from unconstrained continuous text. ... Based on multiple strategies for computing each metric, the competing systems were evaluated for recall, precision, and overgeneration. The results support the claim that systems incorporating natural language-processing techniques are more effective than systems based on stochastic techniques alone." Natural Language Lecture Slides & Accompanying Transcripts from Professors Tomás Lozano-Pérez & Leslie Kaelbling's Spring 2003 course: Artificial Intelligence. Available from MIT OpenCourseWare.
Natural Language Understanding and Semantics. Section 1.2.4 of Chapter One (available online) of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005). "One of the long-standing goals of artificial intelligence is the creation of programs that are capable of understanding and generating human language. Not only does the ability to use and understand natural language seem to be a fundamental aspect of human intelligence, but also its successful automation would have an incredible impact on the usability and effectiveness of computers themselves. ... Understanding natural language involves much more than parsing sentences into their individual parts of speech and looking those words up in a dictionary. Real understanding depends on extensive background knowledge about the domain of discourse and the idioms used in that domain as well as an ability to apply general contextual knowledge to resolve the omissions and ambiguities that are a normal part of human speech." Computers That Speak Your Language - Voice recognition that finally holds up its end of a conversation is revolutionizing customer service. Now the goal is to make natural language the way to find any type of information, anywhere. By Wade Roush. Technology Review (June 2003). "Building a truly interactive customer service system like Nuance’s requires solutions to each of the major challenges in natural-language processing: accurately transforming human speech into machine-readable text; analyzing the text’s vocabulary and structure to extract meaning; generating a sensible response; and replying in a human-sounding voice." And be sure to see the illustration in the article: Inside a Conversational Computer. Chatbot bids to fool humans - A computer program designed to talk like a human is preparing for its biggest test in its bid to be truly "intelligent". By Jo Twist. BBC (September 22, 2003). "Jabberwacky lives on a computer hard drive, tells jokes, uses slang, sometimes swears and can be quite a confrontational conversationalist. What sets this chatty AI (artificial intelligence) chatbot apart from others is the more it natters, the more it learns. The bot is the only UK finalist in this year's Loebner Prize and is hoping to chat its way to a gold medal for its creator, Rollo Carpenter. The Loebner Prize is the annual competition to find the computer with the most convincing conversational skills and started in 1990. Jabberwacky will join eight other international finalists in October, when they pit their wits against flesh and blood judges to see if they can pass as one of them. It is the ultimate Turing Test, which was designed by mathematician Alan Turing to see whether computers 'think' and have 'intelligence'."
The Association for Computational Linguistics (ACL) is the "international scientific and professional society for people working on problems involving natural language and computation." ACL NLP/CL Universe. Choose "Browse" to see menus of what is offered (introductory materials, research groups, conferences, bibliographies, etc.) or choose "Search" for a keyword search engine. ["The NLP/CL Universe is a Web catalog/search engine that is devoted to Natural Language Processing and Computational Linguistics Web sites. It exists since March 18, 1995." Maintained by Dragomir R. Radev for ACL.] AI on the Web: Natural Language Processing. A resource companion to Stuart Russell and Peter Norvig's "Artificial Intelligence: A Modern Approach" with links to reference material, people, research groups, books, companies and much more. Natural Language Group. Information Sciences Institute, University of Southern California. Be sure to see What It's All About: "Natural Language Processing (or Human Language Technology, or Computational Linguistics) is about the treatment of human languages by computer dating since the early 1950s. NLP has experienced unprecedented growth over the past few years. ..." Natural Language Learning at UT Austin. "Natural language processing systems are difficult to build, and machine learning methods can help automate their construction significantly. Our research in learning for natural language mainly involves applying inductive logic programming and other relational learning techniques to constructing database interfaces and information extraction systems from supervised examples. However, we have also conducted research in learning for syntactic parsing, machine translation, word-sense disambiguation, and morphology (past tense generation)." Check out the 3 demos of learning natural-language interfaces: Geoquery; RestaurantQuery; and JobQuery. Natural Language Processing Course Listing, part of the 2004 NLP Course Survey conducted by ACL (Association for Computational Linguistics). The Natural Language Processing Dictionary (NLP Dictionary). Compiled by Bill Wilson, Associate Professor in the Artificial Intelligence Group, School of Computer Science and Engineering, University of NSW. "You should use The NLP Dictionary to clarify or revise concepts that you have already met. The NLP Dictionary is not a suitable way to begin to learn about NLP." Natural Language Processing Group. Department of Artificial Intelligence, University of Edinburgh. "The goal of the [Microsoft] Natural Language Processing (NLP) group is to design and build a computer system that will analyze, understand, and generate languages that humans use naturally, so that eventually you can address your computer as though you were addressing another person. This goal is not easy to reach. ... The challenges we face stem from the highly ambiguous nature of natural language." Natural Language Processing Resource Sites. from Mary D. Taffet. "Please note: This webpage was created primarily for the use of the students in the Natural Language Processing course taught by Elizabeth Liddy at Syracuse University's School of Information Studies. Students of other NLP or Computational Linguistics courses are more than welcome to make use of this page as well. The primary purpose of this page is to point to: 1.NLP-related demos that are available online ... 2.Resources relevant to the various levels of language processing 3.Other useful links for NLP students, relating to any aspect of Natural Language Processing that might be encountered in an academic course, from the lowest levels of language processing to the highest levels."
"The Natural Language Software Registry (NLSR) [fourth edition] is a concise summary of the capabilities and sources of a large amount of natural language processing (NLP) software available to the NLP community. It comprises academic, commercial and proprietary software with specifications and terms on which it can be acquired clearly indicated." From the Language Technology Lab of the German Research Centre for Artificial Intelligence (DFKI GmbH). START. "The START Natural Language System is a software system designed to answer questions that are posed to it in natural language. START parses incoming questions, matches the queries created from the parse trees against its knowledge base and presents the appropriate information segments to the user. In this way, START provides untrained users with speedy access to knowledge that in many cases would take an expert some time to find." Stanford NLP Group. "A distinguishing feature of the Stanford NLP Group is our effective combination of sophisticated and deep linguistic modeling and data analysis with innovative probabilistic and machine learning approaches to NLP."
Aikins, Janice, Rodney Brooks, William Clancey, et al. 1981. Natural Language Processing Systems. In The Handbook of Artificial Intelligence, Vol. I, ed.Barr, Avron and Edward A. Feigenbaum, 283-321. Stanford/Los Altos, CA: HeurisTech Press/William Kaufmann, Inc. Allen, J. F. 1994. Natural Language Understanding. Redwood City, CA: Benjamin/Cummings. A new edition of a classic work. Bobrow, Daniel. 1968. Natural Language Input for a Computer Problem Solving System. In Semantic Information Processing, ed. Minsky, Marvin, 133-215. Cambridge, MA: MIT Press. Charniak, E. 1993. Statistical Language Learning. Cambridge, MA: MIT Press. Cohen, P., J. Morgan, and M. Pollack. 1990. Intentions in Communication. Cambridge, MA: MIT Press. Grosz, Barbara J., Martha E. Pollack, and Candace L. Sidner. 1989. Discourse. In Foundations of Cognitive Science, ed. Posner, M., 437-468. Cambridge, MA: MIT Press. Grosz, Barbara J., Karen Sparck Jones, and Bonnie L. Webber, editors. 1986. Readings in Natural Language Processing. San Mateo, CA: Morgan Kaufmann. Mahesh, Kavi, and Sergei Nirenburg. 1997. Knowledge-Based Systems for Natural Language. In The Computer Science and Engineering Handbook, ed. Allen B. Tucker, Jr., 637-653. Boca Raton, FL: CRC Press, Inc. McKeown, K., and W. Swartout. 1987. Language Generation and Explanation. In Annual Review of Computer Science, Vol. 2, Palo Alto, CA: Annual Reviews. Patterson, Dan W. 1990. Natural Language Processing. In Introduction to Artificial Intelligence and Expert Systems by Dan W. Patterson, 227-270. Englewood Cliffs, NJ: Prentice Hall. Shank, Roger C. 1975. The Structure of Episodes in Memory. In Computation and Intelligence: Collected Readings, ed. Luger, George F., 236-259. Menlo Park/Cambridge, MA: AAAI Press/The MIT Press, 1995. Weizenbaum, J. 1965. ELIZA--A Computer Program for the Study of Natural Language Communication Between Man and Machine. Communications of the ACM, 9 (1): 36-45. A pioneering work. Winograd, T. 1972. Understanding Natural Language. New York: Academic Press. A pioneering work. |