Linguists, computer scientists use TACC supercomputers to improve natural language processing
It’s not hard to tell the difference between the “charge” of a battery and criminal “charges.” But for computers, distinguishing between the various meanings of a word is difficult.
For more than 50 years, linguists and computer scientists have tried to get computers to understand human language by programming semantics as software. Driven initially by efforts to translate Russian scientific texts during the Cold War (and more recently by the value of information retrieval and data analysis tools), these efforts have met with mixed success. IBM’s Jeopardy-winning Watson system and Google Translate are high profile, successful applications of language technologies, but the humorous answers and mistranslations they sometimes produce are evidence of the continuing difficulty of the problem.
Our ability to easily distinguish between multiple word meanings is rooted in a lifetime of experience. Using the context in which a word is used, an intrinsic understanding of syntax and logic, and a sense of the speaker’s intention, we intuit what another person is telling us.
“In the past, people have tried to hand-code all of this knowledge,” explained Katrin Erk, a professor of linguistics at The University of Texas at Austin focusing on lexical semantics. “I think it’s fair to say that this hasn’t been successful. There are just too many little things that humans know.”
Other efforts have tried to use dictionary meanings to train computers to better understand language, but these attempts have also faced obstacles. Dictionaries have their own sense distinctions, which are crystal clear to the dictionary-maker but murky to the dictionary reader. Moreover, no two dictionaries provide the same set of meanings — frustrating, right?
Watching annotators struggle to make sense of conflicting definitions led Erk to try a different tactic. Instead of hard-coding human logic or deciphering dictionaries, why not mine a vast body of texts (which are a reflection of human knowledge) and use the implicit connections between the words to create a weighted map of relationships — a dictionary without a dictionary?
“An intuition for me was that you could visualize the different meanings of a word as points in space,” she said. “You could think of them as sometimes far apart, like a battery charge and criminal charges, and sometimes close together, like criminal charges and accusations (“the newspaper published charges…”). The meaning of a word in a particular context is a point in this space. Then we don’t have to say how many senses a word has. Instead we say: ‘This use of the word is close to this usage in another sentence, but far away from the third use.'”
To create a model that can accurately recreate the intuitive ability to distinguish word meaning requires a lot of text and a lot of analytical horsepower.
“The lower end for this kind of a research is a text collection of 100 million words,” she explained. “If you can give me a few billion words, I’d be much happier. But how can we process all of that information? That’s where supercomputers and Hadoop come in.”
Applying Computational Horsepower
Erk initially conducted her research on desktop computers, but around 2009, she began using the parallel computing systems at the Texas Advanced Computing Center (TACC). Access to a special Hadoop-optimized subsystem on TACC’s Longhorn supercomputer allowed Erk and her collaborators to expand the scope of their research. Hadoop is a software architecture well suited to text analysis and the data mining of unstructured data that can also take advantage of large computer clusters. Computational models that take weeks to run on a desktop computer can run in hours on Longhorn. This opened up new possibilities.
“In a simple case we count how often a word occurs in close proximity to other words. If you’re doing this with one billion words, do you have a couple of days to wait to do the computation? It’s no fun,” Erk said. “With Hadoop on Longhorn, we could get the kind of data that we need to do language processing much faster. That enabled us to use larger amounts of data and develop better models.”
Treating words in a relational, non-fixed way corresponds to emerging psychological notions of how the mind deals with language and concepts in general, according to Erk. Instead of rigid definitions, concepts have “fuzzy boundaries” where the meaning, value and limits of the idea can vary considerably according to the context or conditions. Erk takes this idea of language and recreates a model of it from hundreds of thousands of documents.
The Latest Bing News on:
When Will My Computer Understand Me?
- RIP the iPod. I resisted you at first, but for 20 years, you were my musical lifeon May 11, 2022 at 5:38 am
The news that Apple is pulling the plug on the iPod Touch, and thus the entire 21-year-old line, is curiously timed for me because I only recently ... But for boring technical reasons, the computer ...
- A moment that changed me: a rare condition left me fighting to breathe – and repaired my marriageon May 10, 2022 at 11:26 pm
Then, the consultant waves me over to his computer. On the screen is a video of ... It’s …” “… a Band-Aid. I understand. I don’t mind. I just need to look after my baby.” “Well, you need it ...
- Programming Theoretically Useless Computer Science Courseson May 8, 2022 at 11:37 am
The University of Chicago’s computer science curriculum should continuously adapt to prioritize the instruction of more relevant real-world material.
- A security researcher easily found my passwords and more: How my digital footprints left me surprisingly over-exposedon May 6, 2022 at 12:52 pm
after getting a home computer for the first time as a teenager in around 2001. This access opened a lot of worlds to me. I was part of gaming clans, I got my first taste of social media with MySpace, ...
- Mama & Me: An Ode To The Grandmother Who Loved Me When My Mother Couldn’ton May 6, 2022 at 7:57 am
It was the early 2000s, and that computer was like my own magical closet door to a world which ... an outpouring of rage that ended with her promising to kill me. I have given up trying to understand ...
- Key Future Predictions To 2050: 5 Phases Everyone Should Understandon May 3, 2022 at 11:04 pm
Jacques Attali, author of over 80 books including A Brief History of the Future, predicts the next 50 years will see the fall of established empires, global warfare, and a new enlightenment. I talked ...
- These 3 Latina Teachers Are Pushing the Boundaries of Computer Science Classon May 3, 2022 at 11:07 am
“My story really starts with Dewey University calling me to teach computer science ... “Teachers who had no computer science experience learned to translate programming languages and really understand ...
- Computer science enrollment soars, powered by hot job marketon May 3, 2022 at 9:57 am
“[Cadence Bank’s] generosity is helping me to make my goals and dreams a reality ... Enrollment is booming because people understand these benefits of a computer science degree.” "I like ...
- Terry Crews had a 'false idea' of masculinity: 'I walked around this world with my chest puffed out'on April 30, 2022 at 7:19 am
My Journey To True Power," Terry Crews challenges toxic masculine ideals and shares how he became a different kind of man.
- My son hasn’t spoken to me for 19 years because of a woman _ Computer Manon April 30, 2022 at 2:08 am
According to Computer Man, such incidents are not surprising as ... I asked him to wait till they come of age before they marry. “That was all, my son did not understand me. Since 2003 till date, I ...
The Latest Google Headlines on:
When Will My Computer Understand Me?
The Latest Bing News on:
Computers understand human language
- This is what may happen when we merge the human brain and computerson May 11, 2022 at 8:46 am
Why are we on the verge of creating a technology that will combine the computer with the ... I say this because I understand that the main source of human strength is not in the muscles, but ...
- Natural language processing market is Driven by Rise in Utility of Smart Devices, Finds TMR Studyon May 9, 2022 at 12:34 am
Various industrial sectors today including healthcare, manufacturing, BFSI, automotive, and advertising are growing adoption of different advanced ...
- Nvidia's Powerful New Chip Aims to Help AI Understand You Betteron May 5, 2022 at 5:00 am
Exclusive: The new chip helps cement Nvidia's lead in technology that's revolutionizing computing challenges like language and self-driving cars.
- Language processing programs can assign many kinds of information to a single word, like the human brainon May 4, 2022 at 3:20 am
From search engines to voice assistants, computers are getting better at understanding what we ... just like the human brain. The research team began its analysis of statistics-based language ...
- Microsoft Word is censoring you by altering your politically incorrect languageon April 30, 2022 at 7:07 am
A document editing tool launched by computer giant Microsoft has been ... and even damage the semantic integrity of a language. “I see ‘inclusive warnings’ as an affront to the concept of human ...
- The emerging types of language models and why they matteron April 28, 2022 at 5:30 am
AI systems that understand and generate text, known as language models, are the hot new thing in the enterprise. A recent survey found that 60% of tech leaders said that their budgets for AI language ...
- Why conversational AI is an effective listening toolon April 25, 2022 at 6:48 am
or natural language understanding (NLU), helps computers listen to, understand and extract meaning from human language. These tools are rapidly being adopted across enterprises as organizations ...
- AI21 Labs unveils new system that will change the way AI understands human languageon April 20, 2022 at 5:54 am
There exists a crossroads between computer ... s ability to understand how humans speak and what they mean when they do. That field of technology is referred to as Natural Language Processing ...
- The Power of Natural Language Processingon April 19, 2022 at 8:01 am
The conventional wisdom around AI has been that while computers have the ... to add value for your firm, 2) understand how you might leverage AI-based language technologies to make better ...
- New computational linguistics degree preps students for a growing industryon April 18, 2022 at 2:13 pm
Careers in computational linguistics also may focus on using the tools of computer science to better understand human language, including styling large data sets or better understanding the properties ...