Linguists, computer scientists use TACC supercomputers to improve natural language processing
It’s not hard to tell the difference between the “charge” of a battery and criminal “charges.” But for computers, distinguishing between the various meanings of a word is difficult.
For more than 50 years, linguists and computer scientists have tried to get computers to understand human language by programming semantics as software. Driven initially by efforts to translate Russian scientific texts during the Cold War (and more recently by the value of information retrieval and data analysis tools), these efforts have met with mixed success. IBM’s Jeopardy-winning Watson system and Google Translate are high profile, successful applications of language technologies, but the humorous answers and mistranslations they sometimes produce are evidence of the continuing difficulty of the problem.
Our ability to easily distinguish between multiple word meanings is rooted in a lifetime of experience. Using the context in which a word is used, an intrinsic understanding of syntax and logic, and a sense of the speaker’s intention, we intuit what another person is telling us.
“In the past, people have tried to hand-code all of this knowledge,” explained Katrin Erk, a professor of linguistics at The University of Texas at Austin focusing on lexical semantics. “I think it’s fair to say that this hasn’t been successful. There are just too many little things that humans know.”
Other efforts have tried to use dictionary meanings to train computers to better understand language, but these attempts have also faced obstacles. Dictionaries have their own sense distinctions, which are crystal clear to the dictionary-maker but murky to the dictionary reader. Moreover, no two dictionaries provide the same set of meanings — frustrating, right?
Watching annotators struggle to make sense of conflicting definitions led Erk to try a different tactic. Instead of hard-coding human logic or deciphering dictionaries, why not mine a vast body of texts (which are a reflection of human knowledge) and use the implicit connections between the words to create a weighted map of relationships — a dictionary without a dictionary?
“An intuition for me was that you could visualize the different meanings of a word as points in space,” she said. “You could think of them as sometimes far apart, like a battery charge and criminal charges, and sometimes close together, like criminal charges and accusations (“the newspaper published charges…”). The meaning of a word in a particular context is a point in this space. Then we don’t have to say how many senses a word has. Instead we say: ‘This use of the word is close to this usage in another sentence, but far away from the third use.'”
To create a model that can accurately recreate the intuitive ability to distinguish word meaning requires a lot of text and a lot of analytical horsepower.
“The lower end for this kind of a research is a text collection of 100 million words,” she explained. “If you can give me a few billion words, I’d be much happier. But how can we process all of that information? That’s where supercomputers and Hadoop come in.”
Applying Computational Horsepower
Erk initially conducted her research on desktop computers, but around 2009, she began using the parallel computing systems at the Texas Advanced Computing Center (TACC). Access to a special Hadoop-optimized subsystem on TACC’s Longhorn supercomputer allowed Erk and her collaborators to expand the scope of their research. Hadoop is a software architecture well suited to text analysis and the data mining of unstructured data that can also take advantage of large computer clusters. Computational models that take weeks to run on a desktop computer can run in hours on Longhorn. This opened up new possibilities.
“In a simple case we count how often a word occurs in close proximity to other words. If you’re doing this with one billion words, do you have a couple of days to wait to do the computation? It’s no fun,” Erk said. “With Hadoop on Longhorn, we could get the kind of data that we need to do language processing much faster. That enabled us to use larger amounts of data and develop better models.”
Treating words in a relational, non-fixed way corresponds to emerging psychological notions of how the mind deals with language and concepts in general, according to Erk. Instead of rigid definitions, concepts have “fuzzy boundaries” where the meaning, value and limits of the idea can vary considerably according to the context or conditions. Erk takes this idea of language and recreates a model of it from hundreds of thousands of documents.
The Latest Bing News on:
When Will My Computer Understand Me?
- What Happened When I Cloned My Own Voiceon May 9, 2024 at 5:00 am
It had already been used to clone President Joe Biden’s voice to create a fake robocall discouraging people from voting in the New Hampshire primary. I signed up and fed it a few hours of me speaking ...
- Godbey: Some things I cannot understandon May 7, 2024 at 1:01 pm
I’ll admit that I spend too much time thinking about things that others just ignore. However, my brain doesn’t work that way. Case in point, I stopped at the store today to restock on beef jerky and ...
- My View: A sporadic but enjoyable writing career covers many genreson May 7, 2024 at 3:00 am
Pals in my writing group assure me that I am a professional because over that 60-year period I have probably earned at least $125 for my output.
- My Husband Cheated On Me, But Here's How We Saved Our Marriageon May 6, 2024 at 4:59 pm
After a romantic anniversary dinner celebrating 14 years of marriage, I was at my computer while my husband was in the ... Now, I'm occasionally slow, but I'm not stupid. "I understand flowers for a ...
- AI: AI Chronicles Help Us To Understand Each Otheron May 6, 2024 at 6:32 am
The future of work, he contends, is that AI should not replace people, but rather, assist them. He shows an organizational chart that Ai chronicles people achieving corporate tasks and the role of the ...
- IT’S GEEK TO ME: Addressing iPhone AutoFillon May 5, 2024 at 10:00 am
The Odessa American is the leading source of local news, information, entertainment and sports for the Permian Basin.
- A science-based afterlife: Can quantum entanglement save my failing body? - opinionon May 4, 2024 at 3:13 am
Will quantum biology finally provide a fact-based explanation reconciling the supernatural with the physical world?
- "Not My JAMB Result": School's Best Arts Student Rejects UTME Score, Says Board Made Mistakeon May 4, 2024 at 12:40 am
A Nigerian man who was the head boy in his school has recounted how he surprisingly scored a low aggregate of 163 in his UTME. Social media users reacted massively.
- When My Mother Was Healthy, I Was Worthless To Heron May 2, 2024 at 6:30 am
However, between the time when she started into old age and when she became dependent on others, she was a toxic terror. She spewed malicious opinions about almost everyone, including her family, got ...
- Fact Check: Do Any Of TikTok’s Most Viral Computer Tips Actually Help?on April 29, 2024 at 12:44 pm
It's time to find out if TikTok's most viral computer hacks actually work, from apps to performance speed and Wi-Fi.
The Latest Google Headlines on:
When Will My Computer Understand Me?
[google_news title=”” keyword=” When Will My Computer Understand Me?” num_posts=”10″ blurb_length=”0″ show_thumb=”left”]
The Latest Bing News on:
Computers understand human language
- Verbal e-volution: Language technologies are still shaping global cultureon May 8, 2024 at 11:00 pm
All leaps in language processing have proven revolutionary, be it Gutenberg’s printing press or the language models of AI. And language tech has mostly favoured English, adding to its global dominance ...
- Sperm Whale Communication Is Remarkably Similar to Human Language, Research Suggestson May 7, 2024 at 12:30 pm
The latest study is a big step forward in decoding whale linguistics—and machine learning is making it possible.
- Basic building blocks of sperm whale language have been uncovered, scientists sayon May 7, 2024 at 9:14 am
Scientists studying sperm whales have identified the basic elements of their communication system, potentially paving the way for better protection of the mammals.
- Duolingo’s CEO on Language Learning, AI and the End of CAPTCHAson May 2, 2024 at 11:43 am
Breakthroughs in generative AI have created enormous opportunities for humans to learn from computers. We can use them to explain the news, understand historical concepts, fix our coding errors, and ...
- BASIC's 60th anniversary reminds us of the language that democratized programming, as AI threatens to automate codingon May 2, 2024 at 9:41 am
It's hard to overstate how revolutionary BASIC was in the early 1960s computing landscape. At that time, computers were highly specialized black boxes confined to corporate, ...
- Luis von Ahn Explains How Computers and Humans Learn From Each Otheron May 2, 2024 at 12:59 am
Breakthroughs in generative AI have created enormous opportunities for humans to learn from computers. We can use them to explain the news, understand historical concepts, fix our coding errors, and ...
- Forget the AI doom and hype, let's make computers usefulon April 25, 2024 at 12:26 am
Machine learning has its place, just not in ways that suits today's hypesters Systems Approach Full disclosure: I have a history with AI, having flirted with it in the 1980s (remember expert systems?) ...
- More than machines: Computer scientist prepares robots to improve human liveson April 23, 2024 at 5:00 pm
“To do that, we really have to understand their language and culture ... To many computers, those two answers would be identical. They contain the exact same letters in the exact same order. But a ...
- Natural Language Processingon October 5, 2023 at 9:27 am
To investigate the properties of written human language and to model the cognitive mechanisms underlying the understanding and production of ... intelligent processing of written human language by ...
- What Companies Are Fueling The Progress In Natural Language Processing? Moving This Branch Of AI Past Translators And Speech-To-Texton February 6, 2023 at 6:01 pm
Natural language processing is a computer process enabling machines to understand and respond to text or voice inputs. The goal is for the machine to respond with text or voice as a human would.
The Latest Google Headlines on:
Computers understand human language
[google_news title=”” keyword=”computers understand human language” num_posts=”10″ blurb_length=”0″ show_thumb=”left”]