Oscar Wilde once said that satire is the lowest form of wit but the highest form of intelligence. Perhaps it's because of how difficult it is to use and understand. Satire is notoriously difficult to convey through text. Even when you meet in person, you can easily be misunderstood. Subtle changes in tone conveying sarcasm often confuse even computer algorithms, limiting virtual assistants and content analysis tools.
Xiyuan Gao, Shekhar Nayak and Matt Coler from the Speech Technology Lab at the University of Groningen, Campus Fryslân, have developed a multimodal algorithm for improved sarcasm detection that increases accuracy by examining multiple aspects of audio recordings. Gao will present his work on Thursday, May 16, as part of the joint conference of the Acoustical Society of America and the Acoustical Society of Canada, taking place May 13-17 at the Shaw Center in downtown Ottawa, Ontario, Canada.
Traditional sarcasm detection algorithms often rely on a single parameter to produce results, which is the main reason why they often fall short. Instead, Gao, Nayak, and Coler used two complementary approaches: sentiment analysis using text and emotion recognition using audio to get a more complete picture.
“We extracted acoustic parameters such as pitch, speaking rate, and energy from the voice, and then used automatic speech recognition to convert the voice into text to perform sentiment analysis,” Gao said. “Next, we assigned emojis to each speech segment, reflecting their emotional content. By incorporating these multimodal cues into machine learning algorithms, our approach combines auditory and textual information with emojis for comprehensive analysis. “Use your strengths.”
The team is optimistic about the algorithm's performance, but is already looking at ways to improve it further.
“There are different facial expressions and gestures that people use to emphasize sarcastic elements in their speech,” Gao said. “We need to do a better job of incorporating this into our projects. We also want to include more languages and adopt the development of sarcasm recognition technology.”
This approach can be used for more than just identifying dry wit. Researchers emphasize that this technology can be widely applied in a variety of fields.
“The development of sarcasm recognition technology can benefit other research areas using sentiment analysis and emotion recognition,” Gao said. “Traditionally, sentiment analysis focuses primarily on text and has been developed for applications such as online hate speech detection, customer opinion mining, etc. Speech-based emotion recognition can be applied to AI-assisted healthcare. Sarcasm recognition technology applying a multimodal approach is insightful in this area of research.”