Well, La-Di-Dah: Algorithm Detects Online Sarcasm

By May 17th, 2010, 2:44 pm

Recommended Videos

Well isn’t this just delightful: A team of super-clever computer scientists at The Hebrew University’s Institute of Computer Science in Jerusalem will be presenting their findings on an algorithm that detects online sarcasm at next week’s International Conference for Weblogs and Social Media (ICWSM) in Washington, D.C.

They call their algorithm SASI, short for “semi-supervised sarcasm identification algorithm.” Essentially, it works by taking a handful of sentences that have been tagged by humans as sarcastic (hence the “semi-supervised” part) and employs machine learning to make guesses as to other sarcastic sentences.

The algorithm has been tested out on Tweets and Amazon product reviews, and it’s done pretty darn good (no sarcasm intended!): SASI achieved a precision of 77% and recall of 83.1% “on an evaluation set containing newly discovered sarcastic sentences, where each sentence was annotated by three human readers.”

Though there were no hard-and-fast methods for ferreting out sarcasm, the researchers did find a few moderately useful rules of thumb, including excessive capital letters and exclamation marks, although these were less reliable indicators than large disparities between sentiment and review score for reviewed items:

A number of sentences that were classified as sarcastic present excessive use of capital letters, i.e.: “Well you know what happened. ALMOST NOTHING HAPPENED!!!” (on a book), and “THIS ISN’T BAD CUSTOMER SERVICE IT’S ZERO CUSTOMER SERVICE”. These examples fit with the theoretical framework of sarcasm and irony (see the Related work section) as sarcasm, at its best, emerges from a subtle context, hence cues are needed to make it easier to the hearer to comprehend, especially with written text not accompanied by audio (‘…’ for pause or a wink, ‘!’ and caps for exaggeration, pretence and echoing). Surprisingly, though, the weight of these cues is limited and they fail to achieve neither high precision nor high recall.

According to the study, the three most sarcastically reviewed items on Amazon are Shure and Sony noise cancelation earphones, Dan Brown’s Da Vinci Code, and Amazon’s own Kindle e-reader. We’re guessing they didn’t come across the Amazon reviews for a gallon of Tuscan Whole Milk or a certain popular lupine t-shirt.

The researchers hypothesize that “one of the main reasons for using sarcasm in online communities and social networks is ‘enlightening’ the mass that are ‘treading the wrong path.'”

Check out their papers on sarcasm in Amazon reviews and broader sarcasm detection on Twitter and Amazon. (Warning: both PDFs.)

(h/t Slashdot)

Have a tip we should know? tips@themarysue.com

Author

Robert Quigley

Filed Under:

Shouldn’t InfoWars just simply die?

‘Clear eyes, full hearts, can lose’: What are we doing with this Friday Night Lights reboot?

Best Black Friday TV Deals 2024: 9 Early Steals To Get Right Now

‘Horror, even’: A CNN doctor spells doom for RFK Jr. as Secretary of Health and Human Services

‘Very random’: Lana Del Rey shuts down allegations she has beef with Lizzo

Well, La-Di-Dah: Algorithm Detects Online Sarcasm

Filed Under:

Follow The Mary Sue: