Skip to main content

Guess What, If You Build an AI Project That Steals Data From Thousands of Books, Authors Are Going To Be Mad!

This should not be a surprise!

books lined up on a library shelf
Recommended Videos

Just when you think the fight against AI is hopeless, hundreds of authors come together to take down a massive AI-powered platform. Benji Smith, the creator of the word-processing program Shaxpir, has come under fire for his latest project: Prosecraft. While Shaxpir itself is a relatively harmless tool, Prosecraft is a different beast entirely.

The idea behind it was quite simple. Are you a writer, and have you ever hopelessly compared your writing to that of the authors you admire? If so, Prosecraft would give you a run-down of your favorite novels’ word counts, the “vividness” of the language used, the amount of passive-voice an author utilizes, the total number of adverbs one might encounter in the text, and a very basic overview of the emotional story arc, based on the number of words with positive or negative connotations in any given chapter.

Smith claimed that these metrics would be incredibly useful for authors, and sure, to a certain extent, writers become better at writing by reading, and analyzing writing styles is part of that journey. And yet.

The biggest problem with Prosecraft is that it has gathered all this (debatably useful) information by scraping books off the internet without express permission to do so. By Smith’s own admission, over 25,000 novels by thousands of different authors were used to build Prosecraft’s “linguistic literary database.” Naturally, this sparked outrage on Twitter/”X,” as authors discovered one or more of their novels were being used to run this tool.

https://twitter.com/AllyMalinenko/status/1688307202656755712?s=20
https://twitter.com/DianaUrban/status/1687923109033021442?s=20

The backlash became severe enough that Smith was, reluctantly, willing to give authors an out. He asked those writers who wished for their work to be taken off the site to email him. But that was an egregious mistake. Why should authors have to request their work be taken down when he never asked them for permission to use their art this way in the first place?

And it wasn’t just mid-list and debut authors on the site. The works of Stephen King, Nora Roberts, Neil Gaiman, Angie Thomas, Terry Pratchett, John le Carré, and more were all part of this database, too.

https://twitter.com/Marika_Writes_/status/1688619569173118977?s=20

Predictably and understandably, the backlash became worse—The Authors Guild even got involved—until finally Smith relented and published a statement saying he would take down the Prosecraft website in its entirety. But the statement itself reads fairly self-congratulatory.

In his statement, Smith once again tries to explain how useful Prosecraft could be for authors, why it was important to study these metrics, and that when he first launched the site in 2017, it was—he claims—met with near-universal praise. He spends about 1,000 words touting his own product before even getting to anything resembling an apology.

Even then, at the end of his lengthy statement, instead of genuinely acknowledging his mistakes, Smith blames the current Prosecraft backlash on the rise of AI. And in a painfully roundabout way, he’s right. Authors, artists, and actors are currently fighting to retain their livelihoods, and generative AI programs are at the heart of that battle. But Prosecraft is undeniably part of that problem, even if Smith is unwilling to admit it.

Prosecraft may not technically have been a generative AI program, which create derivative “new” works by stealing from text, images, and other artistic creations that can be found on the internet. But scraping the work of artists and authors off the internet without permission is one of the biggest issues with all AI programs, whether generative or not. Smith never offered compensation, and while he claims Prosecraft wasn’t created to generate income and never did, writers who use Shaxpir could subscribe to Shaxpir 4: Pro for $7.99 a month, a subscription which then included access to Prosecraft’s analytical tools.

Authors have the right to decide how their work is used. If any writers permitted Smith to upload their work into Prosecraft’s database, more power to them! That was their choice. But the rise of AI has made it much too easy for artists to be unknowingly exploited this way, and it needs to be stopped. Prosecraft being taken down is a step in the right direction, but the battle has only just begun.

(featured image: Jessica Ruscello on Unsplash)

Have a tip we should know? tips@themarysue.com

Author
El Kuiper
El (she/her) is The Mary Sue's U.K. and weekend editor and has been working as a freelance entertainment journalist for over two years, ever since she completed her Ph.D. in Creative Writing. El's primary focus is television and movie coverage for The Mary Sue, including British TV (she's seen every episode of Midsomer Murders ever made) and franchises like Marvel and Pokémon. As much as she enjoys analyzing other people's stories, her biggest dream is to one day publish an original fantasy novel of her own.

Filed Under:

Follow The Mary Sue:

Exit mobile version