OpenAI released Sora, its long-awaited video generator, earlier this week. In the days since, users have noticed that Sora’s gaming-related “creations” bear a striking resemblance to real video game IPs, including Super Mario Bros, Call of Duty, Teenage Mutant Ninja Turtles (TMNT), and more. Some Sora content even closely mimics well-known Twitch streamers. The similarities—which have already caught the attention of several copyright attorneys—could add yet another fold to the debate over generative AI, which some argue is just a glorified plagiarism machine.
When OpenAI teased Sora back in February, one of the company’s interns showed that the video-generating tool could essentially recreate Minecraft, albeit with a more natural-looking sky. The company had reportedly trained Sora on Minecraft videos, so this wasn’t too much of a surprise. But Kyle Wiggers, a reporter for TechCrunch, started to wonder if other games had been included in Sora’s training dataset. After OpenAI opened Sora to the public, Wiggers asked it to generate clones of popular video games—something it achieved with startling efficacy.
In one sample, Sora creates what looks undeniably like Super Mario Bros, but with punchier graphics. In another, it appears to combine Call of Duty and Counter-Strike in a mock playthrough of a first-person shooter. Yet another clip shows four bright green bipedal turtles fighting in front of a twilight cityscape, complete with bandanas. (This is perhaps the jankiest of the bunch; the clip’s distorted text and “Presss Start” title page are classic AI giveaways.)
Wiggers got Sora to create this Super Mario Bros rip-off.
Credit: OpenAI/TechCrunch
Sora isn’t just for generating two-dimensional video assets, though; it can also spit out videos of real people and animals. When Wiggers asks Sora to create a fake Twitch stream involving Auronplay (a popular streamer), it does so fairly well, with the Twitch UI somehow the least convincing part of the end product. Sora’s version of Auronplay, which includes the streamer’s forearm tattoo, looks enough like the real guy to trick anyone who isn’t deeply familiar with his content. The generator also pulled off a decent copy of Pokimane, another Twitch streamer.
OpenAI executives and spokespeople have largely evaded questions about Sora’s training data. The company’s website claims Sora was trained on “a mix of publicly available data, proprietary data accessed through partnerships, and custom datasets developed in-house,” but the question of OpenAI partnering with Nintendo, Activision, or Paramount Global regarding any of the above games is a non-starter. That means OpenAI’s legal team could soon be very busy.
“Training a generative AI model generally involves copying the training data,” Joshua Weigensberg, an IP attorney at the law firm Pryor Cashman, told Wiggers. “If that data is video playthroughs of games, it’s overwhelmingly likely that copyrighted materials are being included in the training set.”
Choi Byung-ho, a professor at Korea University’s AI Research Institute, seems fairly confident that OpenAI trained its new tool on actual game playthroughs (some of which very well could have come from Twitch). “Sora likely learned from limited content to provide high-quality videos,” he told Chosunbiz, a Korean news outlet. “In this case, game content becomes a useful learning resource.”
Byung-ho went on to explain that although OpenAI is likely “well aware of the copyright infringement controversies, they prefer to negotiate after lawsuits arise rather than consulting with copyright-holding companies beforehand.”
Time will tell whether OpenAI will need to defend itself in court for its uncanny lookalikes, and if it does, whether it will come out on top. Until then, one thing is for certain: We wouldn’t want to get on Nintendo’s bad side.
#OpenAI #Appears #Trained #Sora #Game #Content
source: https://www.extremetech.com/gaming/openai-appears-to-have-trained-sora-on-game-content


