So there’s this YouTuber I follow who makes these videos about the Dark Souls games — no talking, generally, just insane gameplay set to captions and fun video game music. The latest video was titled something like “Man vs. Wild in Elden Ring,” which, sure, I’ll click on that. The first thing I noticed: the video had voice acting. The second was that the voice was Bear Grylls’s.
Underneath the video, they’d linked what they used to make the video — ElevenLabs for the voice, and GPT-4 for the script. Over on the subreddit they linked an image of the entire ChatGPT thread they’d used to create the video. Two comments from the YouTuber stood out to me.
First:
And then:
I bring all this up because it feels like the next frontier of ~content creation~1. (Though I confess I didn’t actually watch the whole video. Too unnerving.) To me, the Man vs. Wild video is a turning point in the whole AI discussion. Not just because at the time of this writing, 587,000 people have watched it, but also because the consumer-grade tools are getting good enough that making a YouTube video like this — with a generated voice and script — isn’t an insane amount of work. What’s a little weirder to me, though, is how much some of the commenters liked it2.
I don’t think AI is quite the existential threat to writers a lot of Twitter Blue subscribers think it is, but I do think that the forces of capital are going to use the technology as a cudgel to further marginalize people who make things. AI is of the provisions that the Writer’s Guild of America is currently striking over; the writers want regulations on its use, and the producers don’t. Because it’s cheaper to hire a writer to fix a generated script than it is to hire a whole room of writers to create a good one.
Though, to go back to the video again: while I was initially surprised people enjoyed a mostly AI-generated video, it seems like what people were actually responding to — what they actually liked — was the human element, the parts that the AI didn’t generate. The jokes, easter eggs, and editing, in other words.
Anyway. It’s fascinating in the same way that watching a disaster unfold is; you don’t know how bad it’s going to get, but you can’t look away. I think we’re going to see a lot more of this kind of thing — this kind of collaboration between a person and an AI, where the AI is helping execute a fixed creative vision.
I didn’t like the video because it felt derivative almost immediately: part of what’s interesting to me about dropping a Man vs. Wild conceit into Elden Ring is just seeing how someone would execute that format in a video game. Like, how do you replicate that format — a British guy in the wild trying to survive, usually by drinking his own piss — in a video game that’s about as far as you can get from reality TV? It’s much less interesting to me to use the voice of the guy himself to simulate the show than it is to come up with literally any other solution3.
I guess what I mean is: AI is very good at simulating things. But I’m not sure it’s great at solving problems, because that requires ingenuity. I want people to make the cultural products I consume because their labor — their ingenuity — is what gives them an individuality and a point of view. Human labor is what makes something into art. I guess I’m uncomfortable with the idea of AI replacing any of that.
Love,
Bijan
Which is the reason I didn’t name the creator — I think this use of AI is gonna be something we see a lot. In any case their name is ymfah, and you can find their channel here.
For the record, a bunch didn’t.
And there have been other solutions, because there are 2 other Man vs. Wild ymfah videos, lol. Which he voiced!