Content Generation
There is a clear progression in terms of the likely impact of Gen AI, with the generation of text being the most advanced (ChatGPT is a high profile example), followed by audio and then video. In part this is due to the complexity of the tasks involved, but also the ability to generate content that can be accepted by consumers. Text created by Gen AI is arguably the hardest to discern from that originated by humans, whilst Gen AI video can be the easiest to spot. Gen AI is already being used to generate captions, Instagram posts and podcast text descriptions. It can generate metadata that is appended to content and advertising, and add translations, transcriptions and subtitles.
In some regards, this echoes the sequential impact of the internet on different types of media. The web was initially text-based due to dial-up limitations, but then the advent of MP3s and streaming saw the music industry impacted a long time ahead of the video industry. However, it was stressed that the time lag here is unlikely to be very protracted (“months, rather than years”) with Gen AI already being used in children’s television. For example, the animations for a whole series of Paw Patrol, which has been created by AI from plots and scripts through to animation. Advertising is already being created by AI.
Matt Deegan of Folder Media shared a specific application of AI already being used by some community radio stations in the UK. Here an AI application is used to scan news feeds for stories relevant to that community and collate them into scripted news breaks. These are then read by a voice that has been ‘cloned’ from broadcast talent.
Matt Deegan, Folder Media
In this specific case, it can be debated whether AI is ‘replacing’ human activity, as smaller media owners might argue that they could not afford their own news teams and readers, so arguably the technology is enabling rather than substituting. This echoes a point discussed in our group that the greatest impact of Gen AI may be less on premium, high-quality content, and more as a tool to enhance User Generated Content like YouTube content and podcasts. Voice cloning can be used in editing to insert new words without getting voice talent back in the studio. Clearly the contracts of talent will need to address, authorize (or limit) the use of their voices and images for cloning. This was one of the issues that provoked the 2023 actors strike in the US.
There is also the debate about whether Gen AI should be used to generate content and how far that should go. Media owners are at heart creative organizations and there is existential angst about how far Gen AI and machine learning could and should get them. Will its role be limited to children’s and family entertainment? Is that being condescending to children and families?
Margo Swadley, BBC Studios
Perhaps some of the skepticism about whether Gen AI could ever create – as opposed to augment - premium video content lies in the whole concept of how Gen AI works: it is trained by absorbing and learning from content that already exists, so it can reflect its input data, but can it take a truly ‘creative step’ to move beyond this?