Life in a Glass House
Working with new technologies is one of the most thrilling parts of our roles at dandelion + burdock, and we jump at the chance to work on projects that let us learn new techniques and explore new methods of communication.
We were recently given the opportunity to collaborate with the fantastic team at Kaleida on a film that required us to create digital avatars of real influencers. The avatars would then interact with their real-life counterparts through a combination of Unreal Engine, Deep Fake technology, AI Voice generation, a holobox, and ChatGPT.
Our brief was to take photogrammetry scans of three people through the Metahuman pipeline to create animation-ready characters which would then be given facial animation using the Unreal Live Link Face app. We were also tasked with creating a set of clips that would be used to train the Deep Fake AI to ultimately replace the digital avatar’s faces for maximum realism. The result would be a digital double that could interact with its real-world doppelganger in real-time, with AI generated responses occurring on the fly.
This highly ambitious project blended technology and creativity in a completely new way. The resulting film explored the relationship between the double lives we lead as members of an ever-more online community. We all have digital versions of ourselves cultivated through hours and years of life online in games, websites and social media.
What does it mean to come face-to-face with your digital-self?
The Challenge
The challenges of creating realistic digital humans are not new. Hollywood has been refining this process for decades, but is a process historically saved for blockbuster films and video games. Furthermore, creating a compelling character that can convey the full range of emotions expressed via subtle facial movements has typically been the preoccupation of large motion capture studios.
One of the key requirements for the project was to create a collection of facial motion capture sequences inside Unreal Engine that would be used to train a Deep Fake AI. This would be used to allow the digital doubles to be able to have real-time conversations with their real-world counterparts. If the training data we created wasn’t detailed enough, the digital double wouldn’t be able to emote and communicate in a way that felt compelling for the audience.
Luckily, Epic Games created Metahuman, a series of production ready tools and processes that allow us to create digital avatars with highly expressive facial controls with nothing more than a PC and an iPhone.
The use of real-time technology in conjunction with AI responses generated on the fly to allow people to interact with highly realistic digital avatars is a world-first. And points towards a future where we can communicate with AI in an enticingly tangible way.
But to do this we would first need our Metahumans.
Human to Metahuman
There are multiple routes to creating a Metahuman. You can use the cloud-based Metahuman Creator app which functions much like a character creator typically found in many modern games engines. You can start from scratch using footage captured with any model of iPhone with a depth sensor and the Live Link app. But the most accurate way to recreate your likeness is to use a photogrammetry scan. And luckily for us, we kickstarted our process with extremely detailed scans of our performers captured by Clear Angle Studios.
The photogrammetry scan provided us with a highly accurate base mesh that can be imported directly into Metahuman and forms the genesis of the Metahuman Identity. From here Epic Games have provided a robust and easy to use toolkit that allows us to guide the Metahuman system into understanding the key facial features on the mesh.
Here is where the magic begins. Once you have helped Metahuman understand where important features such as eyes, nose and mouth are located, it will create a new mesh that conforms to the profile of the high resolution scan, but with clean animation-ready topology directly within the Unreal environment.
While this happens, a version of your new Metahuman is also added to the cloud-based Metahuman Creator app where you can start applying the finer details to your models, such as: Skin texture, eye details, make, hair and groom and more.
The power (and to some degree limitation) in this process comes from the fact that Metahuman essentially replaces your model with one generated using its own template mesh. The advantage being that you are given access to a completely animation ready human character with a full body and facial rig ready to go. This will even integrate perfectly into the Unreal Engine Control Rig environment to allow you to animate your characters in real-time in the engine.
The drawback however, is that by default you are limited in terms of customisation to the options provided by the Metahuman Creator tools and assets. There are a range of options for each element of the character's identity creation, including skin details (freckles etc), eye types, facial and head hair. However the ways that these can be adjusted are often limited.
We encountered a number of areas that required more specific control over (for example) hair options that meant we had to resort to external tools to create our own groom assets to be applied to our characters. This was not a problem, but Metahuman feels most powerful when working within its parameters and an even more detailed customisation toolkit would potentially remove any need to leave the pipeline.
That being said, it was incredible to discover this process and the ease at which we were able to recreate and digitise the performers with fully rigged facial functionality is truly game changing.
Metahuman was released a number of years ago, but I don’t feel we really grasped the potential for what this could mean for small studios to begin producing truly life-like digital doubles within Unreal Engine until now.
Creating a Metahuman Identity was just the first part of the process, the next challenge would be to give our characters movements.
Digital Puppetry
While AI would ultimately be used to replace the faces of our characters, the expressions would still have to be driven by the underlying animation. The more detailed our facial motion capture could be, the better the end result would be.
Again, Epic Games has a solution on hand to help. Metahuman now has the ability to integrate with newer iPhone models to create a character by simply filming yourself from a number of angles and making a few key expressions. The key advantage to this process was a Metahuman that was trained specifically on your face, which allowed for more detailed facial animations to be captured via the iPhone. These motions would range from overly exaggerated expressions, to simply reading long passages of text.
Life beyond the glass house
The motion clips could then be used to drive any other Metahuman character, allowing us to act as facial puppeteers for the digital avatars that would be used to create the AI training content.
The ultimate aim was to create clips with enough range of movement to train the AI face replacement technology with everything it needs to create the final deep fake.
We were working with tight timelines so the option to refine the performance with additional animation wasn’t an option, so we needed to rely on what could be captured directly into Unreal Engine.
The ease and speed of capture, combined with the level of detail of expression we were able to achieve with the Metahuman Live Link process easily surpassed our expectations. Other pipelines have tended to result in stilted animations lacking in the subtle movements that are crucial for a truly compelling facial performance. But we were able to create highly expressive animations in very little time with no additional tweaking.
The ability to bring life to digital characters with such accessible tools makes me excited to see where the industry will go next.
For more details about the final project check out our project entry here: Lenovo: Meet your digital self