Animation is a vital ingredient in building a realistic, immersive, and modern videogame world. As computing power and graphical quality increase, so too does the complexity involved in bringing characters, animals, objects, and effects to life with animation. Teams at Ubisoft La Forge and the Ubisoft China AI & Data Lab are always looking to the future for ways to make games more realistic, more immersive, and simpler to build. Recently, these teams presented two projects of interest internally at Ubisoft, with one also featured at the Symposium of Computer Animation.
ZooBuilder is a project from the Ubisoft China AI & Data Labs that uses deep learning AI to generate 3D information from 2D videos. This allows animations to be built from reference materials, meaning animators need to spend much less time building them from scratch, and do not need to rely on motion capture or the painstaking work of reproducing movements by sight alone. La Torch is a tool designed to aid developers in creating more realistic smoke and fire, creating effects that react to physics and movement and incorporate flow dynamics.
Each year, the Symposium on Computer Animation presents the latest developments and breakthroughs in digital animation. This year’s event – SCA 2020 –featured talks from a host of speakers from professional and academic backgrounds. This year’s showcase segment, organized by Daniel Holden (research & development scientist, Ubisoft Montreal), included Ubisoft’s own Shahin Rabbani (physics programmer, Ubisoft Montreal) and Abassin Sourou Fangbemi (associate data scientist, Ubisoft Chengdu), who gave some insights into the tools they have developed and what they could mean for the future of animation. We sat down with Rabbani, Fangbemi, and Holden to talk about their roles at Ubisoft and their work on ZooBuilder and La Torch.
What are your current roles and backgrounds?
Daniel Holden: I work as a research scientist at Ubisoft La Forge, working mainly on machine learning and animation. My job is essentially to lead the research and development of new animation technology, with the aim of helping Ubisoft create better animation systems more easily. More generally, I’m interested in how new tech can be used to build realistic experiences and expand the scope and depth of the worlds we build, without requiring an endless amount of work. Before joining Ubisoft La Forge, I studied for my Ph.D. at The University of Edinburgh, and before that I worked as a technical artist and graphics programmer for a couple of small game studios and indie developers.
Shahin Rabbani: I am on the research and development team as a physics programmer at La Forge. My job is to locate and integrate the latest tech and tools in the fields of physics and animation, and adapt it all to complement and enhance the workflows of various teams within Ubisoft. We tend to place an emphasis on novelty in our work, and also publish and share our innovations with the academic community. As for my background, I completed my Master’s in electrical engineering, and I have a Ph.D. in computer science. I have 10 years of experience working on robotics and character animation, but since 2018, I have been working on real-time fluid simulation, which is the topic of my SCA presentation; and La Torch, the tool we are working on.
Abassin Sourou Fangbemi: I have a background in computer vision, which is the idea of teaching an AI to recognize visual data from images or videos, for example. This is also what my Ph.D. is focused on: action recognition in videos. Currently, I am a data scientist with the Ubisoft China AI & Data Lab, and I am forging a similar path with my work here. ZooBuilder is a small part of the work I do, which focuses on 2D and 3D pose estimation by an AI, but I also have various projects in other aspects of AI and machine learning. What roles do La Forge and Ubisoft China AI & Data Lab play within Ubisoft?
ASF: The purpose of the China Data Lab is to empower Ubisoft teams with big data, tools, and machine learning innovations to help them make better games in a more efficient way. The scope of this includes game production and development, but also business operations and systems. In other words, we support game production teams by developing AI-based solutions that can speed up their workflow, automate some of their tasks, or help them overcome challenges in their creation process. We also bring the power of AI to business teams, such as customer service and community management.
SR: La Forge aims to bridge the gap between academia and industry, which are often disconnected worlds. La Forge works closely with professors and students from different academic institutions to push research that will define the tech of the future, always staying ahead of the curve. At La Forge, we’re encouraged to work on a specific type of project that both solves real-world problems in production, and makes for high-quality research papers.
What are the challenges animators face that can be helped by tools like the ones you have built?
ASF: Motion-capture with animals can be quite difficult for a few reasons. Animals are not always easy to work with, and it isn’t possible to get certain creatures into a capture studio due to safety, both for the animals and for people. It may be possible to bring pets or domesticated animals into a motion-capture room if they are well trained, but our games include a wide range of species such as mountain lions, elephants, deer, wolverines, and much more. Traditionally, our animators manually craft keyframe animations frame-by-frame, which is extremely time-consuming. For a single animal with complex animations and sophisticated behaviors, such as interactions with players or attacks in the wild, we estimate it can take up to 500 man-days. ZooBuilder’s purpose is not to remove the human artistic element from the process, but to supplement it. Animators can spend more time refining and perfecting movements instead of focusing on building each one from the ground up.
SR: Fluid simulation, especially in 3D, is notoriously hard for real-time applications to simulate. The underlying equations of flow dynamics demand many steps to calculate the interactions between the large number of particles involved in flow. For the crudest form of smoke or fire, for instance, you might need a few milliseconds to compute the next frame, which is typically much too high a cost for the limited computational budget in game production. The current common workaround is to “bake” or pre-make animation clips in physics-simulator software and place those in the scene, which is often a tedious and very iterative task. Moreover, the most interesting part of the physics is completely lost, and the animation does not react in real-time. This can also mean that changes have to be added manually, whereas when physics is simulated the fire and smoke will automatically react to those changes.
What is the current process for an animator to go from a still model to a moving, lifelike creature? How long does that usually take?
ASF: The process usually starts by collecting images and video references of the animal being worked on. They analyze the poses and motion, then they use the 3D model and a “rig,” or a skeleton that defines where the joints of the animal are to create the actual animation. During this process, they go back and forth between the reference materials and the animation they are creating in tools such as Maya, 3DS Max, MotionBuilder or Blender. It is possible to interpolate poses between consecutive frames to speed up the process, but they still have to do a lot of manual and time-consuming work. The time varies depending on the complexity and amount of animations needed, but as I mentioned, just a single animal can take 500 man-days to fully produce.
How do you teach a computer this same process with only 2D videos as a reference?
ASF: Deep learning algorithms learn features and movements from long hours of training on massive datasets. In our case, our pipeline involves three neural networks – an object-detection network locates the animal on each frame of the video, then a second network generates the 2D coordinates of the animal skeleton. Finally, a third network converts the 2D coordinates of the skeleton to 3D.
We train the second network with data generated from existing keyframe animations. This data includes images of the animal and the 2D coordinates of its skeleton. During the training process, the neural network learns features from the images and tries to guess the coordinates of its skeleton. By comparing its guess against the data in the second step, the network learns the characteristics of images and “recognizes” the pose of the cougar. We repeat this process many times, and then, to convert the 2D coordinates into 3D, we train the third network with a similar approach. Currently, ZooBuilder only supports the cougar example from my SCA presentation, but our objective in the long run is to support more animals, especially quadrupeds. The same technology can be applied to different species of animals with different skeletons, such as birds or fish, as long as we have existing keyframe animations to generate the synthetic training data.
What about La Torch? How is it different from the more traditional process of creating fire and smoke effects?
SR: As opposed to the technique of baking animations I mentioned, Torch 2.5D directly runs inside games in real time. This means it can interact with objects and respond to 3D forces like wind and motion. Our system has two elements that pave the way for practical application in console games: first is a novel technique in speeding up the computation and equation-solving, which can achieve up to four times the speed. The second system creates a 3D illusion of the fire and smoke, but with a 2D simulation under the hood, which makes sure the computational costs are significantly lower while taking advantage of several rendering techniques that provide a realistic 3D simulation.
The idea is to maximize simulation speed while ensuring a minimum visual quality that is consistent with the visuals of the rest of the scene. Ideally, we would like to completely cut the iterative process of baking the simulation in external software and do it all on the fly, directly in the game. The artist can tune or script the physics parameters to control the effect, and every instance of fire can be controlled independently from the others. Torch also provides the baking option for cases where it’s needed, saving computation budget and giving options based on scene requirements.
Is it usual for fire in games to feature physics, and be so reactive to things like movement or flow dynamics?
SR: Fire and smoke are great examples of an immersive gameplay experience, as the physics is always rich in detail and unpredictable, but in a predictable way. A part of the realism of smoke and fire depends on how they will react in response to their surrounding environment. As long as we use animated, baked clips of smoke and fire instead of actually computing their state in real time, there is little we can do to simulate the reactive qualities that make it so real, things like colliding with objects or changing shape and direction with the wind. So the answer is no, it is not usual at all, but it is highly desired as a next step towards building more life-like worlds.
What is the goal of events like SCA?
DH: SCA is one of the top academic conferences in animation, simulation, and computer graphics, with researchers from all around the world presenting their latest work in these fields. The role of the showcases track this year is to try and give people working in the field, but not necessarily in academia, an easy way to share their latest ideas and results. Too often, some of the most interesting tips, tricks, and ideas get overlooked because they may not have a strong academic, theoretical component to them. The showcases track is a way to share and preserve this knowledge, and to let people from any background present their latest work to the community in a simple and easy way.
Why are events like SCA important? What does sharing these breakthroughs do for the community?
DH: For those of us working on the tech side of games, I think conferences like SCA are always a reminder of why we are so passionate about this field in the first place. Sometimes it can be easy to forget how essential it is to experiment and try new things, and often, seeing new research can inspire us both to push ourselves and to be proud of what we have already created.
ASF: Though the work we are doing with ZooBuilder is quite exciting and promising, we still have a long way to go. Being able to attend events such as SCA gives us the chance not only to present our work, but also to connect with other experts from the academic world and other companies such as Disney, Pixar or Nvidia. It is a great opportunity for us to receive constructive feedback from the professional community which we can use to improve our work. Speaking of sharing, we are always eager to learn from our peers and very open to collaborations in this area.
SR: Sharing is a great way to circulate information that can benefit everyone. The academic community will benefit by being exposed to new ideas that will hopefully inspire follow-up works in the field, and we benefit by receiving constructive feedback to improve our work, as well as creating enthusiasm in the domain so that people – experts or interns – will come and work with us to build the next generation of tools.
What does the animation of the future look like with access to tools like these?
DH: In the past, I think animation has often been used simply as an indicator to show what is happening on the gameplay side. Historically, it has also been a limitation for game design, as characters are only capable of performing actions they already have animations for, and only in ways the animation system supports. With recent research, it is starting to become possible to build very realistic, flexible, scalable animation systems from scratch with relative ease, and I hope this is going to eventually lead to much richer, more immersive, and unique game experiences.
ASF: Artificial Intelligence cannot fully replace animators, no matter what advances and developments we make. There is a strong creative and artistic intent behind animations, and only humans have this capacity. What AI can do, though, is assist in the more tedious work by generating raw animations much faster. It gives animators more time to focus on high-value tasks, refining and adding more aesthetics to their work. I see the animation of the future as a cake, with the ingredients and base work provided by AI – but the recipe, its execution, and the cherry on the top would all be the work of the chef, the animator.
SR: We have seen great progress in key domains like realistic rendering and seamless motion transitions in character animation. However, physics in general, despite being a huge contributing factor to the richness of the game experience, has been slow to catch up. I believe we are getting closer to real-time simulations that are a good fit for our production computational budget. With effects like fluids, we will be able to include a great deal of interesting and visually pleasing effects that will ultimately elevate the virtual world experience for players. On top of that, once real-time simulation is established, it will most likely affect the kinds of gameplay which can be designed, which will open up a new world to all sorts of new types and genres of games.