Dimensional Imaging: The Remedy Connection / 14th of February 2015
Remedy’s Quantum Break is set to be both a flagship title for Microsoft’s Xbox One, and a potential breakthrough in narrative driven gaming experiences, and not simply for the ground-breaking way in which in converges interactive and non-interactive storytelling elements either. A major contributor to this, out with the developer’s own Northlight game engine, is the performance capture technology chosen to transition the gulf between videogame character and actor’s performance. Whilst motion capture has been in use within the industry for a considerable amount of time, it’s never been this advanced before, and the combination of the two technologies has given birth to a whole generation of digital actors capable of expressing even the slightest nuance of emotion. Yet the system that has been developed to enable this wasn’t created in either America or Japan, but Scotland, at a small company in Glasgow…
Dimensional Imaging’s breakthrough Di4D Pro technology, is a system that employs either six or nine cameras capturing footage at resolutions of 1600 x 1200 or 2048 x 1536 at as much as 60fps, the cameras are divided into groups of threes - two monochrome, one colour - referred to as a pod. Remarkably, this technology does not require the use of any special makeup or even markers, a common sight in motion capture, to achieve the stunning results that set this system apart from the rest of the pretenders. So, how does it do this? We wanted to find out for ourselves, and so, we took our questions directly to the company behind this innovative technology.
HRG: Can you explain how your 4D performance capture technology works?
DI: Facial capture traditionally splits into either facial performance capture or facial 3D scanning. Our 4D surface capture solution, (DI4D PRO) effectively combines both by capturing dense 3D ‘scan’ data at 50 or 60 frames per second. Our DI4D PRO System captures sequences of high resolution, colour 3D facial performance data. We use only standard video lighting and do not require any markers, make-up or structured light projection. Our software then processes the captured video data to produce a simple point cache file (and optional per frame texture and detail maps) that contains the motion of all of the vertices in the customer’s facial mesh.
HRG: How are your results achieved without the need for markers or makeup?
DI: Our Systems use a highly efficient passive stereo photogrammetry solution. This allows us to generate high fidelity 3D sequence data, using only a relatively small number of cameras and without requiring excessive processing power. The result is practical, yet a very high fidelity solution that is able to capture much more of the subtlety and nuance of facial performance than alternative approaches without the need for any markers or make-up.
Colin Urquhart (CEO): There are only so many markers you can put on a person’s face. This limits the fidelity of traditional optical motion capture. Rather than track a sparse set of marker points, we obtain dense per-pixel optical flow data and use this to obtain a fixed topology mesh that deforms. That’s what makes our system unique and useful for entertainment.
HRG: How do your techniques improve upon already established performance capture systems?
DI: Most facial performance capture systems either capture a single video stream which is interpreted to drive a rigged character model, or capture the 3D trajectories of only a sparse set of discrete facial markers. By contrast, our 4D system acquires a “3D scan” per frame using passive stereo photogrammetry and then a dense mesh is tracked through the sequence using optical flow. Every vertex in the tracked mesh then effectively becomes a motion capture marker. The result is much higher fidelity data, capturing much more of the subtlety and nuance of facial performance and expression.
Colin Urquhart (CEO): The fundamental difference between our system and others that capture 3D data at video rates is that our tracking works at the pixel level using the natural skin texture rather than tracking features or markers on the face.
DI: In addition to our ‘sit down’ system, (DI4D PRO), we have also developed a head mounted version of this system - ‘DI4D HMC’. The HMC is able to capture full performance (by simultaneously capturing the facial performance with traditional optical body motion capture) from multiple actors, whilst also delivering the high fidelity of capture we are known for.
HRG: How will this technology benefit narrative driven experiences such as Remedy Entertainment’s Quantum Break?
DI: From the actors point of view, they simply have to give their facial performance in front of an array of synchronised video cameras and standard video lighting, without requiring markers or special make-up - the result is super high definition facial motion capture from real life talent. Once our software has processed the captured video data to produce a simple point cache file (and optional per frame texture and detail maps), Remedy then ‘solve’ this point cache data onto their character rigs.
Sam Lake (Remedy Creative Director): Quantum Break is a hugely ambitious project that combines action and narrative components in a unique way to bring the characters to life. The only way to achieve the high quality of performance was to create highly realistic digital doubles of talented actors. By using Dimensional Imaging’s DI4D facial performance capture solution, combined with Remedy’s Northlight storytelling technology, we can ensure that every nuance of the actors’ performances are captured on screen.
Colin Urquhart (CEO): With the development of games like Quantum Break, we are now starting to see a level of graphic quality and realism rendered in real-time that could previously only be achieved with off-line rendering. As a result we are truly entering an era of convergence, where the quality of both the assets and performances required in game will be much more on a par with those required for television and movies, this is where the benefits of our technology come into their own.
HRG: What uses does the system present outside of entertainment industries?
DI: Both our 3D and 4D systems are used across the Research sector within the facial surgery, psychology, and orthodontics sectors.
In some ways, the requirements of the two markets are quite similar in that both require high fidelity data captured in a very subject friendly environment. The main difference is in the duration and intensity of projects. Medical research projects tend to develop quite slowly and then require modest intensity of capture and processing over a long period of time, whereas entertainment projects tend to develop (and change) rapidly, with a brief period of very high intensity work with tight deadlines. We are very pleased that we have been able to develop flexible solutions that can be applied easily to both scenarios.
HRG: Comparable differences between recording resolutions and different camera set-ups (six vs nine)
DI: Adding more cameras increases the coverage of capture. E.g. our HMC system uses just two cameras and only captures the front of the subject’s face. The nine camera system can capture from ear to ear and also allows for a little bit of subject movement. The nine camera system can also capture better fidelity data in areas such as the side of the nose that are not so visible with a two camera system.
Now, whilst the majority of this might sound a tad academic to most gamers, this technology could have big implications for the quality of cinematic, narrative experiences, Remedy Entertainment’s upcoming Xbox One exclusive, Quantum Break, being just one. As gamers seek to find reasons to upgrade to a PS4 or Xbox One, and with companies such as EA touting the potential of the mobile market as a viable alternative to the console platforms, the coming of age of the narrative experience could perhaps not have come at a better time. Each successive console generation seems to usher in a new age of one particular aesthetic aspect, be it texture resolution or lighting, but this time, we may very well see a considerable leap forward in terms on animation, as characters look and move in dramatically more lifelike ways. When showcasing Forza 5 at E3 2013, Turn 10 proclaimed that AI was dead as it plugged its Drivatar system, well perhaps now we are witnessing the end of the video game character in much the same sense, as pixels and polygons finally step aside to allow the digital actor to take to the stage.
Special thanks to everyone at Dimensional Imaging for taking the time out to chat with us