Siggraph 2018: deep learning and deep fakes

Dr. Andrew Glassner is a writer-director and a consultant in interactive fiction, and computer graphics. Kicking off this year’s SIGGRAPH courses program, he ran 2018’s incredibly impressive Deep Learning: A Crash Course. Deep learning is the hot topic for this year’s SIGGRAPH.

The giant ballroom was packed for the whole afternoon on Sunday as a testament to how much interest there is in this subject and his talk is one of the best SIGGRAPH courses we have attended in recent years. AsGlassner mentioned, he has spent a career discovering patterns from data and he is a brilliant speaker and educator.

Glassner’s course aimed to show how this sub set of AI, called deep learning, works and what it offers computer graphics. Glassner started working in 3D computer graphics in 1978, and has carried out research at such key centres as the NYIT Computer Graphics Lab, Case Western Reserve University, the IBM TJ Watson Research Lab, Xerox PARC, and Microsoft Research. The New York Times wrote that, “Andrew Glassner is one of the most respected talents in the world of computer graphics research.”

Another course or new style SIGGRAPH workshop that was also on Sunday was Truth in Images, Video and Graphics, organised by Irfan Essa. He is a Professor at the School of Interactive Computing of the College of Computing at the Georgia Tech. He also works at Google Research. This workshop/course was focused on the ethical and societal issues that Deep Learning has thrown up. Deep Learning has accelerated the problem of the arms race of making and detecting synthetic imagery or Fake News.

In many respects, these two courses, while both exceedingly different, were two different looks at the same topic. The Truth course was new special type of SIGGRAPH intensive, by invitation course. It was tiny compared to the vast size of Glassner’s lecture, yet its attendees were senior experts in deep Learning and also Forensics imaging. The Truth workshop format was different than the Deep Learning Crash Course. It had a range of speakers and experts such as Matthias Nießner , from the Visual Computing Group Munich, at TUM, Hao Li from USC ICT but also Forensics imaging experts such as Hany Farid is a professor of computer science at Dartmouth College, and people involved in the recent DARPA Media Forensics program.

As an aside, if you’re at SIGGRAPH this week and are interested in learning more this area, be sure to check out the papers session on Thursday from 2:00pm – 3:30pm in Ballroom C. One of the presentations will be Deep Video Portraits which “enables full control over a target actor by transferring head pose, facial expressions, and eye motion with a high level of photorealism.” Check out the YouTube video, below.

Deep Learning: A Crash Course

Glassner’s course on deep learning was designed to be a semester course on the theory of deep learning, but in an afternoon (and without the maths). The course was a great primer for many of the technical papers we saw mentioning deep learning at the Papers Fast Forward later that night. Glassner pitched the content to a SIGGRAPH beginner in the area of deep learning, but the course was very densely packed with information and exceedingly well structured.

Over the afternoon Glassner explained terms, gave examples, and built up a strong conceptual understand of why this area of research has been so successful in image processing and image classification problems such as face recognition, noise reduction, ray tracing. And why it is being actively researched by almost every lab and equipment manufacturer. This point was underlined by Monday’s NVIDIA real time ray tracing hardware launch which uses deep learning extensively and Glassner pointed to NVIDIA’s published research in deep learning several times.

Deep learning is a class of machine learning (ML) algorithms that have proven themselves to be enormously powerful especially in areas such as classification problems. Deep learning is an approach to doing machine learning that has been one of the most successful and is very applicable to GPU and hardware acceleration. It is widely used for things already like address hand writing on envelopes, or details on bank checks. But the focus of the course was moving from these simple textbook examples to the hugely impactful work being shown here at SIGGRAPH in image synthesis, facial recognition, tracking, fluid simulation as well as the key work in noise reduction and ray tracing. Starting with the simple handwriting data sets of MNIST, which is the ‘Hello World’ of ML, he built to covering NVIDIA’s brilliant noise reduction and why it is possible to produce such fast ray traced renders that so impressive.

Glassner outlined most of the key terms and explained the concepts, starting with unsupervised learning, reinforcement learning and supervised learning. It was the last of these that he spent most of his time. He also covered some funny examples of defeating machine learning and highlighting how little this field is often times just experimentation rather than based on proven theory. The very nature of machine learning is the black box nature of the solutions that can both appear as if magical and yet involve insightful innovative ‘simple’ ideas as their kernels.

Truth in Images: Arms Race

The problem of the democratisation of news is that it has meant everyone has access to the news, but everyone also thinks that they are a journalist. Feeding into this narrative has been the issue of ‘fake news’. The Truth workshop sort to discuss both the technical issues around what is possible and also how to technically detect fakes. This later work covered many of the same landscape that Glassner covered, but at the high end of the field of deep learning current innovation.

The most obvious example from the last year has been the deep fakes, and while the complexity of actually using that software to produce believable results is missed by most of the popular press’s coverage, the issue of detecting high quality propaganda or false imagery is very timely and important.

Unfortunately, while there is little benefit to people in presenting actually faked news, it will happen. And while the panel of speakers did not focus much of the discussion on the issue of ‘profiting’ from having people view and often believe edited imagery, they did frame the discussion as an Arms Race between the tools of making believable fakes and detecting them. What made the day so interesting is that the experts in the room covered both leading researching in what was possible, and how to spot it.

Many leading researchers who attended are involved in the legal expert testimony and national security issues. The weight of experienced opinion and in depth discussion made this SIGGRAPH workshop so valuable. As with many of the courses and workshops at SIGGRAPH 2018, the organizing committees do not get enough praise for the calibre of speakers they have arranged. In this case, it was hard to imagine a more informed and experienced group addressing such a complex problem.

Some of the solutions discussed and demonstrated included advanced image analysis but also the idea of maintaining providence to keep trust. The question was raised as to why the cameras themselves are not encoding more information that would provide proof of untouched imagery, something that is completely possible.

This ‘blockchain for photos’ would not solve 100% CGI fakes but it could allow for detailed examination of photos to prove their accuracy. This is key in not only proving a fake, but addressing the growing problem that people believe “anything can be faked” and thus they are losing trust in actual real images and valid reporting.

The veracity of material, the lack of trust and allowing someone to say “it was faked” when it is not, reflects broader issues of trust in society and in the institutions of a democracy. A weaponising of social media undermines the targets of the attacks and provides those who would be held accountable the excuse of ‘it was all faked”.

In addition to advanced algorithms for detecting editing, the speakers also pointed to web site initiatives such as TruePic, which seek to validate the imagery is a camera secure source. This app provides simple, straightforward image authentication for any photo verification. But sadly people are lazy and few will use such tools explicitly hence much of the day was spent focusing on forensic algorithms for identifying faked material, especially using GANs (generative networks). As one speaker commented, “if you believe the ‘GAN story’ (of delivering incredible learned results) then the generator will always win, anything you can detect, you can train to improve and add in to fake.” Thus the arms race to use the deep learning tools to detect not improve faked imagery.

Of course, one does not need deep learning or even photoshop image processing tools to provide fake news. Taking images out of context, or showing images from one war zone as footage from another, or even just staging a photograph to promote an agenda are all real world low tech problems. But it is also true that SIGGRAPH needs to provide these high level discussions that address some of the implications of computer graphics and image processing in addition to the great technical focus on how to do it.