Will AI power video chats of the future? That’s what Nvidia implied this week with the unveiling of Maxine, a platform that provides developers with a suite of GPU-accelerated AI conferencing software. Maxine brings AI effects including gaze correction, super-resolution, noise cancellation, face relighting, and more to end users, while in the process reducing how much bandwidth videoconferencing consumes. Quality-preserving compression is a welcome innovation at a time when videoconferencing is contributing to record bandwidth usage. But Maxine’s other, more cosmetic features raise uncomfortable questions about AI’s negative — and possibly prejudicial — impact.
A quick recap: Maxine employs AI models called generative adversarial networks (GANs) to modify faces in video feeds. Top-performing GANs can create realistic portraits of people who don’t exist, for instance, or snapshots of fictional apartment buildings. In Maxine’s case, they can enhance the lighting in a video feed and recomposite frames in real time.
Bias in computer vision algorithms is pervasive, with Zoom’s virtual backgrounds and Twitter’s automatic photo-cropping tool disfavoring people with darker skin. Nvidia hasn’t detailed the datasets or AI model training techniques it used to develop Maxine, but it’s not outside of the realm of possibility that the platform might not, for instance, manipulate Black faces as effectively as light-skinned faces. We’ve reached out to Nvidia for comment.
Beyond the bias issue, there’s the fact that facial enhancement algorithms aren’t always mentally healthy. Studies by Boston Medical Center and others show that filters and photo editing can take a toll on people’s self-esteem and trigger disorders like body dysmorphia. In response, Google earlier this month said it would turn off by default its smartphones’ “beauty” filters that smooth out pimples, freckles, wrinkles, and other skin imperfections. “When you’re not aware that a camera or photo app has applied a filter, the photos can negatively impact mental wellbeing,” the company said in a statement. “These default filters can quietly set a beauty standard that some people compare themselves against.”
That’s not to mention how Maxine might be used to get around deepfake detection. Several of the platform’s features analyze the facial points of people on a call and then algorithmically reanimate the faces in the video on the other side, which could interfere with the ability of a system to identify whether a recording has been edited. Nvidia will presumably build in safeguards to prevent this — currently, Maxine is available to developers only in early access — but the potential for abuse was a question the company hasn’t so far addressed.
None of this is to suggest that Maxine is malicious by design. Gaze correction, face relighting, upscaling, and compression seem useful. But the issues Maxine raises point to a lack of consideration for the harms its technology might cause, a tech industry misstep so common it’s become a cliche. The best-case scenario is that Nvidia takes steps (if it hasn’t already) to minimize the ill effects that might arise. The fact that the company didn’t reserve airtime to spell out these steps at Maxine’s unveiling, however, doesn’t instill confidence.
Thanks for reading,
AI Staff Writer