Background Modifier, AI Governance, and small dataset GANs
Background Features in Google Meet
Google launches background features in its online video conference platform that work directly in your browser.
As the working-from-home culture quickly spreads around the world, there is more demand for added functionalities on conferencing applications. To focus on the meeting itself and prevent distractions from setting or background objects, several applications have implemented background modifications tools. Unfortunately, these functionalities require extensive computing power. Indeed, they often require the installation of specific software.
Google AI researchers have recently developed background features for Google Meet, their Workspace's real-time meeting application. The new set of features allow users to (1) slightly blur, (2) heavily blur, or (3) completely replace their background.
The interesting aspect here is that everything runs in high quality directly in your browser! Researchers used MediaPipe, Google's open-source Live ML platform, and WebAssembly, a low-level web code format, to achieve high performance at extremely fast inference speed. For more information about making this solution run on a wide range of devices and with low power consumption, read the article.
Lastly, Google provides a Model Card for the segmentation model used in these features. It lays out many interesting aspects of the model, such as its limitations and ethical considerations. Furthermore, it provides the evaluation metrics used for evaluating model performance and detailed results testing fairness across geographic regions, skin tones, and gender. This Model Card format, based on a paper by Google, is relevant for shedding light on intended use cases and promoting transparent model reporting.
Why it matters
The complexity of Machine Learning models is not necessarily related to its potential utility in the real-world. Here, Google researchers show that a fairly simple and light-weight segmentation model optimized for web performance can have a tremendous impact on a daily application. Regardless of their complexity, successful solutions are transparent, accessible, and suited to common real-life use-cases.
AI governance around ethics, fairness, transparency, and explainability are paramount when putting Machine Learning solutions into production.
When coming out of the research setting, AI models can introduce unique problems. Training data often doesn't reflect real-life data. Be it errors, duplication, or bias; when training data is flawed, the model doesn't perform well. Even worse, it could produce discriminatory or unfair results.
Additionally, models go stale over time. The inference quality of a model is known to drift as the input stream becomes increasingly different from the data the model was trained on.
While ML tools for production (controversially called 'ML Ops') are on the rise. Tools such as allegro ai and MLflow from Databricks advertise end-to-end ML Operations management, from experiment tracking to deployment in production. With or without these tools, companies today need to define process management frameworks that take all external factors into account. Read more in this Forbes article.
Adding on to the fundamental requirements formulated by the EU for trustworthy AI, BMW Group has written its code of ethics for AI. It states seven basic principles covering the use of AI within the company, which are displayed in the image below.
The code of ethics is a great start and shows BMW's hands-on approach to tackling AI governance. However, concepts such as ethics, fairness, explainability, and transparency are still topics of debate in the AI industry. They are ever-changing, which is what makes AI governance so challenging. BMW affirms that this list will be refined and adapted continuously.
Why it matters
It is fundamental for any company using AI in their products or services to define internal AI governance. Its widespread use and increasing diversity of use cases demonstrate the need for companies to manage their processes and take responsibility for their products' outcomes. Especially as AI is being democratized and increasingly leveraged in small and medium enterprises.
GANs with small datasets
Dynamical data augmentation allows GANs to produce good results with less data.
Generative Adversarial Networks often need huge amounts of data for good results. You might think this is a non-issue with the seemingly unlimited supply of images online. However, it remains challenging to collect a large dataset for an application with specific constraints. Constraints can be subject type, image quality, location, privacy, and many more. Unfortunately, when trained on small datasets, GANs tend to replicate the training data or output extremely noisy results.
Researchers from Nvidia have developed a discriminator augmentation mechanism to stabilize training in scenarios where less data is available. The technique, called Adaptive Discriminator Augmentation, dynamically augments the training data with image scaling, rotations, and color transformations, etc. The key is to add these common augmentation techniques in the right proportion in order to prevent overfitting.
Training a StyleGAN model on the Flickr Face High-Quality dataset, researchers were able to significantly improve the evaluation metrics when compared to the baseline StyleGAN model they used. In fact, their version even beat the baseline model that was trained on a dataset that is 5 times larger! These results are shown in the figure below, taken from the paper.
Why it matters
It takes tens of thousands of images to train a GAN successfully. Gathering all the necessary data is an extremely resource-intensive task. Reducing the number of needed images by an order of magnitude can already notably reduce the effort. Indeed, Adaptive Discriminator Augmentation makes GANs more accessible and increases the feasibility of high-stake Machine Learning tasks.