Home
Services & products
Services
Develop your AI maturity with tailored services
01
AI & data strategy
02
Data engineering
03
Educational workshop
04
Tailored ML solutions
05
AI team extension
06
ML deployment & operations
Products
Plug-and-play AI solutions
SalesHunter
Your AI journey
Discover how AI can help your organisation
AI maturity assessment
Industries
01
Chemicals
02
CPG
03
Banking
04
Logistics
05
Manufacturing
06
Pharma
07
Utilities
Industries
Industries
Ensure the success of your AI transformation
01
Chemical
02
Consumer Packaged Goods
03
Financial Services
04
Logistics
05
Manufacturing
06
Pharmaceuticals
07
Utilities
About
Our valuesOur story
Our team
Blog
All postsAI & MLAbout VisiumCase studiesDigestTalks & Webinars
JOIN USCONTACT US
Posted on 
December 7, 2020
|
minutes

Protein Folding Breakthrough, TLDR in Science, & Robot Bias

AlphaFold solves the protein folding problem

DeepMind's AlphaFold reaches 90+ percent accuracy at protein structure prediction competition

Context

Proteins are indubitably the most important molecules for sustaining life. Practically all functions, from transporting oxygen through our blood to giving leaves their bold colors, are supported by proteins.

These proteins can be described using three different structural languages, which each have a varying degree of complexity and abstraction and are depicted below.

Source: MSOE Center for BioMolecular Modeling

There are different ways of determining protein structure, and each of these methods yields information about the protein in one of the different structural languages. For instance, while mass spectrometry can yield primary structure, only Nuclear Magnetic Resonance (NMR), X-ray crystallography, and cryo-electron microscopy (which are immensely time- and resource-intensive) are able to yield tertiary structure.

What's new

Last week, DeepMind's AlphaFold competed in the biennial Critical Assessment of protein Structure Prediction (CASP). The challenge allows participants to predict the tertiary structure of a given primary structure. The metric used for evaluation is called the Global Distance Test (GDT). In short, the score ranges from 0-100 and indicates how close the predicted structure is from the Ground Truth.

In the past 7 versions of CASP, the winners' scores didn't grow past 75 GDT, even staying below 50 GDT before CASP 2018. This year, however, AlphaFold's state-of-the-art AI model was able to achieve a median score of 92.4 GDT. This surpasses the 90 GDT threshold that is considered to be a 'solution' to the protein folding problem.

Source: DeepMind

Their solution implements new deep learning techniques that consider a folded protein as a spatial graph. Using an attention-based neural network, evolutionarily related sequences, and multiple sequence alignment, the system develops strong predictions of the underlying physical structure of the protein.

Why it matters

For 50 years, researchers in Biology have been looking for a method to determine tertiary structure using only the information from the primary structure. This is essential as the tertiary structure is closely linked to its function. Therefore, knowing a protein's tertiary structure unlocks a greater understanding of what it does and how it works.

What's next

The DeepMind team states that they're "optimistic about the impact AlphaFold can have on biological research and the wider world, and excited to collaborate with others to learn more about its potential in the years ahead. Alongside working on a peer-reviewed paper, we’re exploring how best to provide broader access to the system in a scalable way."

'Too Long; Didn't Read' comes to scientific literature

A new state-of-the-art summarization model is being used to distill the information of AI research papers into a single sentence

Context

In recent years, many different summarization models have been released. Their common goal: reduce reading time without compromising understanding. You can easily find online bots such as summarizebot, summarization APIs such as one from DeepAI, and articles explaining the key technical concepts behind these types of models. What's the catch? The common flaw of these models is that they don't generalize well. If applied to text that is uncommon in the dataset that was used for training, the model will perform significantly worse.

What's new

Researchers from the Allen Institue "introduce TLDR generation for scientific papers, a new automatic summarization task with high source compression, requiring expert background knowledge and complex language understanding." This quote is a summarized version of the abstract of their paper using the method described in said paper.

Using a multitask learning strategy on top of pretrained BART, researchers were able to compile the SciTLDR dataset. By analyzing a paper's abstract, intro, and conclusion (for computational reasons), the method is able to summarize 5 000 word articles in only 20.

Source: Semantic Scholar

The AI solution has been deployed as a beta-version on Semantic Scholar. Displaying the TLDR of articles directly on the search results page enables you to quickly locate the right papers for you. The feature is already available for nearly 10 million computer science papers, and counting!

Why it matters

Staying up to date with scientific literature is an essential part of a researchers’ workflow. Furthermore, parsing through a long list of papers from different sources by reading abstracts is extremely time-consuming.

TLDRs can help researchers make quick and informed decisions about which papers are relevant to them. TLDRs also provide paper summaries for explaining the content in other contexts, such as sharing a paper on social media platforms.

What's next

Summarizing papers with 20 words gives you a good idea of the direction of the paper. However, in complex domains such as Computer Science, a couple of dozen words is not enough to distill the information. A possibility for the future might be dynamic N-sentence summarizers.

Making Robots less biased than humans

Researchers in Robotics have committed to actively ensuring fairness in AI-driven solutions

Context

Almost all police robots in use today are straightforward remote-control devices. However, more sophisticated robots are being developed in labs around the world. Increasingly, they use Artificial Intelligence to integrate many more complex and diverse features.

Many researchers find this problematic. In fact, several AI algorithms for facial recognition, predicting people’s actions, or nonlethal projectile launching have led to controversy in past few years. The reason is clear: many of these algorithms are biased against people of color and other minorities. Researchers from Google have argued why the police shouldn't use this type of software. Above that, some private citizens are now using facial Recognition against the Police, as mentioned in a previous digest.

‍

What's new

Earlier this year, hundreds of AI and robotics researchers committed to actively changing some practices in their field of work. A first Open Letter from the organization Black in Computing states that “the technologies we help create to benefit society are also disrupting Black communities through the proliferation of racial profiling.” A second statement, “No Justice, No Robots”, calls for its signers to refuse work with or for law enforcement.

Researchers in robotics are trained to solve difficult technical problems. They are not educated to consider societal questions about how the robots they build affect society. Nevertheless, they have committed themselves to actions whose end goal is to make the creation and usage of AI in Robotics more just.

Source: Wes Frazer for The New York Times

Why it matters

The adoption of AI systems is growing exponentially. Today there are AI systems built into self-driving cars meant specifically for the detection of pedestrians. A study by Benjamin Wilson and his colleagues from Georgia Tech has found that eight such systems were significantly worse at detecting people with darker skin tones than lighter ones.

As a public policy researcher from Georgia Tech, Dr. Jason Borenstein, puts it: “it is disconcerting that robot peacekeepers, including police and military robots, will, at some point, be given increased freedom to decide whether to take a human life, especially if problems related to bias have not been resolved.”

What's next

The root cause of this issue, as Dr. Odest Chadwicke Jenkins (one of the main organizers of the open letter mentioned above) from the University of Michigan states, "is representation in the room — in the research lab, in the classroom, and the development team, the executive board.”

In parallel, some technical progress is trying to mitigate the potential unfair outcomes of AI systems. For instance, Google has developed a system to bring a shared understanding of AI models called Model Cards, as mentioned in a previous digest that discussed background features in Google Meet. In the Model Card, bias is tested for different geographies, skin tones, and genders. This method clearly identifies the metrics used and results found, adding a lot of transparency and accountability to Machine Learning Modeling.

Additionally, the market for synthetic datasets is growing rapidly. The use of this methodology, which is covered in more detail in a previous digest, allows to balance datasets that could potentially produce unfair outcomes.

‍

Tagged:
Ethics in AI
R&D
Arnaud Dhaene
AI Solution Specialist
view All Posts
Subscribe
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Featured Posts
Digest
Data Cascades, AI for Football, and Protein Generation
Digest
AI in Manufacturing, Google Vertex AI, and Session-Based Recommendations
Digest
Green ML, Interpretable Cancer Detection, and Self-supervised Transformers
Digest
AI Regulations, Car Wreckognition, and External Data Copy
Digest
Melanoma Detection, Bank Customer Confidence and Welding Control
Tags
AI For Good
AI Governance
Document Management
Ethics in AI
Healthcare
Keynotes and Talks
ML in Production
Manufacturing
NLP
Probabilistic Programming
R&D
Retail
Sound AI
Stay Connected
Subscribe

Get new posts to your inbox

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
More Posts

You might also like

Artificial Intelligence 101
AI in Financial Services: 10 questions you should ask before thinking about leveraging new technologies for AML- Fraud detection
Nov 8, 2021
 by 
Digest
Data Cascades, AI for Football, and Protein Generation
Jul 26, 2021
 by 
Arnaud Dhaene
Digest
AI in Manufacturing, Google Vertex AI, and Session-Based Recommendations
Jun 15, 2021
 by 
Arnaud Dhaene
Artificial Intelligence 101
Let’s Talk AI Strategy
Jun 4, 2021
 by 
Lucas Nottaris
Digest
Green ML, Interpretable Cancer Detection, and Self-supervised Transformers
Jun 1, 2021
 by 
Arnaud Dhaene
Digest
AI Regulations, Car Wreckognition, and External Data Copy
May 18, 2021
 by 
Arnaud Dhaene

We tailor state-of-the-art AI solutions for the world's best brands

Navigation
HomeAboutBlogDigestContactPrivacy policy
Services
AI & data strategyIdeation workshopTailored ML solutionsAI team extensionML operations
Products
SalesHunter
© Copyright 2021. All Rights Reserved.
Visium SA