HomeScience & TechAlphaFold a revolutionary artificial intelligence (AI) network to predict the structures of...

AlphaFold a revolutionary artificial intelligence (AI) network to predict the structures of about 200 million proteins from 1 million species

The researchers used AlphaFold a revolutionary artificial intelligence (AI) network to predict the structures of about 200 million proteins from 1 million species, covering nearly every known protein on the planet. The data dump will be freely available in a database created by DeepMind, Google’s London-based artificial intelligence company that developed AlphaFold, and the European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI), an intergovernmental organization near Cambridge, UK.

What’s next for AlphaFold and the AI ​​protein folding revolution

“You can basically think of it as covering the entire protein universe,” DeepMind CEO Demis Hassabis said at a press conference. “We are at the beginning of a new era of digital biology.”The 3D shape or structure of a protein determines its function in cells. Most drugs are designed using structural information, and accurate maps are often the first step to discoveries about how proteins work. DeepMind developed the AlphaFold network using an artificial intelligence technique called deep learning, and a year ago the AlphaFold database was launched with 350,000 structure predictions covering nearly every protein made by humans, mice and 19 other widely studied organisms. The catalog has since grown to approximately 1 million records. “We’re getting ready to release this huge treasure trove,” says Christine Orengo, a computational biologist at University College London who used the AlphaFold database to identify new protein families. “Having all the data predicted for us is just fantastic.

High quality construction

The release of AlphaFold last year caused a stir in the life-sciences community, which sought to use the tool. The network produces highly accurate predictions of the 3D shape or structure of proteins. It also provides information about the accuracy of its predictions, so scientists know what to rely on. Scientists have traditionally used time-consuming and expensive experimental methods such as X-ray crystallography and cryo-electron microscopy to solve protein structures. According to EMBL-EBI, about 35% of the more than 214 million predictions are considered highly accurate, meaning they are as good as experimentally determined structures. Another 45% were considered confident enough to be relied on for many applications.

Many AlphaFold structures are good enough to replace experimental structures for some applications. In other cases, scientists use AlphaFold predictions to verify and understand experimental data. Mispredictions are often obvious, and some of them are due to intrinsic disorder in the protein itself, meaning it does not have a defined shape, at least without the presence of other molecules. The 200 million predictions published today are based on sequences in another database, called UNIPROT. It is likely that scientists already had an idea of ​​the shape of some of these proteins because they are covered in databases of experimental structures or because they resemble other proteins in such repositories, says Eduard Porta Pardo, a computational biologist at the Josep Carreras Leukemia Research Institute. (IJC) in Barcelona. But such records tend to be skewed toward human, mouse, and other mammalian proteins, Porta says, so the AlphaFold dump is likely to add significant knowledge because it draws from many more diverse organisms. “It will be an amazing resource. And I’ll probably download it as soon as it comes out,” says Porta.

Since the AlphaFold software has been available for a year, scientists have already had the capacity to predict the structure of any protein they wish. But many say having the predictions available in a single database will save researchers time, money  and faff. “It’s another barrier to entry that you remove,” says Porta. “I’ve used a lot of AlpahFold models. I have never run AlphaFold myself.”Jan Kosinski, a structural modeler at EMBL Hamburg in Germany, who has been running the AlphaFold network for the past year, can’t wait to expand the database. His team spent 3 weeks predicting the proteome – the collection of all the proteins of an organism  a pathogen. “Now we can download all the models,” he said at the briefing.

One hundred terabytes

Having almost every known protein in the database will also enable new kinds of studies. The Orengo team used the AlphaFold database to identify new species of protein families, and now they will do so on a much larger scale. Her lab will also use the expanded database to understand the evolution of proteins with useful properties, such as the ability to consume plastic, or worrisome ones, such as those that can drive cancer. Identification of distant relatives of these proteins in the database may determine the basis for their properties. Martin Steinegger, a computational biologist at Seoul National University who helped develop the cloud version of AlphaFold, is excited to see the database expand. But he says researchers will likely have to run the network themselves. People are increasingly using AlphaFold to determine how proteins interact, and such predictions are not in the database. Microbial proteins are not even identified by sequencing genetic material from soil, ocean water and other “metagenomic” sources.

Some sophisticated applications of AlphaFold’s extended database may also depend on downloading its entire 23 terabytes of content, which will not be feasible for many teams, Steinegger says. Cloud storage can also be expensive. Steinegger co-developed a software tool called FoldSeek that can quickly find structurally similar proteins and should be able to greatly compress the AlphaFold data. Although every known protein is included, the AlphaFold database will need updating as new organisms are discovered. AlphaFold predictions may also improve as new structural information becomes available. Hassabis says DeepMind is committed to supporting the database for the long haul and could see updates appear every year. He hopes the availability of the AlphaFold database will have a lasting impact on the life sciences. “It’s going to take a pretty big change in mindset.”

Read also :Technology Focus: United States and Japan have decided to launch a new joint international semiconductor research centerJa

[responsivevoice_button buttontext="Listen This Post" voice="Hindi Female"]

LEAVE A REPLY

Please enter your comment!
Please enter your name here

RELATED ARTICLES

Trending News

Meta Unveils Enhanced AI Assistant Powered by Llama 3 Model

San Francisco: Meta has announced the launch of an upgraded version of its AI assistant, Meta AI, leveraging advancements...

US House of Representatives to Vote on $95 Billion Aid Package for Ukraine, Israel, Taiwan, and Potential TikTok Ban

In a significant move, the US House of Representatives is set to vote on a $95 billion aid package...

Study Reveals Alarming Increase in Atrial Fibrillation Cases Silent Killer

A groundbreaking study conducted by Danish researchers has uncovered a startling revelation: atrial fibrillation (AFib), a serious heart condition...

Breaking News: Assembly Elections 2024 Kick Off in Arunachal Pradesh and Sikkim

As the sun rises over the picturesque landscapes of Arunachal Pradesh and Sikkim, voters across both states are flocking...