Le Lézard
Classified in: Health, Science and technology
Subjects: PDT, TRI

Basecamp Research Launches BaseFold: A Breakthrough in 3D Protein Structure Prediction of Large, Complex Protein Structures


BaseFold leverages Basecamp Research's purpose-built foundational dataset to significantly increase prediction accuracy of large, complex protein structures and small molecule interactions ? it is up to six times more accurate than AlphaFold2 and offers up to a three-fold improvement in small molecule docking 

More reliable 3D structure predictions for larger and more complex proteins is poised to greatly accelerate AI-based drug discovery efforts

LONDON, March 12, 2024 /PRNewswire/ -- Basecamp Research, a world leader in artificial intelligence (AI)-based design of proteins and other biological systems, today announced the launch of BaseFold, its new deep learning model that predicts 3D structures of large, complex proteins more accurately than other AI-powered tools, including the industry gold standard, AlphaFold2. These data were recently published in bioRxiv.

BaseFold was created by augmenting the AlphaFold2 model, which predicts the 3D structure of a protein based on its amino acid sequence, with BaseGraph. BaseGraph is Basecamp Research's purpose-built foundational dataset for biological AI, collected via access and benefit-sharing partnerships with over 25 biodiversity-rich countries. The published accuracy improvements are just a starting point, as BaseFold is continuously improving week over week as Basecamp Research scales its global network of biodiversity partnerships. Furthermore, Basecamp Research will be working with NVIDIA to optimise and productionise BaseFold for NVIDIA BioNeMo, a generative AI platform for drug discovery.

The scientific benchmark for determining protein structure is still via slow and time-consuming experimental methods such as X-ray crystallography. However, AlphaFold2's development in 2020 provided a breakthrough for the use of AI across biotechnology, giving scientists confidence in AI-based structural predictions. A wide array of structure prediction models have since followed AlphaFold2, most notably CollabFold, ESMFold, OpenFold and RoseTTAFold.

However, the performance of these models is highly dependent on their training data; all are trained on public protein databases that are widely seen as unfit for biotech's AI era. These public training datasets are small, unreliable and heavily biased toward proteins from laboratory model organisms. The sequence data captured in these public databases is estimated to represent less than 0.000001% of life on Earth. These data limitations mean that existing AI tools work well for predicting the structures of smaller, simpler proteins that are well-represented in public datasets but often struggle beyond that, creating major problems for those using AI to develop complex new medicines.

AlphaFold2 draws heavily from the public MGnify database, known for having issues with incomplete sequences, which can impact the quality of structures predicted for larger proteins. Basecamp Research's BaseFold tackles the next big computational challenge, which is to achieve crystallography-level accuracy for larger, more complex proteins, especially those underrepresented in existing protein sequence databases.

To do this, BaseFold extracts orders of magnitude more meaningful evolutionary information from over 6 billion relationships in BaseGraph. Replete with extensive genomic context and comprehensive metadata, training algorithms on BaseGraph has been shown to yield significant advances in the performance of a wide range of biological AI models, including AlphaFold2 as presented here.

In this preprint, Basecamp Research scientists evaluated BaseFold's performance in predicting the structure of various proteins selected from the CASP15 (Critical Assessment of Structure Prediction) competition and CAMEO (Continuous Automated Model EvaluatiOn) community project.

Publication Result Highlights

"We have redesigned and rebuilt the entire data acquisition process, making us the first team ever to collect and annotate biodiversity data with the same quality as human clinical genetic data ? all purpose-built for the AI era," said Dr. Phil Lorenz, CTO of Basecamp Research. "BaseGraph, the most diverse and comprehensive dataset of its kind, is the core driver of our advances in AI. The results of this publication prove that more diverse, representative genomics data allows for step-change algorithm improvements without the need for extensive lab-in-the-loop infrastructure. Our database is growing every week, and as a result, BaseFold is improving every week, too."

"AlphaFold is one of the most useful AI tools in drug discovery, and for good reason. It enables researchers to better predict how medicines may interact with proteins in the body, shaving off years of work. However, AlphaFold still has significant room for improvement ? particularly when being used to predict large, complex and underrepresented proteins, which are often the most critical for the development of new therapeutics. Even just a few percentage points of error can have major implications in accurately predicting protein-molecule interactions," said Dr. Glen Gowers, co-founder of Basecamp Research.

"We know that when it comes to AI, the best data produces the best outcomes, and it's rewarding to know that the new, purpose-built foundational dataset that we have built is already having widespread implications for drug development and human health," Dr. Gowers added. "We're not stopping here, though ? we are continuing to scale our biodiversity partnerships and apply this data advantage across more and more biological AI models."

The full preprint can be found here: https://www.biorxiv.org/content/10.1101/2024.03.06.583325v1

About Basecamp Research

Basecamp Research is a market leader in mapping biodiversity for AI-based design of biological systems. We match and refine novel proteins for our partners' exact industrial, therapeutic or diagnostic applications using BaseGraphtm, a new generation of AI design that is powered by the first-ever high-resolution map of global genetic biodiversity. 

Understanding the full genetic, evolutionary, and environmental context of each protein allows Basecamp Research to design tailored proteins for specific applications without the need for expensive and time-consuming directed evolution campaigns. We're a team of explorers, scientists and policy experts driven by our ambition to protect and learn from nature's diversity, whilst delivering life-changing breakthroughs to those who need them most. 

For more information, visit www.basecamp-research.com.

For media and other inquiries, please contact [email protected], 07867 488769

Photo - https://mma.prnewswire.com/media/2357306/Basecamp.jpg
Logo - https://mma.prnewswire.com/media/2357382/Basecamp_Research_Logo.jpg

SOURCE Basecamp Research


These press releases may also interest you

at 00:55
A news report from CRI Online: In August 2023, the United Nations General Assembly (UNGA) adopted the resolution "International Decade of Sciences for Sustainable Development 2024-2033" (Sciences Decade). This resolution offers a distinctive...

27 avr 2024
"Currently, the world economy is gradually recovering and the energy industry is accelerating its green and low-carbon transition. Developing nuclear energy has become a broad consensus of the world," said Yu Jianfeng, chairman of the China National...

27 avr 2024
Panasonic Energy of North America (PENA) recently joined forces with Girl Scouts of the Sierra Nevada (GSSN) to create the "Manufacturing for Clean Energy" patch program, a first-of-its-kind initiative. The program was held at both the GSSN and PENA...

27 avr 2024
With an aim to spotlight influential figures who have significantly shaped the contours of digital marketing, ClickReady's new blog series...

27 avr 2024
Having led over a decade of successful WordPress projects, Inspry has proven itself as a reliable technical...

27 avr 2024
Claudio Bono, Managing Director for two independent Hotels in Silicon Valley, is thrilled to unveil a groundbreaking platform and an idea that will transform the landscape of the urgent unhoused crisis and social issues. Leveraging his extensive...



News published on and distributed by: