AI models
We develop frontier AI models, trained on large-scale biological datasets to understand and model life from the level of molecules to tissue and cells.
A world model of protein biology
We’re making it possible for the scientific community to explore, understand, discover, and design proteins, through an open source engine for biological discovery and engineering.

ESMC
A protein language model that has learned the rules of protein biology from training on billions of protein sequences.

ESMFold2
A-state-of-the-art model for protein structure prediction and design that defines a new frontier for speed and accuracy.

ESM Atlas
The largest atlas of protein structure and function with 6.8 billion proteins covering the breadth of life’s biodiversity.
Data
Our mission at Biohub is to cure and prevent disease and we’re building the technologies and datasets that will allow the entire scientific community to reach that goal faster. We generate large-scale biological data that spans model systems and organisms, experimental and observational methods, and diverse cellular states and make these data openly available to help scientists accelerate discoveries. Visit our ecosystem of datasets and tools at biohub.ai.
Biohub launches the Virtual Biology Initiative
The landmark initiative will galvanize a global effort of leading institutions and consortia to create the technologies and multi-modal datasets needed to build predictive models of the human cell to accelerate the cure and prevention of all disease.

CELL×GENE
An interactive data explorer for single-cell datasets that leverages modern web development techniques to enable fast visualizations of at least 1 million cells, enabling data exploration.

CryoET Data Portal
A cloud-based, open-source portal aimed at driving the development of automated annotations of cryoET datasets and shortening data processing time from months or years to weeks.