GEN, STAT cite CZI datasets and models as advancements toward AI-based virtual cell models

Scatterplot of 800,000 human organoid cells forming color-coded tissue clusters, visualized in CZI’s CELLxGENE tool.
A look inside CZ CELLxGENE: each color shows a different tissue type from more than 800,000 human organoid cells, helping power AI models like TranscriptFormer.

As the scientific community progresses toward AI-based virtual cell models, openly sharing datasets and models is paramount for the field to advance. Two articles in ​​GEN and STAT cover the announcement of a publicly available Perturb-seq dataset from AI drug developer Xaira Therapeutics, co-founded by Nobel Laureate and CZI grantee Dr. David Baker, and also credit CZI as a leader in openly sharing observational data through its CZ CELLxGENE platform. Along with other publicly available datasets, CZ CELLxGENE was used to train TranscriptFormer, a cross-species generative AI model built by CZI that further expands the field’s capacity to understand and simulate biology. The articles also mention CZI’s Billion Cells Project, an effort to generate an unprecedented one billion cell dataset to fuel rapid progress in AI model development for biology.

With leaders like CZI, Xaira and others opening up large-scale, high-quality datasets and tools, the research community is closer than ever to unlocking how cells behave — and how to treat disease at the cellular level.

###

About the Chan Zuckerberg Initiative
The Chan Zuckerberg Initiative was founded in 2015 to help solve some of society’s toughest challenges — from eradicating disease and improving education, to addressing the needs of our local communities. Our mission is to build a better future for everyone. For more information, please visit chanzuckerberg.com.

News

  • Scott Fraser: 5 ways imaging and AI are capturing biology across billion-fold scales

    Dynamic imaging technologies are allowing us to watch biology unfold in real time and with unprecedented detail, shares Biohub’s president of imaging.

  • Andrea Califano: 6 strategies for using AI to reprogram the immune system

    The convergence of machine learning, synthetic biology, and immunology is changing what's possible for human health.

  • Shana Kelley: 5 new ways to measure inflammation

    With AI-integrated platforms to watch immune cells in action, we will be able to intervene in inflammation before it becomes disease, says Biohub’s president of bioengineering.