Meta AI’s Galactica: A 120 Billion Parameter Language Model For Science

Jim Clyde Monge
4 min readDec 2, 2022
Galactica AI. Open source large laguage model for science

Meta (Facebook 2.0) has recently released a preview of its brand-new AI model called Galactica. The AI can generate science-related articles, complete with citations. It can also carry out mathematical calculations and provide explanations.

What Is Galactica?

Galactica is a large language model (LLM) for science trained on over 48 million research papers, textbooks, and other sources of scientific knowledge.

Here’s a table that provides a detailed breakdown of the dataset sources.

Galactica LLM total dataset size breakdown of dataset sources
Galactica Research Paper

With this much information, the AI can suggest citations and help find papers that are related to the one being looked at.

Here’s an example:

A paper that introduced a neural network architecture for recognizing digits
Galactica example showing citation for a specific topic
Citation example from Galactica

The model also works with scientific terms, math and chemical formulas, and source codes.