Meta AI’s Galactica: A 120 Billion Parameter Language Model For Science

Jim Clyde Monge
4 min readDec 2, 2022
Galactica AI. Open source large laguage model for science

Meta (Facebook 2.0) has recently released a preview of its brand-new AI model called Galactica. The AI can generate science-related articles, complete with citations. It can also carry out mathematical calculations and provide explanations.

What Is Galactica?

Galactica is a large language model (LLM) for science trained on over 48 million research papers, textbooks, and other sources of scientific knowledge.

Here’s a table that provides a detailed breakdown of the dataset sources.

Galactica LLM total dataset size breakdown of dataset sources
Galactica Research Paper

With this much information, the AI can suggest citations and help find papers that are related to the one being looked at.

Here’s an example:

A paper that introduced a neural network architecture for recognizing digits
Galactica example showing citation for a specific topic
Citation example from Galactica

The model also works with scientific terms, math and chemical formulas, and source codes.

For example, you can ask the AI to explain a mathematical formula in plain English.

Galactica example explaining a code in plain english
Mathematical explanation example from Galactica

This is incredible.

I can imagine how useful this tool would be for students and researchers.

What Are LLMs?

Large Language Models (LLMs) are artificial intelligence computer programs that can read, summarize, and translate texts. They can also guess what words will come next in a sentence, which lets them make sentences that sound natural.

Galactica Is Open Source

Unlike other LLMs that are accessible only via paid APIs, Galactica is open-source.

We believe models should be open source and so we open source the model…

