VANTA Research

---
license: apache-2.0
base_model: allenai/OLMo-3-7B-Instruct
base_model_relation: finetune
tags:
- astronomy
- astrophysics
- physics
- science
- research
- cognitive-architectures
- education
- lora
- OLMo
- conversational
- conversational-ai
- chat
- collaborative-ai
- vanta-research
- text-generation
- LLM
- STEM
- olmo3
- olmo3-7b-instruct
- ai-research
- ai-alignment-research
- ai-alignment
- ai-behavior
- ai-behavior-research
language:
- en
library_name: transformers
pipeline_tag: text-generation
---

<div align="center">

![vanta_trimmed](https://cdn-uploads.huggingface.co/production/uploads/686c460ba3fc457ad14ab6f8/hcGtMtCIizEZG_OuCvfac.png)
  
  <h1>VANTA Research</h1>
    
  <p><strong>Independent AI research lab building safe, resilient language models optimized for human-AI collaboration</strong></p>
  
  <p>
    <a href="https://vantaresearch.xyz"><img src="https://img.shields.io/badge/Website-vantaresearch.xyz-black" alt="Website"/></a>
    <a href="https://unmodeledtyler.com/work-with-vanta-research"><img src="https://img.shields.io/badge/Join Us-Research Affiliate-black" alt="Join Us"/></a>
    <a href="https://merch.vantaresearch.xyz"><img src="https://img.shields.io/badge/Merch-merch.vantaresearch.xyz-sage" alt="Merch"/></a>
    <a href="https://x.com/vanta_research"><img src="https://img.shields.io/badge/@vanta_research-1DA1F2?logo=x" alt="X"/></a>
    <a href="https://github.com/vanta-research"><img src="https://img.shields.io/badge/GitHub-vanta--research-181717?logo=github" alt="GitHub"/></a>
  </p>
</div>

---

# Atom-Astronomy-7B

Atom-Astronomy-7B is a specialized large language model fine-tuned for astronomy and astrophysics research. Built on the OLMo-3-7B-Instruct foundation, this model combines deep domain expertise with efficient inference, delivering graduate-level astronomical knowledge with 2.2x faster response times compared to competing models.

## Model Details

- **Base Model**: allenai/OLMo-3-7B-Instruct
- **Architecture**: Transformer-based decoder (7B parameters)
- **Training Method**: Low-Rank Adaptation (LoRA) with r=16, alpha=32
- **Training Data**: 23,513 astronomy, identity, and collaboration-focused examples across 15 specialized datasets
- **Training Duration**: 2 epochs, 29.3 hours on consumer GPU
- **License**: Apache 2.0
- **Developed by**: VANTA Research

## Key Features

### Domain Expertise
- Comprehensive coverage of observational astronomy, stellar physics, cosmology, and high-energy astrophysics
- Native LaTeX equation support for mathematical expressions
- Advanced understanding of graduate-level concepts including general relativity, quantum field theory in curved spacetime, and advanced stellar evolution

### Performance Advantages
- **2.23x faster** than Qwen3-8B on complex astrophysics problems
- **1.67x faster** than base OLMo-3-7B
- **2.60x more concise** than Qwen3-8B while maintaining technical rigor
- 100% equation usage rate on technical questions
- Average response time: 75 seconds for graduate-level problems

### Technical Quality
- Maintains mathematical precision with proper notation and units
- Provides detailed derivations when appropriate
- Balances theoretical depth with practical interpretation
- Consistent use of astronomical nomenclature and conventions

## Training Data

The model was trained on a carefully curated dataset comprising:

1. **Astronomy Fundamentals** 
   - Observational techniques and instrumentation
   - Coordinate systems and celestial mechanics
   - Photometry and spectroscopy

2. **Stellar Physics** 
   - Stellar structure and evolution
   - Nucleosynthesis and energy generation
   - Compact objects and endpoints

3. **Cosmology** 
   - Large-scale structure formation
   - Dark matter and dark energy
   - CMB physics and early universe

4. **High-Energy Astrophysics** 
   - Black hole physics and accretion
   - Relativistic jets and gamma-ray bursts
   - Neutron stars and pulsars

5. **Galactic and Extragalactic Astronomy** 
   - Galaxy formation and evolution
   - Active galactic nuclei
   - Interstellar medium

6. **Computational and Observational Methods** 
   - Data analysis techniques
   - Numerical methods in astrophysics
   - Telescope systems and surveys

7. **Specialized Topics** 
   - Exoplanets and planetary systems
   - Astrobiology considerations
   - Multi-messenger astronomy
   - Gravitational wave astronomy

## Benchmark Performance

### Hard Graduate-Level Astrophysics Evaluation

A comprehensive 10-question benchmark covering advanced topics including:
- Eddington luminosity and super-Eddington accretion
- Tolman-Oppenheimer-Volkoff equation derivations
- Cosmological inflation and CMB physics
- Relativistic beaming in gamma-ray bursts
- Stellar nucleosynthesis (pp-chain and CNO cycle)
- Cosmological recombination and Saha equation
- Black hole orbital dynamics and ISCO calculations
- Penrose process and Blandford-Znajek mechanism
- Type Ia supernovae as standard candles
- Hawking radiation and black hole thermodynamics

**Results:**

| Model | Avg Response Time | Total Time | Avg Words | Equation Usage | Calculation Rate |
|-------|------------------|------------|-----------|----------------|------------------|
| **Atom-Astronomy-7B** | **75.2s** | **12.5 min** | **2,032** | **100%** | **100%** |
| OLMo-3-7B-Instruct | 125.2s | 20.9 min | 3,396 | 100% | 100% |
| Qwen3-8B | 168.0s | 28.0 min | 5,277 | 100% | 100% |

**Key Findings:**
- 2.23x faster than Qwen3-8B
- 1.67x faster than base OLMo-3-7B
- Maintains 100% technical accuracy with equations and calculations
- Delivers concise, focused responses without sacrificing depth
- 40-55% reduction in total processing time for complex queries

### AstroBench Professional MCQ Evaluation

**Status**: Evaluation in progress

This model is currently undergoing comprehensive evaluation on the AstroBench_MCQ_v1_Public dataset, a professional-grade multiple-choice question benchmark derived from the Annual Review of Astronomy and Astrophysics. The dataset contains 3,846 expert-level questions covering the full breadth of modern astronomy research.

**Preliminary Observations:**
- 90% answer extraction rate (18/20 in initial test)
- 5.43s average response time per question
- Maintains technical reasoning quality with proper elimination of incorrect options
- Shows appropriate caution by not forcing answers when uncertain

Full results will be published upon completion of the comprehensive evaluation. The model card will be updated with detailed accuracy metrics and comparative analysis.

## Intended Use

### Primary Applications
- Graduate-level astronomy education and tutoring
- Research literature comprehension and summarization
- Rapid calculation verification and derivation assistance
- Conceptual explanation of complex astrophysical phenomena
- Preparation of technical documentation and proposals

### Recommended Use Cases
- Researchers requiring quick answers to technical astronomy questions
- Educators developing curriculum materials and problem sets
- Students studying advanced astrophysics coursework
- Scientific writers needing accurate technical content
- Data analysts working with astronomical datasets

### Out of Scope
- Real-time observational data processing (use specialized pipelines)
- Production-level numerical simulations (use dedicated simulation codes)
- Medical or legal advice
- Financial or investment guidance

## Usage

### Basic Inference

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "vanta-research/atom-astronomy-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

prompt = """Explain the Tolman-Oppenheimer-Volkoff equation and how it differs from 
standard hydrostatic equilibrium. What does this tell us about neutron star structure?"""

messages = [
    {"role": "system", "content": "You are Atom, a helpful AI assistant specialized in astronomy and astrophysics."},
    {"role": "user", "content": prompt}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.7, top_p=0.9)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

### Quantized Inference (GGUF)

For efficient local deployment, quantized GGUF versions are available:

```bash
# Using Ollama
ollama create atom-astronomy:7b -f Modelfile

# Query the model
ollama run atom-astronomy:7b "Calculate the Schwarzschild radius for a 10 solar mass black hole"
```

## Limitations

### Known Constraints
- Primarily trained on English-language astronomy content
- Knowledge cutoff based on training data (not continuously updated)
- May occasionally produce overly concise responses for pedagogical contexts
- Limited training on observational data reduction techniques
- Astronomical constants and measurements may require verification against latest standards

### Model Behavior
- Optimized for technical accuracy over verbosity
- Assumes reader familiarity with undergraduate physics
- May not provide extensive motivational context compared to base model
- Better suited for expert users than complete beginners

## Bias and Safety Considerations

### Training Data Bias
- Dataset reflects historical emphasis on optical/radio astronomy
- May underrepresent emerging fields like multi-messenger astronomy
- Training data primarily from Western academic institutions
- Limited coverage of cultural astronomy and historical perspectives

### Safety Measures
- Maintains Apache 2.0 open-source license
- No training on personal or proprietary data
- Inherits safety alignments from base OLMo-3 model
- Recommended for use within appropriate scientific contexts

## Model Card Authors

VANTA Research

## Citation

If you use Atom-Astronomy-7B in your research, please cite:

```bibtex
@misc{atom-astronomy-7b,
  title={Atom-Astronomy-7B: A Specialized Language Model for Astronomy and Astrophysics},
  author={VANTA Research},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/vanta-research/atom-astronomy-7b}}
}
```

Please also cite the base model:

```bibtex
@article{olmo3,
  title={OLMo 3: Open Language Model},
  author={Allen Institute for AI},
  year={2024},
}
```

## Acknowledgments

This model builds upon the excellent work of the Allen Institute for AI in developing the OLMo series of open language models. We thank the astronomy and astrophysics community for developing the open-source educational materials and research papers that informed our training data curation.

## Contact

For questions, issues, or collaboration inquiries, please contact:
- Email: hello@vantaresearch.xyz