Physics of Language Models: Part 4.2
updated
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers
Paper
• 2512.17351
• Published • 29
facebook/PhysicsLM4.2__LlamaCanon-8B-Nemo-1T-lr0.003
Updated • 97
• 5
facebook/PhysicsLM4.2__LlamaCanon-1B-Nemo-1T-lr0.002
Updated • 316
• 3
facebook/PhysicsLM4.2__LlamaCanon-1B-Nemo-1T-lr0.003
Updated • 196
• 2
facebook/PhysicsLM4.2__LlamaCanon-1B-Nemo-2T-lr0.003
Updated • 261
• 3
facebook/PhysicsLM4.2__LlamaCanon-1B-Nemo-2T-lr0.005
Updated • 207
• 4
facebook/PhysicsLM4.2__LlamaCanon-3B-Nemo-1T-lr0.002
Updated • 130
• 3
facebook/PhysicsLM4.2__LlamaCanon-3B-Nemo-1T-lr0.003
Updated • 130
• 2
facebook/PhysicsLM4.2__LlamaCanon-8B-Nemo-1T-lr0.002
Updated • 88
• 2
facebook/PhysicsLM4.2__Llama-1B-Nemo-1T-lr0.002
Updated • 149
• 2
facebook/PhysicsLM4.2__Llama-1B-Nemo-1T-lr0.003
Updated • 154
• 3
facebook/PhysicsLM4.2__Llama-1B-Nemo-2T-lr0.005
Updated • 154
• 2
facebook/PhysicsLM4.2__Llama-3B-Nemo-1T-lr0.002
Updated • 129
• 2
facebook/PhysicsLM4.2__Llama-3B-Nemo-1T-lr0.003
Updated • 125
• 2
facebook/PhysicsLM4.2__Llama-8B-Nemo-1T-lr0.002
Updated • 311
• 2
facebook/PhysicsLM4.2__Llama-8B-Nemo-1T-lr0.003
Updated • 202
• 3
facebook/PhysicsLM4.2__Llama-1B-Nemo-2T-lr0.003
Updated • 144
• 2