# Gated PixelCNN from Scratch — CIFAR-10

A complete from-scratch implementation of the **Gated PixelCNN** autoregressive generative model, trained on CIFAR-10.
Built using Huggingface's Ml-intern.
## 🚀 Quick Start

1. Open `Gated_PixelCNN_CIFAR10.ipynb` in Google Colab
2. Select **Runtime → Change runtime type → T4 GPU**
3. Run all cells — training takes ~2 hours

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Mubashir-2000/gated-pixelcnn-cifar10/blob/main/Gated_PixelCNN_CIFAR10.ipynb)

## Architecture

Implemented entirely from scratch using only `torch.nn` primitives:

- **Masked Convolutions** with correct Mask A/B and RGB sub-pixel channel ordering
- **Vertical + Horizontal Stack** architecture eliminating the blind spot
- **Gated Activations**: `y = tanh(W_f * x) ⊙ σ(W_g * x)`
- **256-way categorical cross-entropy** loss per pixel per channel

| Component | Specification |
|---|---|
| Model | Gated PixelCNN |
| Layers | 15 gated layers |
| Filters | 128 per stack |
| Kernels | 7×7 input, 3×3 body |
| Parameters | ~5.4M |
| Loss | Cross-Entropy (256 classes) |
| Metric | BPD + FID |

## Evaluation

| Metric | Our Model (50 epochs) | Paper [2] (converged) |
|---|---|---|
| BPD (test) | ~3.5-4.0 | 3.03 |
| FID | Reported in notebook | — |

## T4 GPU Optimization

- Total VRAM usage: ~2 GB (model + activations)
- Batch size: 32
- Training time: ~2 hours for 50 epochs
- fp32 (model is small enough; fp16 gains minimal)

## References

1. van den Oord et al., "Pixel Recurrent Neural Networks," ICML 2016. arXiv:1601.06759
2. van den Oord et al., "Conditional Image Generation with PixelCNN Decoders," NeurIPS 2016. arXiv:1606.05328
3. Salimans et al., "PixelCNN++," ICLR 2017. arXiv:1701.05517