# Gated PixelCNN from Scratch — CIFAR-10 A complete from-scratch implementation of the **Gated PixelCNN** autoregressive generative model, trained on CIFAR-10. Built using Huggingface's Ml-intern. ## 🚀 Quick Start 1. Open `Gated_PixelCNN_CIFAR10.ipynb` in Google Colab 2. Select **Runtime → Change runtime type → T4 GPU** 3. Run all cells — training takes ~2 hours [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Mubashir-2000/gated-pixelcnn-cifar10/blob/main/Gated_PixelCNN_CIFAR10.ipynb) ## Architecture Implemented entirely from scratch using only `torch.nn` primitives: - **Masked Convolutions** with correct Mask A/B and RGB sub-pixel channel ordering - **Vertical + Horizontal Stack** architecture eliminating the blind spot - **Gated Activations**: `y = tanh(W_f * x) ⊙ σ(W_g * x)` - **256-way categorical cross-entropy** loss per pixel per channel | Component | Specification | |---|---| | Model | Gated PixelCNN | | Layers | 15 gated layers | | Filters | 128 per stack | | Kernels | 7×7 input, 3×3 body | | Parameters | ~5.4M | | Loss | Cross-Entropy (256 classes) | | Metric | BPD + FID | ## Evaluation | Metric | Our Model (50 epochs) | Paper [2] (converged) | |---|---|---| | BPD (test) | ~3.5-4.0 | 3.03 | | FID | Reported in notebook | — | ## T4 GPU Optimization - Total VRAM usage: ~2 GB (model + activations) - Batch size: 32 - Training time: ~2 hours for 50 epochs - fp32 (model is small enough; fp16 gains minimal) ## References 1. van den Oord et al., "Pixel Recurrent Neural Networks," ICML 2016. arXiv:1601.06759 2. van den Oord et al., "Conditional Image Generation with PixelCNN Decoders," NeurIPS 2016. arXiv:1606.05328 3. Salimans et al., "PixelCNN++," ICLR 2017. arXiv:1701.05517