{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 00 — Smoke Test\n", "\n", "**Purpose:** verify that everything in our environment works before we touch any real data.\n", "\n", "Run every cell top-to-bottom. If all green checkmarks ✅ appear at the end, Phase 1 is complete.\n", "\n", "**What this notebook does:**\n", "1. Verifies GPU is available\n", "2. Mounts Google Drive and creates folder structure\n", "3. Installs pinned dependencies\n", "4. Loads Wav2Vec 2.0 from HuggingFace\n", "5. Runs one forward pass on random audio\n", "6. Connects to Weights & Biases\n", "7. Prints environment summary for documentation\n", "\n", "**Estimated time:** 5-10 minutes the first time (mostly pip install)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 1 — GPU check\n", "\n", "Before anything else: are we on a GPU? If this cell shows \"No GPU\", go to **Runtime → Change runtime type → Hardware accelerator → GPU** and re-run.\n", "\n", "**What you want to see:** Tesla T4, V100, or A100. With Colab Pro you'll usually get T4 (most common) or sometimes V100. A100 is rare on Pro." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import subprocess\n", "\n", "result = subprocess.run(['nvidia-smi'], capture_output=True, text=True)\n", "if result.returncode == 0:\n", " print(result.stdout)\n", "else:\n", " print('❌ No GPU detected. Go to Runtime → Change runtime type → GPU')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2 — Mount Google Drive\n", "\n", "Colab gives you a temporary disk that wipes between sessions. We need persistent storage for the ~25 GB of ASVspoof data and our model checkpoints. Drive is the answer.\n", "\n", "**You'll be prompted** to authorize access. Click the link, choose your Google account, copy the verification code, and paste it back. This happens once per session." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from google.colab import drive\n", "drive.mount('/content/drive')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now create the folder structure on Drive. We do this once; it'll persist across sessions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "DRIVE_ROOT = '/content/drive/MyDrive/deepfake_audio'\n", "\n", "subdirs = [\n", " 'data/raw/asvspoof_2019',\n", " 'data/raw/asvspoof_2021',\n", " 'data/raw/wavefake',\n", " 'data/processed',\n", " 'checkpoints',\n", " 'logs',\n", "]\n", "\n", "for sub in subdirs:\n", " path = os.path.join(DRIVE_ROOT, sub)\n", " os.makedirs(path, exist_ok=True)\n", " print(f' ✅ {path}')\n", "\n", "print('\\n✅ Drive folder structure created.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 3 — Clone the GitHub repo\n", "\n", "Pull our project code into Colab's local disk. We work out of `/content/deepfake-audio-detection/` for code, and read/write data via `/content/drive/MyDrive/deepfake_audio/`.\n", "\n", "**⚠️ Replace `YOUR_USERNAME` below with your actual GitHub username before running.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# --- EDIT THIS LINE ---\n", "REPO_URL = 'https://github.com/YOUR_USERNAME/deepfake-audio-detection.git'\n", "# ----------------------\n", "\n", "%cd /content\n", "!if [ ! -d 'deepfake-audio-detection' ]; then git clone {REPO_URL}; else echo 'Repo already cloned'; fi\n", "%cd /content/deepfake-audio-detection\n", "!ls -la" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 4 — Install dependencies\n", "\n", "We pin exact versions in `requirements.txt`. This protects us from \"it worked yesterday\" disasters when libraries silently update.\n", "\n", "**Note:** after install, Colab may suggest restarting the runtime. Restart **once**, then continue from Step 5. Don't restart again later — that wipes the loaded model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install -q -r requirements.txt\n", "print('\\n✅ Dependencies installed. If Colab shows a restart prompt, restart NOW and resume from Step 5.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 5 — Verify imports & versions\n", "\n", "Confirm the libraries actually loaded with the versions we expect." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import sys\n", "import torch\n", "import torchaudio\n", "import transformers\n", "import numpy as np\n", "import pandas as pd\n", "import librosa\n", "import sklearn\n", "\n", "print(f'Python: {sys.version.split()[0]}')\n", "print(f'PyTorch: {torch.__version__}')\n", "print(f'torchaudio: {torchaudio.__version__}')\n", "print(f'transformers: {transformers.__version__}')\n", "print(f'numpy: {np.__version__}')\n", "print(f'pandas: {pd.__version__}')\n", "print(f'librosa: {librosa.__version__}')\n", "print(f'sklearn: {sklearn.__version__}')\n", "\n", "print(f'\\nCUDA available: {torch.cuda.is_available()}')\n", "if torch.cuda.is_available():\n", " print(f'CUDA version: {torch.version.cuda}')\n", " print(f'GPU name: {torch.cuda.get_device_name(0)}')\n", " props = torch.cuda.get_device_properties(0)\n", " print(f'GPU memory: {props.total_memory / 1e9:.2f} GB')\n", " print(f'Compute capability: {props.major}.{props.minor}')\n", "\n", "print('\\n✅ All imports succeeded.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 6 — Load Wav2Vec 2.0\n", "\n", "This is the core test. We load the pretrained Wav2Vec 2.0 Base model from HuggingFace. The first run downloads ~360 MB; subsequent runs use a cached copy.\n", "\n", "**What's happening under the hood:** HuggingFace fetches the model weights and configuration, instantiates a PyTorch model, and we move it to the GPU." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from transformers import Wav2Vec2Model, Wav2Vec2FeatureExtractor\n", "\n", "MODEL_NAME = 'facebook/wav2vec2-base'\n", "\n", "print(f'Loading {MODEL_NAME}...')\n", "feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained(MODEL_NAME)\n", "model = Wav2Vec2Model.from_pretrained(MODEL_NAME)\n", "\n", "device = 'cuda' if torch.cuda.is_available() else 'cpu'\n", "model = model.to(device)\n", "model.eval()\n", "\n", "n_params = sum(p.numel() for p in model.parameters())\n", "print(f'\\n✅ Model loaded on {device}.')\n", "print(f' Total parameters: {n_params:,} (~{n_params/1e6:.1f}M)')\n", "print(f' Number of transformer layers: {model.config.num_hidden_layers}')\n", "print(f' Hidden size: {model.config.hidden_size}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 7 — Forward pass on random audio\n", "\n", "Generate 4 seconds of random audio (just noise — we're testing plumbing, not real prediction). Push it through the model. Check the output shape makes sense.\n", "\n", "**What you want to see:**\n", "- Input shape: `(1, 64000)` — 1 batch × 4 seconds × 16,000 samples/sec\n", "- Output shape: `(1, ~199, 768)` — 1 batch × ~199 time frames × 768 hidden dimensions\n", "- Wav2Vec 2.0 downsamples by ~320, so 64000 / 320 ≈ 200 frames" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SAMPLE_RATE = 16000\n", "DURATION_SEC = 4.0\n", "\n", "# Random audio — Gaussian noise, just for shape testing\n", "torch.manual_seed(42)\n", "fake_audio = torch.randn(1, int(SAMPLE_RATE * DURATION_SEC)).to(device)\n", "print(f'Input shape: {tuple(fake_audio.shape)}')\n", "\n", "with torch.no_grad():\n", " output = model(fake_audio)\n", "\n", "hidden_states = output.last_hidden_state\n", "print(f'Output shape: {tuple(hidden_states.shape)}')\n", "print(f' ↳ batch size: {hidden_states.shape[0]}')\n", "print(f' ↳ time frames: {hidden_states.shape[1]} (one frame ≈ 20ms of audio)')\n", "print(f' ↳ feature dim: {hidden_states.shape[2]} (Wav2Vec 2.0 Base uses 768)')\n", "\n", "# Mean-pool over time to get one vector per clip — this is what our classifier head will use\n", "pooled = hidden_states.mean(dim=1)\n", "print(f'\\nMean-pooled shape: {tuple(pooled.shape)}')\n", "print(f' ↳ This is what the classification head (added in Phase 3) will see.')\n", "\n", "print('\\n✅ Forward pass successful. The model is talking to the GPU correctly.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 8 — Connect to Weights & Biases\n", "\n", "Wandb is our experiment tracking service. Without it, every Colab disconnect erases your training history. With it, every loss curve, every metric, every config — saved to the cloud and viewable in a browser.\n", "\n", "**Before running this cell:**\n", "1. Sign up at https://wandb.ai (use GitHub for one-click auth)\n", "2. Go to https://wandb.ai/authorize and copy your API key\n", "3. When prompted by the cell below, paste the key\n", "\n", "After this works once, wandb caches your credentials in `/root/.netrc` for the session. You won't be prompted again unless you start fresh." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import wandb\n", "\n", "wandb.login() # will prompt for your API key on first run\n", "\n", "# Start a tiny test run to confirm everything connects end-to-end\n", "run = wandb.init(\n", " project='deepfake-audio-detection',\n", " name='smoke-test',\n", " job_type='setup-test',\n", " config={\n", " 'phase': 'phase-1-setup',\n", " 'model': MODEL_NAME,\n", " 'gpu': torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'cpu',\n", " 'pytorch_version': torch.__version__,\n", " }\n", ")\n", "\n", "# Log a fake metric just to verify the pipe works\n", "wandb.log({'smoke_test_metric': 1.0, 'environment_ok': True})\n", "\n", "print(f'\\n✅ Wandb run started.')\n", "print(f' View it at: {run.url}')\n", "\n", "wandb.finish()\n", "print('\\n✅ Wandb test run finished. Visit the URL above to confirm it logged correctly.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 9 — Environment summary\n", "\n", "Print the full environment fingerprint. **Copy this output into `docs/environment.md`** and commit it. Future-you (and your coauthor) will thank you." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import platform\n", "from datetime import datetime\n", "\n", "print('='*70)\n", "print(f' ENVIRONMENT FINGERPRINT — {datetime.now().strftime(\"%Y-%m-%d %H:%M:%S\")}')\n", "print('='*70)\n", "print(f'Platform: Google Colab Pro')\n", "print(f'OS: {platform.platform()}')\n", "print(f'Python: {sys.version.split()[0]}')\n", "print(f'PyTorch: {torch.__version__}')\n", "print(f'torchaudio: {torchaudio.__version__}')\n", "print(f'transformers: {transformers.__version__}')\n", "if torch.cuda.is_available():\n", " print(f'GPU: {torch.cuda.get_device_name(0)}')\n", " print(f'CUDA: {torch.version.cuda}')\n", " print(f'GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB')\n", "print(f'Drive root: {DRIVE_ROOT}')\n", "print(f'Wandb project: deepfake-audio-detection')\n", "print('='*70)\n", "\n", "print('\\n🎉 Phase 1 smoke test complete. You are ready for Phase 2 (data acquisition).')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Phase 1 Checklist\n", "\n", "Before moving to Phase 2, confirm all of these are done:\n", "\n", "- [ ] All cells above ran without errors\n", "- [ ] GPU detected (T4, V100, or A100)\n", "- [ ] Drive mounted, `deepfake_audio/` folder structure created\n", "- [ ] GitHub repo cloned into Colab\n", "- [ ] Wav2Vec 2.0 loaded and forward pass succeeded\n", "- [ ] Wandb run visible at the URL printed in Step 8\n", "- [ ] Environment fingerprint copied into `docs/environment.md`\n", "- [ ] Repo committed and pushed to GitHub with all Phase 1 files\n", "\n", "**If anything failed:** scroll up, find the failing cell, read the error message, fix the issue, re-run. Don't proceed until every cell shows ✅." ] } ], "metadata": { "accelerator": "GPU", "colab": { "gpuType": "T4", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.11.x" } }, "nbformat": 4, "nbformat_minor": 4 }