Instructions to use FlameF0X/Stable-Lime-v1.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use FlameF0X/Stable-Lime-v1.0 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("FlameF0X/Stable-Lime-v1.0", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
Stable-Lime-v1.0
Stable-Lime-v1.0 is an unconditional diffusion model based on the Denoising Diffusion Probabilistic Models (DDPM) architecture. It has been trained specifically to generate images representing the "essence of Lime."
Model Details
- Model Type: Unconditional Image Generation (Diffusion)
- Architecture: UNet2DModel with DDPMScheduler
- Framework: PyTorch & Hugging Face Diffusers
- Resolution: $64 \times 64$ pixels
- Channels: 3 (RGB)
- License: MIT (Assumed based on open-source usage)
Intended Use
This model is designed for:
- Generating $64 \times 64$ images of limes (or lime-like textures).
- Educational purposes regarding the implementation of DDPM loops.
- Low-resolution, "retro" aesthetic generation.
Out of Scope:
- Text-to-Image generation (this model does not accept text prompts).
- High-resolution photorealism (limited by the 64px architecture).
Training Data
The model was trained on a proprietary dataset located at dataset_lime/processed.
- Preprocessing: Images were resized to $64 \times 64$ and normalized to the range $[-1, 1]$.
- Augmentation: Random horizontal flips were applied during training to improve generalization.
Training Procedure
Hyperparameters
The model was trained using the following configuration ("The Lime Settings"):
| Parameter | Value | Description |
|---|---|---|
| Batch Size | 16 | Small batch size suitable for consumer GPUs. |
| Learning Rate | $1 \times 10^{-4}$ | Optimizer step size (AdamW). |
| Epochs | 5 | Note: This is a very short training duration. |
| Timesteps | 1000 | Number of diffusion noise steps. |
| Image Size | 64 | Output resolution. |
Architecture Specification
The U-Net architecture utilizes a deep structure with attention mechanisms in the lower bottleneck layers:
- Block Output Channels:
(128, 128, 256, 256, 512, 512) - Downsampling: 4x
DownBlock2D, 1xAttnDownBlock2D, 1xDownBlock2D - Upsampling: Mirror of downsampling blocks.
Loss Function
The model optimizes the Mean Squared Error (MSE) between the actual noise added and the predicted noise:
Where $\epsilon$ is the Gaussian noise and $\epsilon_\theta$ is the model's prediction at timestep $t$.
Limitations & Biases
- Undertraining Risk: With only 5 Epochs, the model may not have fully converged. Generated images might appear blurry or retain significant noise (static) rather than clear lime features.
- Resolution: The output is strictly $64 \times 64$, resulting in pixelated, low-fidelity images.
- Dataset Bias: The model's output is entirely dependent on the variety found in
dataset_lime. If the dataset contained only green limes, it will not generate yellow limes (lemons).
- Downloads last month
- -
