Stable-Lime-v1.0

Stable-Lime-v1.0 is an unconditional diffusion model based on the Denoising Diffusion Probabilistic Models (DDPM) architecture. It has been trained specifically to generate images representing the "essence of Lime."

Model Details

Model Type: Unconditional Image Generation (Diffusion)
Architecture: UNet2DModel with DDPMScheduler
Framework: PyTorch & Hugging Face Diffusers
Resolution: $64 \times 64$ pixels
Channels: 3 (RGB)
License: MIT (Assumed based on open-source usage)

Intended Use

This model is designed for:

Generating $64 \times 64$ images of limes (or lime-like textures).
Educational purposes regarding the implementation of DDPM loops.
Low-resolution, "retro" aesthetic generation.

Out of Scope:

Text-to-Image generation (this model does not accept text prompts).
High-resolution photorealism (limited by the 64px architecture).

Training Data

The model was trained on a proprietary dataset located at dataset_lime/processed.

Preprocessing: Images were resized to $64 \times 64$ and normalized to the range $[-1, 1]$.
Augmentation: Random horizontal flips were applied during training to improve generalization.

Training Procedure

Hyperparameters

The model was trained using the following configuration ("The Lime Settings"):

Parameter	Value	Description
Batch Size	16	Small batch size suitable for consumer GPUs.
Learning Rate	$1 \times 10^{-4}$	Optimizer step size (AdamW).
Epochs	5	Note: This is a very short training duration.
Timesteps	1000	Number of diffusion noise steps.
Image Size	64	Output resolution.

Architecture Specification

The U-Net architecture utilizes a deep structure with attention mechanisms in the lower bottleneck layers:

Block Output Channels: (128, 128, 256, 256, 512, 512)
Downsampling: 4x DownBlock2D, 1x AttnDownBlock2D, 1x DownBlock2D
Upsampling: Mirror of downsampling blocks.

Loss Function

The model optimizes the Mean Squared Error (MSE) between the actual noise added and the predicted noise:

$L = \text{MSE}(\epsilon, \epsilon_\theta(x_t, t))$

Where $\epsilon$ is the Gaussian noise and $\epsilon_\theta$ is the model's prediction at timestep $t$.

Limitations & Biases

Undertraining Risk: With only 5 Epochs, the model may not have fully converged. Generated images might appear blurry or retain significant noise (static) rather than clear lime features.
Resolution: The output is strictly $64 \times 64$, resulting in pixelated, low-fidelity images.
Dataset Bias: The model's output is entirely dependent on the variety found in dataset_lime. If the dataset contained only green limes, it will not generate yellow limes (lemons).

Downloads last month: -

Inference Providers NEW

Unconditional Image Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

FlameF0X
/

Stable-Lime-v1.0