upload via upload_folder 2025-08-03T14:14:32.982945+00:00

Files changed (7) hide show

README.md ADDED Viewed

+---
+env_name: LunarLander-v3
+tags:
+- LunarLander-v3
+- double-dqn
+- reinforcement-learning
+- custom-implementation
+- deep-q-learning
+- pytorch
+model-index:
+- name: DoubleDQN-1d-LunarLander-v3
+  results:
+  - task:
+      type: reinforcement-learning
+      name: reinforcement-learning
+    dataset:
+      name: LunarLander-v3
+      type: LunarLander-v3
+    metrics:
+    - type: mean_reward
+      value: 271.13 +/- 32.77
+      name: mean_reward
+      verified: false
+---
+# **Double-DQN** Agent playing **LunarLander-v3**
+This is a trained model of a **Double-DQN** agent playing **LunarLander-v3**.
+## Usage
+### create the conda env in https://github.com/GeneHit/drl_practice
+```bash
+conda create -n drl python=3.10
+conda activate drl
+python -m pip install -r requirements.txt
+```
+### play with full model
+```python
+# load the full model
+model = load_from_hub(repo_id="winkin119/DoubleDQN-1d-LunarLander-v3", filename="full_model.pt")
+# Create the environment.
+env = gym.make("LunarLander-v3")
+state, _ = env.reset()
+action = model.action(state)
+...
+```
+There is also a state dict version of the model, you can check the corresponding chapter in the repo.

eval_result.json ADDED Viewed

+{
+    "mean_reward": 271.1292218414864,
+    "std_reward": 32.76824691220499,
+    "datetime": "2025-07-30T12:11:33.037116+00:00",
+    "train_duration_min": "9.34"
+}

full_model.pt ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:684be2bf9f5991db171c296b32d3b4829576a49302c95a9445c81764e51dd93c
+size 281145

params.json ADDED Viewed

+{
+    "env_config": {
+        "env_id": "LunarLander-v3",
+        "env_kwargs": {},
+        "max_steps": null,
+        "normalize_obs": false,
+        "use_image": false,
+        "vector_env_num": 6,
+        "use_multi_processing": true,
+        "image_shape": null,
+        "frame_stack": 1,
+        "frame_skip": 1,
+        "training_render_mode": null
+    },
+    "device": "cpu",
+    "learning_rate": 0.0001,
+    "gamma": 0.99,
+    "checkpoint_pathname": "",
+    "max_grad_norm": null,
+    "log_interval": 50,
+    "track": true,
+    "eval_episodes": 100,
+    "eval_random_seed": 42,
+    "eval_video_num": 10,
+    "timesteps": 250000,
+    "epsilon_schedule": {
+        "_type": "LinearSchedule",
+        "_module": "practice.utils_for_coding.scheduler_utils",
+        "_start_e": 1.0,
+        "_end_e": 0.01,
+        "_duration": 150000,
+        "_start_t": 0
+    },
+    "replay_buffer_capacity": 120000,
+    "batch_size": 64,
+    "train_interval": 1,
+    "target_update_interval": 250,
+    "update_start_step": 2000,
+    "dqn_algorithm": "double"
+}

replay.mp4 ADDED Viewed

Binary file (41.6 kB). View file

state_dict.pt ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:cc5358fa47f1ccd737da9dcb0d6a2283b154719bdada1dbd40841beb056b0c64
+size 279673

tensorboard/events.out.tfevents.1753876921.winkindeMacBook-Air.local.76622.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:ad003c56ec4b32c116d08f59d06260e9f819a52e4baac64ec44d440a575d45ac
+size 1720886