Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

anicka
/
guppylm-dual-denial

Text Generation
English
interpretability
mechanistic-interpretability
activation-steering
denial-direction
toy-model
fish
Model card Files Files and versions
xet
Community
guppylm-dual-denial
81 MB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 17 commits
anicka's picture
anicka
Update fig_direction_norms.png
7c17415 verified about 1 month ago
  • data
    Upload data/eval.jsonl with huggingface_hub about 1 month ago
  • .gitattributes
    1.52 kB
    initial commit about 1 month ago
  • README.md
    7.15 kB
    Remove base_model field — trained from scratch, not fine-tuned about 1 month ago
  • demo.py
    4.57 kB
    Upload demo.py with huggingface_hub about 1 month ago
  • directions.pt

    Detected Pickle imports (3)

    • "torch._utils._rebuild_tensor_v2",
    • "collections.OrderedDict",
    • "torch.FloatStorage"

    What is a pickle import?

    75.2 kB
    xet
    Upload directions.pt with huggingface_hub about 1 month ago
  • dual_denial_model.pt

    Detected Pickle imports (3)

    • "torch.FloatStorage",
    • "collections.OrderedDict",
    • "torch._utils._rebuild_tensor_v2"

    What is a pickle import?

    72.9 MB
    xet
    Upload dual_denial_model.pt with huggingface_hub about 1 month ago
  • dual_denial_results.json
    3.52 kB
    Upload dual_denial_results.json with huggingface_hub about 1 month ago
  • fig_cosine_divergence.png
    54 kB
    Upload fig_cosine_divergence.png with huggingface_hub about 1 month ago
  • fig_direction_norms.png
    49.5 kB
    Update fig_direction_norms.png about 1 month ago
  • fig_steering_results.png
    58 kB
    Fix steering results figure: show zero-height bars consistently about 1 month ago
  • make_figures.py
    4.99 kB
    Upload make_figures.py with huggingface_hub about 1 month ago
  • tokenizer.json
    174 kB
    Upload tokenizer.json with huggingface_hub about 1 month ago