Parameters / Experts - How to run this model ;
#16 opened about 1 year ago
by
DavidAU
DeepSeek R1 0528?
#15 opened about 1 year ago
by
Thireus
This model almost completely loses Chinese ablities
π 1
3
#14 opened about 1 year ago
by
CHNtentes
Base version?
β 3
2
#13 opened about 1 year ago
by
ToastyPigeon
Russian language is missing
1
#12 opened about 1 year ago
by
Kosh69
Please, share the custom vLLM source you made
π 1
#11 opened about 1 year ago
by
hyunw55
Update metadata π€
#10 opened about 1 year ago
by
merve
Model seems to not be performing correctly
1
#9 opened about 1 year ago
by
daniel-ltw
Larger model?
π§ 2
#8 opened about 1 year ago
by
blobbybob
number of experts +
π₯π§ 2
#7 opened about 1 year ago
by
Danioken
Brainstorming
π§ 5
5
#6 opened about 1 year ago
by
Downtown-Case
Further training/distillation needed?
π 1
1
#5 opened about 1 year ago
by
mingyi456
Besides pruning..
6
#4 opened about 1 year ago
by
Lockout
Context size? YaRN still supported?
2
#3 opened about 1 year ago
by
Thireus
Variants
#2 opened about 1 year ago
by
someone13574
code
β 18
#1 opened about 1 year ago
by
mrfakename