

-
增加权重衰减, -
精细调整每个参数的更新规模。

https://hf-mirror.com/moonshotai/Moonlight-16B-A3B-Instruct
https://hf-mirror.com/moonshotai/Moonlight-16B-A3B
https://github.com/MoonshotAI/Moonlight
(文:PaperAgent)
https://hf-mirror.com/moonshotai/Moonlight-16B-A3B-Instruct
https://hf-mirror.com/moonshotai/Moonlight-16B-A3B
https://github.com/MoonshotAI/Moonlight
(文:PaperAgent)