

-
增加权重衰减, -
精细调整每个参数的更新规模。

https://hf-mirror.com/moonshotai/Moonlight-16B-A3B-Instructhttps://hf-mirror.com/moonshotai/Moonlight-16B-A3Bhttps://github.com/MoonshotAI/Moonlight
(文:PaperAgent)



https://hf-mirror.com/moonshotai/Moonlight-16B-A3B-Instructhttps://hf-mirror.com/moonshotai/Moonlight-16B-A3Bhttps://github.com/MoonshotAI/Moonlight
(文:PaperAgent)