PaperWeekly
GPT-4.5刚出来就被攻破了?成功率破90%, MBZUAI团队出品
文章介绍了一篇关于对抗攻击的研究成果,该研究提出了一种新的方法M-Attack来提高对大型视觉语言模型的攻击成功率,并成功应用于多个商业模型中。
AI Agent真的安全吗?南洋理工最新综述揭秘LLM Agent安全风险与防御策略
A Survey on Trustworthy LLM Agents: Threats and Countermeasures, a study by researchers from Nanyang Technological University and Squirrel AI, categorizes Agent safety issues into internal (Brain, Memory, Tool) and external (User, Agent, Environment) modules. It provides insights for future research directions such as collaborative security mechanisms and adversarial robustness training methods.