论文阅读:Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions
论文阅读:EEG-Defender : Defending against Jailbreak through Early Exit Generation of Large Language Models
论文阅读:PARDEN, CanYouRepeat That? Defending against Jailbreaks via Repetition——通过重复来防御越狱攻击
论文阅读:SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding——SafeDecoding通过安全意识解码防御越狱攻击
论文阅读:Mitigating Large Language Model Hallucination with Faithful Finetuning——通过忠诚微调减轻大型语言模型幻觉
论文阅读:A Pathway Towards Responsible AI Generated Content