CAVOK@lemmy.world to Technology@lemmy.worldEnglish · 1 month agoDonating our open-source alignment tool - Anthropicwww.anthropic.comexternal-linkmessage-square1linkfedilinkarrow-up121arrow-down19
arrow-up112arrow-down1external-linkDonating our open-source alignment tool - Anthropicwww.anthropic.comCAVOK@lemmy.world to Technology@lemmy.worldEnglish · 1 month agomessage-square1linkfedilink
minus-squareEm Adespoton@lemmy.calinkfedilinkEnglisharrow-up7·1 month agoThat’s all great, but all it takes is to unalign a single parameter and it appears to unalign the entire model. So this is great for ensuring you’re testing what you think you’re testing, but it’s not going to actually secure a model you’re going to make open.
That’s all great, but all it takes is to unalign a single parameter and it appears to unalign the entire model.
So this is great for ensuring you’re testing what you think you’re testing, but it’s not going to actually secure a model you’re going to make open.