Estimating worst case frontier risks of open weight LLMs

May 21, 2026 - 18:15
Updated: 18 hours ago
0 0
Estimating worst case frontier risks of open weight LLMs
In this paper, we study the worst-case frontier risks of releasing gpt-oss. We introduce malicious fine-tuning (MFT), where we attempt to elicit maximum capabilities by fine-tuning gpt-oss to be as capable as possible in two domains: biology and cybersecurity.

What's Your Reaction?

Like Like 0
Dislike Dislike 0
Love Love 0
Funny Funny 0
Wow Wow 0
Sad Sad 0
Angry Angry 0

Comments (0)

User