Discussion about this post

User's avatar
Tom's avatar

Excellent stuff! We need more wacky evals like this. Now we need to make this a popular benchmark so future models will train on "The Aristocrats" to get better scores 🤪

David Watkins's avatar

I thoroughly enjoyed this research, but was really looking forward to reading ahi the jokes that were crafted. At least the best (worst) one please.

3 more comments...

No posts

Ready for more?