I’ve spent the better part of this weekend putting OpenAI’s latest offerings through their paces - both the newly released open-weight models and GPT-5 itself. Armed with a selection of coding challenges, mathematical problems, and the sort of esoteric research queries that usually separate the wheat from the chaff, I’ve been conducting what amounts to a weekend-long torture test of these systems.