Adam Daniel

Freelance AI Engineer

Blog LinkedIn

May 28, 2026 1 min read

Quoting Anthropic: Opus 4.8 Safety “somewhat less robust”

Agentic safety. Although it shows improvements in some areas (such as refusing malicious requests), we found Opus 4.8 to be somewhat less robust than Opus 4.7 in several agentic contexts (such as vulnerability to prompt injection attacks). However, the application of our safeguards closes the gap between the models in practice. […]
May 13, 2026 5 min read

Introducing GHA-bench

GHA-bench is a benchmark and a set of evals for how well different coding agents author and test GitHub Actions using different languages.

All Posts →

Latest Posts