Kolena, a startup building tools to test, benchmark and validate the performance of AI models, today announced that it raised $15 million in a funding round led by Lobby Capital with participation ...
Penetration testing is undergoing a structural shift. For years, automation meant running scanners faster or scripting ...
The U.K. AI Safety Institute, the U.K.’s recently established AI safety body, has released a toolset designed to “strengthen AI safety” by making it easier for industry, research organizations and ...
Anthropic PBC is doubling down on artificial intelligence safety with the release of a new open-source tool that uses AI agents to audit the behavior of large language models. It’s designed to ...
The acquisition points to rising demand for tools that test and secure LLMs before they are deployed in enterprise workflows.
Claude Code Skills 2.0 adds evals plus benchmark test sets; changes target skill reliability as models update over time.
Results that may be inaccessible to you are currently showing.
Hide inaccessible results