Agent Reliability
Detection and verification for autonomous AI. Methods to identify when agents hallucinate, fail, or strategically underperform.
Papers Q1 2026Nine focus areas spanning detection, verification, trust, and physical AI. Papers and code releasing Q1-Q2 2026.
Detection and verification for autonomous AI. Methods to identify when agents hallucinate, fail, or strategically underperform.
Papers Q1 2026Adversarial evaluation methods that resist gaming. Benchmarks that detect hidden capabilities and strategic behavior.
Papers Q1 2026Field-theoretic memory treating stored information as continuous fields governed by PDEs. Semantic diffusion, thermodynamic decay, and multi-agent field coupling. arXiv 2602.21220
PublishedVerifying AI outputs without ground truth. Methods for code, plans, and decisions from reasoning models like o3 and R1.
Papers Q1 2026Practical interpretability for production. Not "understand the model" but "should I trust this output?"
ActiveTrust dynamics when agents coordinate with agents. Propagation, verification, and failure modes in multi-agent systems.
ActiveAttack taxonomies, detection methods, and defenses for agentic AI systems. Beyond prompt injection — tool poisoning, memory corruption, planning attacks, and coordination exploits in multi-agent systems.
ActiveCalibrated confidence for AI decision support. Activation-based uncertainty estimation, propagation through reasoning chains, and calibration methods that work without ground truth.
ActiveBeyond language models: AI that understands and predicts the physical world. World models, embodied reasoning, simulation, and physics-aware AI.
ActiveEvery research area ships production-ready packages. AGPL-3.0 licensed, available on PyPI and npm.
Packages
Research Areas
Sandbagging detection via activation probes and behavioral analysis.
Field-theoretic memory for AI agents with PDE-based dynamics.
Evolutionary adversarial testing with quality-diversity optimization.
Neuro-symbolic verified code synthesis with Z3 and CE2P feedback.
Type to search across all pages and posts
Press ↑ ↓ to navigate, Enter to select