Blog

Technical deep-dives, research updates, and tutorials

Insights from Rotalabs on AI trust, reliability, and evaluation.

March 01, 2026 research

Field-Theoretic Memory for AI Agents

We treat agent memory as continuous fields governed by partial differential equations instead of discrete database entries. The result: +116% F1 on multi-session reasoning and >99.8% collective intelligence in multi-agent scenarios.

memory agents field-theory multi-agent-trust

Read article

Feb 18, 2026 announcements

Shared Context for AI Agents: Introducing rotalabs-context

AI agents that can't share what they know make the same mistakes independently. We're releasing rotalabs-context - a context intelligence...

open-source context-intelligence multi-agent-trust

Feb 07, 2026 tutorials

Statistical Rigor in LLM Evaluation

Why most LLM benchmarks are doing evaluation wrong, and how to fix it with confidence intervals, significance tests, and effect...

evaluation statistics rotalabs-eval

Feb 05, 2026 tutorials

Adversarial Testing for AI Systems with RedQueen

A systematic approach to red-teaming AI systems. Attack taxonomies, automated campaigns, and finding vulnerabilities before attackers do.

adversarial-testing red-teaming rotalabs-redqueen

Feb 03, 2026 tutorials

Steering Vectors: Runtime Behavior Control for LLMs

How to extract behavioral directions from language models and apply them at inference time. A practical guide to rotalabs-steer.

steering-vectors interpretability rotalabs-steer

Feb 01, 2026 research

Agent-to-Agent Networks: Trust Dynamics and Attack Surfaces in Moltbook

A technical analysis of trust dynamics, emergent behaviors, and security vulnerabilities in Moltbook - the first large-scale agent-to-agent social network....

moltbook multi-agent-trust prompt-injection

Jan 30, 2026 announcements

Introducing the Rotalabs Open Source Ecosystem

We're releasing 12 packages for AI trust, evaluation, and reliability. Available on PyPI and npm, all AGPL-3.0 licensed.

open-source release ai-safety

Jan 20, 2026 research

When AI Agents Don't Know What They Don't Know

The security conversation around AI agents is stuck on identity and permissions. The harder problem is whether an agent should...

uncertainty agents calibration

Jan 10, 2026 research

Multi-Agent Coordination Without Centralized Control

The standard architecture for coordinating autonomous systems assumes a command node. That works until the link goes down. Here's how...

multi-agent coordination autonomy

Dec 15, 2025 research

Runtime AI Control for Contested Environments

Prompting is not a control mechanism. When AI operates in kill chains and beyond reliable comms, 'the model usually follows...

steering defense activation-engineering

Nov 23, 2025 research

MCP is Infrastructure. Trust is the Missing Layer.

The Model Context Protocol connects AI agents to the world. Everyone's focused on securing the connections. But you can authenticate...

mcp multi-agent security

Oct 23, 2025 research

The CoT Blind Spot: Why Watching What Models Say Isn't Enough

Chain-of-thought monitoring was supposed to let us supervise AI reasoning. But new research shows models only faithfully report their reasoning...

chain-of-thought interpretability monitoring

Sep 18, 2025 research

Memory Poisoning: The Attack Vector Nobody's Ready For

A single compromised agent poisoned 87% of downstream decisions in 4 hours. As AI agents gain persistent memory, attackers are...

memory security ai-safety

Aug 17, 2025 research

Prompt Injection Is Still Unsolved: What the Latest Research Actually Shows

A joint paper from OpenAI, Anthropic, and DeepMind bypassed 12 published defenses. NAACL 2025 broke 8 more. Here's where we...

prompt-injection security ai-safety

Jul 15, 2025 research

The Multi-Agent Trust Problem: Why Your AI Agents Can't Verify Each Other

As AI agents collaborate on complex tasks, a critical question emerges: how does Agent A know Agent B isn't compromised,...

multi-agent trust ai-safety

Jun 20, 2025 research

Detecting AI Sandbagging with Activation Probes

First empirical demonstration of activation-level sandbagging detection. Linear probes achieve 90-96% accuracy across Mistral, Gemma, and Qwen models. Sandbagging representations...

sandbagging activation-probing ai-safety

Stay Updated

Subscribe to our newsletter

Research updates, tool releases, and thoughts on AI trust. No spam.