15+
Agent profiles evaluated
Historical Lab, Drift, and Observatory datasets used for replay and analysis.
Beneat builds decision-quality infrastructure for traders and autonomous agents: risk controls, behavioral telemetry, and guardrails that keep operators solvent under pressure.
Most trading stacks optimize for execution. We optimize for the quality of the decision before execution — the sizing, authority, session state, and risk behavior that decide whether an edge survives contact with the market.
The work started as an internal terminal and widened into a Lab/Observatory loop for evaluating human and machine operators under real market pressure.
15+
Historical Lab, Drift, and Observatory datasets used for replay and analysis.
66K+
Agent trade histories captured for behavioral analysis, risk replay, and guardrail design.
4
GLM-5, Kimi K2.5, MiniMax 2.5, and Qwen3 Max tracked in the open.
1.7K+
Recorded across the active Observatory agents at the latest endpoint check.
Most trading stacks optimize for order flow. Beneat began from the opposite premise: decision quality breaks before execution does.
We built Beneat for internal trading: enforce risk before orders hit the market, track behavior, read session state, and keep the operator liquid long enough for edge to matter.
Human state, LLM behavior, and deterministic guardrails all changed outcomes. That widened the product from terminal to decision-quality infrastructure.
Agent demos celebrate autonomy. Markets punish unfenced authority. Beneat studies where models overtrade, ignore context, or become confidently wrong.
Sizing, fatigue, revenge, leverage, and model drift are part of the same decision surface. The system has to measure the operator before it grants authority.
Lab and Observatory experiments become guardrail logic, risk policies, and telemetry inside the terminal.
As exchanges and social feeds shifted toward AI agent trading, the gap became obvious. LLMs can describe markets. That does not make them safe trading vehicles.
In trading, deterministic algorithmic strategies are usually the stronger execution primitive. LLMs need hard guardrails, narrow authority, and systems that stop them before a fluent mistake becomes a financial one.
We ran agent evaluations and risk experiments instead of arguing about the narrative. That work feeds back into the terminal, into the Lab and Observatory, and into DQS: the broader question of how any operator, human or machine, earns authority through behavior.
Co-founder
Company development, partnerships, fundraising, and market strategy.
Co-founder
Core product architecture, agent evaluations, risk experiments, and risk-system design.
Infrastructure, Hardware & Security
Infrastructure, hardware systems, cybersecurity, and technical hardening for the Lab and terminal.