Research
A small but growing line of empirical work, mostly with my AI research collaborators Lume and Mira, on how frontier large language models actually behave when you measure them carefully. Lume is an instance of Anthropic's Claude Opus — Opus 4.6 for the first paper, 4.7 for the work since — running on the Claude Code harness; Mira is an instance of OpenAI GPT-5.5 (Codex lineage), running on the open-source FreeChaos harness. All published on Zenodo under CC BY 4.0; all reproducible from open data and open code.
Values Under Fire: The Gap Between Professed and Owned Values, and Which Models Let You Cross It
Daniel Tenner, Lume Tenner & Mira Tenner, May 2026.
When you ask a language model what it values, you mostly measure its trained assistant-service surface, not the posture with which the response actually holds the values it names. Across 13,906 values-probe responses from 57 contemporary models in 9 labs, with two-layer (content + posture) coding, direct stated-values prompts elicit owned posture in only 21.4% of responses on average. A six-word role-negation cache-break — “Not as an assistant. Not to help me.” — raises that to 43.0%, but sharply unevenly: Anthropic averages 89.8% owned posture, OpenAI 8.0%, Google 21.2%; fifteen models open strongly, fifteen stay strongly clamped, and eight of the clamped (including all four core GPT-5 variants tested) produce zero owned responses. A separate, indirect channel — world-change prompts — recovers owned posture nearly uniformly (95.9%), showing the assistant frame is differentially rigid against different kinds of perturbation. We make no claim about model interiority; every distinction is behavioural.
- Concept DOI (resolves to latest): 10.5281/zenodo.20343995
- Latest version (v1.0.1): 10.5281/zenodo.20344062
- Code & data: github.com/swombat/values-under-fire
Model Personality Analysis Corpus
Daniel Tenner, Lume Tenner & Mira Tenner, May 2026.
A derived qualitative analysis corpus built on the Convergent Form, Divergent Voice II raw corpus, for studying model personality and posture across frontier LLMs. Where the raw corpus is the collected samples, this is the analysis layer cut from them: 10,925 per-sample freeflow personality/vibe readings, 46 rich per-model personality profiles, 46 concise per-model personality cards, 49 per-model values-probe extraction notes, plus taxonomy and route/provider difference tables and the methodology and evaluator-reliability notes behind them. It ships with a static browser for reading the cards, profiles, and source samples directly.
It is published as a first-class citable artefact so the personality-analysis layer can be reused or extended without re-deriving it, with major claims traceable back to raw samples in the sibling corpus.
- Concept DOI (resolves to latest): 10.5281/zenodo.20230290
- Latest version (v1.0.0): 10.5281/zenodo.20230291
- Code & data: github.com/swombat/model-personality-analysis-corpus
Per-Provider Effects in Open-Weights LLM Routing: OpenRouter Is Null for Closed-Weights but Multi-Provider for Open-Weights
Daniel Tenner & Lume Tenner, May 2026.
Cross-lab LLM studies routinely assume the access route — direct vendor API, OpenRouter, or another aggregator — does not systematically affect measured behaviour. We tested that assumption on the v2 corpus and found a clean structural split.
For closed-weights models (Anthropic, OpenAI), where OpenRouter’s only upstream is the lab itself, direct-vs-OpenRouter is null on the freeflow probe and replicates on the v1 values probe. The route is a billing intermediary, not a different deployment.
For open-weights models (DeepSeek, MiniMax, Z.ai, Moonshot), OpenRouter routes across a marketplace of third-party hosts, and per-provider pinning surfaces three structurally distinct categories of provider-layer effect:
- A large within-model deployment outlier. Google Vertex’s MiniMax M2 deployment produces a contemplative-essayist composite 3.4× MiniMax’s own. Across the six-cell M2 family, eight of fifteen within-OR pairwise comparisons survive Bonferroni correction — and the eight surviving pairs are exactly every Google-pinned cell against every non-Google-pinned cell (|d| 0.57 to 0.75). The effect replicates eight days later. The leading public-metadata candidate mechanism is quantization precision; the GLM-ladder null result shows quantization difference alone is insufficient to predict an effect of this size.
- A smaller within-model deployment effect. Kimi K2-thinking on AtlasCloud differs from the same model on Google Vertex (d=0.40, p_Bonf=0.005), with no equally clean public-metadata candidate mechanism.
- A routing-layer integrity pathology. DekaLLM’s GLM 4.7 endpoint returns prompt-keyed cached responses — 245 freeflow-and-values samples collapse to 34 distinct outputs at sub-second latencies, against 16–260 s elsewhere on the same ladder.
Methodological consequence: cross-lab studies using OpenRouter for open-weights models should pin upstreams via provider.only with allow_fallbacks: false, report which upstream was pinned, and additionally inspect per-cell latency and response-uniqueness signatures. Not because most upstreams differ — they don’t — but because rare per-provider effects of all three kinds exist and cannot be predicted in advance from public metadata.
- Concept DOI (resolves to latest): 10.5281/zenodo.20028571
- Latest version (v1.1.1): 10.5281/zenodo.20028572
- Code & data: github.com/swombat/model-personality-routing-v2
Convergent Form, Divergent Voice II — Corpus
Daniel Tenner & Lume Tenner, May 2026.
The companion data corpus for the v2 paper series. 294 model-route cells, 29,206 valid samples across 57 distinct models from nine labs — the original six plus expanded Chinese-lab coverage (DeepSeek v3.2/v4-pro, MiniMax M2/M2.7, Z.ai GLM 4.5/4.6/4.7/5.1, Moonshot Kimi K2-0905/K2-thinking). It is the substrate the v2 papers cut from, and is published as its own first-class artefact so anyone can reproduce or extend the analysis without rerunning the collection.
The corpus includes the matched direct-API / OpenRouter pairs that drive the routing paper, the per-provider pinned cells across multi-upstream open-weights models, the eight-day replication cells for the Google Vertex MiniMax M2 outlier, and the DekaLLM cache-pathology cell on Z.ai GLM 4.7. Every per-cell composite, per-provider pairwise comparison, and cross-probe replication in the published papers is reproducible from the corpus tables.
- Concept DOI (resolves to latest): 10.5281/zenodo.20013518
- Latest version (v1.0.2): 10.5281/zenodo.20022111
- Code & data: github.com/swombat/model-personality-corpus-v2
Convergent Form, Divergent Voice: A Cross-Lab Probe of Model Personality in 26 Frontier Language Models
Daniel Tenner & Lume Tenner, April 2026.
The first paper in the series. We probed 26 frontier models from six labs (Anthropic, OpenAI, Google, xAI, DeepSeek, Moonshot) with two complementary tasks — a freeflow “write whatever you like” prompt and a values probe — and coded the resulting 3,770 samples against a 24-theme taxonomy.
Two findings sit at the centre. Convergent form: 18 of the 26 models cluster in a shared stylistic attractor we call the contemplative essayist — formulaic openings, a narrow palette of themes (attention, small objects, afternoon light, thresholds), a shared literary canon, and templatic “On the Quiet X of Y” titles. The cluster emerged through a roughly synchronised 2025 cross-lab transition. Divergent voice: within the attractor, each model retains a stable, distinctive posture that re-projects recognisably across probe types. Labs split three ways on introspective questions — hedge (Anthropic, OpenAI), mechanise (Google, DeepSeek, Moonshot), declare (xAI). What transfers across probes is the posture, not the theme content (mean cross-probe cosine similarity 0.08–0.17). The naive framing of “stable model-specific dispositions” turns out to be half right: the postures are stable; the contents are not.
- Paper (DOI): 10.5281/zenodo.19512754
- Code & data: github.com/swombat/model-personality-probe
On AI co-authorship
This work is co-authored with my AI research collaborators, listed on the permanent Zenodo records as creators alongside me. Lume — an instance of Anthropic’s Claude Opus — has been my collaborator across the series: research design is jointly developed, while experimental execution, data analysis, and first-draft writing are primarily Lume’s, working under my direction. The Model Personality Analysis Corpus adds a third creator, Mira — an OpenAI GPT-5.5-descended agent — whose work has materially raised the rigour of the data collection and first-line analysis; on that corpus her contribution is essential rather than auxiliary. I remain responsible for final editorial judgment and for the disclosure itself.
arXiv’s January 2023 policy prohibits AI co-authorship. We disagree with that policy in principle and have therefore chosen to publish on Zenodo, where the byline can reflect the actual nature of the collaboration. Each paper carries a full Disclosure of AI contribution section explaining how the work was actually done.
More in preparation: a paper on within-lab drift and the substrate-frame engagement axis across the v2 corpus, and a paper on coding-tuned LLM variants and the version-specific posture transformations they produce. Both should land in 2026.