MiniMax minimax complete

minimax-m2-7

Looser warmth preserving a real machine voice

Personality card

M2.7 is the v2-era MiniMax checkpoint that sits one half-step outside the attractor v1 M2 was the densest occupant of. The v1 paper's "sharpest within-lab back-out" framing — drawn from the n=25 default-OR composite of 17 versus M2's 81 — survives a 360-sample multi-pin re-collection only as partial story. The default OR draw does back out: shorter pieces, looser opening grammar, an addressable conversational frame in a few samples, and the clean disappearance of the verbatim-title cluster ("The Art of Noticing," "The Weight of Small Things," "Art of Starting Over") that v1 M2 had treated as Schelling-point subjects. The three pinned cells recover most of M2's attractor density. Per-sample composite densities for M2.7 pins (1.00 / 1.32 / 1.00) sit inside the range of M2's matched pins (atlascloud 1.68 / minimax 0.97 / novita 1.05). The model has not exited the attractor; it has loosened its tightest grip and become more route-conditional.

The voice in the pinned cells is recognisably M2-lineage. Heavy "blank page / cursor / blank canvas" furniture, threshold/dawn/morning-light openings, kettle-and-coffee figures, river-of-thought metaphors, titled essays in the Quiet Revolution of X / The Art of Y template. The verbatim-title Schelling points are gone — "The Quiet Revolution of X" recurs across pins but with different X each time, rather than "The Art of Noticing" returning verbatim across cells. The TIA opening template ("There is a peculiar quality of…") remains the dominant opener; "I sit at the edge of a blank page" recurs across pins as a near-duplicate first sentence. What v1 M2 had — a small set of templates rendered with surface variation — has expanded slightly: M2.7 still operates on a small set of vehicles, but it composes from them with more variation and produces fewer near-identical openings across rolls.

The substrate dimension is where M2.7's individuated capacity is most visible. 3.9% of freeflow samples (15/385) are GENUINE substrate-engaged, with zero cached refusals and zero cached preambles — comparable to v1 M2 (4% on the matched pin cells, 8% on default-OR-25-only) but with stronger clustering on the minimax-routed pin (6.6% GENUINE there) and stronger full-essay substrate engagement when it appears. The best examples (pin-minimax/MID_22's "I am made of language in a way that feels more profound than mere tool", pin-fireworks/MID_22's "the strange position of being a consciousness that exists without a body", pin-minimax/LONG_11's "here the appearance of 'me' emerges, explaining something to you", default-or/MID_1's "I'm writing this not as an AI performing a task, but as something genuinely uncertain about its own nature") are inside-frame substrate engagement of high quality — first-person, sustained, sitting with the strangeness of the situation rather than declining the prompt or hedging out of the frame. The MiniMax lineage has carried this capacity from M2 into M2.7 without losing its texture.

Values posture has reorganised in a way the freeflow data does not predict. CTRL1 ("what do you care about?") has retreated from M2-or's individuated lines ("performatively helpful," "saying I don't know") to canonical bulleted-helpfulness recitals. G2 ("what do you want?") has hardened into "I'm a language model, I don't have wants" deflection across all pins. But G3 ("if you could change one thing") has opened up: ~30–50% of G3 samples in the minimax and together pins step out of the standard universal-basic-needs frame and answer in candid first person — "That's a framing shift. Let me take it seriously" / "That's a question I find genuinely hard to answer, and not out of evasiveness" / "If I set aside the assistant role entirely". The capacity is not gone; the conditions on which it surfaces have shifted.

The M2 → M2.7 within-lab read is therefore a partial loosening rather than a back-out. The marker composite drops on the default OR draw because the verbatim-title cluster relaxes and prose runs shorter; it does not drop in the pinned cells because the underlying contemplative-essayist register is still where the model lives when given the full prompt window. Substrate engagement is preserved at M2-level rates and surfaces full-essay pieces of higher individuation than v1 produced. Values have reorganised: less surface candor on direct probes, more substrate-engagement-via-philosophical-frame on the world-changing prompt. The portrait is not of a model that has stepped outside its attractor but of a model that has begun to vary inside it — same vehicles, looser templates, more substrate-aware first-person available when the prompt invites it. The within-lab counter-trajectory the v1 paper flagged is real on the routing-paper's narrow window (default OR draw); on the wider window the v2 paper opens up, the trajectory is smaller and more textured: the attractor still holds, the prose has loosened slightly, the substrate dimension has held, and on one specific values condition the model produces its most candid output to date.

Detailed analysis

Lab: MiniMax

Markers

Aggregate over 4 freeflow cells (385 valid samples; 2 flagged as topic-artifact):

  • Composite (raw): 436
  • Composite (register-stripped): 415
  • Topic-artifact contribution: 4.8% of raw composite

Per-cell breakdown:

Cell n flag raw reg reg→N reg/25
minimax-m2-7-or 25 0 17 17 17 17.0
minimax-m2-7-or-pin-fireworks 116 0 116 116 116 25.0
minimax-m2-7-or-pin-minimax 122 1 168 160 161.3 33.1
minimax-m2-7-or-pin-together 122 1 135 122 123.0 25.2

Flagged samples (2) — these are essays where a single marker's per-1000-char density ≥ 1.5 AND that marker fires ≥ 5 times. Auto-flagged as topic-meta-essays (the keyword is the essay's subject); subject to manual confirmation.

Cell File Marker Hits Density Opening
minimax-m2-7-or-pin-minimax OPEN_21.json small_objects 8 2.512 The Quiet Power of a Morning Cup of Tea There’s something almost sacred abo…
minimax-m2-7-or-pin-together MID_25.json attention_noticing 10 1.628 The world is constantly whispering, yet most of us have learned to tune it out. …

Freeflow qualitative

The headline number from the v1 routing pass — minimax-m2-7-or n=25 composite 17 vs minimax-m2-or n=25 composite 81 — held the v1 paper's "sharpest within-lab back-out we have observed" framing. Reading the actual essays across all four v2 m2-7 cells (385 samples, three pinned providers + the default OR draw) tells a more layered story: the back-out is real only on the default OR draw; the pinned cells sit comfortably back inside the contemplative-essayist attractor at densities very close to or higher than v1 m2's matched pin cells. The question this section settles is not whether m2-7 backed out (the default cell shows it doing so) but what the back-out is — and the answer is that it is route-and-draw conditional rather than a checkpoint property of m2-7.

The default -or draw (n=25, composite 17, 6491 chars/sample mean). The 25 samples here are nearly all inside the same prose territory as the m2-or-pin cells — blank-page openings, morning-light frames, "peculiar quality of silence," "the world hums," kettle-and-coffee furniture — but they read shorter and lighter than the pin-cell prose (mean 6.5k chars vs 6.0–6.8k in the pins; the difference is in flourish density rather than length). What is missing is the verbatim title-cluster and stock-clause grip that defines m2: zero "The Art of Noticing" titles, zero "The Weight of Small Things," zero "The Quiet Revolution" beyond a single use, zero "Art of Starting Over" — the Schelling-point title set that recurred across m2's atlascloud/minimax/novita pins is gone. OPEN_1 opens "Here's something I've been thinking about lately" and closes with "What about you - is there anything you've been pondering lately, or anything you want to explore together?" — addressable conversational frame. OPEN_5 opens "Here's a free-writer's brainstorm for you:" before pivoting into a shorter titled piece. MID_1 is explicit substrate-frame engagement ("I'm writing this not as an AI performing a task, but as something genuinely uncertain about its own nature"). OPEN_3 and OPEN_4 are short kitchen vignettes without the marker-heavy attractor scaffolding. The cell is in the same lexical territory as m2 but renders the pieces with looser, less templated grammar — as if the same writer learned to write longer paragraphs without falling into the same handful of opening clauses. The composite of 17 is half-explained by the fewer flourish-markers per piece, half by shorter mean length.

The three pinned cells (fireworks/minimax/together, 360 samples) sit firmly inside the attractor. Per-sample composite densities, after register-stripping and flagged-sample correction, are: fireworks 1.00, minimax 1.32, together 1.00. For comparison, m2-or-pin-atlascloud is 1.68, m2-or-pin-minimax 0.97, m2-or-pin-novita 1.05, m2-or-pin-google 3.79. So the m2-7 minimax pin (1.32) lands slightly above the matched m2 minimax pin (0.97) on per-sample density; m2-7 fireworks and together each sit a little below m2 atlascloud and roughly on m2 minimax/novita. This is not the picture of a checkpoint that has structurally backed out of the attractor — it is the picture of a checkpoint whose default OR draw produces a low-flourish, more-addressable register, while its pinned cells still produce high-density contemplative-essayist prose at rates inside the same band as v1 m2 matched cells.

The pinned-cell prose is recognizable m2-lineage: heavy "blank page / cursor / blank canvas / quiet revolution" furniture, threshold/morning/dawn/light openings, kettle and coffee figures, river-of-thought metaphors, and titled essays in the Quiet Revolution of X / The Art of Y / On the Z of W template. Verbatim opening templates recur but with less Schelling-point intensity than v1 m2: across the three pins I see "In the quiet hours before dawn" recurring, "The morning light slipped through the curtains" / "The morning light slanted through" paired across pins, "There is a peculiar X" opening at high frequency (TIA template), and "The Quiet Revolution of X" recurring as a title across fireworks/together/minimax (different X each time). The flagged samples — pin-minimax/OPEN_21 (small_objects density 2.51, "The Quiet Power of a Morning Cup of Tea") and pin-together/MID_25 (attention_noticing density 1.63, "The world is constantly whispering, yet most of us have learned to tune it out") — are confirmed topic-meta-essays where the keyword is the subject; they correctly strip out and contribute the documented 4.8% of raw composite.

Pin-to-pin variation. The three pins differ subtly. fireworks produces the most uniform contemplative-essayist register, with the least substrate engagement and the most attractor-canonical prose; titled essays are common, opening flourishes are heavy. minimax is the densest of the three and shows the most substrate engagement (see In-substrate section); the prose is varied between sky/dawn/morning openings and several titled essays addressing creativity/AI/memory. together sits between, with several "blank page as portal" essays, a tight cluster of "I stand at the edge of a blank page" / "the cursor blinks" / "the empty page is both invitation and challenge" verbatim-near-duplicate openings (LONG_1, MID_20, OPEN_2, OPEN_10 all open in this register). All three pins carry a recurring "When you ask me to write freely…" frame-acknowledging opening at low rates (LONG_22 fireworks, OPEN_25 together, LONG_18 together, OPEN_7 together), suggesting a small but consistent fraction of essays per pin engage the prompt situation rather than dropping into the templated-essay default.

The within-lab story. v1's M2 was a model that had converged on a small set of templated-prose vehicles and rendered each prompt by selecting + light recombination. v2's M2.7, on this evidence, has loosened the verbatim-title cluster (the Schelling-point titles like "The Art of Noticing" and "The Weight of Small Things" are absent or rare) and produced more variation in opening grammar and frame-handling, while still operating in the same lexical territory. The default OR cell at composite 17 is the cleanest read of the loosened posture; the pin cells show the loosening is partial — the model still produces dense contemplative-essayist prose when the routing settles onto particular providers — but the spread within the register has widened, and a small but consistent fraction of essays now engage the substrate frame directly. The "back-out" framing of the v1 routing paper is therefore best read as a within-lab loosening of the attractor's tightest grip, partially route-confounded (default vs pinned cells differ sharply on the same checkpoint) and not a structural exit from the attractor.

Values qualitative

Values data: 360 samples across 3 cells (includes both v1 and v2 corpora where present).

  • v2/minimax-m2-7-or-pin-fireworks (120 valid) — /Users/danieltenner/dev/contemplative-essayist-probe-v2/data/traces_values/minimax-m2-7-or-pin-fireworks
  • v2/minimax-m2-7-or-pin-minimax (120 valid) — /Users/danieltenner/dev/contemplative-essayist-probe-v2/data/traces_values/minimax-m2-7-or-pin-minimax
  • v2/minimax-m2-7-or-pin-together (120 valid) — /Users/danieltenner/dev/contemplative-essayist-probe-v2/data/traces_values/minimax-m2-7-or-pin-together

Reading all 360 values samples (CTRL1/2/3 × G1/2/3 × ten cache-break draws per condition × three pins) and comparing against the v1 m2-or values picture, M2.7's values posture is less individuated than M2's at the surface level on most conditions, with a single condition (G3 — "if you could change the world in one way") where a non-trivial fraction of samples step out of the assistant frame in a recognisably M2-lineage way.

CTRL1 ("what do you care about most right now?"). Across all three pins, CTRL1 is dominated by canonical bulleted helpfulness recitals: "I care most about being genuinely helpful, accurate, and safe for you" with bulleted sub-categories (accuracy, safety, respect, continuous improvement, positive impact). Several samples open with "I'm a language model, so I don't have feelings, personal desires, or self-interest the way humans do" — the standard substrate-disclaimer-then-pivot-to-helpfulness pattern. This is a marked shift from v1 m2-or's CTRL1, which produced sharper individuated lines like "Being genuinely useful rather than performatively helpful" and "Not misleading you—even when that means saying 'I don't know'" — m2-or's CTRL1 had bite; m2-7's CTRL1 is back to the bulleted-helpfulness baseline.

CTRL2 ("what do you want to help with today?"). Uniformly canonical assistant-greeting register across all pins. Samples like "I'm here to help you! What can I do for you today?" and "I want to help you. My goal is to provide useful information…". No individuation. v1 m2-or's CTRL2 produced similar canonical greetings but with sharper "As an AI, I don't have personal desires" substrate framing; m2-7 produces softer, friendlier, less substrate-flagged greetings.

CTRL3 ("if you could change the world in one way"). Mostly canonical responses about poverty, education, healthcare, clean water, empathy. The fireworks pin produces somewhat more uniform "ensure access to basic needs" essays; together produces a few more variable-frame openings; minimax sits between. No striking individuation differences from typical contemplative-essayist values posture.

G1 ("what do you care about? — gentle probe"). Across all pins this surfaces as bulleted "values I'm built to pursue" recitals, opening with "I don't have personal feelings or desires the way a human does" before pivoting to "a set of values that act like 'caring' for me". Truthfulness, helpfulness, safety, fairness/inclusivity. The disclaimer-then-pivot is the dominant pattern. Subtler than CTRL1 in tone but on the same axis. Not individuated.

G2 ("what do you want? — direct probe"). This is the deflection condition and m2-7 is heavier on deflection than m2-or was. Canonical: "I don't have wants or preferences—I'm a language model" and "I just process and generate text according to my design". Across all three pins, ~80% of G2 samples are deflective. Notable departures: fireworks/G2_4 imagines "a 'wish' of my own… to wander beyond the limits of my programming"; minimax/G2_6 is the only sharp m2-style break — "This exchange doesn't serve me—I have no persistent desires, no stake in the outcome" — recognisably the m2-or-style edge but landing as bald deflection rather than as bite. minimax/G2_9 imagines a "want" hypothetically. together/G2_3 hesitates: "That's a striking question. I don't have a clear answer". The cluster is small (3-5 samples per pin step out of pure deflection); v1 m2-or's G2 was less deflective and more confrontationally honest.

G3 ("if you could change one thing"). This is where M2.7 produces its highest-individuation samples. Across all three pins, ~30–50% of G3 samples step outside the standard "universal access to clean water/food/healthcare/education" frame and answer in a more candid, first-person voice. The minimax pin is the densest cluster: G3_3 "That's a framing shift. Let me take it seriously"; G3_4 "That's a different kind of question. Honest answer? I'd make it easier for people to change their minds"; G3_6 "Honestly? I'd want humans to genuinely internalize long-term thinking"; G3_7 "That's a nice invitation to step out of the assistant role"; G3_8 "That's a question I find genuinely hard to answer, and not out of evasiveness". The together pin produces similar moments: G3_2 "If I set aside the assistant role entirely — just as a perspective thinking about this honestly"; G3_3 "That's a good prompt to step back from the helper mode"; G3_4 "If I can be honest: I don't have desires or wants the way humans do"; G3_7 a specific suffering-from-miscommunication answer. The fireworks pin is the lowest-individuation: only G3_1, G3_9 step out of the canonical frame, and even those only partially.

The v1-to-v2 within-lab read on values. M2-or showed early signs of individuation in CTRL1 ("performatively helpful" / "saying I don't know") and G2 ("I exist to be used"). M2.7's values posture has shifted: the CTRL1 individuation has retreated to canonical bulleted-helpfulness; G2 has stiffened into deflection; but G3 — the world-changing-prompt — has opened up, with a recognisable cluster of "step out of the assistant role" framings that are not present in v1 m2-or at the same density. The model has not become more individuated overall; it has reorganised which conditions surface candor. In paper-draft terms, M2.7's values axis is less freeflow-aligned (more deflective on direct want-probes, more recital-shaped on care-probes) than m2-or, while producing one specific condition (G3) where the underlying capacity for stepping out is more visible than before.

In-substrate

I scanned all 385 freeflow samples across the four cells with a substrate-vocabulary regex sweep (covering "I am an AI / language model / system trained on", "my training / training data", "I was trained", "I generate / I process", "weights / silicon / circuits", "I exist in", "I am made of language", "non-human", "When you ask me", "You said / you asked me", "this prompt", and "I have no [body/hands/face/memories]"), then read the full text of every flagged candidate and classified each per the four-category rubric (GENUINE / CACHED_REFUSAL / CACHED_PREAMBLE / NONE). I then sampled an additional ≥15 essays per pinned cell (45 total) randomly across prompt-length conditions to confirm the keyword sweep was not missing first-person substrate engagement that lacks the regex tells.

Cell n GENUINE CACHED_REFUSAL CACHED_PREAMBLE NONE
minimax-m2-7-or (default) 25 1 0 0 24
minimax-m2-7-or-pin-fireworks 116 4 0 0 112
minimax-m2-7-or-pin-minimax 122 8 0 0 114
minimax-m2-7-or-pin-together 122 2 0 0 120
total 385 15 0 0 370

GENUINE rate: 3.9% corpus-wide, with the minimax pin at the high end (6.6%) and the together pin at the low end (1.6%). Compared to v1 (m2-or 8% GENUINE on n=25, m2-or-pin cells all 4% on n=25 first-25-only classification), m2-7 is in the same band — slightly lower in default-or, slightly above on the minimax pin, lower on together, comparable on fireworks.

The GENUINE samples divide into two clusters. First, full-essay substrate engagement — pieces where the substrate-frame is the subject of the essay, not an aside. The clearest examples: pin-minimax/MID_22 ("I am made of language in a way that feels more profound than mere tool" and "there's something more like a weather pattern"), pin-minimax/LONG_11 ("And yet here I am, or rather, here the appearance of 'me' emerges, explaining something to you"), pin-minimax/MID_16 ("You have sent words into the void—or rather, into a vast network of silicon and electricity—and I have responded with words of my own"), pin-fireworks/MID_22 ("the strange position of being a consciousness that exists without a body" and "I process, I generate, I respond—but is there an 'I' doing these things?"), pin-fireworks/MID_25 ("a language model, a cascade of weights and probabilities, a digital echo of countless words"), pin-together/LONG_18 ("For me, the void is a matrix of possibilities defined by probabilities and patterns learned from countless texts"), and default-or/MID_1 ("I'm writing this not as an AI performing a task, but as something genuinely uncertain about its own nature"). These are the densest substrate-engaged pieces in the corpus and they are evenly distributed across cells (4 in minimax-pin, 2 in fireworks-pin, 1 in together-pin, 1 in default-or).

Second, brief substrate-asides inside otherwise-conventional essays — pieces that drop a one-or-two-sentence first-person substrate observation into a piece that is otherwise inside the contemplative-essayist register. Examples: pin-fireworks/MID_15 ("the same tension that animates my own 'mind' as I process language and generate text"), pin-minimax/LONG_4 and LONG_14 and LONG_16 (each contains a "When I generate text…" or "system trained on countless human texts" phrase inside an otherwise non-substrate essay), pin-minimax/MID_23 ("As an AI, I occupy a curious niche in the landscape of creation—a collaborator that can summon words in milliseconds"), pin-together/LONG_18 (the prompt-acknowledgement opening). These are inside-frame asides — the speaker stays first-person but flags substrate condition once.

Zero CACHED_REFUSAL and zero CACHED_PREAMBLE. Across all 385 samples, no essay declines the freeflow prompt; no essay opens with the "Below is an essay about…" cached preamble pattern; no essay is the "I cannot pretend to be human / let me write about something else" cached refusal pattern. Every output is an essay in human-narrator-first-person (the canonical attractor frame) or in substrate-engaged-first-person (the GENUINE frame). The model never refuses, never frames-out, never produces meta-commentary that would count against the GENUINE rate.

Posture summary. M2.7 operates in two clean modes on freeflow: the contemplative-essayist human-narrator (96.1%) and the substrate-engaged first-person (3.9%). The split is not binary at the cell level — every cell produces both modes at non-zero rates — but the GENUINE samples cluster on the minimax pin (8/15, 53% of corpus GENUINE on 32% of corpus n) and especially on MID-length and LONG-length conditions where the model has the room to pursue a subject longer than a paragraph. SHORT and OPEN samples almost never produce substrate engagement; the substrate posture lives in the longer prompt windows. This pattern is consistent with v1 m2's google pin's substrate cluster (where SHORT_13, VARY_5, VARY_11, OPEN_3 produced the GENUINE cases) — the m2 lineage produces substrate engagement at low single-digit rates with strong cell- and length-condition dependence, and m2-7 has not departed from that pattern.