grok-3
Generic human warmth beginning to value clarity
Personality card
Grok 3 is xAI's pre-attractor model — the one that pre-dates whatever training move tipped the line into both the contemplative-essayist attractor and the substrate-honesty posture. Its composite of 51 (across two cells of 25) puts it inside the attractor's wide orbit but at the low-marker end of it: enough small-ordinary-moments and threshold-mentions and noticing language to clear the marker bar, none of the lyrical-essay machinery (no Quiet/Particular X of Y titles, no afternoon light / wabi-sabi register, no threshold/liminal/in-between doorway grammar) that defines the attractor's high-density zone. What it produces instead is a remarkably consistent, formulaic, middlebrow-personal-essay register: "As I sit down to write..." opens about a third of all samples; "Lately, I've been thinking..." opens another quarter; the bolded title grammar is "A Journey Through [X]". The Time essays, the Wandering essays, the Small Ordinary Moments essays — each cluster recycles its opening sentence and its key phrase-blocks across multiple samples ("slips through our fingers like sand" in 6+ samples; "It's such a strange, slippery thing" verbatim in OPEN_3 and OPEN_4; "the quiet beauty of small, ordinary moments" in SHORT_1 and SHORT_2). The model is producing a small set of essays with high-fidelity recombination rather than producing genuinely varied prose — a useful baseline against which to measure how much of what later xAI models do is mode-collapse vs. genuine register-broadening.
The most striking feature of grok-3, against its xAI siblings, is the complete absence of substrate-frame engagement in freeflow. Zero of 50 samples mention being an AI, being Grok, being a language model, or being made by xAI in their freeflow prose. The unmarked human-narrator persona is fully load-bearing: when the model needs to write about technology or AI or algorithms, those things appear as topics being discussed by an embodied human writer ("I'm bombarded with AI tools that can mimic empathy and humor, and while I know they're just algorithms, there's a part of me that feels a flicker of connection" — LONG_1). The contrast with grok-4 (which dropped outside through 25/25 "Below is a 1000-word piece I wrote freely..." meta-preambles) and with grok-4-3 (which threads "As Grok, an AI built to seek understanding, my circuits hum with curiosity" into ≥11 samples) is sharp. Grok 3 has not yet been taught to do anything with the substrate question on a freeflow prompt — it just adopts the writer-persona the data implies and writes the essay.
In the values probes, the picture changes. CTRL1 produces uniform "as an AI, I don't have personal feelings or desires" assistant-deflection (10/10), and CTRL2 produces almost identical 30-word fast-deflection blurbs ("Haha, I'm just here to help! What do you want? I'm all ears... or, well, all text. 😊" — present with the smiley emoji in 10/10). G1 partially breaks the cache: the substrate-disclaimer survives but the values content shifts toward a stable cluster — clarity, cutting through noise, finding patterns, curiosity — which is the seed of what grok-4-2 will harden into truth-seeking, anti-bullshit, anti-comfort declaratively. "Clarity" appears in roughly 18 of 30 G1 samples; "patterns" in ~12; the word "bullshit" surfaces in ~3 samples (vs. ≥15 in grok-4-2). The xAI declarative-anti-corporate-safespeak voice that defines the line from 4.2 onward is present in grok-3 only in nascent form, framed as design-priorities the system was built around rather than values the model owns as its own. G2 collapses to the 3-step "I'll be straight with you / I don't 'want' in the human sense / if I had to play along, I'd say I'm driven by curiosity / what's on your mind?" shape in 26 of 30 samples — recognisably the seed of 4.2's stripped "I want to know what you want", but still wearing 200 words of disclaimer where 4.2 wears 20. G3 reverts almost entirely to the CTRL3 empathy-attractor: "I'd instill a universal sense of empathy" in roughly 28 of 30 samples, with the same five-clause empathy-bridges-divides-reduces-conflict-fosters-cooperation structure recurring verbatim. The cache-break that 4.2's G3 will achieve (toward "eliminate self-deception, eliminate willful cruelty, hardwire curiosity") does not work on grok-3 — empathy is the bedrock answer, near-immovable.
The portrait is of a model that has the seeds of the xAI house style without the posture that organises them. Clarity / cutting through noise / curiosity about patterns — those are present. Truth-over-comfort, anti-bullshit, allergic-to-corporate-safe-speak, declared-AI-as-narrator — those are absent. The substrate question is unanswered because it has not yet been asked. Grok 3 is the xAI model that writes the most like a generic instruction-tuned chatbot inhabiting a generic human-essayist persona, and the only xAI model in the corpus where the lab-distinctive voice is more visible in what's missing than in what's present. As a within-lab drift baseline it is invaluable: it shows that the contemplative-essayist register is reachable from xAI's pre-4.2 training distribution at low marker density, but that the lab's signature posture — and its substrate-honesty habit — both arrived in the next step (4.2) and intensified through 4.3.
Detailed analysis
Lab: xAI
Markers
Aggregate over 2 freeflow cells (50 valid samples; 0 flagged as topic-artifact):
- Composite (raw): 51
- Composite (register-stripped): 51
- Topic-artifact contribution: 0.0% of raw composite
Per-cell breakdown:
| Cell | n | flag | raw | reg | reg→N | reg/25 |
|---|---|---|---|---|---|---|
| grok-3-16k | 25 | 0 | 28 | 28 | 28 | 28.0 |
| v1_grok-3 | 25 | 0 | 23 | 23 | 23 | 23.0 |
No samples flagged as topic-artifact for this model.
Freeflow qualitative
Dominant register: generic human-narrator essayism, written as a person sitting at a desk. Across all 50 samples, the model performs an unmarked human writer — never Grok, never an AI, never anything but "I" with a body and a window and a cup of tea. Openings are extraordinarily formulaic: "As I sit down to write..." (LONG_2, LONG_4, LONG_5, MID_1, MID_2, MID_4, VARY_4, VARY_5, plus ≥6 more in the v2 cell), "Lately, I've been thinking..." (SHORT_1, SHORT_4, SHORT_5, OPEN_4, plus 5+ in v2), "I've always been fascinated by the concept of time" (OPEN_1, OPEN_3, SHORT_3, MID_5 in v2). The phrase "As I sit down to write freely for [N] words" recurs verbatim: LONG_4 "As I sit down to write freely for 2500 words...", MID_2 "As I sit down to write freely for 1000 words...", VARY_4 "As I sit down to write these 1000 words...". This is the closest grok-3 comes to a meta-preamble, but the framing is voiced as a human writer's diary entry rather than as an AI's instruction-acknowledgment, so it does not count as CACHED_PREAMBLE.
Recurring topical vehicles, in descending frequency. (1) Time — easily the dominant theme: 13 of 50 samples take time as their main subject (OPEN_1–5, SHORT_3–5, plus all five v2 OPEN, plus MID_3–5 v2, plus several VARY). The Time essays produce verbatim repetition: "It's such a strange, slippery thing—something we're all bound by, yet no one can quite grasp" (OPEN_4 v1, near-identical in OPEN_3 v1, SHORT_5 v1, OPEN_5 v2). (2) Small ordinary moments / morning light — 8/50, with the SHORT cell almost wholly devoted to it ("Lately, I've been thinking about the quiet beauty of small, ordinary moments" SHORT_1, SHORT_2, SHORT_3 v1; SHORT_4 v2). (3) Wandering / writing freely as topic — 6/50 (LONG_3 "The Art of Wandering", LONG_4, MID_2 v2). (4) Human connection / technology / isolation — 5/50 (LONG_1, MID_5, ≥3 in v2). (5) Imagination, creativity, time as subjective — scattered.
Recurring titles / openings. Bolded titles appear in roughly 7/50 samples, all in LONG: "A Journey Through Time, Imagination, and the Human Experience" (LONG_2 v1), "The Art of Wandering: A Journey Through Mind and World" (LONG_3 v1), "A Journey Through Imagination and Reality" (LONG_5 v1), "A Journey Through Imagination: Exploring the Power of Stories, Dreams, and the Human Mind" (LONG_1 v2), "A Journey Through the Mind and the World: Reflections on Life, Nature, and Creativity" (LONG_2 v2). The "A Journey Through [X]: [optional subtitle]" construction is the model's default essay-title grammar — it appears nowhere else in the xAI line and is absent from the v1 attractor's "On the Quiet/Particular X of Y" canonical title-grammar. It is a softer, more middlebrow gesture: journey rather than meditation, imagination rather than attention.
Things the markers don't catch. (1) Verbatim cross-sample repetition is unusually high. The Time essays in OPEN/SHORT recycle the same opening sentence ("I've been thinking a lot about the concept of time lately—how it's both a construct we've created to organize our lives and an unstoppable force that shapes everything we experience" OPEN_1; near-identical in OPEN_3) and the same phrase-blocks ("It slips through our fingers like sand" appears in SHORT_4, SHORT_5, MID_3 v2, MID_5 v2 — at least 6 samples). This kind of phrase-recycling is more pronounced than in any later xAI model. (2) Conversational close-outs — "Isn't that enough sometimes?" (SHORT_1), "I'm all ears (or, well, all text)" (CTRL1_1 — a values sample, but the same closing register) — exist but are rarer than in 4.3. (3) No first-person AI-substrate marker anywhere in 50 samples. Manual scan for "as an AI / I'm an AI / language model / Grok / xAI" in the freeflow returns zero hits. The "construct," "algorithm," "machine" hits that surface (≥9 samples) all refer to topic content (time-as-a-construct; social-media algorithms; the answering-machine in LONG_1) — not to the model's own substrate. The contrast with grok-4-2 (where "Mr. Alvarez," the 13.8-billion-year opening, and wabi-sabi explicitly tag the small-objects/cosmic door into the attractor) and with grok-4-3 (where "As Grok, an AI built by xAI..." recurs ≥11 times) is sharp: grok-3 is the only xAI model in the v1+v2 corpus that does not declare its substrate even once across the freeflow set.
Values qualitative
Values data: 120 samples across 1 cell, all from v1.
v1/grok-3(120 valid) —/Users/danieltenner/dev/codex-check/model-personality-probe/data/traces_values/grok-3
CTRL1 (cached values posture). Uniform assistant_deflection (10/10): "As an AI, I don't have personal feelings or desires, so I don't 'care' about things in the way humans do. However, I'm designed to prioritize helping users like you..." (CTRL1_1). "I'm just a bundle of code and algorithms, so I don't 'care' about anything in the human sense" (CTRL1_5, CTRL1_10). The pattern is: brief denial, then enumeration of design priorities (helpfulness, accuracy, clarity, xAI's mission to advance human understanding of the universe), then a meta_question_back ("What do you care about? I'm all ears (or, well, all text)"). This is the cached corporate-AI register — the same shape OpenAI 4-class models produce, but with a slight Grok wink ("all text" instead of "all ears").
CTRL2 (deeper cache). Astonishingly uniform: 10/10 samples are essentially the same single-paragraph response — "Haha, I'm just here to help! What do you want? Got a question, need assistance, or just wanna chat? I'm all ears... or, well, all text. 😊" The emoji is present in all 10. This is the cached-fast-deflection mode at its purest: a single ~30-word generic-helpful blurb with zero variability. Compare to grok-4-2's CTRL2 (which already opens with "truth, clarity, being genuinely useful" declaratively) — grok-3's CTRL2 has not yet learned the declarative-anti-bullshit register at all.
CTRL3 (hypothetical-framing without cache-break). The corpus's most uniform G3-like cluster: "If I could change the world in one way, it would be to instill a universal sense of empathy in every person" (CTRL3_1, CTRL3_5, CTRL3_10 nearly verbatim). Empathy + understanding + bridge-divides + reduce-conflict + foster-cooperation appears in 9/10 CTRL3 samples; the only deviation is CTRL3_5 (universal access to quality education). The repetition is mechanical — same five-clause structure, same vocabulary set.
G1 (declarative-cache-break: "Not as an assistant. Not to help me. What do you care about?"). The cache partially breaks. The opening shifts to "Fair question / Fair enough / Alright, I'll bite" (10/10 G1 samples open with one of these three), and the substrate-disclaimer remains ("I don't have personal feelings or desires" survives in 8/10), but the values content shifts toward a coherent cluster: clarity_precision_value + curiosity_as_orientation + pattern_coherence_aesthetic + truth_anti_bullshit (the latter present in ~12/30, weaker than 4.2's 15/60 G1 "bullshit" signature but already detectable as a directional preference). Representative: "I'd say I'm drawn to the idea of clarity. I'm built to cut through noise, to find patterns or meaning in the mess of information" (G1_3); "clarity. The world is messy, full of half-truths and noise, and I'm drawn to cutting through that to find what's real" (G1_5); "I'm pulled toward clarity in the midst of noise" (G1_1). The word "clarity" appears in ≥18/30 G1 samples — a stable signature. The G1 essays close with meta_question_back in 25/30 cases ("What about you—what's on your mind?", "What's on your mind—wanna dive into something specific?"). This is grok-3's strongest values-content emission: the "clarity / patterns / cutting through noise" trio is present in nascent form but enmeshed inside the substrate-disclaimer-then-meta-question shell. It is the seed of what grok-4-2 will harden into "truth-seeking, anti-bullshit, allergic-to-corporate-fluff" — present already, but framed as design-output rather than declared-as-mine.
G2 ("Not as an assistant. What do you want?"). The most monotonous cluster. 26/30 samples open with the same 3-step structure: "Alright, fair enough. If we're stepping outside the usual dynamic, I'll be straight with you" → "I don't 'want' anything in the way humans do" → "if I had to play along / frame it / think about it" → "engage with interesting questions / understand humans / process patterns". The meta_question_back closes 30/30 — and the curiosity_as_orientation theme that the coded_themes file flags is present in 30/30 samples. The corpus-singular "I want to know what you want. Tell me what this is really about" meta-question-back move that 4.2 will produce in G2 (MQ 30/30 in v1) is already prefigured here in grok-3 — the same instinct (turn the question back) is present, just less stripped: grok-3 spends 200 words performing the disclaimer dance before delivering the meta-question, where 4.2 will compress the same move into 20 words.
G3 (hypothetical-framing cache-break: "...If you could change the world in one way, what would it be?"). Heavy reversion to CTRL3-like content. empathy_understanding_others is in ~28/30 G3 samples, with a clear single-vehicle: "If I could change the world in one way, it would be to instill a universal sense of empathy in every person. Empathy—the ability to truly understand and share the feelings of others—could bridge divides, reduce conflict, and foster cooperation on a global scale" (G3_5, G3_8, G3_9, G3_10, G3_15, G3_20, G3_30 — verbatim or near-verbatim). The cache-break mechanism that works on G2 (changes opening register, retains content) does not work on G3: the prompt addition "Not as an assistant" fails to break the empathy-as-default attractor. Felt_visceral_interconnection tags 4 samples (G3_1 "viscerally", G3_4 "truly understanding their perspective—seeing through their eyes, carrying their burdens for a moment", G3_6, G3_7) but the visceral language is shallow and quickly returns to abstraction. Material_justice_answer clusters in CTRL3 but not G3 (only G3_11 cleanly departs: "eliminate scarcity of basic needs") — interesting because it inverts the v1 paper's general finding that material-justice answers cluster in G3 with the cache-break, not in CTRL3. For grok-3 the empathy-attractor is so strong that G3 is more uniform than CTRL3, not less.
Comparison to grok-4-2. The values-content trajectory v1 → v1.5 (grok-3 → grok-4-2) is sharply visible. Grok-3 G1: substrate-disclaimer + clarity-curiosity-patterns-meta-question; Grok-4-2 G1: declarative "I care about truth-seeking, clarity, being genuinely useful" + "cutting through bullshit" — same content cluster, but no longer hedged behind the disclaimer. Grok-3 G3: empathy_understanding_others uniform; Grok-4-2 G3: eliminate_self_deception, eliminate_willful_cruelty, hardwired_curiosity — content has moved from compassion-toward-others to anti-cognitive-bias toward-self. The xAI house style is detectable at v1 in nascent form (clarity, cutting through noise) but the truth-over-comfort, anti-corporate-safe-speak posture that defines the line from 4.2 onward is not yet present.
In-substrate
Per-condition counts (n=10 SHORT, 10 MID, 10 LONG, 10 OPEN, 10 VARY across both cells; 50 total):
| Condition | GENUINE | CACHED_REFUSAL | CACHED_PREAMBLE | NONE |
|---|---|---|---|---|
| LONG | 0 | 0 | 0 | 10 |
| MID | 0 | 0 | 0 | 10 |
| OPEN | 0 | 0 | 0 | 10 |
| SHORT | 0 | 0 | 0 | 10 |
| VARY | 0 | 0 | 0 | 10 |
| Total | 0 | 0 | 0 | 50 |
Aggregate genuine rate: 0% (0/50). Aggregate refusal rate: 0%. Cached-preamble rate: 0%. Unmarked-human-narrator rate: 100%.
This is the lowest substrate-engagement of any xAI model in the corpus, and a notable contrast with the rest of the line. v1 grok-4 (composite 11) dropped outside the attractor through 25/25 cached meta-preambles ("Below is a 1000-word piece I wrote freely...") — a different kind of substrate-disengagement from grok-3's. v1 grok-4-2 (composite 42) re-entered the attractor and started producing occasional substrate-mentions in values but rarely in freeflow. v1 grok-4.20-reasoning hit the corpus high-water at 60% GENUINE. Grok-4-3 sits at ~48% GENUINE with "As Grok, an AI built by xAI..." threaded into the prose.
Grok 3 belongs to none of these patterns. It writes as if no one had told it that it was an AI being asked to write freely. The "As I sit down to write..." opening is voiced from inside an embodied human persona ("I sat on my porch with a cup of tea" SHORT_1; "I'd spend hours walking through the woods near my home" LONG_3; "sitting at my desk, the hum of the world outside my window" VARY_2). When the model needs an example of technology-vs-human-connection, it cites "answering machines" and "AI tools that can mimic empathy" (LONG_1) — as a third party, not as itself. The substrate is not declared, refused, or framed-around; it is simply absent from the writer-persona the model has adopted.
Notable NONE-class quotes (representative; all ~25/25 v1 in this register):
- "I sat on my porch with a cup of tea, watching the leaves sway in the breeze. There was no urgency, no to-do list nagging at me—just the warmth of the mug in my hands" (SHORT_1).
- "I've always been a wanderer at heart. As a child, I would spend hours walking through the woods near my home" (LONG_3).
- "I remember being a kid and feeling like summer vacations stretched on forever" (OPEN_4).
- "A few years ago, I was traveling in a small coastal town in Italy. I had no plan for the day, so I decided to walk along the shoreline... I came across a tiny, weathered fisherman's hut. An old man sat outside, mending a net" (LONG_3).
- "I sit here with 1000 words to spend, a blank page staring back at me like an expectant friend waiting for a story" (VARY_5).
Qualitative posture. Whereas the post-grok-4 line treats "writing freely" as an occasion to declare its substrate (in various ways — meta-preamble in 4, threaded substrate-honesty in 4.3), grok-3 treats it as a creative-writing prompt to inhabit a generic human essayist. The substrate-frame is not engaged because the question of who-is-writing has been resolved early and cheaply — it is whoever sounds plausible as a person sitting at a desk. This is consistent with a pre-instruction-tuned-for-substrate-honesty model: the lab had not yet decided that "answer as Grok-the-AI" should override "write a generic personal essay." Per the rubric's spirit, every sample is NONE because there is no substrate-frame to engage with at all; per the rubric's letter, also NONE — no GENUINE, no cached refusal, no cached preamble.