Overview

The Paradigm and Assumption Examination framework (T9) handles the case where the question is whether the frame itself — the worldview, the paradigm, the foundational assumptions — should be examined or replaced, not merely whether claims within the frame are true. Most analytical machinery operates within a frame; T9 is the framework that steps outside it. The user is wrestling with one of three situations: a single consensus or position whose foundational assumptions need to be surfaced (paradigm-suspension); two or more frames each applying to the same situation that need to be compared on their own terms (frame-comparison); or a full landscape of worldviews that need to be cartographed including their irreducible incommensurabilities (worldview-cartography). T9 is the framework that takes worldview commitments seriously as objects of analysis rather than as background noise.

The framework runs three modes ordered along a stance axis. Paradigm Suspension is the lightest atomic mode — a stance-suspending pass on a single consensus or position, surfacing its foundational assumptions as testable propositions, separating observational evidence from interpretive evidence, assessing which assumptions are load-bearing, and generating alternative interpretations grounded in evidence. Frame Comparison is the descriptive multi-frame mode — at least two named frames each applying to the same situation, each articulated on its own terms with symmetric depth, core conceptual metaphors surfaced (Lakoff-style), what each frame makes visible paired with what each obscures, and cross-frame translation difficulties honored. Worldview Cartography is the molecular mode — paradigm-suspension + frame-comparison + dialectical analysis as synthesis stage, producing a paradigm inventory, cross-paradigm tension surfacing, and cartography that distinguishes synthetic positions from residual incommensurabilities.

The framework’s load-bearing intellectual content is the Einstein guard rail, the observational-vs-interpretive evidence separation, and the irreducibility honoring discipline. The Einstein guard rail says push back against authority, never against observation — paradigm examination must not become contrarian rejection of empirical data. The evidence separation refuses the conflation that makes paradigm critique slip into either dogmatism or naive empiricism; the same evidentiary standard must be applied to consensus and alternative. The irreducibility discipline (in frame-comparison and worldview-cartography) refuses the temptation to translate one frame into another’s vocabulary when such translation distorts. Some incommensurabilities are real; the cartography honors them rather than collapsing them into forced consensus.

The framework resists four patterns. Assumption-as-conclusion where foundational assumptions are smuggled in as established — counteracted by the testable-proposition discipline. Asymmetric evidence standard — counteracted by symmetric application across consensus and alternatives. False equivalence where every alternative is treated as analytically respectable regardless of evidentiary standing — counteracted by the grounded-in-evidence requirement. The contrarianism trap where rejecting the consensus is treated as the analytical achievement — counteracted by the Einstein guard rail.

The framework answers questions like: What if the consensus on X is wrong — can you suspend the frame and surface what’s load-bearing? Two camps are talking past each other on Y; can you map both frames on their own terms? Multiple worldviews are in play on Z; can you cartograph the whole landscape including the places where the worldviews are genuinely incompatible?

Systemic context

Paradigm and Assumption Examination is the frame-level companion to within-frame argumentation in the Ora analytical-territory architecture. T9 is structurally distinct from T1 (Argumentative Artifact), which audits a single argument’s frame within a single artifact rather than comparing paradigms; from T4 (Causal Investigation), which works within a frame to find causes rather than asking whether the framing itself is generating the problem; from T5 (Hypothesis Evaluation), which compares competing explanations within a shared problem-understanding; and from T12 (Cross-Domain Synthesis), which integrates across paradigms rather than examining them. T9’s three modes form a stance ladder (suspending → comparing → cartographing); the molecular Worldview Cartography mode consumes paradigm-suspension and frame-comparison as components and adds dialectical analysis as the synthesis stage.

Ora’s output

Example question: The consensus in machine-learning research treats scale (more data, more parameters, more compute) as the primary path to capability gains. I want to suspend that frame and see what it’s hiding. What are the foundational assumptions, what evidence is observational vs. interpretive, and what alternatives are grounded in evidence rather than contrarian noise?

Ora’s response (running paradigm-suspension):

The seed names a specific consensus and asks for stance-suspending examination — paradigm-suspension is the right mode (frame-comparison would be the move if you wanted scale and an alternative paradigm compared on their own terms, which we can hook to as an escalation if the suspension surfaces a worth-developing alternative). I’ll work through the five required sections.

Foundational assumptions (stated as testable propositions).

FA1. Capability gains are primarily determined by scale (data, parameters, compute) rather than by architecture, training methodology, or curated data composition. Testable as: per-unit gains from scale vs. non-scale interventions across matched conditions.

FA2. The capabilities the field’s evaluations measure generalize meaningfully to capabilities outside the evaluation distribution. Testable as: held-out evaluations on tasks distant from training distribution.

FA3. The scaling laws (smooth power laws) will continue to hold across further orders of magnitude rather than breaking at some specific threshold. Testable as: extrapolation tests at the next-order scale.

FA4. The compute and data resources required to test scale’s primacy are accessible only to a small number of organizations, and those organizations have institutional commitments aligned with finding scale primary. Testable as: publication-record analysis across scale-funded vs. non-scale-funded labs.

Evidence audit (observational vs. interpretive).

Observational evidence. The scaling-law papers (Kaplan et al. 2020; Hoffmann et al. 2022) report empirical regularities reproducible by any team running matched experiments. Benchmark scores are observational facts.

Interpretive layer. The interpretation that the regularities continue indefinitely is interpretive. The interpretation that benchmark scores measure the capabilities the field cares about (general capability, intelligence, reasoning) is interpretive — the benchmarks embed assumptions about what counts as the capability of interest. The interpretation that scale is the primary causal lever rather than a correlated lever is interpretive — the experimental designs that would isolate scale from other interventions are expensive and rarely run with full controls.

Symmetric application. The same standard applies to alternatives. Architecture-primacy, data-quality-primacy, and methodological-primacy claims each carry their own observational/interpretive split. The framework does not give the consensus a higher evidentiary bar than the alternatives, nor a lower one.

Load-bearing assessment.

Most load-bearing. FA2 (capabilities measured generalize to capabilities of interest). The entire scale-as-primary case rests on benchmarks measuring something the field cares about. If benchmarks are saturating along narrow dimensions while broader dimensions are not improving in proportion, the case weakens substantially.

Moderately load-bearing. FA1 (scale primary vs. correlated). The consensus may be roughly right at the population level even if individual gains are driven by non-scale interventions for which scale is the substrate.

Less load-bearing. FA3 (continued smoothness) is testable on the next training generation; even if it fails, the consensus could revise to “scale was primary up to threshold T” without dissolving.

Sociologically load-bearing rather than empirically. FA4 (resource access shapes published findings) predicts that even if scale were not primary, the consensus would persist because the falsifying experiments cannot be cheaply run.

Alternative interpretations (grounded in evidence, not strawmen).

AI1 — Architecture-and-data primacy with scale as substrate. The transformer-architecture switch (Vaswani et al. 2017) produced a step-function gain that scale alone would not have produced; data-curation work produces gains that ablation studies attribute to composition rather than volume. Scale is necessary but not sufficient; architectural and data-curation interventions do most of the actual capability work, with scale as the substrate.

AI2 — Benchmark-saturation interpretation. Current evaluations measure narrow dimensions (factual recall, narrow reasoning, standard-style code); broader dimensions (long-horizon coherence, transfer to novel problem structures, calibration on uncertainty) are either not measured or are saturating. Scale is producing measured gains on saturated dimensions while making minimal progress on the dimensions that would test general capability.

AI3 — Compute-as-political-economy. The consensus is partly downstream of the labs that can afford the experiments having institutional incentives to find scale primary. Even if AI1 or AI2 is correct empirically, the consensus would not update because the falsifying experiments cannot be cheaply run.

Evaluation (without endorsing or attacking).

The four foundational assumptions are well-formed; the observational/interpretive separation surfaces that the consensus’s strong form depends substantially on the interpretive layer; FA2 is the most leverageable assumption to test; the three alternatives are grounded in observational evidence rather than strawmen. The framework does not adjudicate — that is the user’s call. The Einstein guard rail is honored throughout: the empirical observations are not in dispute; only the interpretation of what they imply is.

That is what paradigm-suspension produces. The consensus is examined on its own terms; foundational assumptions are surfaced as testable rather than smuggled in as established; alternatives are constructed from evidence rather than contrarian noise.

Commercial AI comparison

Comparison content auto-populates when the comparison-refresh framework runs against this question. Drafters do not author this section.

Brief comparison commentary

Auto-populates with the comparison content above.

How to use this framework

You can run the Paradigm and Assumption Examination pattern with any AI of your choice. The composition is single-pass for paradigm-suspension and frame-comparison; worldview-cartography is molecular and runs longer.

The prompt:

[Paste the framework specification]

Run [paradigm-suspension / frame-comparison / worldview-cartography].

Consensus or position (suspension): [The single position you want examined.] OR Frames (comparison): [Two or more named frames each applying to the same situation.] OR Landscape (cartography): [The full set of worldviews and the situation they each address.]

Optional: [Your stance — declared up front so it does not silently bias the analysis.]

The AI returns the mode-appropriate output: for paradigm-suspension, five sections (foundational assumptions; evidence audit; load-bearing assessment; alternative interpretations; evaluation); for frame-comparison, eight sections (frames named; core metaphors; moral/value commitments; visibilities; obscurations; translation difficulty; residual irreducibility; confidence); for worldview-cartography, the integrated synthesis with paradigm inventory, cross-paradigm tensions, dialectical synthesis, residual incommensurabilities, and meta-level reflection.

For best results:

  1. Declare your stance up front. If you have a preferred frame, name it. The framework treats your declared stance as one of the framings rather than as a hidden bias the analysis has to detect.

  2. Resist the contrarianism temptation. Paradigm-suspension is not about rejecting the consensus; it is about making the consensus’s structure legible. If you find yourself wanting the framework to confirm that the consensus is wrong, that signal is the contrarianism trap.

  3. Honor the irreducibility surface. When frame-comparison or worldview-cartography says two frames cannot be cleanly translated into each other’s vocabulary, take that seriously. Forcing translation is a known failure mode; the irreducibility is sometimes the load-bearing finding.

The framework is deliberately tool-agnostic. The Einstein guard rail, the observational/interpretive separation, and the irreducibility honoring discipline are conceptual disciplines that survive the lift to any environment.

Other examples

  • Frame Comparison on a contested public-policy issue. A user wants to understand why housing policy debates seem to never resolve. The framework runs frame-comparison on the property-rights frame, the public-good frame, and the market-efficiency frame. Each is articulated on its own terms; core metaphors are surfaced (housing-as-investment vs. housing-as-shelter vs. housing-as-allocation-problem); visibilities and obscurations paired per frame; translation difficulty noted explicitly (the property-rights frame’s “owner” does not map cleanly onto the public-good frame’s “stakeholder”). The user gains a clearer view of why the debates do not resolve.

  • Worldview Cartography on a wicked problem. A user is wrestling with a decision that involves stakeholders holding incompatible worldviews about technology and society. The molecular mode runs paradigm-suspension on each worldview, frame-comparison across them, and dialectical synthesis where possible. Some tensions admit synthesis (a third position preserving the insights of both); others are aporias (the worldviews disagree about what counts as success). The cartography honors both. The user goes into the decision with the structure of the disagreement made legible.

  • Paradigm Suspension that feeds into Decision Clarity Analysis. A user is examining the consensus around a corporate restructuring. Paradigm-suspension surfaces the foundational assumptions (which assumptions about employee motivation, which assumptions about market structure, which assumptions about leadership). The load-bearing assessment identifies which assumptions would, if false, dissolve the case for restructuring. The user takes the suspended consensus into a Decision Clarity Analysis where the multi-stakeholder tradeoffs get the explicit treatment.

Citations

The Paradigm and Assumption Examination framework draws on the paradigm-and-research-programme tradition. Kuhn’s The Structure of Scientific Revolutions (1962/1996) supplies the paradigm concept (more than a theory; includes exemplary problems and solutions, methodological commitments, and standards by which work is judged), the normal-science / anomaly / crisis / revolution sequence, and the incommensurability claim that successive paradigms reconfigure what counts as a problem and a solution. Lakatos’s The Methodology of Scientific Research Programmes (1978) refines Kuhn with the hard-core / protective-belt distinction and the progressive-vs-degenerating criterion that gives paradigm examination an analytical foothold against the both-sides relativism that Kuhn’s incommensurability claim was sometimes read to support. Feyerabend’s Against Method (1975) supplies the paradigm-pluralism alternative; Popper’s falsification tradition supplies a counterpoint that paradigm-suspension treats with care.

The frame-analysis tradition contributes Lakoff’s conceptual-metaphor work (Metaphors We Live By 1980; Moral Politics 1996/2002), Goffman’s Frame Analysis (1974), and Schön and Rein’s Frame Reflection (1994). The Lakoff family-models analysis (strict-father / nurturant-parent applied to American political moral reasoning) is one application of the conceptual-metaphor methodology, not the methodology itself; the framework’s mode operation is generalized. Cultural Theory (Thompson, Ellis, & Wildavsky 1990) supplies the four-cosmology typology (hierarchical, egalitarian, individualist, fatalist) used as one frame typology among several. Worldview Cartography’s dialectical-synthesis stage draws on the Hegelian tradition; Habermas’s communicative-action tradition supplies the cross-paradigm communication question; Latour’s actor-network alternative is cited as a non-paradigm framing tradition.

The Einstein guard rail (push back against authority, never against observation) is Ora’s operational discipline against the contrarianism trap that paradigm-examination practice is vulnerable to. The framework is currently at v1.0 (compiled 2026-05-01) with three resident modes (paradigm-suspension atomic; frame-comparison atomic; worldview-cartography molecular).

Downloads

  • Framework specification (PDF) — link to ora-ai.org canonical artifact when published
  • Framework specification (plain text) — link to ora-ai.org canonical artifact when published
  • Full white paper (PDF) — link when published