The Analyst-Grade Standard — Whitepaper

Section 01

What This Paper Is

A methodology standard for analyst-grade pre-touch research in B2B go-to-market. The standard rests on two anchors. The first is published industry research that documents what happens when sellers do this work well. The second is the rigor traditions of adjacent analytical fields, where what counts as rigorous work is already well defined.

Analyst-grade methodology is decomposable from analyst-grade time investment. The intellectual discipline of thesis-driven, triangulated, provenanced, calibrated, comparably benchmarked, and counter-evidenced work is the standard. Time per artifact is an implementation consequence of unit economics. Equity research has had time-rich analyst-grade methodology for a century because per-company analyst labor of one to three weeks is economically supported. B2B go-to-market has historically had time-starved methodology because cold-outreach unit economics do not support time-equivalent labor. The contribution of this paper is articulating what analyst-grade methodology looks like at GTM economics: same intellectual discipline, fundamentally different production model, implementation-agnostic standard.

Applied to B2B pre-touch research, the methodology dimensions describe a category of work that goes beyond firmographic enrichment plus one trigger event. Analyst-grade pre-touch research defends a thesis with triangulated evidence, sources its claims per-claim, calibrates confidence visibly, benchmarks fit against a defined target profile, and acknowledges what would invalidate the recommendation. The difference between this category and standard pre-touch research shows up in funnel performance (Section 2) and in the quality of conversation a rep can hold with the buyer.

The paper introduces the term analyst-grade as the name for this category of work in B2B go-to-market. The methodology standards the term names are practiced and codified across the analytical traditions in Section 3; their formal application to GTM pre-touch research is what this paper sets out.

Disclosure

This paper is published by NIEOS. NIEOS operates one commercial implementation of the methodology described here, called SDR Flow. The methodology in this paper is intended to outlive any specific implementation. Where SDR Flow's design choices are referenced, they appear in Appendix D (Implementation Profiles) as one of five sample implementations evaluated against the same methodology and coverage standards.

When the standard applies

This is a standard for where the standard matters. Strategic accounts where rigor and an educated first touch are critical justify the time investment described in Section 6. For mid-market and high-volume motions, the methodology serves as a reference that can be applied in parts. Not every dimension and not every category is required at every prospect tier. Section 10 makes the selective-application logic explicit. The full standard is the bar; selective application is the practical use case for most teams.

Who this is for

Revenue operations leaders. Sales leadership. Procurement teams evaluating intelligence vendors. Practitioners building or auditing their own pre-touch process. The assessment framework in Section 11 works against any artifact, vendor-produced, agency-produced, in-house, or engine-produced. The substance applies the same way regardless of who does the work.

How the paper is organized

Sections 2 and 3 establish the evidence base and the rigor precedent.
Sections 4 to 6 set out the substance: a framework intro, the six methodology dimensions, and the seven research categories with their time math.
Sections 7 and 8 cover the substrate that calibration depends on and how an analyst-grade artifact reads in practice.
Sections 9 to 11 cover limitations, how to apply the standard at scale, and the assessment framework.
Section 12 is a brief note on use.
The appendices contain references, a glossary, a structured assessment worksheet, and five implementation profiles rated against the methodology dimensions.

Section 02

What Industry Research Documents

The outcomes of deep, evidence-based pre-touch research are well documented. This section gathers the findings most relevant to the substance the rest of the paper sets out. Each finding uses the language of the original research. A short bridge at the end of each subsection ties it back to the framework.

2.1 Buyers avoid suppliers whose outreach is irrelevant

Gartner's B2B Buying Journey research finds that 73% of B2B buyers actively avoid suppliers that send irrelevant outreach [R-01]. The same body of research documents that buying groups average 6–10 stakeholders. Each member completes four to five distinct "buyer jobs" at once. Generic outreach reaches a sophisticated, multi-person evaluation that has already learned to filter low-context messages out.

What this means for the framework: pre-touch research depth is not a marginal improvement. Outreach without it damages the supplier's standing with the buyer.

2.2 Top-performing sellers always research first

LinkedIn's State of Sales research finds that top quota performers research differently from the rest. 76% of top performers say they "always" conduct research on their buyers before reaching out, compared to just 38% of those who fail to meet target [R-06]. The same research notes that incomplete or stale data stalls sellers and stalls deals. That points to a quality threshold below which research stops producing the outcome.

What this means for the framework: the behavior that separates top performers from the rest is rigorous pre-touch research. Tools and processes that fall below the quality threshold don't produce the top-performer result.

2.3 Personalization depth produces measurable funnel improvement

Published cold-outbound benchmarks from Apollo, Smartlead, Lavender, and Belkins consistently report a clear gradient by personalization depth. The table below shows the range each source publishes for each depth band, with the anchoring vendor and the exact claim cited per row:

Personalization depth	Reply rate range	Lift vs generic	Anchored by (verified sources)
Generic outreach	1–3%	baseline	Apollo (<2% "below average") [R-07a] · Belkins (larger lists 2.1%) [R-07d]
Enriched (firmographic + role + trigger)	3–6%	~2–3×	Apollo (3–5% "well-run") [R-07a] · Smartlead (3.43% platform avg, 5%+ "good") [R-07b] · Belkins (5.1% overall, 5.8% small-targeted) [R-07d]
Deeply personalized (account context + buyer-specific framing)	7–20%+	~7×	Apollo (5–8% "strong", 10%+ "excellent") [R-07a] · Smartlead (advanced personalization up to 18%) [R-07b] · Lavender (20.5% across ~50K active Lavender inboxes, self-selected sample) [R-07c]

Backlinko's outreach benchmarks report that personalized subject and body copy improve response by roughly 30.5% to 32.7% on top of baseline rates [R-04]. Gong's 28M+ cold-email research finds that top reps generate roughly 4.2× more replies than average reps, and that personalization variables compound rather than substitute [R-05].

What this means for the framework: the funnel improvement from depth is measured, not theoretical. The roughly 7× reply-rate gradient between generic and deeply personalized outbound sets an order-of-magnitude expectation for the difference between rigorous and basic pre-touch research. The Lavender 20.5% figure reflects a self-selected sample of teams already invested in personalization tooling, so it should be read as a ceiling, not a baseline.

2.4 Buying-committee complexity makes individual-only research insufficient

Forrester's Demand Unit Waterfall research [R-02] reframes B2B demand around the buying committee as the unit of demand. The individual lead is not the unit. The implication for pre-touch research is structural. Targeting individual contacts in isolation misses the multi-stakeholder dynamics that actually drive purchase decisions. The same point appears throughout Gartner's research on buying-group size and complexity [R-01].

What this means for the framework: research that treats prospects as isolated individuals fails on a documented dimension. Account-aware and committee-aware context is part of the substance, not an enhancement.

2.5 The cost of stopping short

Three documented costs of operating at basic-enrichment depth instead of rigorous depth:

Wasted SDR time on poor-fit accounts and bad data. Without rigorous targeting and qualification, SDRs spend cycles on accounts that would have failed a calibrated assessment. Published RevOps benchmarks place bad-data waste alone (research, verification, wrong contacts, bounced emails) at roughly 15–25% of selling time, before counting fit-quality misses.
Brand erosion from documented buyer avoidance. Per Gartner [R-01], the supplier that sends irrelevant outreach is filtered out by 73% of buyers. Not just for the current cycle, but for future ones. The brand cost compounds.
Top-of-funnel waste. The reply-rate gradient in 2.3 means an outbound team operating at generic depth needs roughly 7× the volume to produce the same conversation count as a team operating at deeply personalized depth. The wasted send volume carries deliverability, list-quality, and brand consequences.

A note on the validation argument

The research in this section validates the category of approach. Deep, evidence-based, target-calibrated pre-touch work produces better outcomes. The research does not validate any specific articulation of the substance, and certainly not any specific vendor's implementation. The framework in the rest of the paper is a perspective on what the substance looks like. The industry data is the reason the substance is worth setting out.

The chain of reasoning

Industry research documents that pre-touch research at this depth produces measurably better outcomes. Adjacent analytical fields document what rigor at this depth means. Sections 4 to 8 set out a small number of conditions any rigorous implementation would satisfy. The substance is what the industry has validated. The conditions are what adjacent fields have already established as rigor.

Section 03

What Analytical Rigor Means: Precedent Across Domains

Rigorous analytical work has shared markers across domains: a defended thesis, triangulated evidence, sourced claims, calibrated confidence, comparable benchmarking, and acknowledged counter-evidence. The five traditions below each developed their own version of these markers.

3.1 Equity research

Sell-side and buy-side equity research operate under the most formally codified version of these standards. Thesis-driven analysis with a target outcome. Mandatory source disclosure for material claims. Calibrated language for uncertainty. Comparable-company benchmarking. Disclosed valuation methodology.

The CFA Institute's Standards of Professional Conduct [R-10] codifies these expectations. The standards require that analysts have a "reasonable and adequate basis" for any recommendation. Material assumptions must be disclosed. SEC Regulation AC requires research analysts to certify that the views in their reports honestly reflect personal views.

Equity research · Conditions analogous to our framework

Triangulation: Mosaic theory, combining publicly available material from multiple sources to reach a conclusion no single source supports alone.

Provenance: Source disclosure for material claims, as required by CFA standards.

Calibration: Comparable-company analysis, valuation against a peer set, not against a generic ideal.

Calibrated language: Buy / Hold / Sell ratings with stated confidence, target prices, and risk factors.

3.2 Industry analyst publications

Industry analyst firms (Gartner, Forrester, IDC) publish methodology documents that describe how their analytical products are produced. Gartner's Magic Quadrant Methodology [R-12] discloses the evaluation criteria, the weightings applied to each criterion, the data sources consulted, and the multi-pass review process. Forrester's Wave Methodology [R-13] follows the same convention.

Both make a structural choice that is itself a rigor standard. The methodology is disclosed alongside the conclusion. Readers can assess the basis for the conclusion, not just the conclusion.

Industry analyst publications · Conditions analogous to our framework

Disclosed evaluation criteria: The dimensions on which vendors are evaluated are stated, with weightings.

Multi-source primary research: Vendor briefings, customer references, surveys, secondary research, convergent evidence required.

Structural format: Same format across every Magic Quadrant or Wave, so readers can navigate consistently.

Methodology disclosure: The published methodology is itself an artifact, separable from any individual report.

3.3 Investigative journalism

Investigative journalism developed rigorous source standards independently. The Investigative Reporters and Editors body (IRE) codifies practices that include the two-source rule (claims about contested facts require independent corroboration), document trails (citing the documents that ground a claim), preference for named over anonymous sources, and explicit acknowledgment of competing accounts. ProPublica and similar organizations publish methodology disclosures alongside major investigations [R-14].

Investigative journalism · Conditions analogous to our framework

Multi-source corroboration: The two-source rule for contested claims.

Document trails: Provenance per claim through cited documents.

Competing accounts acknowledged: Alternative interpretations stated, not buried.

3.4 Consulting analytics

Top-tier consulting firms (McKinsey, BCG, Bain) operate under analytical conventions that are themselves a rigor standard. MECE structure for problem decomposition. Hypothesis-driven research. Pyramid argumentation.

Barbara Minto's The Pyramid Principle [R-08] is the canonical reference for consulting analytical structure. It codifies the operational format: lead with the answer, support it with a structured argument, ensure the argument is logically complete. The format itself is a rigor commitment. It forces analytical completeness because the structure shows gaps.

Consulting analytics · Conditions analogous to our framework

Operational structure: Pyramid Principle, answer first, structured support below.

MECE decomposition: Problem coverage that is logically complete and non-overlapping.

Hypothesis-driven: Analysis serves a thesis; the thesis is testable against evidence.

3.5 Decision-science methodology

The methods used for evaluating multiple options against multiple criteria are well established in operations research and decision science. Multi-Criteria Decision Analysis (MCDA) is the umbrella term. The Analytic Hierarchy Process (AHP), developed by Thomas Saaty [R-15], is one of the most widely cited specific methods. Conjoint analysis is the standard technique for revealed-preference work, inferring what attributes drive choice from observed decisions rather than stated preferences [R-16].

These methods underpin rigorous evaluation work across operations research, market research, healthcare decision-making, and policy analysis.

Decision-science methodology · Conditions analogous to our framework

Weighted composite scoring: Multiple dimensions evaluated with disclosed weights, composited deterministically.

Revealed-preference analysis: Inferring what matters from observed evidence, not from stated claims.

Peer-set normalization: Comparison within a defined peer set rather than absolute scoring.

3.6 What these traditions converge on

Five conditions show up in every one of these adjacent fields:

Condition	Appears in
Depth (triangulation / multi-source corroboration)	Equity research (mosaic theory) · Industry analyst (multi-source) · Journalism (two-source rule) · Consulting (MECE)
Structure (operational, consistent format)	Equity research (standard report sections) · Industry analyst (Magic Quadrant format) · Consulting (Pyramid Principle)
Evidence (provenance + confidence)	Equity research (CFA source disclosure) · Industry analyst (methodology disclosure) · Journalism (document trails)
Calibration (target-specific, not generic)	Equity research (comparable-company analysis) · Industry analyst (criteria + weighting disclosure) · Decision science (peer-set normalization)
Time-equivalence (depth comparable to skilled analyst work)	Equity research (standard report time investment) · Industry analyst (Magic Quadrant production cycles) · Consulting (MECE completeness)

Domain-to-dimension mapping

Equity research informs thesis-driven targeting (5.1), triangulated evidence (5.2 via mosaic theory), per-claim provenance (5.3 via CFA Standards), calibrated confidence (5.4 via Reg AC), comparable benchmarking (5.5 via peer-set analysis), and counter-evidence (5.6 via risk factors). Investigative journalism informs triangulated evidence (5.2 via two-source rule), per-claim provenance (5.3 via document trails), and counter-evidence (5.6 via competing accounts). Consulting analytics informs thesis-driven targeting (5.1 via hypothesis-driven analysis) and counter-evidence (5.6 via pre-mortem analysis). Decision science informs comparable benchmarking (5.5 via multi-criteria evaluation). No single domain produces all six dimensions; the methodology synthesizes across them.

Section 04

The Standard

Working definition

Analyst-grade pre-touch research

Per-account or per-prospect intelligence work that applies six methodology dimensions across seven research categories. Output can come from analysts, engines, agencies, or hybrid pipelines. Production method is not a substance criterion.

What the standard rules in

Per-account and per-prospect intelligence formats.
Static deliverables (print, web, downloadable) and live deliverables (in-CRM, in-platform views), provided methodology and coverage are satisfied.
Engine-produced, analyst-produced, agency-produced, RevOps-stack-produced, and hybrid work alike, provided methodology and coverage are satisfied.
Incremental research patterns (refresh, signal-triggered updates), provided each consumed instance satisfies the methodology dimensions and coverage required by its opportunity tier.

What the standard rules out

The 30–60 minute manual skim. Title plus company plus one trigger. However well presented, it fails coverage and most of the methodology dimensions.
Enrichment-record output. A row of firmographic, technographic, and contact fields joined and rendered. This is data, not intelligence. It fails thesis-driven targeting, calibrated confidence, and coverage.
Templated AI summary. A model-generated paragraph naming the company, the role, and a recent event. It fails triangulated evidence, per-claim provenance, and calibrated confidence regardless of word count.
Single-pass analyst notes. Even from a skilled analyst, working notes are not analyst-grade until they are structured, sourced, and calibrated. They fail per-claim provenance and methodology disclosure.

A note on the label

The term analyst-grade is this paper's name for the methodology-and-coverage bundle. Practitioners who prefer different labels, research-grade, analyst-equivalent, structured pre-touch intelligence, are working on the same substance. The label is portable; what matters is whether the methodology dimensions and coverage requirements are operative in the artifact. Readers who reject the label but adopt the substance get the benefit.

Property of the work, not the producer

The six methodology dimensions in Section 5 describe properties of the research and its output. A senior analyst can produce work that fails the dimensions. An engine can produce work that satisfies them. What's in the artifact is what counts, not what produced it.

Section 05

The Six Methodology Dimensions

Dimension 01

Thesis-driven targeting: defensible point of view

The artifact contains a defensible thesis: why this account, why this prospect, why now. The thesis is testable against evidence.

The dimension

Every analyst-grade artifact contains a defended thesis. A peer reviewer can examine the supporting research and judge whether the thesis holds or fails. The thesis names the buyer-specific reason this opportunity is worth pursuing now, not generic enthusiasm and not a templated "this looks interesting" gesture.

Why this is necessary

Without a defended thesis, the artifact is a data dump. Research collected without a thesis to test is not analyst-grade work; it is preparation pretending to be analysis. A thesis forces the analyst to commit, which forces the analyst to defend, which forces the analyst to evaluate evidence rather than just collect it.

Cross-domain precedent

Equity research: every Buy / Hold / Sell rating defends a thesis (revenue acceleration, multiple expansion, margin recovery). Consulting hypothesis-driven analysis: recommendations rest on a stated hypothesis that the work supports or invalidates. Industry analyst Magic Quadrant placements rest on disclosed evaluation criteria, not undefended position.

Assessment test

Read the artifact and state its thesis in one sentence. Then identify the three pieces of evidence in the artifact that most directly support the thesis. If the thesis cannot be stated, or the support cannot be located, the artifact fails thesis-driven targeting.

Dimension 02

Triangulated evidence: no material claim from a single source

No material conclusion sits on a single source. Cross-source corroboration is required for the claims that drive the recommendation.

The dimension

For every material claim, the artifact draws on evidence from at least two independent source classes. A claim grounded in firmographic data alone, LinkedIn signals alone, or a single press release alone is not triangulated and is flagged as such.

Why this is necessary

Single-source claims fail in two directions. They fail upward when the source is wrong (or misread) and the conclusion propagates unchecked. They fail downward when the consumer cannot calibrate confidence because they cannot see what else supports the claim. Mosaic theory exists because no single source supports a serious conclusion alone.

Cross-domain precedent

Equity research mosaic theory: combining publicly available material from multiple sources to reach a conclusion no single source supports alone. Investigative journalism two-source rule (Investigative Reporters and Editors). Industry analyst multi-source primary research (vendor briefings, customer references, surveys, secondary research combined).

Assessment test

Pick any three material claims in the artifact. For each, identify the source classes that support the claim. If any material claim has only one source class behind it without an explicit single-source flag, the artifact fails triangulated evidence.

Dimension 03

Per-claim provenance: sourcing in-line

Every material claim ties to its source. Any conclusion in the artifact is traceable to the evidence it stands on.

The dimension

Sources are cited per claim, not per section. The reader can identify the source of any material claim within five seconds, without leaving the claim. Inferred claims are distinguished from observed claims by source attribution: "the company published a press release announcing X" is sourced differently from "they are likely shifting to a mid-market motion."

Why this is necessary

Without per-claim provenance, the reader either trusts on faith or reproduces the research manually. Both collapse the value. Per-section sourcing creates an authority gradient where everything in the section appears equally well-grounded, which is rarely true. Per-claim sourcing exposes which conclusions are observation, which are inference, and which are speculation.

Cross-domain precedent

CFA Institute Standards of Professional Conduct: equity analysts must have a "reasonable and adequate basis" for recommendations, with material sources disclosed. SEC Regulation Analyst Certification (Reg AC) requires personal certification of views. Investigative journalism document trails (IRE). Industry analyst published methodology documents disclose source families per evaluation criterion.

Assessment test

Pick any three material claims at random. For each, the reader should be able to identify the source within five seconds without leaving the claim. If any takes longer than five seconds, the artifact fails per-claim provenance.

Dimension 04

Calibrated confidence: speculation labeled, not laundered

Claims carry explicit confidence levels. Speculative inferences are labeled, not presented as observation.

The dimension

Material claims carry visible confidence levels (high, medium, low, or equivalent). The reader can distinguish high-confidence observation from medium-confidence inference from low-confidence speculation at a glance. Uniform confidence across all claims signals a problem, and the reader can detect it.

Why this is necessary

Without calibration, speculation is laundered as fact. A confident-sounding paragraph reads as a verified claim whether or not the analyst had evidence. Calibrated language exists in equity research because investors making capital allocation decisions must distinguish what is known from what is inferred. The same distinction matters in GTM, where reps acting on the artifact need to know which claims they can defend to a buyer who challenges them.

Cross-domain precedent

CFA Institute Standards: calibrated language for uncertainty is required, not optional. SEC Reg AC requires personal certification. Intelligence and forecasting communities use calibrated-language conventions (low / medium / high confidence) because uncalibrated assertion fails the consumer.

Assessment test

Read the artifact and identify the lowest-confidence claim. Then identify the highest-confidence claim. If both cannot be distinguished by their language and labeling, the artifact fails calibrated confidence.

Dimension 05

Comparable benchmarking: fit expressed relative, not absolute

Fit is expressed relative to a defined target profile, with dimensions decomposed. It is not asserted in absolute terms.

The dimension

When the artifact says "good fit" or "poor fit," the reader can see what target profile the fit is expressed against and what dimensions were evaluated. The target profile is named, the dimensions are decomposed, and the comparable peer set is identifiable. Single composite scores without decomposition fail this dimension. The target profile must itself be rigorous; Section 7 addresses what that means.

Why this is necessary

Without comparable benchmarking, "fit" is asserted, not measured. "Looks like a good account" is not different from generic enthusiasm. Generic fit assessment against an unstated "ideal mid-market SaaS company" archetype fails this dimension regardless of how confidently it is stated.

Cross-domain precedent

Equity research comparable-company analysis: valuation against a defined peer set, never against a generic ideal. Industry analyst evaluation against disclosed criteria with weightings (Magic Quadrant Methodology). Decision-science peer-set normalization (Saaty AHP, MCDA): scoring is meaningful only within a defined comparison set.

Assessment test

Ask the artifact: which dimensions of our target profile does this account match, and which doesn't it? If the answer cannot be rebuilt from the artifact alone (without consulting external documents), comparable benchmarking fails.

Dimension 06

Counter-evidence acknowledged: disqualifiers on the page

The artifact names the conditions under which the recommendation would be wrong. Disqualifying evidence is surfaced, not buried.

The dimension

The artifact addresses what would have to be true for this opportunity not to be a fit. Disqualifying evidence is surfaced explicitly. The reader does not have to ask "what about X." The artifact has already considered X. The artifact could have produced "do not engage" as an output for a different prospect with the same methodology applied.

Why this is necessary

Without acknowledged counter-evidence, the artifact is sales advocacy disguised as analysis. A research artifact that produces "engage" recommendations on every prospect is not research; it is a justification engine. Disqualifying evidence is what distinguishes analysis from confirmation bias dressed in citations.

Cross-domain precedent

Equity research risk factors: every coverage report names what would invalidate the thesis. Investigative journalism "competing accounts": alternative interpretations are stated, not buried. Consulting pre-mortem analysis: structured identification of how the recommendation could fail.

Assessment test

Read the artifact and identify the conditions under which the recommendation would be wrong. If no such conditions are named, the artifact fails counter-evidence acknowledged. If the conditions are present but cannot be located within ten seconds, the artifact fails by burying them.

The dimensions are cumulative

An artifact that satisfies five of six dimensions does not satisfy the standard. A thesis without triangulation is unsupported assertion. Triangulation without provenance is uncheckable. Provenance without calibration is uniform-confidence noise. Calibration without comparable benchmarking is generic. Comparable benchmarking without counter-evidence is sales advocacy. All six are necessary. None is sufficient alone.

Section 06

The Seven Research Categories

Seven research categories define the coverage standard for analyst-grade pre-touch work. Full-depth requirements below; mid-market and high-volume motions apply them selectively (Section 10).

Category	What it requires	Time at full depth
Persona psychographics	Model of how this buyer thinks, what frames resonate, what they fear, what they reward publicly. Not their role, their wiring.	15–30 min
Product Fit	Whether what we sell addresses their stated and inferred problem space in the right shape for their stage, motion, and stack, including integration friction, migration cost, and political cost of displacement.	15–45 min
Competitor Landscape	Who else is selling to this account, who is installed, who has been evaluated and rejected, and what is the displacement story for each.	15–45 min
Industry Context	Macro, sector, and segment dynamics that make this buyer more or less buyable right now, independent of the account itself.	15–30 min
Gap Analysis	Current state vs. target state for the prospect. Where they are operationally, where they say they want to be, and where the actual gaps live.	15–45 min
Stakeholder Analysis	Full buying committee map. Economic buyer, technical buyer, user, champion, blocker, influencer. With named individuals where possible.	60–150 min
Outreach Cadence	Based on prospect-specific signals, the inferred optimal cadence: channel mix, frequency, sequence, timing, and expected pushback playbook.	15–30 min
Total at full depth		2.5–6.25 hours

Source basis per category

The published literature on per-category pre-touch research time is uneven. Outreach Cadence has the strongest primary support (Lavender 15–20 min manual baseline, corroborated by McKinsey State of AI in Sales 20–30 min SDR research figure). Stakeholder Analysis is bottoms-up derived: Clay's published 15 min per prospect for manual lookup × Gartner B2B Buying Journey 6–10 stakeholders per buying group [R-01] = 90–150 min for individual lookup, plus role labeling and committee synthesis; Prospeo's account-mapping guide separately publishes 30–90 min initial mapping. Two have proxy benchmarks: Persona psychographics (vendor-tool consumption baselines from Crystal Knows plus manual signal review) and Gap Analysis (discovery-prep proxies from Sandler, HubSpot, Highspot pre-call guides). The remaining three are triangulated from adjacent benchmarks: Product Fit (Loopio RFP report 33 hours per response with fit-analysis subset roughly 20–30%; ABM account-research 2–4 hours with fit subset 25–40%); Competitor Landscape (Klue battlecard work at 16 hours initial per competitor, applied to a single account rather than built per account); Industry Context (Buyer Persona Institute 15-hour segment research amortized across named accounts plus per-prospect timing-signal refresh).

Coverage requirement, not output specification

Each category is a coverage requirement, not an output specification. The artifact does not have to ship a particular field count, source family count, or visualization count to satisfy the category. Implementation profiles in Appendix D show how different production models (in-house analyst desk, RevOps stack, agency, software hybrid, pure software) cover these seven categories at different depths. The implementation is a design choice; the coverage is the standard.

Tier framing

The 2.5 to 6.25 hour range is the sum of the per-category times above. It applies when all seven categories are researched at full depth. Section 10 covers when less than full depth is the right choice and which dimensions and categories scale down at mid-market and high-volume tiers.

Consumption time

The compression on the consumption side matters as much as on the production side. An artifact meeting the standard should be readable and operationally usable in roughly 6 minutes. Long enough to absorb strategic context, engagement plan, and qualification overlay. Not so long that pre-touch review becomes its own cost center. The 6-minute review benchmark assumes structural consistency. Without it, review time balloons.

Section 07

Targeting Rigor

Calibration (Condition 4) requires fit assessment against a target profile. The condition is meaningful only to the extent that the target profile is itself rigorous. A perfectly calibrated artifact scored against a wishlist target produces polished output against a weak benchmark. That is not rigorous work. This section addresses what makes a target profile rigorous.

The substrate problem

A target profile is the substrate behind pre-touch research. Every fit score, every prioritization decision, every exclusion that filters out poor-fit accounts, all derive their validity from the substrate. Substrate that is thin or undefined produces downstream output that is thin and undefined. The artifact format does not save it.

The most common substrate failures, observed in practice:

Wishlist profiles. "Mid-market SaaS, 200–2000 employees, US, B2B." No evidence base. No scoring logic. No exclusion criteria. Any account roughly in the range scores roughly the same.
Single-evidence-layer profiles. Either built from inference (no historical data) or built from history (no forward-looking evidence). Both fail triangulation at the substrate level.
Static profiles. Defined once, never refreshed against actual outcome data. The substrate ossifies. Calibration drifts from current reality.
Profiles with no decomposition. A composite "fit score" with no exposed per-dimension structure. It cannot be audited, refined, or argued with.

What makes a target profile rigorous

A rigorous target profile satisfies four conditions of its own:

Substrate 01

Defensible derivation

The profile is built through a stated methodology, not asserted. The methodology specifies what evidence sources are consulted, how dimensions are scored, how the composite is weighted, and how confidence is tracked.

Substrate 02

Multiple independent evidence categories

The profile is defended against more than one category of evidence. Forward-looking strategic evidence (market positioning, segment thesis, customer voice, competitive landscape, persona behavior) and historical performance evidence (won-deal patterns, lost-deal patterns, retention data, comparable peer analysis) are two categories commonly used. The principle is the same as triangulated evidence in Section 5.2: no defining claim about who fits can rest on a single category. The number of evidence categories is implementation-specific; the methodology requires more than one.

Substrate 03

Quantitative scoring with decomposition

Fit is reported as a composite score with exposed per-dimension decomposition, not a single number. The consumer (and downstream research) can see which dimensions drive the score for any account.

Substrate 04

Refresh against outcome data

The profile is bound to analytics tracking actual outcomes per fit tier. Calibration becomes testable. Tier A accounts should outperform Tier B. If they don't, the substrate is refined. Static profiles drift. Tracked profiles improve.

Cross-domain precedent for substrate rigor

The methods used for rigorous target-profile work are well established in operations research and market research:

Method	Precedent	What it contributes
Multi-Criteria Decision Analysis (MCDA)	Operations research, multi-decade literature	Weighted composite scoring with disclosed weights
Analytic Hierarchy Process (AHP)	Saaty, foundational decision-science framework [R-15]	Hierarchical decomposition + pairwise comparison for weight derivation
Conjoint analysis	Marketing research, foundational technique [R-16]	Revealed-preference inference of attribute weights from observed evidence
Percentile-rank normalization	Standard descriptive statistics	Scoring within peer sets rather than absolute scales
Segmentation theory	Smith (1956) onward; foundational market-research paradigm	Structured market subdivision with criteria-based grouping

The substrate is applied analytical methodology

None of these methods are exotic. MCDA, AHP, conjoint analysis, and percentile-rank normalization are taught in operations-research and market-research curricula. They are used across operations, healthcare decision-making, policy analysis, and product development. Rigorous target-profile work applies these methods to a domain (B2B sales targeting) where they have not been formalized at scale.

Section 08

How an Analyst-Grade Artifact Reads

Eight properties define what a reader encounters when reading an analyst-grade artifact. Format does not change them: PDF, structured dashboard, document, assessment, scored report, all manifest the eight the same way.

8.1 The thesis arrives first

An analyst-grade artifact opens with a defended position. The reader knows within the first screen or first paragraph: why this account, why this prospect, why now. The thesis is not buried in section three. It is the first thing the reader encounters because every other claim in the artifact serves it.

8.2 Coverage is visible without searching

The reader can confirm that the seven research categories were addressed without searching the artifact. Structure makes coverage scannable. Missing coverage is visible as a gap, not hidden by formatting.

8.3 Material claims are corroborated, not asserted

For any claim that drives the recommendation, the reader sees more than one source. Single-source claims are flagged as such. The reader does not have to take any material conclusion on faith.

8.4 Sourcing is in-line

The reader can identify the source of any material claim without leaving the claim. Provenance is visible where the claim lives, not buried in an appendix.

8.5 Confidence is calibrated visibly

Claims carry visible confidence levels. The reader can distinguish high-confidence observation from medium-confidence inference from low-confidence speculation at a glance. Uniform high-confidence ratings across all claims signal a problem, and the reader can detect it.

8.6 Fit is benchmarked, not asserted

When the artifact says "good fit" or "poor fit," the reader sees what target profile the fit is expressed against and what dimensions were evaluated. Fit without benchmark is not analyst-grade and the reader can see the absence.

8.7 Counter-evidence is on the page

The reader encounters the conditions under which the recommendation would be wrong. Disqualifying evidence is surfaced. The reader does not have to ask "what about X." The artifact has already considered X.

8.8 The thesis lands at a specific action

The reader closes the artifact knowing what to do next: with what opener, on what channel, in what window, anticipating what pushback. An artifact that researches but does not recommend is a brief, not analyst-grade work product.

Reader-verifiable across implementations

The eight reader-experience properties above are visible regardless of format or production model. Appendix D documents five sample implementations (in-house analyst desk, RevOps stack, agency, software hybrid, pure software) and rates how each surfaces these properties at different production cost points.

Section 09

Limitations

A working framework that does not acknowledge its limits is brittle. This section names where the framework does not apply, where the derivation is weakest, and what is open for refinement.

Claim taxonomy

Every claim in this paper falls into one of five categories. The categories are not interchangeable. Readers should weight conclusions accordingly.

Market evidence. Third-party data about buyer behavior or outreach performance. Examples: Gartner's 73% buyer-avoidance finding [R-01], LinkedIn's 76% top-performer research behavior [R-06], the Apollo / Smartlead / Lavender / Belkins reply-rate benchmarks in Section 2.3 [R-07a][R-07b][R-07c][R-07d].
Product specification. SDR Flow artifact fields, categories, workflow stages, and source families. These describe what the system produces by design. They are not measurements of customer outcomes.
Practitioner observation. Time estimates, QA observations, and workflow benchmarks derived from internal practice rather than formal study. The 2.5 to 6.25 hour full-depth per-prospect range in Section 6 is the clearest example. One category (Outreach Cadence) has primary published support; one (Stakeholder Analysis) is bottoms-up derived; two have proxy benchmarks; three are triangulated from adjacent benchmarks. None is a primary time-and-motion measurement.
Planning assumption. Calculator inputs, projected lift figures, and modeled outcomes used for sales planning. These are informed by published benchmarks but have not been validated against any specific SDR Flow customer. Treat as planning inputs, not promised outcomes.
Pilot-validated metric. Reserved for outcomes measured against a specific SDR Flow customer pilot, with a defined baseline and measurement window. This paper does not yet contain pilot-validated metrics. They will be added as customer engagements close their measurement windows.

Scope limitations

Pure inbound motions

The framework is designed for outbound and hybrid motions where pre-touch research depth is a value driver. Pure inbound motions have different research requirements (faster response, lighter pre-touch depth, more emphasis on real-time conversation prep). The conditions may apply in modified form. The operational benchmark (~6.5 hr) likely does not.

Account-only intelligence without prospect overlay

The framework assumes per-prospect calibration (Condition 4). Pure account-only intelligence, without a named prospect, can satisfy depth, structure, evidence, and time-equivalence. But calibration is harder to satisfy without prospect-level fit data. A modified version of the framework for account-only artifacts may be warranted.

Markets with structurally limited contact-level enrichment

Heavily regulated industries (healthcare with HIPAA constraints, defense, financial services compliance) and non-English-language markets may not support the source-diversity description. The core conditions still apply. The specific output spec numbers may need calibration.

Industries with unique data structures

Healthcare, defense, and certain financial-services contexts have data structures and compliance constraints that bound what enrichment is permissible. The framework does not address sector-specific calibration. Sector-specific extensions would strengthen the framework.

Methodological limitations

The full-depth time range is not derived from primary measurement

The 2.5 to 6.25 hour full-depth per-prospect range in Section 6 is partially supported by published benchmarks but no primary study measures total full-depth pre-touch research time directly. One category (Outreach Cadence) has the strongest primary support; one (Stakeholder Analysis) is bottoms-up derived from Clay + Gartner inputs; two have proxy benchmarks; three are triangulated from adjacent benchmarks. A formal time-and-motion study, sampling senior B2B analysts across multiple firms and measuring focused research time per category on standardized briefs, would supersede the current derivation. This remains the weakest empirical claim in the framework.

The assessment checklist has not been psychometrically validated

The Section 11 checklist has not been validated for inter-rater reliability. Two reviewers running it against the same artifact may score differently. The magnitude of disagreement has not been measured. Psychometric validation (inter-rater reliability, test-retest reliability) would strengthen the assessment instrument.

The framework is published, not peer-reviewed

This is a practitioner publication. It has not been peer-reviewed by external industry figures. Readers should assess the substance directly against the supporting evidence and precedent.

Known open questions

ABM-first motions. Where committee mapping precedes prospect identification, the order of operations is inverted. Some conditions may need reframing.
Refresh patterns. The framework primarily addresses fresh artifact production. How incremental refresh should satisfy the conditions is an open question.
AI-generated content under Condition 3. Evidence requires provenance. How LLM-generated claims should be marked, weighted, or excluded deserves more careful articulation.
The minimum dimension count. Six is offered. The number is defensible but not derived from formal study. Reasoned alternatives (five, seven) deserve consideration.

Section 10

How to Apply This Standard at Scale

Most GTM motions do not operate at strategic tier across their full prospect base. Two selection decisions reduce the standard to what is operationally feasible at lower tiers.

10.1 Two selection decisions

Methodology dimension selection. Some dimensions enforce easily at scale; others require analyst judgment. Per-claim provenance, calibrated confidence, and comparable benchmarking can be enforced through tooling and templates with minimal per-prospect human time. Thesis-driven targeting and counter-evidence acknowledged require analyst judgment and are harder to enforce at scale. Pick the dimensions most relevant to your motion's failure modes.

Research-category selection. Some categories are decision-critical for your segment; others are useful but not required. Stakeholder Analysis is decision-critical for enterprise deals where the buying committee determines outcome. Outreach Cadence is decision-critical for high-volume motions where channel and timing fit determine response. Pick the categories most decision-relevant for your tier and segment.

10.2 Tier selection guidance

Tier	Methodology dimensions typically applied	Categories typically applied	Per-prospect time
Strategic	All six	All seven	2.5–6.25 hr
Mid-market	Thesis-driven targeting, per-claim provenance, calibrated confidence, comparable benchmarking	Stakeholder Analysis, Product Fit, Gap Analysis, Outreach Cadence	1.5–3 hr
High-volume	Per-claim provenance, comparable benchmarking	Persona psychographics, Outreach Cadence	Under 30 min

The selections above are illustrative defaults. Actual selection depends on the motion's specific failure modes, segment dynamics, and deal economics. A high-volume motion in regulated industries may require Counter-evidence acknowledged on every artifact for compliance reasons. A strategic motion in commodity segments may de-emphasize Persona psychographics if buyers are interchangeable. The standard is the reference; selection is the practitioner's job.

10.3 Selective application is not lower rigor

An artifact that applies four methodology dimensions across four research categories at mid-market tier is not less rigorous than a strategic-tier full-depth artifact. It is appropriately rigorous for its tier. Rigor is the fit between the methodology applied and the decision the artifact supports. The full standard is the reference; the appropriate application is what matters.

Section 11

Assessment Framework

A checklist for evaluating whether a specific artifact meets the analyst-grade standard. The assessor first identifies the opportunity tier (Section 10) the artifact is operating at, then runs the methodology and coverage questions. The framework works against vendor-produced output, agency output, in-house prep, or any other implementation.

11.0 Opportunity tier (from Section 10)

Identify the tier the artifact is operating at before applying the rest of the checklist. Tier definitions and selection guidance are in Section 10.2. Methodology dimensions (11.1 through 11.6 below) apply at every tier. Coverage requirements (11.7) scale with tier.

11.1 Thesis-driven targeting (Section 5.1)

1.1 Does the artifact contain a defended thesis: why this account, why this prospect, why now?

1.2 Can a peer reviewer trace the thesis to its supporting evidence?

1.3 Is the thesis testable: could it be invalidated by evidence the analyst would accept?

11.2 Triangulated evidence (Section 5.2)

2.1 For each material claim, are there sources from at least two independent source classes?

2.2 Are claims grounded in a single source identified as such?

2.3 Are the corroborating sources visible to the reader, not buried in a methodology footnote?

11.3 Per-claim provenance (Section 5.3)

3.1 For any material claim, can the reader identify its source within five seconds?

3.2 Are sources cited per claim, not per section?

3.3 Are inferred claims distinguished from observed claims by source attribution?

11.4 Calibrated confidence (Section 5.4)

4.1 Are confidence levels attached to material claims (high, medium, low, or equivalent)?

4.2 Is speculation labeled as such, not presented as observation?

4.3 Does the confidence calibration vary across claims (a uniform high-confidence rating across all claims is suspect)?

11.5 Comparable benchmarking (Section 5.5)

5.1 Is fit expressed relative to a named target profile (ICP)?

5.2 Are the dimensions of fit decomposed and individually visible?

5.3 Is the comparable peer set named (other accounts of the same profile)?

11.6 Counter-evidence acknowledged (Section 5.6)

6.1 Does the artifact name the conditions under which the recommendation would be wrong?

6.2 Is disqualifying evidence surfaced or buried?

6.3 Could the artifact have produced "do not engage" as an output for a different prospect with the same methodology applied?

11.7 Research-category coverage (Section 6, depth scaled to tier from Section 10)

7.1 Persona psychographics: depth and source quality of the buyer-decision model appropriate to tier?

7.2 Product Fit: integration friction, migration cost, political cost of displacement identified?

7.3 Competitor Landscape: incumbents and likely alternatives named with displacement angle for each?

7.4 Industry Context: segment dynamics relevant to timing and urgency identified?

7.5 Gap Analysis: specific gaps named with confidence about whether each can be closed?

7.6 Stakeholder Analysis: buying committee mapped with named individuals and posture per role?

7.7 Outreach Cadence: channel mix, sequence, timing, pushback playbook present?

11.8 Reader-experience verification (fast pass)

A faster assessment uses the eight reader-experience properties from Section 8. If the reader can verify all eight properties in the artifact, the artifact passes the methodology bar. The Section 8 properties are derived from the Section 5 dimensions and are designed to be visually verifiable, not requiring deep audit.

11.9 What this checklist does not assess

The framework does not score on number of fields, categories, source families, or visualizations; on engine economics, production cost, or vendor pricing; on specific tooling or vendor choices; on aesthetic or formatting preferences; or on whether the artifact takes a specific number of hours to produce. Earlier versions of this paper assessed "Does the artifact ship at least 15 source families, 16 categories, 8 visualizations? Does the vendor charge above $500 per output? Would a senior analyst given 6.5 hours produce materially more depth?" Those criteria have been removed; they were implementation choices disguised as methodology standards.

Note: Inter-rater reliability has not been formally validated. Two reviewers may score differently on the same artifact.

Section 12

A Note on Use

This framework is offered for use, free, with attribution to NIEOS. Apply it to assess existing pre-touch work, evaluate vendor output, structure leadership conversations, or build new process. The methodology dimensions and coverage requirements are portable across vendors, agencies, internal teams, and hybrid arrangements. Comments and refinements welcome at whitepaper@nieos.com. Published under Creative Commons Attribution 4.0 (CC BY 4.0).

Appendix A

References

Sources cited in this paper, grouped by category.

Industry research on B2B buyer behavior and outcomes
[R-01]Gartner, Inc. The B2B Buying Journey.Multi-year research program, originating 2018. Identifies six buyer jobs (problem identification, solution exploration, requirements building, supplier selection, validation, consensus creation). Documents buying-group size (average 6–10 stakeholders) and the 73% buyer-avoidance finding. Stamford, CT: Gartner.
[R-02]Forrester Research (incorporating SiriusDecisions). Demand Unit Waterfall.Multi-year framework. Reframes B2B demand modeling around the buying committee as the unit of demand. Cambridge, MA: Forrester.
[R-03]Korn Ferry. Sales Transformation research and practice.Korn Ferry's sales-transformation research program (incorporating prior CSO Insights work). Documents buyer-seller alignment, sales-effectiveness benchmarks, and the cost of misalignment in B2B selling. kornferry.com/insights/featured-topics/sales-transformation
Outbound benchmarks and personalization research
[R-04]Backlinko. Cold Email Outreach Benchmarks.Multi-year benchmarks on B2B cold email reply rates. Average outreach reply rate ~8.5%; personalized subject and body copy improves response 30.5–32.7%.
[R-05]Gong.io. Cold-email research, based on analysis of 28M+ cold emails.Aggregate cold-email research. Documents that top reps generate roughly 4.2× more replies than average reps and that personalization variables (industry, company, individual, activity) compound rather than substitute. gong.io/blog/cold-email-stats. The frequently-cited BlueGrace "88% reply-rate lift" figure is a single customer case study published by Gong (gong.io/customers/case-studies/how-bluegraces-reps-got-an-88-lift-in-reply-rates-using-gong-engage), not a generalized benchmark.
[R-06]LinkedIn Sales Solutions. Top-Performing Salespeople research.LinkedIn Sales Solutions research on top-performer behavior: 76% of top performers say they "always" conduct research on their buyers before reaching out, compared to just 38% of those who fail to meet target. business.linkedin.com/sales-solutions/top-performing-salespeople
[R-07a]Apollo. "What's the Expected Reply Rate for a Well-Run Outbound Cold Email Campaign?"Vendor benchmark from Apollo's outbound platform usage. Defines <2% as "below average," 3–5% as "well-run," 5–8% as "strong," and 10%+ as "excellent." apollo.io/insights/whats-the-expected-reply-rate-for-a-well-run-outbound-cold-email-campaign
[R-07b]Smartlead. "Cold Email Statistics."Platform benchmarks across Smartlead users. Reports 3.43% platform-average reply rate, 5%+ as the "good" threshold, and advanced personalization driving reply rates up to 18% in observed campaigns. smartlead.ai/blog/cold-email-stats
[R-07c]Lavender. "Building Your Own Sales Email Benchmarks."Lavender platform benchmark across roughly 50,000 active Lavender inboxes: 20.5% average reply rate. This is a self-selected sample of teams already invested in personalization tooling and coaching, not a market-wide baseline; should be read as a ceiling. lavender.ai/blog/building-your-own-sales-email-benchmarks
[R-07d]Belkins. "B2B Cold Outreach Benchmarks."Agency-aggregated benchmarks across Belkins client campaigns. Reports 2.1% reply rate on larger lists, 5.1% overall average, and 5.8% on small targeted lists. belkins.io/resources/b2b-cold-outreach-benchmarks
Analytical-methodology references (cross-domain)
[R-08]Minto, B. (2010). The Pyramid Principle: Logic in Writing and Thinking (3rd ed.).Pearson. The canonical reference for consulting analytical structure. Sets out the pyramid principle: lead with the answer, support with structured argument.
[R-09]Adamson, B., & Dixon, M. (2011). The Challenger Sale.Portfolio. CEB / Gartner research-derived seller-buyer interaction methodology.
[R-10]CFA Institute. Standards of Professional Conduct.The professional-ethics and analytical-standards framework governing CFA charter holders. Requires "reasonable and adequate basis" for recommendations and material source disclosure. cfainstitute.org/en/ethics-standards
[R-11]U.S. Securities and Exchange Commission. Regulation Analyst Certification (Regulation AC).17 CFR §§ 242.500–505. Requires research analysts to certify that views in their reports honestly reflect personal views.
[R-12]Gartner, Inc. Magic Quadrant Methodology.Published methodology document describing how Gartner Magic Quadrants are produced, evaluation criteria, weightings, data sources, review process. Public document.
[R-13]Forrester Research. The Forrester Wave Methodology.Published methodology document describing how Forrester Waves are produced.
[R-14]Investigative Reporters and Editors (IRE).Professional body for investigative journalism. Publishes resources on multi-source corroboration, document trails, and methodology disclosure. ire.org
[R-15]Saaty, T. L. (1980). The Analytic Hierarchy Process: Planning, Priority Setting, Resource Allocation.McGraw-Hill. Foundational reference for the Analytic Hierarchy Process (AHP), one of the most widely-cited methods in operations research and multi-criteria decision analysis.
[R-16]Green, P. E., & Srinivasan, V. (1990). Conjoint Analysis in Marketing: New Developments with Implications for Research and Practice.Journal of Marketing, 54(4), 3–19. Foundational reference for conjoint analysis as a market-research method for revealed-preference inference.
[R-17]Smith, W. R. (1956). Product Differentiation and Market Segmentation as Alternative Marketing Strategies.Journal of Marketing, 21(1), 3–8. Foundational reference for market segmentation theory.
Foundational sales methodology references
[R-18]Rackham, N. (1988). SPIN Selling.McGraw-Hill. In-conversation question methodology; canonical sales-methodology reference.
[R-19]Force Management. MEDDPICC methodology.Eight-dimension qualification framework: Metrics, Economic buyer, Decision criteria, Decision process, Paper process, Identify pain, Champion, Competition.

Appendix B

Glossary

Key terms used in this paper.

Analyst-grade pre-touch research: Per-account or per-prospect intelligence work that applies the six methodology dimensions (Section 5) across the seven research categories (Section 6). The category name introduced by this paper for B2B go-to-market.
Buying committee: The 6–10 stakeholders who collectively make a B2B purchase decision, per Gartner B2B Buying Journey research and Forrester Demand Unit Waterfall research. Treated as the unit of demand, not the individual contact.
Calibrated confidence (Dimension 4): Claims carry explicit confidence levels. Speculative inferences are labeled, not presented as observation.
Comparable benchmarking (Dimension 5): Fit is expressed relative to a defined target profile, with dimensions decomposed. Not asserted in absolute terms. Meaningful only if the target profile is itself rigorous (Section 7).
Conjoint analysis: Market-research technique for inferring attribute weights from observed choice behavior, revealed-preference inference. Foundational reference: Green & Srinivasan (1990) [R-16].
Counter-evidence acknowledged (Dimension 6): The artifact names the conditions under which the recommendation would be wrong. Disqualifying evidence is surfaced, not buried.
MCDA (Multi-Criteria Decision Analysis): Umbrella term for operations-research methods that evaluate multiple options against multiple criteria with explicit weighting. AHP is one specific MCDA method.
AHP (Analytic Hierarchy Process): MCDA method developed by Thomas Saaty (1980) [R-15]. Hierarchical decomposition of decision criteria with pairwise comparison for weight derivation.
Methodology dimension: One of the six intellectual-discipline requirements in Section 5. The dimensions describe HOW analyst-grade research is conducted, independent of WHAT is researched.
Mosaic theory: Equity research methodology for combining publicly available material from multiple sources to reach a conclusion no single source supports alone. Direct analog to Dimension 2 (Triangulated evidence).
Per-claim provenance (Dimension 3): Every material claim ties to its source. Any conclusion in the artifact is traceable to the evidence it stands on.
Research category: One of the seven coverage areas in Section 6. The categories describe WHAT must be researched for analyst-grade pre-touch work in B2B go-to-market specifically.
Substrate: The target profile against which pre-touch research is calibrated. Section 7 addresses what makes a substrate rigorous.
Target profile: The buyer-specific specification of who to pursue: which firmographics, signals, roles, and behaviors define a fit account. Synonymous with "ICP" in common usage. Used here as the more general term.
Thesis-driven targeting (Dimension 1): The artifact contains a defensible thesis: why this account, why this prospect, why now. The thesis is testable against evidence.
Tier (strategic / mid-market / high-volume): The opportunity-importance level at which the methodology and coverage are applied. Strategic-tier requires all six dimensions across all seven categories at full depth. Lower tiers apply the standard selectively (Section 10).
Triangulated evidence (Dimension 2): No material conclusion sits on a single source. Cross-source corroboration is required for the claims that drive the recommendation.
Two-source rule: Investigative journalism standard: claims about contested facts require independent corroboration from at least two sources. Direct analog to Dimension 2.

Appendix C

Assessment Worksheet

A structured worksheet for documenting a review of any pre-touch research artifact, internal team output, vendor-produced output, or other implementation. Designed for printing and completing.

Reviewer Identification

Name

Role

Organization

Artifact reviewed

Date

Opportunity Tier

Identify the tier this artifact is operating at before assessing methodology and coverage.

Strategic (all seven categories, full depth, 2.5–6.25 hr)

Mid-market (4–5 categories, focused depth, 1.5–3 hr)

High-volume (2–3 categories, light touch, under 30 min)

Per-Dimension Assessment

For each of the six methodology dimensions (Section 5), indicate whether the artifact meets the dimension, partially meets it, or does not meet it. Methodology dimensions apply at every tier.

Dimension 01 (Thesis-driven targeting)

Meets

Partially meets

Does not meet

Dimension 02 (Triangulated evidence)

Meets

Partially meets

Does not meet

Dimension 03 (Per-claim provenance)

Meets

Partially meets

Does not meet

Dimension 04 (Calibrated confidence)

Meets

Partially meets

Does not meet

Dimension 05 (Comparable benchmarking)

Meets

Partially meets

Does not meet

Dimension 06 (Counter-evidence acknowledged)

Meets

Partially meets

Does not meet

Research-Category Coverage

For each of the seven research categories (Section 6), verify coverage at the depth appropriate to the tier identified above.

Substrate (Targeting) Assessment

Is the target profile against which the artifact is calibrated itself rigorous (Section 7 substrate conditions)?

Reader-Experience Verification (fast pass)

Do the eight reader-experience properties from Section 8 hold when reading the artifact?

Overall

Methodology dimensions met (0–6)

Research categories met at tier (0–7)

Bracket

Meets standard (all six dimensions + tier-appropriate coverage)

Partial (4–5 dimensions or partial coverage)

Below (fewer than 4 dimensions)

Free-form summary

Appendix D

Implementation Profiles

Five sample implementations of analyst-grade pre-touch research methodology. Each profile is rated against the six methodology dimensions (Section 5) and indicates typical research-category depth (Section 6). The ratings are illustrative snapshots of how each implementation typically operates. Any specific instance may operate at higher or lower rigor than the profile.

The five profiles

Profile 1: In-house analyst desk. A dedicated internal team of senior analysts producing per-prospect assessments manually. Closest analog to equity research: high-discipline, high-cost, low-volume. Typical depth tier: strategic accounts only.

Profile 2: RevOps plus Clay plus LLM stack. Internal RevOps function using enrichment platforms (Clay, Apollo) plus LLM-assisted synthesis. Medium-discipline, medium-cost, medium-volume. Typical depth tier: mid-market with some strategic.

Profile 3: Agency workflow. External agency producing per-prospect research at scope-defined depth. Discipline varies by agency; cost is engagement-specific. Typical depth tier: depends on engagement scope.

Profile 4: Software-assisted hybrid (SDR Flow). Productized methodology plus tooling plus ongoing analyst oversight. Designed to compress per-prospect cost while maintaining the methodology bar. Typical depth tier: strategic and mid-market.

Profile 5: Pure software (Aomni, Common Room class). AI-driven account research with limited or no analyst oversight. Lowest cost per artifact. Methodology depends on what the software enforces. Typical depth tier: high-volume with selective deeper depth.

Methodology dimensions per profile

Ratings: ✓ = typically operative, ~ = partial or operator-dependent, — = typically absent. Ratings are honest snapshots, not stacked toward any specific implementation.

Dimension	In-house desk	RevOps + Clay + LLM	Agency workflow	Software hybrid (SDR Flow)	Pure software
Thesis-driven targeting	✓	~	~	✓	~
Triangulated evidence	✓	~	~	✓	~
Per-claim provenance	~	—	~	✓	—
Calibrated confidence	~	—	~	✓	—
Comparable benchmarking	~	~	~	✓	~
Counter-evidence acknowledged	~	—	~	~	—

SDR Flow profile detail

SDR Flow's typical output ships approximately 87 fields across 16 categories with 8 standard visualizations. The implementation enforces triangulated evidence, per-claim provenance, calibrated confidence, and comparable benchmarking as platform features. It is partial on counter-evidence acknowledgment: disqualifying evidence is surfaced in some artifacts but not enforced across all. Its strongest dimensions are provenance, calibration, and comparable benchmarking (3-Level ICP scoring against a defined Foundation profile). Its weakest dimension is counter-evidence acknowledgment.

The 87 fields, 16 categories, and 8 visualizations are SDR Flow's design choices, not the methodology standard. Other valid implementations of the same methodology may use different counts.

Analyst-Grade Pre-Touch Research.

Contents

What This Paper Is

When the standard applies

Who this is for

How the paper is organized

What Industry Research Documents

2.1 Buyers avoid suppliers whose outreach is irrelevant

2.2 Top-performing sellers always research first

2.3 Personalization depth produces measurable funnel improvement

2.4 Buying-committee complexity makes individual-only research insufficient

2.5 The cost of stopping short

A note on the validation argument

What Analytical Rigor Means: Precedent Across Domains

3.1 Equity research

3.2 Industry analyst publications

3.3 Investigative journalism

3.4 Consulting analytics

3.5 Decision-science methodology

3.6 What these traditions converge on

The Standard

What the standard rules in

What the standard rules out

A note on the label

The Six Methodology Dimensions

Thesis-driven targeting: defensible point of view

The dimension

Why this is necessary

Triangulated evidence: no material claim from a single source

The dimension

Why this is necessary

Per-claim provenance: sourcing in-line

The dimension

Why this is necessary

Calibrated confidence: speculation labeled, not laundered

The dimension

Why this is necessary

Comparable benchmarking: fit expressed relative, not absolute

The dimension

Why this is necessary

Counter-evidence acknowledged: disqualifiers on the page

The dimension

Why this is necessary

The Seven Research Categories

Source basis per category

Coverage requirement, not output specification

Tier framing

Consumption time

Targeting Rigor

The substrate problem

What makes a target profile rigorous

Defensible derivation

Multiple independent evidence categories

Quantitative scoring with decomposition

Refresh against outcome data

Cross-domain precedent for substrate rigor

How an Analyst-Grade Artifact Reads

8.1 The thesis arrives first

8.2 Coverage is visible without searching

8.3 Material claims are corroborated, not asserted

8.4 Sourcing is in-line

8.5 Confidence is calibrated visibly

8.6 Fit is benchmarked, not asserted

8.7 Counter-evidence is on the page

8.8 The thesis lands at a specific action

Limitations

Scope limitations

Pure inbound motions

Account-only intelligence without prospect overlay

Markets with structurally limited contact-level enrichment

Industries with unique data structures

Methodological limitations

The full-depth time range is not derived from primary measurement

The assessment checklist has not been psychometrically validated

The framework is published, not peer-reviewed

Known open questions

How to Apply This Standard at Scale

10.1 Two selection decisions

10.2 Tier selection guidance

10.3 Selective application is not lower rigor

Analyst-Grade
Pre-Touch Research.