NIEOS
White Paper · v1.0
A methodology standard for analyst-grade pre-touch research in B2B go-to-market. Grounded in published industry research on outcomes and in the rigor traditions of equity research, industry analyst publications, investigative journalism, consulting analytics, and decision science.
A methodology standard for analyst-grade pre-touch research in B2B go-to-market. The standard rests on two anchors. The first is published industry research that documents what happens when sellers do this work well. The second is the rigor traditions of adjacent analytical fields, where what counts as rigorous work is already well defined.
Analyst-grade methodology is decomposable from analyst-grade time investment. The intellectual discipline of thesis-driven, triangulated, provenanced, calibrated, comparably benchmarked, and counter-evidenced work is the standard. Time per artifact is an implementation consequence of unit economics. Equity research has had time-rich analyst-grade methodology for a century because per-company analyst labor of one to three weeks is economically supported. B2B go-to-market has historically had time-starved methodology because cold-outreach unit economics do not support time-equivalent labor. The contribution of this paper is articulating what analyst-grade methodology looks like at GTM economics: same intellectual discipline, fundamentally different production model, implementation-agnostic standard.
Applied to B2B pre-touch research, the methodology dimensions describe a category of work that goes beyond firmographic enrichment plus one trigger event. Analyst-grade pre-touch research defends a thesis with triangulated evidence, sources its claims per-claim, calibrates confidence visibly, benchmarks fit against a defined target profile, and acknowledges what would invalidate the recommendation. The difference between this category and standard pre-touch research shows up in funnel performance (Section 2) and in the quality of conversation a rep can hold with the buyer.
The paper introduces the term analyst-grade as the name for this category of work in B2B go-to-market. The methodology standards the term names are practiced and codified across the analytical traditions in Section 3; their formal application to GTM pre-touch research is what this paper sets out.
This paper is published by NIEOS. NIEOS operates one commercial implementation of the methodology described here, called SDR Flow. The methodology in this paper is intended to outlive any specific implementation. Where SDR Flow's design choices are referenced, they appear in Appendix D (Implementation Profiles) as one of five sample implementations evaluated against the same methodology and coverage standards.
This is a standard for where the standard matters. Strategic accounts where rigor and an educated first touch are critical justify the time investment described in Section 6. For mid-market and high-volume motions, the methodology serves as a reference that can be applied in parts. Not every dimension and not every category is required at every prospect tier. Section 10 makes the selective-application logic explicit. The full standard is the bar; selective application is the practical use case for most teams.
Revenue operations leaders. Sales leadership. Procurement teams evaluating intelligence vendors. Practitioners building or auditing their own pre-touch process. The assessment framework in Section 11 works against any artifact, vendor-produced, agency-produced, in-house, or engine-produced. The substance applies the same way regardless of who does the work.
The outcomes of deep, evidence-based pre-touch research are well documented. This section gathers the findings most relevant to the substance the rest of the paper sets out. Each finding uses the language of the original research. A short bridge at the end of each subsection ties it back to the framework.
Gartner's B2B Buying Journey research finds that 73% of B2B buyers actively avoid suppliers that send irrelevant outreach [R-01]. The same body of research documents that buying groups average 6–10 stakeholders. Each member completes four to five distinct "buyer jobs" at once. Generic outreach reaches a sophisticated, multi-person evaluation that has already learned to filter low-context messages out.
What this means for the framework: pre-touch research depth is not a marginal improvement. Outreach without it damages the supplier's standing with the buyer.
LinkedIn's State of Sales research finds that top quota performers research differently from the rest. 76% of top performers say they "always" conduct research on their buyers before reaching out, compared to just 38% of those who fail to meet target [R-06]. The same research notes that incomplete or stale data stalls sellers and stalls deals. That points to a quality threshold below which research stops producing the outcome.
What this means for the framework: the behavior that separates top performers from the rest is rigorous pre-touch research. Tools and processes that fall below the quality threshold don't produce the top-performer result.
Published cold-outbound benchmarks from Apollo, Smartlead, Lavender, and Belkins consistently report a clear gradient by personalization depth. The table below shows the range each source publishes for each depth band, with the anchoring vendor and the exact claim cited per row:
| Personalization depth | Reply rate range | Lift vs generic | Anchored by (verified sources) |
|---|---|---|---|
| Generic outreach | 1–3% | baseline | Apollo (<2% "below average") [R-07a] · Belkins (larger lists 2.1%) [R-07d] |
| Enriched (firmographic + role + trigger) | 3–6% | ~2–3× | Apollo (3–5% "well-run") [R-07a] · Smartlead (3.43% platform avg, 5%+ "good") [R-07b] · Belkins (5.1% overall, 5.8% small-targeted) [R-07d] |
| Deeply personalized (account context + buyer-specific framing) | 7–20%+ | ~7× | Apollo (5–8% "strong", 10%+ "excellent") [R-07a] · Smartlead (advanced personalization up to 18%) [R-07b] · Lavender (20.5% across ~50K active Lavender inboxes, self-selected sample) [R-07c] |
Backlinko's outreach benchmarks report that personalized subject and body copy improve response by roughly 30.5% to 32.7% on top of baseline rates [R-04]. Gong's 28M+ cold-email research finds that top reps generate roughly 4.2× more replies than average reps, and that personalization variables compound rather than substitute [R-05].
What this means for the framework: the funnel improvement from depth is measured, not theoretical. The roughly 7× reply-rate gradient between generic and deeply personalized outbound sets an order-of-magnitude expectation for the difference between rigorous and basic pre-touch research. The Lavender 20.5% figure reflects a self-selected sample of teams already invested in personalization tooling, so it should be read as a ceiling, not a baseline.
Forrester's Demand Unit Waterfall research [R-02] reframes B2B demand around the buying committee as the unit of demand. The individual lead is not the unit. The implication for pre-touch research is structural. Targeting individual contacts in isolation misses the multi-stakeholder dynamics that actually drive purchase decisions. The same point appears throughout Gartner's research on buying-group size and complexity [R-01].
What this means for the framework: research that treats prospects as isolated individuals fails on a documented dimension. Account-aware and committee-aware context is part of the substance, not an enhancement.
Three documented costs of operating at basic-enrichment depth instead of rigorous depth:
The research in this section validates the category of approach. Deep, evidence-based, target-calibrated pre-touch work produces better outcomes. The research does not validate any specific articulation of the substance, and certainly not any specific vendor's implementation. The framework in the rest of the paper is a perspective on what the substance looks like. The industry data is the reason the substance is worth setting out.
Industry research documents that pre-touch research at this depth produces measurably better outcomes. Adjacent analytical fields document what rigor at this depth means. Sections 4 to 8 set out a small number of conditions any rigorous implementation would satisfy. The substance is what the industry has validated. The conditions are what adjacent fields have already established as rigor.
Rigorous analytical work has shared markers across domains: a defended thesis, triangulated evidence, sourced claims, calibrated confidence, comparable benchmarking, and acknowledged counter-evidence. The five traditions below each developed their own version of these markers.
Sell-side and buy-side equity research operate under the most formally codified version of these standards. Thesis-driven analysis with a target outcome. Mandatory source disclosure for material claims. Calibrated language for uncertainty. Comparable-company benchmarking. Disclosed valuation methodology.
The CFA Institute's Standards of Professional Conduct [R-10] codifies these expectations. The standards require that analysts have a "reasonable and adequate basis" for any recommendation. Material assumptions must be disclosed. SEC Regulation AC requires research analysts to certify that the views in their reports honestly reflect personal views.
Triangulation: Mosaic theory, combining publicly available material from multiple sources to reach a conclusion no single source supports alone.
Provenance: Source disclosure for material claims, as required by CFA standards.
Calibration: Comparable-company analysis, valuation against a peer set, not against a generic ideal.
Calibrated language: Buy / Hold / Sell ratings with stated confidence, target prices, and risk factors.
Industry analyst firms (Gartner, Forrester, IDC) publish methodology documents that describe how their analytical products are produced. Gartner's Magic Quadrant Methodology [R-12] discloses the evaluation criteria, the weightings applied to each criterion, the data sources consulted, and the multi-pass review process. Forrester's Wave Methodology [R-13] follows the same convention.
Both make a structural choice that is itself a rigor standard. The methodology is disclosed alongside the conclusion. Readers can assess the basis for the conclusion, not just the conclusion.
Disclosed evaluation criteria: The dimensions on which vendors are evaluated are stated, with weightings.
Multi-source primary research: Vendor briefings, customer references, surveys, secondary research, convergent evidence required.
Structural format: Same format across every Magic Quadrant or Wave, so readers can navigate consistently.
Methodology disclosure: The published methodology is itself an artifact, separable from any individual report.
Investigative journalism developed rigorous source standards independently. The Investigative Reporters and Editors body (IRE) codifies practices that include the two-source rule (claims about contested facts require independent corroboration), document trails (citing the documents that ground a claim), preference for named over anonymous sources, and explicit acknowledgment of competing accounts. ProPublica and similar organizations publish methodology disclosures alongside major investigations [R-14].
Multi-source corroboration: The two-source rule for contested claims.
Document trails: Provenance per claim through cited documents.
Competing accounts acknowledged: Alternative interpretations stated, not buried.
Top-tier consulting firms (McKinsey, BCG, Bain) operate under analytical conventions that are themselves a rigor standard. MECE structure for problem decomposition. Hypothesis-driven research. Pyramid argumentation.
Barbara Minto's The Pyramid Principle [R-08] is the canonical reference for consulting analytical structure. It codifies the operational format: lead with the answer, support it with a structured argument, ensure the argument is logically complete. The format itself is a rigor commitment. It forces analytical completeness because the structure shows gaps.
Operational structure: Pyramid Principle, answer first, structured support below.
MECE decomposition: Problem coverage that is logically complete and non-overlapping.
Hypothesis-driven: Analysis serves a thesis; the thesis is testable against evidence.
The methods used for evaluating multiple options against multiple criteria are well established in operations research and decision science. Multi-Criteria Decision Analysis (MCDA) is the umbrella term. The Analytic Hierarchy Process (AHP), developed by Thomas Saaty [R-15], is one of the most widely cited specific methods. Conjoint analysis is the standard technique for revealed-preference work, inferring what attributes drive choice from observed decisions rather than stated preferences [R-16].
These methods underpin rigorous evaluation work across operations research, market research, healthcare decision-making, and policy analysis.
Weighted composite scoring: Multiple dimensions evaluated with disclosed weights, composited deterministically.
Revealed-preference analysis: Inferring what matters from observed evidence, not from stated claims.
Peer-set normalization: Comparison within a defined peer set rather than absolute scoring.
Five conditions show up in every one of these adjacent fields:
| Condition | Appears in |
|---|---|
| Depth (triangulation / multi-source corroboration) | Equity research (mosaic theory) · Industry analyst (multi-source) · Journalism (two-source rule) · Consulting (MECE) |
| Structure (operational, consistent format) | Equity research (standard report sections) · Industry analyst (Magic Quadrant format) · Consulting (Pyramid Principle) |
| Evidence (provenance + confidence) | Equity research (CFA source disclosure) · Industry analyst (methodology disclosure) · Journalism (document trails) |
| Calibration (target-specific, not generic) | Equity research (comparable-company analysis) · Industry analyst (criteria + weighting disclosure) · Decision science (peer-set normalization) |
| Time-equivalence (depth comparable to skilled analyst work) | Equity research (standard report time investment) · Industry analyst (Magic Quadrant production cycles) · Consulting (MECE completeness) |
Equity research informs thesis-driven targeting (5.1), triangulated evidence (5.2 via mosaic theory), per-claim provenance (5.3 via CFA Standards), calibrated confidence (5.4 via Reg AC), comparable benchmarking (5.5 via peer-set analysis), and counter-evidence (5.6 via risk factors). Investigative journalism informs triangulated evidence (5.2 via two-source rule), per-claim provenance (5.3 via document trails), and counter-evidence (5.6 via competing accounts). Consulting analytics informs thesis-driven targeting (5.1 via hypothesis-driven analysis) and counter-evidence (5.6 via pre-mortem analysis). Decision science informs comparable benchmarking (5.5 via multi-criteria evaluation). No single domain produces all six dimensions; the methodology synthesizes across them.
Per-account or per-prospect intelligence work that applies six methodology dimensions across seven research categories. Output can come from analysts, engines, agencies, or hybrid pipelines. Production method is not a substance criterion.
The term analyst-grade is this paper's name for the methodology-and-coverage bundle. Practitioners who prefer different labels, research-grade, analyst-equivalent, structured pre-touch intelligence, are working on the same substance. The label is portable; what matters is whether the methodology dimensions and coverage requirements are operative in the artifact. Readers who reject the label but adopt the substance get the benefit.
The six methodology dimensions in Section 5 describe properties of the research and its output. A senior analyst can produce work that fails the dimensions. An engine can produce work that satisfies them. What's in the artifact is what counts, not what produced it.
The artifact contains a defensible thesis: why this account, why this prospect, why now. The thesis is testable against evidence.
Every analyst-grade artifact contains a defended thesis. A peer reviewer can examine the supporting research and judge whether the thesis holds or fails. The thesis names the buyer-specific reason this opportunity is worth pursuing now, not generic enthusiasm and not a templated "this looks interesting" gesture.
Without a defended thesis, the artifact is a data dump. Research collected without a thesis to test is not analyst-grade work; it is preparation pretending to be analysis. A thesis forces the analyst to commit, which forces the analyst to defend, which forces the analyst to evaluate evidence rather than just collect it.
Equity research: every Buy / Hold / Sell rating defends a thesis (revenue acceleration, multiple expansion, margin recovery). Consulting hypothesis-driven analysis: recommendations rest on a stated hypothesis that the work supports or invalidates. Industry analyst Magic Quadrant placements rest on disclosed evaluation criteria, not undefended position.
Read the artifact and state its thesis in one sentence. Then identify the three pieces of evidence in the artifact that most directly support the thesis. If the thesis cannot be stated, or the support cannot be located, the artifact fails thesis-driven targeting.
No material conclusion sits on a single source. Cross-source corroboration is required for the claims that drive the recommendation.
For every material claim, the artifact draws on evidence from at least two independent source classes. A claim grounded in firmographic data alone, LinkedIn signals alone, or a single press release alone is not triangulated and is flagged as such.
Single-source claims fail in two directions. They fail upward when the source is wrong (or misread) and the conclusion propagates unchecked. They fail downward when the consumer cannot calibrate confidence because they cannot see what else supports the claim. Mosaic theory exists because no single source supports a serious conclusion alone.
Equity research mosaic theory: combining publicly available material from multiple sources to reach a conclusion no single source supports alone. Investigative journalism two-source rule (Investigative Reporters and Editors). Industry analyst multi-source primary research (vendor briefings, customer references, surveys, secondary research combined).
Pick any three material claims in the artifact. For each, identify the source classes that support the claim. If any material claim has only one source class behind it without an explicit single-source flag, the artifact fails triangulated evidence.
Every material claim ties to its source. Any conclusion in the artifact is traceable to the evidence it stands on.
Sources are cited per claim, not per section. The reader can identify the source of any material claim within five seconds, without leaving the claim. Inferred claims are distinguished from observed claims by source attribution: "the company published a press release announcing X" is sourced differently from "they are likely shifting to a mid-market motion."
Without per-claim provenance, the reader either trusts on faith or reproduces the research manually. Both collapse the value. Per-section sourcing creates an authority gradient where everything in the section appears equally well-grounded, which is rarely true. Per-claim sourcing exposes which conclusions are observation, which are inference, and which are speculation.
CFA Institute Standards of Professional Conduct: equity analysts must have a "reasonable and adequate basis" for recommendations, with material sources disclosed. SEC Regulation Analyst Certification (Reg AC) requires personal certification of views. Investigative journalism document trails (IRE). Industry analyst published methodology documents disclose source families per evaluation criterion.
Pick any three material claims at random. For each, the reader should be able to identify the source within five seconds without leaving the claim. If any takes longer than five seconds, the artifact fails per-claim provenance.
Claims carry explicit confidence levels. Speculative inferences are labeled, not presented as observation.
Material claims carry visible confidence levels (high, medium, low, or equivalent). The reader can distinguish high-confidence observation from medium-confidence inference from low-confidence speculation at a glance. Uniform confidence across all claims signals a problem, and the reader can detect it.
Without calibration, speculation is laundered as fact. A confident-sounding paragraph reads as a verified claim whether or not the analyst had evidence. Calibrated language exists in equity research because investors making capital allocation decisions must distinguish what is known from what is inferred. The same distinction matters in GTM, where reps acting on the artifact need to know which claims they can defend to a buyer who challenges them.
CFA Institute Standards: calibrated language for uncertainty is required, not optional. SEC Reg AC requires personal certification. Intelligence and forecasting communities use calibrated-language conventions (low / medium / high confidence) because uncalibrated assertion fails the consumer.
Read the artifact and identify the lowest-confidence claim. Then identify the highest-confidence claim. If both cannot be distinguished by their language and labeling, the artifact fails calibrated confidence.
Fit is expressed relative to a defined target profile, with dimensions decomposed. It is not asserted in absolute terms.
When the artifact says "good fit" or "poor fit," the reader can see what target profile the fit is expressed against and what dimensions were evaluated. The target profile is named, the dimensions are decomposed, and the comparable peer set is identifiable. Single composite scores without decomposition fail this dimension. The target profile must itself be rigorous; Section 7 addresses what that means.
Without comparable benchmarking, "fit" is asserted, not measured. "Looks like a good account" is not different from generic enthusiasm. Generic fit assessment against an unstated "ideal mid-market SaaS company" archetype fails this dimension regardless of how confidently it is stated.
Equity research comparable-company analysis: valuation against a defined peer set, never against a generic ideal. Industry analyst evaluation against disclosed criteria with weightings (Magic Quadrant Methodology). Decision-science peer-set normalization (Saaty AHP, MCDA): scoring is meaningful only within a defined comparison set.
Ask the artifact: which dimensions of our target profile does this account match, and which doesn't it? If the answer cannot be rebuilt from the artifact alone (without consulting external documents), comparable benchmarking fails.
The artifact names the conditions under which the recommendation would be wrong. Disqualifying evidence is surfaced, not buried.
The artifact addresses what would have to be true for this opportunity not to be a fit. Disqualifying evidence is surfaced explicitly. The reader does not have to ask "what about X." The artifact has already considered X. The artifact could have produced "do not engage" as an output for a different prospect with the same methodology applied.
Without acknowledged counter-evidence, the artifact is sales advocacy disguised as analysis. A research artifact that produces "engage" recommendations on every prospect is not research; it is a justification engine. Disqualifying evidence is what distinguishes analysis from confirmation bias dressed in citations.
Equity research risk factors: every coverage report names what would invalidate the thesis. Investigative journalism "competing accounts": alternative interpretations are stated, not buried. Consulting pre-mortem analysis: structured identification of how the recommendation could fail.
Read the artifact and identify the conditions under which the recommendation would be wrong. If no such conditions are named, the artifact fails counter-evidence acknowledged. If the conditions are present but cannot be located within ten seconds, the artifact fails by burying them.
An artifact that satisfies five of six dimensions does not satisfy the standard. A thesis without triangulation is unsupported assertion. Triangulation without provenance is uncheckable. Provenance without calibration is uniform-confidence noise. Calibration without comparable benchmarking is generic. Comparable benchmarking without counter-evidence is sales advocacy. All six are necessary. None is sufficient alone.
Seven research categories define the coverage standard for analyst-grade pre-touch work. Full-depth requirements below; mid-market and high-volume motions apply them selectively (Section 10).
| Category | What it requires | Time at full depth |
|---|---|---|
| Persona psychographics | Model of how this buyer thinks, what frames resonate, what they fear, what they reward publicly. Not their role, their wiring. | 15–30 min |
| Product Fit | Whether what we sell addresses their stated and inferred problem space in the right shape for their stage, motion, and stack, including integration friction, migration cost, and political cost of displacement. | 15–45 min |
| Competitor Landscape | Who else is selling to this account, who is installed, who has been evaluated and rejected, and what is the displacement story for each. | 15–45 min |
| Industry Context | Macro, sector, and segment dynamics that make this buyer more or less buyable right now, independent of the account itself. | 15–30 min |
| Gap Analysis | Current state vs. target state for the prospect. Where they are operationally, where they say they want to be, and where the actual gaps live. | 15–45 min |
| Stakeholder Analysis | Full buying committee map. Economic buyer, technical buyer, user, champion, blocker, influencer. With named individuals where possible. | 60–150 min |
| Outreach Cadence | Based on prospect-specific signals, the inferred optimal cadence: channel mix, frequency, sequence, timing, and expected pushback playbook. | 15–30 min |
| Total at full depth | 2.5–6.25 hours |
The published literature on per-category pre-touch research time is uneven. Outreach Cadence has the strongest primary support (Lavender 15–20 min manual baseline, corroborated by McKinsey State of AI in Sales 20–30 min SDR research figure). Stakeholder Analysis is bottoms-up derived: Clay's published 15 min per prospect for manual lookup × Gartner B2B Buying Journey 6–10 stakeholders per buying group [R-01] = 90–150 min for individual lookup, plus role labeling and committee synthesis; Prospeo's account-mapping guide separately publishes 30–90 min initial mapping. Two have proxy benchmarks: Persona psychographics (vendor-tool consumption baselines from Crystal Knows plus manual signal review) and Gap Analysis (discovery-prep proxies from Sandler, HubSpot, Highspot pre-call guides). The remaining three are triangulated from adjacent benchmarks: Product Fit (Loopio RFP report 33 hours per response with fit-analysis subset roughly 20–30%; ABM account-research 2–4 hours with fit subset 25–40%); Competitor Landscape (Klue battlecard work at 16 hours initial per competitor, applied to a single account rather than built per account); Industry Context (Buyer Persona Institute 15-hour segment research amortized across named accounts plus per-prospect timing-signal refresh).
Each category is a coverage requirement, not an output specification. The artifact does not have to ship a particular field count, source family count, or visualization count to satisfy the category. Implementation profiles in Appendix D show how different production models (in-house analyst desk, RevOps stack, agency, software hybrid, pure software) cover these seven categories at different depths. The implementation is a design choice; the coverage is the standard.
The 2.5 to 6.25 hour range is the sum of the per-category times above. It applies when all seven categories are researched at full depth. Section 10 covers when less than full depth is the right choice and which dimensions and categories scale down at mid-market and high-volume tiers.
The compression on the consumption side matters as much as on the production side. An artifact meeting the standard should be readable and operationally usable in roughly 6 minutes. Long enough to absorb strategic context, engagement plan, and qualification overlay. Not so long that pre-touch review becomes its own cost center. The 6-minute review benchmark assumes structural consistency. Without it, review time balloons.
Calibration (Condition 4) requires fit assessment against a target profile. The condition is meaningful only to the extent that the target profile is itself rigorous. A perfectly calibrated artifact scored against a wishlist target produces polished output against a weak benchmark. That is not rigorous work. This section addresses what makes a target profile rigorous.
A target profile is the substrate behind pre-touch research. Every fit score, every prioritization decision, every exclusion that filters out poor-fit accounts, all derive their validity from the substrate. Substrate that is thin or undefined produces downstream output that is thin and undefined. The artifact format does not save it.
The most common substrate failures, observed in practice:
A rigorous target profile satisfies four conditions of its own:
The profile is built through a stated methodology, not asserted. The methodology specifies what evidence sources are consulted, how dimensions are scored, how the composite is weighted, and how confidence is tracked.
The profile is defended against more than one category of evidence. Forward-looking strategic evidence (market positioning, segment thesis, customer voice, competitive landscape, persona behavior) and historical performance evidence (won-deal patterns, lost-deal patterns, retention data, comparable peer analysis) are two categories commonly used. The principle is the same as triangulated evidence in Section 5.2: no defining claim about who fits can rest on a single category. The number of evidence categories is implementation-specific; the methodology requires more than one.
Fit is reported as a composite score with exposed per-dimension decomposition, not a single number. The consumer (and downstream research) can see which dimensions drive the score for any account.
The profile is bound to analytics tracking actual outcomes per fit tier. Calibration becomes testable. Tier A accounts should outperform Tier B. If they don't, the substrate is refined. Static profiles drift. Tracked profiles improve.
The methods used for rigorous target-profile work are well established in operations research and market research:
| Method | Precedent | What it contributes |
|---|---|---|
| Multi-Criteria Decision Analysis (MCDA) | Operations research, multi-decade literature | Weighted composite scoring with disclosed weights |
| Analytic Hierarchy Process (AHP) | Saaty, foundational decision-science framework [R-15] | Hierarchical decomposition + pairwise comparison for weight derivation |
| Conjoint analysis | Marketing research, foundational technique [R-16] | Revealed-preference inference of attribute weights from observed evidence |
| Percentile-rank normalization | Standard descriptive statistics | Scoring within peer sets rather than absolute scales |
| Segmentation theory | Smith (1956) onward; foundational market-research paradigm | Structured market subdivision with criteria-based grouping |
None of these methods are exotic. MCDA, AHP, conjoint analysis, and percentile-rank normalization are taught in operations-research and market-research curricula. They are used across operations, healthcare decision-making, policy analysis, and product development. Rigorous target-profile work applies these methods to a domain (B2B sales targeting) where they have not been formalized at scale.
Eight properties define what a reader encounters when reading an analyst-grade artifact. Format does not change them: PDF, structured dashboard, document, dossier, scored report, all manifest the eight the same way.
An analyst-grade artifact opens with a defended position. The reader knows within the first screen or first paragraph: why this account, why this prospect, why now. The thesis is not buried in section three. It is the first thing the reader encounters because every other claim in the artifact serves it.
The reader can confirm that the seven research categories were addressed without searching the artifact. Structure makes coverage scannable. Missing coverage is visible as a gap, not hidden by formatting.
For any claim that drives the recommendation, the reader sees more than one source. Single-source claims are flagged as such. The reader does not have to take any material conclusion on faith.
The reader can identify the source of any material claim without leaving the claim. Provenance is visible where the claim lives, not buried in an appendix.
Claims carry visible confidence levels. The reader can distinguish high-confidence observation from medium-confidence inference from low-confidence speculation at a glance. Uniform high-confidence ratings across all claims signal a problem, and the reader can detect it.
When the artifact says "good fit" or "poor fit," the reader sees what target profile the fit is expressed against and what dimensions were evaluated. Fit without benchmark is not analyst-grade and the reader can see the absence.
The reader encounters the conditions under which the recommendation would be wrong. Disqualifying evidence is surfaced. The reader does not have to ask "what about X." The artifact has already considered X.
The reader closes the artifact knowing what to do next: with what opener, on what channel, in what window, anticipating what pushback. An artifact that researches but does not recommend is a brief, not analyst-grade work product.
The eight reader-experience properties above are visible regardless of format or production model. Appendix D documents five sample implementations (in-house analyst desk, RevOps stack, agency, software hybrid, pure software) and rates how each surfaces these properties at different production cost points.
A working framework that does not acknowledge its limits is brittle. This section names where the framework does not apply, where the derivation is weakest, and what is open for refinement.
Every claim in this paper falls into one of five categories. The categories are not interchangeable. Readers should weight conclusions accordingly.
The framework is designed for outbound and hybrid motions where pre-touch research depth is a value driver. Pure inbound motions have different research requirements (faster response, lighter pre-touch depth, more emphasis on real-time conversation prep). The conditions may apply in modified form. The operational benchmark (~6.5 hr) likely does not.
The framework assumes per-prospect calibration (Condition 4). Pure account-only intelligence, without a named prospect, can satisfy depth, structure, evidence, and time-equivalence. But calibration is harder to satisfy without prospect-level fit data. A modified version of the framework for account-only artifacts may be warranted.
Heavily regulated industries (healthcare with HIPAA constraints, defense, financial services compliance) and non-English-language markets may not support the source-diversity description. The core conditions still apply. The specific output spec numbers may need calibration.
Healthcare, defense, and certain financial-services contexts have data structures and compliance constraints that bound what enrichment is permissible. The framework does not address sector-specific calibration. Sector-specific extensions would strengthen the framework.
The 2.5 to 6.25 hour full-depth per-prospect range in Section 6 is partially supported by published benchmarks but no primary study measures total full-depth pre-touch research time directly. One category (Outreach Cadence) has the strongest primary support; one (Stakeholder Analysis) is bottoms-up derived from Clay + Gartner inputs; two have proxy benchmarks; three are triangulated from adjacent benchmarks. A formal time-and-motion study, sampling senior B2B analysts across multiple firms and measuring focused research time per category on standardized briefs, would supersede the current derivation. This remains the weakest empirical claim in the framework.
The Section 11 checklist has not been validated for inter-rater reliability. Two reviewers running it against the same artifact may score differently. The magnitude of disagreement has not been measured. Psychometric validation (inter-rater reliability, test-retest reliability) would strengthen the assessment instrument.
This is a practitioner publication. It has not been peer-reviewed by external industry figures. Readers should assess the substance directly against the supporting evidence and precedent.
Most GTM motions do not operate at strategic tier across their full prospect base. Two selection decisions reduce the standard to what is operationally feasible at lower tiers.
Methodology dimension selection. Some dimensions enforce easily at scale; others require analyst judgment. Per-claim provenance, calibrated confidence, and comparable benchmarking can be enforced through tooling and templates with minimal per-prospect human time. Thesis-driven targeting and counter-evidence acknowledged require analyst judgment and are harder to enforce at scale. Pick the dimensions most relevant to your motion's failure modes.
Research-category selection. Some categories are decision-critical for your segment; others are useful but not required. Stakeholder Analysis is decision-critical for enterprise deals where the buying committee determines outcome. Outreach Cadence is decision-critical for high-volume motions where channel and timing fit determine response. Pick the categories most decision-relevant for your tier and segment.
| Tier | Methodology dimensions typically applied | Categories typically applied | Per-prospect time |
|---|---|---|---|
| Strategic | All six | All seven | 2.5–6.25 hr |
| Mid-market | Thesis-driven targeting, per-claim provenance, calibrated confidence, comparable benchmarking | Stakeholder Analysis, Product Fit, Gap Analysis, Outreach Cadence | 1.5–3 hr |
| High-volume | Per-claim provenance, comparable benchmarking | Persona psychographics, Outreach Cadence | Under 30 min |
The selections above are illustrative defaults. Actual selection depends on the motion's specific failure modes, segment dynamics, and deal economics. A high-volume motion in regulated industries may require Counter-evidence acknowledged on every artifact for compliance reasons. A strategic motion in commodity segments may de-emphasize Persona psychographics if buyers are interchangeable. The standard is the reference; selection is the practitioner's job.
An artifact that applies four methodology dimensions across four research categories at mid-market tier is not less rigorous than a strategic-tier full-depth artifact. It is appropriately rigorous for its tier. Rigor is the fit between the methodology applied and the decision the artifact supports. The full standard is the reference; the appropriate application is what matters.
A checklist for evaluating whether a specific artifact meets the analyst-grade standard. The assessor first identifies the opportunity tier (Section 10) the artifact is operating at, then runs the methodology and coverage questions. The framework works against vendor-produced output, agency output, in-house prep, or any other implementation.
Identify the tier the artifact is operating at before applying the rest of the checklist. Tier definitions and selection guidance are in Section 10.2. Methodology dimensions (11.1 through 11.6 below) apply at every tier. Coverage requirements (11.7) scale with tier.
A faster assessment uses the eight reader-experience properties from Section 8. If the reader can verify all eight properties in the artifact, the artifact passes the methodology bar. The Section 8 properties are derived from the Section 5 dimensions and are designed to be visually verifiable, not requiring deep audit.
The framework does not score on number of fields, categories, source families, or visualizations; on engine economics, production cost, or vendor pricing; on specific tooling or vendor choices; on aesthetic or formatting preferences; or on whether the artifact takes a specific number of hours to produce. Earlier versions of this paper assessed "Does the artifact ship at least 15 source families, 16 categories, 8 visualizations? Does the vendor charge above $500 per output? Would a senior analyst given 6.5 hours produce materially more depth?" Those criteria have been removed; they were implementation choices disguised as methodology standards.
Note: Inter-rater reliability has not been formally validated. Two reviewers may score differently on the same artifact.
This framework is offered for use, free, with attribution to NIEOS. Apply it to assess existing pre-touch work, evaluate vendor output, structure leadership conversations, or build new process. The methodology dimensions and coverage requirements are portable across vendors, agencies, internal teams, and hybrid arrangements. Comments and refinements welcome at whitepaper@nieos.com. Published under Creative Commons Attribution 4.0 (CC BY 4.0).
Sources cited in this paper, grouped by category.
Key terms used in this paper.
A structured worksheet for documenting a review of any pre-touch research artifact, internal team output, vendor-produced output, or other implementation. Designed for printing and completing.
Identify the tier this artifact is operating at before assessing methodology and coverage.
For each of the six methodology dimensions (Section 5), indicate whether the artifact meets the dimension, partially meets it, or does not meet it. Methodology dimensions apply at every tier.
For each of the seven research categories (Section 6), verify coverage at the depth appropriate to the tier identified above.
Is the target profile against which the artifact is calibrated itself rigorous (Section 7 substrate conditions)?
Do the eight reader-experience properties from Section 8 hold when reading the artifact?
Five sample implementations of analyst-grade pre-touch research methodology. Each profile is rated against the six methodology dimensions (Section 5) and indicates typical research-category depth (Section 6). The ratings are illustrative snapshots of how each implementation typically operates. Any specific instance may operate at higher or lower rigor than the profile.
Profile 1: In-house analyst desk. A dedicated internal team of senior analysts producing per-prospect dossiers manually. Closest analog to equity research: high-discipline, high-cost, low-volume. Typical depth tier: strategic accounts only.
Profile 2: RevOps plus Clay plus LLM stack. Internal RevOps function using enrichment platforms (Clay, Apollo) plus LLM-assisted synthesis. Medium-discipline, medium-cost, medium-volume. Typical depth tier: mid-market with some strategic.
Profile 3: Agency workflow. External agency producing per-prospect research at scope-defined depth. Discipline varies by agency; cost is engagement-specific. Typical depth tier: depends on engagement scope.
Profile 4: Software-assisted hybrid (SDR Flow). Productized methodology plus tooling plus ongoing analyst oversight. Designed to compress per-prospect cost while maintaining the methodology bar. Typical depth tier: strategic and mid-market.
Profile 5: Pure software (Aomni, Common Room class). AI-driven account research with limited or no analyst oversight. Lowest cost per artifact. Methodology depends on what the software enforces. Typical depth tier: high-volume with selective deeper depth.
Ratings: ✓ = typically operative, ~ = partial or operator-dependent, — = typically absent. Ratings are honest snapshots, not stacked toward any specific implementation.
| Dimension | In-house desk | RevOps + Clay + LLM | Agency workflow | Software hybrid (SDR Flow) | Pure software |
|---|---|---|---|---|---|
| Thesis-driven targeting | ✓ | ~ | ~ | ✓ | ~ |
| Triangulated evidence | ✓ | ~ | ~ | ✓ | ~ |
| Per-claim provenance | ~ | — | ~ | ✓ | — |
| Calibrated confidence | ~ | — | ~ | ✓ | — |
| Comparable benchmarking | ~ | ~ | ~ | ✓ | ~ |
| Counter-evidence acknowledged | ~ | — | ~ | ~ | — |
SDR Flow's typical output ships approximately 87 fields across 16 categories with 8 standard visualizations. The implementation enforces triangulated evidence, per-claim provenance, calibrated confidence, and comparable benchmarking as platform features. It is partial on counter-evidence acknowledgment: disqualifying evidence is surfaced in some artifacts but not enforced across all. Its strongest dimensions are provenance, calibration, and comparable benchmarking (3-Level ICP scoring against a defined Foundation profile). Its weakest dimension is counter-evidence acknowledgment.
The 87 fields, 16 categories, and 8 visualizations are SDR Flow's design choices, not the methodology standard. Other valid implementations of the same methodology may use different counts.
whitepaper@nieos.com