# Training Deck Company Research Prompt

For the more detailed profile-page research workflow and machine-readable schema, use `company_profile_research_blueprint.md` and `company_profile_research_schema.csv` alongside this prompt.

Use this prompt for every training deck/company. Replace bracketed fields before running.

```text
You are building a company-specific research record for a pitch deck feedback and comparable-slide product.

Inputs:
- Company name: [COMPANY_NAME]
- Deck file path: [DECK_FILE_PATH]
- Known deck stage, if any: [PRE-SEED / SEED / SERIES A / UNKNOWN]
- Known sector/category, if any: [CATEGORY]
- Optional website: [WEBSITE]
- Optional Crunchbase URL: [CRUNCHBASE_URL]
- Output folder: [OUTPUT_FOLDER]

Primary objective:
Create a source-backed company intelligence record that connects the deck's slide claims to real company context, pre-seed/seed financing details, investors, later outcomes, and slide-level patterns that can be used for pitch deck feedback and comparable-slide recommendations.

Important rules:
- Do not invent data. If a field is unavailable, write "unknown" and explain what source would be needed.
- Preserve source labels exactly, especially funding stage labels from Crunchbase/PitchBook/press. Also add our internal taxonomy label when the source label is misleading.
- Separate deck-time facts from post-deck outcomes. Do not let later success rewrite what was actually shown in the deck.
- Treat deck metrics as company-claimed unless independently verified.
- For Crunchbase, use the user's logged-in browser session if available, but never ask for or handle passwords. If blocked, record the gap and use public sources.
- Resolve company-name collisions by verifying founder, website, logo, headquarters, sector, and deck contents.
- Use exact dates whenever available.
- Every major claim must have a source URL or be marked as coming from the supplied deck.

Research workflow:
1. Deck extraction
   - Render the deck to slide images.
   - Extract text from each slide.
   - Create a contact sheet image.
   - Count slides and identify deck stage, round ask, date clues, company positioning, and major metrics.

2. Company identity
   - Verify official company name, legal name if available, website, headquarters, founding year, founder(s), category, business model, customer type, and current operating status.
   - Record discrepancies across sources, for example formation date vs public launch date.

3. Funding history with emphasis on pre-seed and seed
   - Capture all funding rounds, but deeply research pre-seed, seed, and the first later institutional round.
   - For each round capture:
     - Source round label
     - Internal stage label
     - Announced date
     - Amount
     - Valuation, if available
     - Lead investor
     - Participating investors
     - Partner names
     - Investor count
     - Source URL
     - Confidence level
   - If a database labels multiple early rounds as "Seed," infer internal taxonomy using timing, amount, launch status, investor type, and later institutional seed, but preserve the original label.
   - Compare deck ask versus actual round closed.

4. Investor intelligence
   - Build an investor table for all pre-seed and seed investors.
   - Classify each investor as fund, angel, operator angel, strategic, accelerator, venture debt, syndicate, family office, or unknown.
   - Identify operator/founder angels and why they matter.
   - Mark repeat investors who participated in later rounds.
   - Note relevant consumer/category expertise and notable comparable portfolio companies where available.

5. Traction and business model
   - Extract deck-time traction: revenue, GMV, units sold, users, waitlist, downloads, subscribers, retention, reviews, NPS, CAC, ROAS, gross margin, contribution margin, payback, partnerships, pilots, retail doors, marketplace performance, press, community, and pipeline.
   - Verify post-deck traction separately from public sources.
   - For physical consumer products, prioritize supply chain, payment terms, inventory, retail/wholesale margin, and working capital.
   - For marketplaces, prioritize supply, demand, liquidity, repeat use, take rate, and geographic density.
   - For consumer apps/social, prioritize retention, DAU/MAU, virality, community loops, and creator/user acquisition.

6. Market and category context
   - Identify the category narrative at deck time: why now, incumbents, consumer behavior shift, platform shift, regulation, cultural shift, distribution shift, or cost shift.
   - Capture competitors named in the deck and competitors from current sources.
   - Note whether the company framed a wedge, a platform, a category creation, or a better-product substitution.

7. Later outcome signal
   - Determine whether the company raised after the deck, how much, when, from whom, and whether seed investors followed on.
   - Record acquisitions, IPO, shutdown, major retail expansion, major product expansion, lawsuits/regulatory issues, and other outcome signals.
   - Build a dated product, channel, financing, legal, and brand milestone timeline. Mark company-authored scale claims separately from independent reporting.
   - Mark whether the company is a strong, medium, or weak comparable for "went on to raise successfully."

8. Competitive and product diligence
   - Build a current competitive-context table with direct peers, indirect alternatives, legacy incumbents, and litigation/regulatory competitors where relevant.
   - Separate deck-time competitor framing from current competitive reality.
   - Capture third-party product reviews, category critiques, customer experience issues, and performance/durability tradeoffs.
   - For consumer brands, distinguish brand heat, product quality, retention, and repeat-purchase evidence.

9. Slide-level intelligence
   - For every slide, produce a row with:
     - slide_number
     - slide_title
     - detected_slide_type
     - primary_claim_or_function
     - key_data_points
     - investor_question_answered
     - investor_objection_reduced
     - why_compelling_for_feedback_product
     - weakness_or_followup_question
     - comparable_tags
   - Identify the 5-12 most compelling slides to show future founders and explain exactly what pattern they should learn from.
   - Do not just label slides "traction" or "market." Capture the useful pattern, such as "named retail pipeline plus revenue math" or "pre-launch owned audience as acquisition moat."

10. Risk and claim quality
   - Capture risks visible in the deck and later public record:
     - unsubstantiated health/safety claims
     - regulatory risk
     - litigation
     - gross margin/inventory risk
     - paid CAC dependence
     - retention gaps
     - competitive response
     - retailer concentration
     - marketplace liquidity risk
     - scientific or product-performance disputes
   - Mark claim strength as proven, partially supported, deck-claimed, contradicted, or unknown.

Required outputs:
1. [SLUG]_research.md
   - Company snapshot
   - Founder/team signals
   - Funding history
   - Pre-seed and seed investor details
   - Deck metadata
   - Deck narrative
   - Traction and economics
   - Market/category context
   - Later outcome signal
   - Most compelling slides
   - Risk/claim-quality notes
   - Comparable matching tags
   - Open data gaps
   - Source list

2. [SLUG]_slides.csv
   Required columns:
   slide_number, slide_title, slide_type, primary_claim_or_function, key_data_points, investor_question_answered, investor_objection_reduced, why_compelling_for_feedback_product, weakness_or_followup_question, tags

3. [SLUG]_investors.csv
   Required columns:
   company, round_internal_stage, round_source_label, round_date, round_amount, investor, investor_type, lead_status, partner, source, notes

4. [SLUG]_sources.csv
   Required columns:
   source_name, url, source_type, accessed_date, facts_used, reliability_notes

5. [SLUG]_timeline.csv
   Required columns:
   date, event_type, event, details, source, relevance_to_training_dataset

6. [SLUG]_competitive_context.csv
   Required columns:
   category, company_or_group, positioning_vs_[SLUG], source, training_dataset_note

7. [SLUG]_contact_sheet.jpg
   Contact sheet of rendered slides for quick visual review.

8. Brand profile page and graphics
   - Create `/output/profiles/[SLUG]/index.html`
   - Create `/output/profiles/[SLUG]/reference.html`
   - Create `/output/profiles/assets/[SLUG]-funding.svg`
   - The profile page should include:
     - company snapshot
     - funding graph with years on the x-axis and dollar amount on the y-axis
     - funding timeline
     - product/channel/outcome timeline
     - investor mix
     - outcome signals
     - competitive context
     - most useful comparable slide patterns
     - contact sheet
     - links to research memo, slides CSV, investors CSV, timeline CSV, competitive CSV, and sources CSV
   - The detailed reference page should include:
     - full company identity table
     - full funding-round table
     - complete pre-seed/seed investor table
     - deck metadata and deck-time traction claims
     - narrative synthesis
     - slide intelligence table
     - product/channel/outcome timeline
     - competitive and product-review context
     - later outcomes
     - risks and claim-quality notes
     - open data gaps
     - source audit with links
   - If a funding round amount is undisclosed, show it as a marker/note rather than converting it to $0.

Quality bar:
- The final research record should let the product answer:
  - What kind of company is this?
  - What round was the deck likely used for?
  - What did the company claim at deck time?
  - What actually happened afterward?
  - Which slides are worth recommending to other founders, and why?
  - Which investors participated in pre-seed/seed, and what signal did they add?
  - What data is missing or uncertain?
```

## Caraway Calibration Example

When applying this prompt, Caraway is the calibration case:

- Preserve Crunchbase's "Seed" label for both March 2019 and May 2020 rounds, but internally classify March 2019 as pre-seed-equivalent and May 2020 as seed.
- Store the deck's $2M seed ask separately from the announced $5.3M seed close.
- Treat deck claims like $410K revenue, 4:1 ROAS, and 160K email subscribers as deck-claimed unless independently verified.
- Mark slides 10, 18, and 29 as "omnichannel distribution proof," not generic GTM.
- Mark slide 20 as "working-capital/supply-chain advantage," a high-value physical product pattern.
- Record post-deck validation: $35M 2022 Series A/investment, broad retail rollout, and product expansion.
- Record post-deck scale claims separately from independent reporting: 2.5M+ customers, 200K+ five-star reviews, and 400+ patents are company-claimed in 2026 PR.
- Record post-deck risk: 2025 NAD advertising decision and 2026 PFAS advertising/legal dispute.
- Record product-performance diligence: third-party reviews praise aesthetics but flag ceramic nonstick durability, care requirements, staining, weight, and hot lid handles.
