The AI Gold Rush: Why Building Shovels Beats Mining Gold
Case study for you! What You Need to Build an AI Customer Support Product (and Where the Opportunities Lie)
The Gold Rush Analogy
In the 1850s, thousands of hopeful miners flocked to California with dreams of striking gold. Most never did. But those who sold the shovels, pans, and jeans to the miners Levi Strauss, Samuel Brannan, the merchants they built generational wealth.
Fast forward to today, and we’re in another kind of gold rush.
The gold is AI applications everything from AI tutors, AI doctors, AI copilots, to AI customer support bots. Everyone wants to strike it rich.
But here’s the hard truth:
Most of these “miners” won’t succeed. Not because AI isn’t real (it very much is), but because building a durable, profitable AI product requires more than just plugging into GPT-4 and calling it a day.
The winners and there will be many are the ones who understand the shovel sellers of AI: the infrastructure stack that makes it possible to build, scale, and secure AI products.
Today, let’s walk through what it actually takes to build an AI product end to end, using customer support automation as our running example.
We’ll map this onto the Gen AI App Infrastructure Stack (from Sapphire Ventures, Sep 2024), explain each layer, show who the big players are, and highlight where the white spaces (opportunities for new startups) exist.
By the end, you’ll see why the AI “shovels” might just be the smartest bet in this gold rush.
The Use Case: AI for Customer Support
Let’s pick a concrete example.
Imagine you’re building “SupportAI” an AI-powered platform that helps companies automate their customer support.
Your product vision:
A chatbot that can resolve 80% of Tier-1 support tickets automatically.
Seamless escalation to human agents when confidence is low.
Continuous learning from new tickets and agent resolutions.
Multi-modal support: text, voice, maybe even screen-sharing.
Sounds great, right? But to actually deliver this, you’ll need an entire stack of infrastructure from data pipelines to model deployment to compliance.
Let’s break it down.
1. Data Management & Processing
AI is useless without high-quality data. For customer support, this means:
Historical chat transcripts
Email support logs
Call center transcripts (voice → text)
CRM data (Zendesk, Salesforce, Freshdesk, Intercom, etc.)
Why this stack is needed
You’ll need to collect, clean, label, and structure this data before it’s usable for training or fine-tuning models.
Key layers
Data Pipeline - Moving raw support tickets into structured storage.
Players: Databricks, Astronomer, Fivetran, Airbyte, Prophecy
White space: Most pipelines aren’t tailored for unstructured conversational data (long chats, messy support logs). Opportunity to build customer-support-specific ETL tools.
Data Labeling, Classification, and Curation – Tagging tickets (billing, tech issue, refund request).
Players: Scale AI, Snorkel, Labelbox, Surge AI
White space: Labeling is still expensive and slow. Opportunity to build synthetic labeling platforms or self-labeling models fine-tuned for support domains.
Data Storage & Retrieval – Storing past tickets and enabling fast semantic search (RAG).
Players: Pinecone, Weaviate, Chroma, PostgresML, Vespa
White space: Support tickets often need hierarchical retrieval
(conversation → issue → resolution). Vector DBs don’t yet do this well.
Synthetic Data Generation – Augmenting your limited dataset with synthetic tickets.
Players: Mostly AI, Gretel, Synthetaic
White space: Synthetic data often lacks realistic edge cases (angry customers, multi-turn escalations). Big gap here.
2. Model Training & Deployment
Now you’ve got clean, labeled data. The next step is building or fine-tuning models.
Why this stack is needed
Your customer support product won’t just rely on GPT-4. You’ll likely:
Fine-tune smaller models on your own support dataset (for speed/cost).
Run experimentation with multiple models.
Continuously evaluate and retrain as new support tickets come in.
Key layers
Model Discovery – Where do you even find models?
Players: Hugging Face, Replicate
White space: Hugging Face is great for general models, but what about vertical-specific model hubs (support, legal, healthcare)?
Model Testing & Evaluation – How do you know your chatbot won’t hallucinate refund policies?
Players: Deepchecks, Kolena, Prolific AI, Giskard, RagaAI
White space: Few evaluation tools handle multi-turn conversation evals (was the issue resolved?). Big opportunity.
Model Experimentation – Trying out 10 different fine-tunes.
Players: Weights & Biases, Comet, ClearML, Galileo
White space: Experiment tracking for non-technical PMs and customer support leaders (drag-and-drop UI).
Model Serving & Inference – Deploying models in production.
Players: OctoAI, Modal, Baseten, Runpod, Anyscale
White space: Optimized serving for low-latency support chat (vs. batch jobs).
3. App Development & Orchestration
This is where your AI product comes to life.
Why this stack is needed
Models alone don’t make a product. You need to:
Chain models together (retrieval → classification → response).
Handle prompt engineering.
Build an app interface for customers.
Monitor and route traffic.
Key layers
Prompt Engineering & Caching – Crafting and storing prompts for common issues.
Players: Humanloop, PromptLayer, Redis, Zilliz
White space: Prompt tools are too generic. Big space for domain-specific prompt libraries (support, HR, finance).
Orchestration & Routing – Deciding when to call GPT-4 vs. a fine-tuned small model.
Players: LangChain, LlamaIndex, Vellum, OneAI, n8n
White space: Orchestration for multi-modal (text + voice) support is underdeveloped.
App & Agent Platforms – Building your chatbot frontend.
Players: LangSmith, Fixie, Relevance AI, Gradio, Streamlit
White space: Most are developer-first. Where are the “Shopify for AI apps” platforms that non-technical teams can use?
Product Analytics – Measuring CSAT, deflection rate, resolution time.
Players: Pendo, Amplitude, Statsig, Context.ai
White space: Analytics rarely measure AI-specific metrics (hallucination rate, fallback rate, escalation rate). Huge gap.
4. Validation & Security
Customer support means handling sensitive data: emails, billing info, credit cards. One breach, and you’re dead.
Why this stack is needed
Ensure your model doesn’t leak PII.
Stay compliant with GDPR, HIPAA, SOC2.
Moderate unsafe responses (no offensive answers).
Key layers
Model Security – Protecting against prompt injection, jailbreaks.
Players: Guardrails AI, Protect AI, DeepKeep, TrojAI
White space: Guardrails are still rule-based. Need adaptive, learning-based guardrails.
Governance & Compliance – Proving to auditors that your AI is safe.
Players: Credo AI, Cranium, Holistic AI, Monitaur
White space: Compliance dashboards are too high-level. Need real-time compliance checks inside model pipelines.
End-User Security – Protecting sensitive customer data.
Players: Nightfall, Netskope, Skyflow
White space: Many focus on text. Voice (call transcripts) security is a gap.
5. Foundation Models
Now comes the elephant in the room: do you build your own model or rely on OpenAI/Anthropic?
Options
Closed Source Models – OpenAI, Anthropic, Google, Cohere, Mistral, AWS.
Pros: Best performance.
Cons: Expensive, black box, vendor lock-in.
Open Source Models – Llama 3, Falcon, MPT, MosaicML, Hugging Face.
Pros: Customizable, cheaper at scale.
Cons: Requires infra, MLOps maturity.
Hybrid Approach – Start with closed source, move to open source once scale demands.
White space
Foundation models are commoditizing fast. The opportunity isn’t in building another LLM (unless you’re Google-scale). It’s in building domain-specialized models (like customer support, legal, or healthcare).
6. Infrastructure
Finally, you need raw compute. GPUs are the new oil.
GPU Cloud Hosting – AWS, Azure, GCP, CoreWeave, Lambda Labs, Crusoe.
PaaS & Serverless GPUs – Modal, RunPod, Replicate, Anyscale.
Specialized Accelerators – Cerebras, Graphcore, SambaNova.
White space
GPU costs are crushing startups. There’s a massive opportunity for AI infra efficiency startups:
Better model compression.
Smarter scheduling.
Renewable-powered GPU farms.
The White Spaces: Where to Build the Next “Shovels”
Looking at the whole stack, here are the biggest gaps:
Vertical-Specific Data Infrastructure → ETL, labeling, and storage tuned for domains (support, legal, healthcare).
Conversation-Specific Evals → Tools to measure AI performance in multi-turn settings.
Non-Technical Experimentation Platforms → Empower PMs, not just ML engineers.
AI-Native Analytics → Metrics beyond DAUs/MAUs (hallucinations, fallbacks, escalations).
Adaptive Guardrails → Security that learns and adapts with new threats.
Multi-Modal Orchestration → Support that blends text, voice, images seamlessly.
GPU Cost Optimization → Infra startups focused on efficiency.
Final Takeaway
If you’re rushing to build the next AI app, remember:
Most won’t scale, because they don’t understand the shovel stack.
The real winners will either:
Build apps on top of this infra with deep domain focus (SupportAI, LegalAI, MedAI).
Or, even smarter: become the shovel sellers building the infra tools every AI app needs.
We’re still in the early innings of the AI gold rush. The ground is fertile. The picks and shovels are still being forged.
The question is: are you mining for gold… or selling the shovels?
If you want to become top 1% AI PM, I recommend you checking out my AI for Product Management course starting 1st Nov.


