TradesMind - The Digital Teacher for Skilled Trades
AI-powered digital teacher for skilled trades giving field technicians instant answers to repair questions, live AR support from remote experts, and bite-sized microlearning that fits between jobs. Built for blue-collar teams in construction, HVAC, electrical, and manufacturing, it captures knowledge from senior workers and trains the next generation through safe AR simulations. Every fix gets logged, building a searchable brain that makes your entire workforce smarter over time.
Potential MCP Stack
Total Volume (Monthly)
150
Avg CPC
$2.29
Avg Competition
0.04
| Keyword | Volume | CPC | Comp. |
|---|---|---|---|
| remote ar support | 10 | $0.00 | 0.14 |
| vocational training app | 0 | $0.00 | 0.00 |
| ai for skilled trades | 0 | $0.00 | 0.00 |
| field service ai assistant | 0 | $0.00 | 0.00 |
| bluecollar ai | 140 | $11.46 | 0.07 |
TradesMind is an AI-assisted knowledge and training system for skilled-trades field teams (HVAC, electrical, construction, manufacturing maintenance). It combines:
- Instant repair Q&A: technicians ask questions (text/voice + photo/video) and receive step-by-step answers grounded in company-approved procedures, manuals, and past fixes.
- Remote expert AR support: a technician can start a live session where a remote expert annotates the technician’s camera feed (markers, arrows, callouts) and co-navigates troubleshooting.
- Microlearning + AR simulations: short, job-adjacent lessons and guided “safe mode” simulations for common tasks and safety procedures.
- Continuous knowledge capture: every resolved issue becomes a structured “Fix Log” (symptoms → diagnostics → actions → parts → outcome), indexed for search and used to improve future answers.
Positioning: a field-first digital teacher that reduces time-to-fix, standardizes quality, and preserves tribal knowledge as senior workers retire. Technically, it is a RAG (retrieval augmented generation) system over enterprise content + fix logs, plus a real-time collaboration subsystem for AR sessions, plus a learning management micro-module engine.
Existing alternatives
- Generic knowledge bases (Confluence/SharePoint/Notion): good for documents, weak for field retrieval, no structured fix capture, no grounded AI answers.
- Traditional LMS (Cornerstone, Docebo): course-centric, not “between jobs”, limited contextual troubleshooting.
- Remote assist tools (TeamViewer Frontline, Dynamics 365 Remote Assist): strong live support, but weak knowledge capture and AI-guided troubleshooting.
- ChatGPT-style assistants: fast answers but not grounded in company procedures; risk of hallucinations and unsafe guidance.
Why insufficient
- Field work is contextual (equipment model, symptoms, environment, safety constraints). Existing tools don’t reliably bind answers to the right asset/manual/version.
- Knowledge is tribal and disappears when senior techs leave; most orgs don’t capture fixes in a reusable, searchable structure.
- Training is time-fragmented; long courses don’t fit the cadence of dispatch-based work.
- Safety and compliance require auditable guidance and controlled content sources.
Pain points
- High mean-time-to-repair (MTTR) due to waiting on experts or searching manuals.
- Repeat mistakes and inconsistent repair quality across crews.
- Slow onboarding; junior techs lack confidence and procedural memory.
- Expert bottleneck: senior techs spend time answering the same questions.
Market inefficiencies
- Companies pay for remote assist and LMS separately, but neither closes the loop into a compounding knowledge system.
- Documentation exists (PDF manuals, SOPs) but is not operationalized into step-by-step, context-aware guidance.
Primary persona: Field Technician (junior to mid-level)
- Works on mobile, often with gloves/noisy environments.
- Needs fast answers, minimal typing, offline tolerance.
- Technical maturity: moderate; comfortable with smartphone apps, not with complex enterprise tools.
Secondary persona: Service Manager / Ops Lead
- Owns MTTR, first-time-fix rate, training completion, safety incidents.
- Wants visibility into recurring issues, parts usage, and knowledge gaps.
- Buying behavior: budget holder or strong influencer; expects pilots, ROI proof, and integration with existing systems.
Additional stakeholders
- Senior technician / SME: contributes knowledge, runs remote sessions.
- Safety/compliance: requires audit trails, approved procedures, and role-based access.
- IT: cares about SSO, device management, data residency, and security.
MVP scope (buildable in 8–12 weeks)
1) Technician Q&A (grounded)
- Ask via text + photo upload (voice optional later).
- AI answers with:
- step-by-step troubleshooting
- required tools/parts
- safety warnings
- citations to source documents and/or prior fix logs
- “Was this helpful?” feedback + flag unsafe/incorrect.
2) Knowledge ingestion + search
- Upload PDFs/manuals/SOPs.
- Automatic chunking + embeddings.
- Full-text + semantic search UI.
- Versioning metadata (document name, revision date).
3) Fix Log capture (structured)
- After a job: guided form to capture:
- asset/equipment type + model
- symptoms
- diagnostics performed
- resolution steps
- parts used
- time spent
- outcome (resolved/partial/return visit)
- Auto-summarize free text into structured fields.
4) Microlearning (minimal LMS)
- Create “cards” (3–7 minutes) tied to equipment categories.
- Assign to teams; track completion.
- Generate microlearning suggestions from recurring fix logs (simple rules-based in MVP).
5) Remote expert session (non-AR MVP)
- Start a live video call with screen annotations (basic overlay) OR fallback to WebRTC video + chat.
- Session notes saved to a Fix Log.
6) Admin console
- Manage users/roles.
- Upload/manage documents.
- View analytics: top questions, unresolved queries, MTTR proxy (time-to-answer), fix log volume.
Explicitly out of MVP
- Full AR headset support, 3D simulations, offline-first sync, deep CMMS/ERP integrations.
High-level components
- Mobile/Web Technician App: Q&A, search, fix log entry, remote session join.
- Admin Web App: document ingestion, user management, analytics, microlearning authoring.
- Backend API: auth, org/workspace, content management, fix logs, learning, analytics.
- AI Service:
- ingestion pipeline (OCR optional)
- embeddings + vector search
- RAG answer generation with citations
- summarization/structuring for fix logs
- Realtime Service: WebRTC signaling + session metadata; optional annotation events.
- Storage: relational DB + object storage + vector index.
Deployment approach
- Containerized services deployed to a single cloud (AWS recommended) with managed Postgres.
- Use a managed vector DB (pgvector in Postgres for MVP) to reduce moving parts.
- WebRTC can be via a managed provider (Twilio Live / Daily) for MVP to avoid NAT traversal complexity.
Mermaid diagram
flowchart LR
subgraph Client
T[Technician Mobile/Web]
A[Admin Web]
end
subgraph Backend
API[API Server]
AUTH[Auth/SSO]
AI[AI Service: RAG + Ingestion]
RT[Realtime Session Service]
end
subgraph Data
PG[(Postgres + pgvector)]
OBJ[(Object Storage: S3)]
end
subgraph External
LLM[LLM Provider]
RTC[WebRTC Provider]
end
T --> API
A --> API
API --> AUTH
API --> PG
API --> OBJ
API --> AI
API --> RT
AI --> PG
AI --> OBJ
AI --> LLM
RT --> RTC
T <--> RTC
Notes
- Multi-tenant: orgs isolate data.
- Use pgvector for embeddings.
- Store raw files in S3; store metadata + extracted text in Postgres.
Core tables (SQL-style)
-- Orgs and users
create table orgs (
id uuid primary key,
name text not null,
created_at timestamptz not null default now()
);
create table users (
id uuid primary key,
org_id uuid not null references orgs(id) on delete cascade,
email citext not null,
name text,
role text not null check (role in ('tech','expert','manager','admin')),
created_at timestamptz not null default now(),
unique (org_id, email)
);
-- Documents
create table documents (
id uuid primary key,
org_id uuid not null references orgs(id) on delete cascade,
title text not null,
source_type text not null check (source_type in ('pdf','url','sop')),
s3_key text not null,
revision text,
status text not null default 'processed' check (status in ('uploaded','processing','processed','failed')),
created_by uuid references users(id),
created_at timestamptz not null default now()
);
create table document_chunks (
id uuid primary key,
org_id uuid not null references orgs(id) on delete cascade,
document_id uuid not null references documents(id) on delete cascade,
chunk_index int not null,
content text not null,
embedding vector(1536),
metadata jsonb not null default '{}',
created_at timestamptz not null default now(),
unique(document_id, chunk_index)
);
-- Q&A
create table questions (
id uuid primary key,
org_id uuid not null references orgs(id) on delete cascade,
asked_by uuid not null references users(id),
text text,
asset_type text,
asset_model text,
created_at timestamptz not null default now()
);
create table question_media (
id uuid primary key,
question_id uuid not null references questions(id) on delete cascade,
s3_key text not null,
media_type text not null check (media_type in ('image','video')),
created_at timestamptz not null default now()
);
create table answers (
id uuid primary key,
question_id uuid not null references questions(id) on delete cascade,
org_id uuid not null references orgs(id) on delete cascade,
answer_md text not null,
citations jsonb not null default '[]',
model text,
created_at timestamptz not null default now()
);
create table answer_feedback (
id uuid primary key,
answer_id uuid not null references answers(id) on delete cascade,
user_id uuid not null references users(id),
rating int check (rating between 1 and 5),
flagged_unsafe boolean not null default false,
comment text,
created_at timestamptz not null default now()
);
-- Fix logs
create table fix_logs (
id uuid primary key,
org_id uuid not null references orgs(id) on delete cascade,
created_by uuid not null references users(id),
asset_type text,
asset_model text,
symptoms text,
diagnostics text,
resolution_steps text,
parts_used jsonb not null default '[]',
duration_minutes int,
outcome text check (outcome in ('resolved','partial','return_visit','unknown')),
source_question_id uuid references questions(id),
created_at timestamptz not null default now()
);
-- Microlearning
create table lessons (
id uuid primary key,
org_id uuid not null references orgs(id) on delete cascade,
title text not null,
content_md text not null,
asset_type text,
created_by uuid references users(id),
created_at timestamptz not null default now()
);
create table lesson_assignments (
id uuid primary key,
org_id uuid not null references orgs(id) on delete cascade,
lesson_id uuid not null references lessons(id) on delete cascade,
user_id uuid not null references users(id) on delete cascade,
status text not null default 'assigned' check (status in ('assigned','completed')),
completed_at timestamptz,
unique(lesson_id, user_id)
);
-- Remote sessions
create table remote_sessions (
id uuid primary key,
org_id uuid not null references orgs(id) on delete cascade,
started_by uuid not null references users(id),
expert_id uuid references users(id),
provider text not null,
provider_room_id text not null,
status text not null default 'active' check (status in ('active','ended')),
started_at timestamptz not null default now(),
ended_at timestamptz
);
Index recommendations
- document_chunks: ivfflat/hnsw index on embedding (pgvector) + btree on (org_id, document_id).
- questions: btree on (org_id, created_at desc).
- fix_logs: btree on (org_id, asset_type), (org_id, created_at desc); GIN on parts_used if queried.
- answers: btree on (question_id).
Auth
- JWT-based sessions with refresh tokens.
- Optional SSO (SAML/OIDC) post-MVP.
- All endpoints scoped by org_id derived from token.
Core endpoints (REST)
Documents
- POST /v1/documents (multipart upload)
- Request: file + metadata
- Response: {id, status}
- GET /v1/documents
- GET /v1/documents/:id
Ask a question
- POST /v1/questions
- Body:
{
"text": "Unit is short cycling and not cooling",
"assetType": "HVAC",
"assetModel": "Trane XR14"
}
- Response: {id}
- POST /v1/questions/:id/media (multipart)
- POST /v1/questions/:id/answer
- Response:
{
"answerId": "...",
"answerMd": "1. Verify thermostat...\n2. Check capacitor...",
"citations": [
{"type":"document_chunk","documentId":"...","chunkId":"...","title":"XR14 Manual","snippet":"..."},
{"type":"fix_log","fixLogId":"...","snippet":"Similar symptom resolved by..."}
]
}
Feedback
- POST /v1/answers/:id/feedback
{ "rating": 4, "flaggedUnsafe": false, "comment": "Worked after replacing capacitor" }
Fix logs
- POST /v1/fix-logs
- GET /v1/fix-logs?assetType=HVAC&assetModel=Trane%20XR14
Microlearning
- POST /v1/lessons
- GET /v1/lessons
- POST /v1/lessons/:id/assign
{ "userIds": ["..."] }
- POST /v1/lesson-assignments/:id/complete
Remote sessions (MVP via provider)
- POST /v1/remote-sessions
{ "provider": "daily" }
- POST /v1/remote-sessions/:id/join → returns meeting URL/token
- POST /v1/remote-sessions/:id/end
Notes on AI answer generation
- POST /v1/questions/:id/answer triggers:
- retrieve top-K chunks by embedding similarity filtered by org
- retrieve top-K similar fix logs (optional: embed fix log summaries)
- generate answer with citations and safety constraints
Recommended stack (MVP)
Frontend
- Next.js (React) + TypeScript for admin + technician web app.
- Why: fast iteration, SSR optional, strong ecosystem.
- Optional: React Native for mobile later; MVP can be mobile-responsive web.
Backend
- Node.js (NestJS) or Fastify + TypeScript
- Why: structured modules, good DX, easy REST.
- Background jobs: BullMQ + Redis for ingestion and AI tasks.
Database
- Postgres (RDS) + pgvector
- Why: single datastore for relational + embeddings; reduces ops.
Object storage
- S3 for PDFs/images/videos.
AI
- LLM: OpenAI / Azure OpenAI (enterprise-friendly) or Anthropic.
- Embeddings: provider embeddings stored in pgvector.
- OCR (optional): AWS Textract for scanned PDFs.
Realtime video
- Daily or Twilio for WebRTC rooms.
- Why: avoids building TURN/STUN infra and signaling complexity.
Hosting/infra
- AWS ECS Fargate (or Kubernetes if team already uses it).
- CloudFront for static assets.
- Terraform for infra-as-code.
Alternatives
- Backend: Python (FastAPI) if team prefers AI-heavy Python tooling.
- Vector DB: Pinecone/Weaviate if scaling beyond pgvector.
Authentication
- Email/password + magic link for MVP; enforce MFA for admins.
- Passwords hashed with Argon2id.
- Short-lived access tokens + rotating refresh tokens.
Authorization
- RBAC by role (tech, expert, manager, admin).
- Org-level isolation enforced in every query (row-level checks).
- Admin-only: document upload, user management, analytics.
Data protection
- TLS everywhere.
- Encrypt S3 objects (SSE-S3 or SSE-KMS).
- Encrypt sensitive fields at rest if needed (KMS envelope) for regulated customers.
- PII minimization: avoid storing unnecessary personal data.
AI safety controls
- RAG-only mode for procedural answers: require citations; if low confidence/no sources, respond with “insufficient info” and suggest remote expert.
- Safety policy layer: block instructions that bypass lockout/tagout, electrical safety, refrigerant handling, etc. Provide warnings and require confirmation.
- Audit log for AI outputs and user feedback.
Rate limiting and abuse
- Per-user and per-org rate limits on Q&A endpoints.
- Upload limits and virus scanning (e.g., ClamAV in pipeline) for documents.
Threat considerations
- Prompt injection from documents: sanitize and isolate retrieved text; use system prompts that treat retrieved content as untrusted.
- Data exfiltration: strict org scoping; do not allow cross-org retrieval.
- Remote session privacy: expiring meeting tokens; restrict join to org members.
Pricing model (B2B SaaS)
- Per-seat monthly pricing with role tiers:
- Technician seat (Q&A + search + fix logs)
- Expert seat (remote support)
- Manager seat (analytics + assignments)
- Add-ons:
- Document ingestion/OCR overage
- Video minutes (pass-through + margin)
- Advanced compliance/audit pack
Revenue streams
- Subscription ARR.
- Usage-based: AI tokens, OCR pages, video minutes.
- Professional services: onboarding, content migration, SOP structuring.
Expansion pricing
- Enterprise tier: SSO, SCIM provisioning, data residency, custom retention, dedicated VPC.
Unit economics assumptions (rough)
- Gross margin target: 75–90%.
- Main COGS: LLM tokens + embeddings + video provider.
- Control levers: caching answers, summarizing fix logs, limiting context window, using cheaper models for extraction.
Distribution strategy
- Start with 20–200 person service orgs in HVAC/electrical where knowledge loss and MTTR are acute.
- Sell to ops/service managers; champion is often a senior tech.
Channels
- Partnerships with:
- trade associations
- equipment distributors
- training providers
- Direct outbound to service companies with high hiring velocity.
- Content-led: “top 50 troubleshooting playbooks” gated downloads.
Launch plan
- 2–3 design partners with clear ROI metrics.
- Pilot: 30 days, limited to one region/team.
- Expand to full org after proving MTTR reduction and adoption.
Growth loops
- Fix logs improve answer quality → faster resolutions → more usage.
- Microlearning suggestions from recurring issues → fewer repeat calls → manager buy-in.
- Expert time saved becomes internal advocacy.
Validate before full build
- Run a concierge pilot:
- Collect manuals/SOPs from a partner.
- Build a lightweight RAG prototype (even script-based) to answer top 50 questions.
- Measure answer usefulness and safety acceptance.
MVP testing approach
- Instrument:
- time-to-first-answer
- % questions with citations
- feedback rating distribution
- escalation rate to expert
- fix log completion rate
- A/B test:
- AI answer vs. search-only
- microlearning prompts vs. none
Metrics to track
- Activation: first question asked within 24h of onboarding.
- Retention: weekly active technicians.
- Outcome: self-reported time saved per job; first-time-fix proxy via outcome field.
- Content health: % of answers with high-confidence citations; top unanswered topics.
Phase 1 (MVP) — 8–12 weeks
- Multi-tenant orgs, RBAC.
- Document upload + chunking + embeddings.
- Technician Q&A with citations.
- Fix logs + basic analytics.
- Microlearning cards + assignments.
- Remote session via provider (no true AR).
Phase 2 — 3–5 months
- Voice input + speech-to-text.
- Better asset context: equipment registry + barcode/QR scan.
- Expert annotation layer (draw/arrows) over video.
- Fix log similarity search + auto-suggest resolutions.
- SSO (OIDC) + SCIM.
Phase 3 — 6–12 months
- True AR workflows (mobile ARCore/ARKit; headset optional).
- AR simulations with step validation (computer vision checkpoints).
- Offline-first mode with queued fix logs and cached procedures.
- Deep integrations: CMMS (ServiceTitan, UpKeep), ERP parts catalogs.
- Advanced analytics: recurring failure modes, parts forecasting.
Technical risks
- Hallucinations causing unsafe guidance.
- Mitigation: citation-required answers, refusal when low evidence, safety policy layer.
- Poor retrieval due to messy PDFs/scans.
- Mitigation: OCR pipeline, manual curation tools, chunking evaluation.
- Realtime reliability in low-bandwidth environments.
- Mitigation: adaptive bitrate via provider, fallback to audio/chat.
Market risks
- Adoption friction: techs may resist logging fixes.
- Mitigation: make fix log fast; auto-fill from Q&A/session; manager incentives.
- Budget competition with existing tools.
- Mitigation: integrate/export to existing KB/LMS; prove ROI quickly.
Legal risks
- Liability for incorrect repair advice.
- Mitigation: disclaimers, safety gating, audit logs, “consult expert” escalation.
- Data handling requirements for enterprise customers.
- Mitigation: encryption, retention controls, DPA, SOC2 roadmap.
Competitive risks
- Large suites (Microsoft, ServiceTitan ecosystem) adding AI copilots.
- Mitigation: specialize in fix-log compounding + trade-specific safety workflows.
Future features
- Asset graph: link fixes to specific customer sites/assets; trend analysis.
- Parts intelligence: integrate parts catalogs; recommend compatible replacements.
- Computer vision assist: detect components from camera image; highlight test points.
- Procedure builder: convert fix logs into standardized SOPs with approvals.
- Skill matrix: map lessons + fix outcomes to technician competency.
- Multi-language: bilingual crews; translate lessons and answers with terminology control.
Scalability direction
- Split AI service into separate autoscaled workers.
- Move embeddings to dedicated vector DB when chunk count grows.
- Event-driven pipeline (SQS/SNS) for ingestion and analytics.
- Data warehouse (BigQuery/Redshift) for long-term reporting.
You are a senior full-stack engineer building the MVP of TradesMind.
Goal
Build a multi-tenant B2B web app that lets skilled-trades technicians ask repair questions and receive grounded AI answers with citations from uploaded manuals/SOPs and prior fix logs. Include fix log capture, microlearning cards, and remote sessions via a managed WebRTC provider.
Tech Stack (must use)
- Frontend: Next.js (App Router) + TypeScript + Tailwind
- Backend: NestJS (TypeScript) REST API
- DB: Postgres + pgvector
- Queue: Redis + BullMQ
- Storage: S3-compatible (use AWS SDK; allow MinIO for local)
- Auth: JWT (access + refresh), Argon2id password hashing
- AI: OpenAI-compatible API for chat + embeddings
- Realtime: Daily (or Twilio) API for room creation; client joins via provider UI
- Infra: Docker Compose for local; Terraform stubs for AWS
Deliverables
1) Monorepo with /apps/web (Next.js) and /apps/api (NestJS) and /packages/shared
2) Database migrations (SQL) implementing the schema below
3) Document ingestion pipeline:
- upload PDF -> store in S3 -> extract text (pdf-parse) -> chunk -> embed -> store in document_chunks
- background job processing with BullMQ
4) Q&A pipeline:
- create question
- optional media upload
- generate answer endpoint that performs retrieval (topK=8) from document_chunks filtered by org_id
- also retrieve topK=5 similar fix logs (store fix log summary embeddings in a new table fix_log_embeddings)
- call LLM with system prompt enforcing: cite sources, refuse if insufficient evidence, include safety warnings
- store answer + citations JSON
5) Fix logs CRUD + “create from question” shortcut
6) Microlearning:
- lessons CRUD
- assignments + completion tracking
7) Remote sessions:
- create room via provider API
- join endpoint returns meeting URL/token
- store session metadata
8) Admin UI:
- user management (basic)
- document upload/status
- analytics page (top questions, feedback, fix log count)
9) Technician UI:
- ask question page + answer view with citations
- search documents (semantic search endpoint)
- create fix log
- join remote session
10) Security basics:
- RBAC guards in API
- org scoping in every query
- rate limit Q&A endpoints
Project Structure
- apps/web
- app/(auth)/login
- app/(tech)/ask
- app/(tech)/questions/[id]
- app/(tech)/fix-logs/new
- app/(admin)/documents
- app/(admin)/users
- app/(admin)/analytics
- apps/api
- src/modules/auth
- src/modules/orgs
- src/modules/users
- src/modules/documents
- src/modules/questions
- src/modules/answers
- src/modules/fix-logs
- src/modules/lessons
- src/modules/remote-sessions
- src/modules/ai
- src/modules/search
- packages/shared
- types, zod schemas, api client
Database Schema (implement migrations)
- Use the schema from the blueprint: orgs, users, documents, document_chunks (embedding vector(1536)), questions, question_media, answers, answer_feedback, fix_logs, lessons, lesson_assignments, remote_sessions.
- Add:
- fix_log_embeddings(id, org_id, fix_log_id, content, embedding vector(1536))
API Endpoints (must implement)
- POST /v1/auth/register, /v1/auth/login, /v1/auth/refresh, /v1/auth/logout
- POST /v1/documents (multipart), GET /v1/documents, GET /v1/documents/:id
- POST /v1/questions, POST /v1/questions/:id/media, GET /v1/questions/:id
- POST /v1/questions/:id/answer
- POST /v1/answers/:id/feedback
- POST /v1/fix-logs, GET /v1/fix-logs
- POST /v1/lessons, GET /v1/lessons
- POST /v1/lessons/:id/assign, POST /v1/lesson-assignments/:id/complete
- POST /v1/remote-sessions, POST /v1/remote-sessions/:id/join, POST /v1/remote-sessions/:id/end
- POST /v1/search (semantic search over document_chunks)
LLM Prompts (implement as templates)
- System: You are a safety-first skilled trades assistant. Only answer using provided sources. Always include safety warnings when relevant. If sources are insufficient, say so and recommend escalation.
- Output format: Markdown with sections: Summary, Steps, Tools/Parts, Safety, When to Escalate, Sources.
Local Development
- Provide docker-compose.yml with:
- postgres + pgvector
- redis
- minio (optional)
- api
- web
- Seed script to create an org + admin user.
Deployment (basic)
- Provide Dockerfiles for web and api.
- Provide Terraform placeholders for:
- RDS Postgres
- ElastiCache Redis
- S3 bucket
- ECS services
Quality Bar
- Type-safe DTOs with Zod or class-validator.
- Centralized error handling.
- Unit tests for retrieval and citation formatting.
- Ensure every DB query is scoped by org_id.
Now implement the MVP end-to-end.



