Chatspark
K
K

Training Your Agent

Your AI agent is only as good as its training data. This guide covers everything you need to know about adding knowledge to make your agent smarter and more helpful.

8 min read

Updated December 2025

Training Overview

Training is the process of teaching your AI agent about your business. Your training data feeds directly into the ChatSpark AI Engine, which uses it to find and deliver accurate answers. The more relevant content you provide, the better your agent can answer customer questions.

You can add training data from multiple sources:

  • Files — Upload documents in various formats
  • Websites — Crawl your site or specific URLs
  • Text — Paste content directly using our editor
  • YouTube — Import video transcripts
  • Google Docs — Connect your documents
Start Here
Begin with your FAQ page and most common support topics. These will immediately make your agent useful for the majority of customer questions.

Supported File Types

Upload documents in any of these formats:

FormatExtensionNotes
PDF.pdfText-based PDFs work best; scanned images may have issues
Word.doc, .docxFull support including tables and lists
PowerPoint.ppt, .pptxText from slides is extracted
CSV.csvGreat for product catalogs and structured data
Text.txtPlain text files
Note
Each uploaded file counts toward your training data limit. 1 page = approximately 750 words.

Website Crawling

Let ChatSpark automatically learn from your website:

  1. Navigate to Training → Website in your agent settings
  2. Enter your website URL or specific page URLs
  3. Choose to crawl the entire site or just specific pages
  4. Click Crawl — we'll extract all text content

Full Site Crawl

Enter your homepage URL and we'll follow links to discover all pages. This is great for comprehensive coverage.

Specific Pages

Add individual URLs to target specific content. Useful for:

  • FAQ pages
  • Product documentation
  • Pricing pages
  • Policy pages (returns, shipping, etc.)
Note
We respect robots.txt and won't crawl pages that are blocked. Dynamic content that requires JavaScript may not be fully captured.

Rich Text Editor

Use our built-in editor to add content directly:

  • Perfect for FAQs — Format questions and answers clearly
  • Quick updates — Add new information instantly
  • Rich formatting — Headings, lists, bold, links
  • No file needed — Just paste and save

The text editor is ideal for:

  • Common Q&A pairs
  • Quick policy updates
  • Seasonal information
  • Corrections or clarifications

YouTube Transcripts

Have video content? Import transcripts from YouTube:

  1. Go to Training → YouTube
  2. Paste the YouTube video URL
  3. We'll extract the transcript automatically
  4. Review and save to your training data

This works great for:

  • Product demos and tutorials
  • Webinar recordings
  • Training videos
  • CEO messages or company announcements

Best Practices

Follow these guidelines for the best results:

  • Be comprehensive — Include all information customers might ask about
  • Use clear language — Write in plain English, avoid jargon
  • Structure content well — Use headings, lists, and clear organization
  • Include variations — If customers might phrase things differently, include those variations
  • Keep it current — Update training data when policies or products change
  • Review analytics — Check unanswered questions to find gaps
Pro Tip
Review your customer support emails and tickets. The questions people actually ask are the best source of training content.

Retraining Your Agent

Your agent automatically retrains when you:

  • Add new training data
  • Update existing content
  • Delete outdated information
  • Re-crawl your website

Retraining typically takes 1-5 minutes depending on the amount of content. Your agent remains available during retraining.

Training Limits

Training data limits vary by plan:

PlanTraining DataApprox. Words
Starter50 pages~37,500 words
Pro200 pages~150,000 words
EnterpriseUnlimitedUnlimited

View your current usage in the dashboard under Training Data Usage.

Previous

AI Agents

Next

Lead Capture