Training Overview
Training is the process of teaching your AI agent about your business. Your training data feeds directly into the ChatSpark AI Engine, which uses it to find and deliver accurate answers. The more relevant content you provide, the better your agent can answer customer questions.
You can add training data from multiple sources:
- Files — Upload documents in various formats
- Websites — Crawl your site or specific URLs
- Text — Paste content directly using our editor
- YouTube — Import video transcripts
- Google Docs — Connect your documents
Begin with your FAQ page and most common support topics. These will immediately make your agent useful for the majority of customer questions.
Supported File Types
Upload documents in any of these formats:
| Format | Extension | Notes |
|---|
| PDF | .pdf | Text-based PDFs work best; scanned images may have issues |
| Word | .doc, .docx | Full support including tables and lists |
| PowerPoint | .ppt, .pptx | Text from slides is extracted |
| CSV | .csv | Great for product catalogs and structured data |
| Text | .txt | Plain text files |
Each uploaded file counts toward your training data limit. 1 page = approximately 750 words.
Website Crawling
Let ChatSpark automatically learn from your website:
- Navigate to Training → Website in your agent settings
- Enter your website URL or specific page URLs
- Choose to crawl the entire site or just specific pages
- Click Crawl — we'll extract all text content
Full Site Crawl
Enter your homepage URL and we'll follow links to discover all pages. This is great for comprehensive coverage.
Specific Pages
Add individual URLs to target specific content. Useful for:
- FAQ pages
- Product documentation
- Pricing pages
- Policy pages (returns, shipping, etc.)
We respect robots.txt and won't crawl pages that are blocked. Dynamic content that requires JavaScript may not be fully captured.
Rich Text Editor
Use our built-in editor to add content directly:
- Perfect for FAQs — Format questions and answers clearly
- Quick updates — Add new information instantly
- Rich formatting — Headings, lists, bold, links
- No file needed — Just paste and save
The text editor is ideal for:
- Common Q&A pairs
- Quick policy updates
- Seasonal information
- Corrections or clarifications
YouTube Transcripts
Have video content? Import transcripts from YouTube:
- Go to Training → YouTube
- Paste the YouTube video URL
- We'll extract the transcript automatically
- Review and save to your training data
This works great for:
- Product demos and tutorials
- Webinar recordings
- Training videos
- CEO messages or company announcements
Best Practices
Follow these guidelines for the best results:
- Be comprehensive — Include all information customers might ask about
- Use clear language — Write in plain English, avoid jargon
- Structure content well — Use headings, lists, and clear organization
- Include variations — If customers might phrase things differently, include those variations
- Keep it current — Update training data when policies or products change
- Review analytics — Check unanswered questions to find gaps
Review your customer support emails and tickets. The questions people actually ask are the best source of training content.
Retraining Your Agent
Your agent automatically retrains when you:
- Add new training data
- Update existing content
- Delete outdated information
- Re-crawl your website
Retraining typically takes 1-5 minutes depending on the amount of content. Your agent remains available during retraining.
Training Limits
Training data limits vary by plan:
| Plan | Training Data | Approx. Words |
|---|
| Starter | 50 pages | ~37,500 words |
| Pro | 200 pages | ~150,000 words |
| Enterprise | Unlimited | Unlimited |
View your current usage in the dashboard under Training Data Usage.