Chatspark
K
K

Training Your Agent

Your AI agent is only as good as its training data. This guide covers everything you need to know about adding knowledge to make your agent smarter and more helpful.

9 min read

Updated April 2026

Training Overview

Training is the process of teaching your AI agent about your business. Your training data feeds directly into the ChatSpark AI Engine, which uses it to find and deliver accurate answers. The more relevant content you provide, the better your agent can answer customer questions.

You can add training data from multiple sources:

  • Files: Upload documents in various formats
  • Websites: Crawl your site or specific URLs
  • Text: Paste content directly using our editor
  • More Sources: Pull resolved tickets and conversations from your helpdesk
Start Here
Begin with your FAQ page and most common support topics. These will immediately make your agent useful for the majority of customer questions.

Supported File Types

Upload documents in any of these formats:

FormatExtensionNotes
PDF.pdfText-based PDFs work best; scanned images may have issues
Word.doc, .docxFull support including tables and lists
PowerPoint.ppt, .pptxText from slides is extracted
CSV.csvGreat for product catalogs and structured data
Text.txtPlain text files
Note
Each uploaded file counts toward your training data limit. 1 page = approximately 750 words.

Website Crawling

Let ChatSpark automatically learn from your website:

  1. Navigate to Training → Website in your agent settings
  2. Enter your website URL or specific page URLs
  3. Choose to crawl the entire site or just specific pages
  4. Click Crawl: we'll extract all text content

Full Site Crawl

Enter your homepage URL and we'll follow links to discover all pages. This is great for comprehensive coverage.

Specific Pages

Add individual URLs to target specific content. Useful for:

  • FAQ pages
  • Product documentation
  • Pricing pages
  • Policy pages (returns, shipping, etc.)
Note
We respect robots.txt and won't crawl pages that are blocked. Dynamic content that requires JavaScript may not be fully captured.

Rich Text Editor

Use our built-in editor to add content directly:

  • Perfect for FAQs: Format questions and answers clearly
  • Quick updates: Add new information instantly
  • Rich formatting: Headings, lists, bold, links
  • No file needed: Just paste and save

The text editor is ideal for:

  • Common Q&A pairs
  • Quick policy updates
  • Seasonal information
  • Corrections or clarifications
  • YouTube video transcripts

More Sources

The More Sources tab lets you pull resolved tickets and conversations directly from your helpdesk or support platform. Your agent learns from real customer interactions, which makes it better at answering the questions your customers actually ask.

Supported platforms:

  • HappyFox
  • Zendesk
  • Freshdesk
  • Salesforce
  • Freshchat
  • Intercom

How to import

  1. Go to Training → More Sources in your agent settings
  2. Set up the AI Action for your platform if you have not done so already
  3. Select your date range and how many records to import
  4. Choose any platform-specific filters such as category, group, or case type
  5. Click Import and your records will be queued for training
Note
Ticket and conversation records are formatted automatically. You do not need to format anything manually.
Counts toward your plan limit
Each imported record counts toward your training data limit the same as any other source. One page is approximately 750 words.

HappyFox

HappyFox requires you to select at least one category before importing. Resolved tickets from those categories will be pulled and formatted as question and answer pairs for your agent.

Zendesk and Freshdesk

Import solved tickets from your Zendesk account or resolved tickets from Freshdesk. You can optionally filter by group to target a specific team.

Salesforce

Pull closed cases from Salesforce. You can optionally filter by case type to focus on a specific category of support interactions.

Freshchat and Intercom

Freshchat and Intercom are conversation-based platforms. Each imported record contains the full back-and-forth dialogue between the customer and your team. You can filter by group or team inbox to focus on the most relevant conversations.

Privacy and Security
Every ticket and conversation imported through More Sources is automatically screened before it is added to your training data. Personal information including names, email addresses, phone numbers, and shared credentials is detected and removed. Your agent learns from the resolution patterns and product knowledge in your support history, not from the personal details of individual customers.

Best Practices

Follow these guidelines for the best results:

  • Be comprehensive: Include all information customers might ask about
  • Use clear language: Write in plain English, avoid jargon
  • Structure content well: Use headings, lists, and clear organization
  • Include variations: If customers might phrase things differently, include those variations
  • Keep it current: Update training data when policies or products change
  • Review analytics: Check unanswered questions to find gaps
Pro Tip
Review your customer support emails and tickets. The questions people actually ask are the best source of training content.

Retraining Your Agent

Your agent automatically retrains when you:

  • Add new training data
  • Update existing content
  • Delete outdated information
  • Re-crawl your website

Retraining typically takes 1-5 minutes depending on the amount of content. Your agent remains available during retraining.

Training Limits

Training data limits vary by plan:

PlanTraining DataApprox. Words
Starter50 pages~37,500 words
Pro200 pages~150,000 words
EnterpriseUnlimitedUnlimited

View your current usage in the dashboard under Training Data Usage.

Previous

AI Agents

Next

Lead Capture