Skip to main content

How to Create & Manage Knowledge Sources

Upload, crawl, and manage content sources that power your AI assistant's knowledge and enable accurate, contextual responses to user inquiries.

Video Walkthrough

Watch a step-by-step video guide for managing and optimizing your knowledge sources:

Understanding Knowledge Sources

Knowledge Sources are the foundation of your AI assistant's intelligence. They provide the specific information your assistant needs to give accurate, relevant responses about your business, products, or services.

Knowledge Source Architecture

When to Use Knowledge Sources

Create knowledge sources when you need your AI assistant to have access to:

  • Business-specific information - Company policies, procedures, product details
  • Updated documentation - FAQs, user manuals, support guides
  • Website content - Product pages, service descriptions, company information
  • Custom training data - Specialized knowledge for your industry or use case

Quick Start: Choose Your Knowledge Source Type

Source Types Comparison

Decision Matrix

Source TypeBest ForContent VolumeProcessing SpeedSetup Complexity
Upload DocumentExisting files, policies, reportsSingle documentsFast ⚡Simple
Add URLSpecific web pages, articlesSingle pageFast ⚡Simple
Crawl WebsiteComprehensive site knowledgeMultiple pagesModerate 🔄Moderate
Text CorpusCustom FAQs, structured contentManual inputInstant ⚡⚡Simple

Step-by-Step Process

Step 1: Access Knowledge Sources

Navigate to Knowledge Sources in your dashboard sidebar. This is your central hub for managing all content that powers your AI assistants.

Step 2: Add Your First Source

Click "Add Source" and follow this visual workflow:

Knowledge Source Workflow

Step 3: Configure Your Source

Upload Document Configuration

  • Supported formats: PDF, DOC, TXT, CSV, XLS, PPT
  • File size limit: Maximum 21MB per file
  • Best practices: Use descriptive names, clean formatting
  • Processing time: Usually under 2 minutes

Add URL Configuration

  • Requirements: Public, accessible web pages only
  • Content extraction: Automatic text and media processing
  • Limitations: Cannot access password-protected or dynamic content
  • Processing time: Usually under 1 minute

Crawl Website Configuration

  • Crawl depth: Set to 2-3 levels for best performance
  • Scope options: Comprehensive or focused crawling
  • Monitor progress: Check status for large websites
  • Processing time: 5-30 minutes depending on site size

Text Corpus Configuration

  • Minimum length: 50 characters required
  • Format: Plain text, markdown, or structured content
  • Use cases: FAQs, policies, custom knowledge
  • Processing time: Instant

Step 4: Assign to Roles

After your knowledge source is processed, assign it to roles so your AI assistants can access the information.

Remember the flow: Knowledge Sources → Roles → Assistants

Supported File Formats

Supported File Formats

Format-Specific Guidelines

PDF Files:

  • Best for: Reports, manuals, presentations, policies
  • Tips: Ensure text is selectable (not scanned images)
  • Optimization: Use bookmarks and clear headings

Word Documents (DOC/DOCX):

  • Best for: Policies, procedures, documentation
  • Tips: Use styles and headings for better extraction
  • Optimization: Remove track changes and comments

Plain Text (TXT):

  • Best for: Simple content, transcripts, raw data
  • Tips: Use clear formatting and line breaks
  • Optimization: Structure with headings and sections

Spreadsheets (CSV/XLS/XLSX):

  • Best for: Data tables, contact lists, structured information
  • Tips: Include clear column headers
  • Optimization: Remove empty rows and formatting

Presentations (PPT/PPTX):

  • Best for: Training materials, product information
  • Tips: Use speaker notes for additional context
  • Optimization: Ensure text is in text boxes, not images

Managing Your Knowledge Sources

Performance Monitoring

Track the effectiveness of your knowledge sources through:

  • Usage analytics - Which sources are accessed most frequently
  • Response accuracy - How well assistants answer using each source
  • Processing status - Monitor upload and crawling progress
  • Content freshness - Track when sources were last updated

Update Schedules

High Priority (Weekly Updates):

  • Customer support FAQs
  • Current pricing information
  • Active promotions and offers
  • Product availability

Medium Priority (Monthly Updates):

  • Product documentation
  • Feature descriptions
  • Company policies
  • Service offerings

Low Priority (Quarterly Updates):

  • Company background information
  • Historical data
  • General industry information
  • About us content

Quality Maintenance

Content Optimization:

  • Use clear, descriptive headings in documents
  • Include relevant keywords users might search for
  • Break up long paragraphs into digestible sections
  • Add context to technical terms and acronyms

Source Organization:

  • Use consistent naming conventions: ProductName-Documentation-2024
  • Include dates for time-sensitive content
  • Group related sources logically
  • Remove outdated or redundant information

Troubleshooting Common Issues

Troubleshooting Guide

Quick Solutions

File Upload Issues:

  1. Check file size (must be under 21MB)
  2. Verify file format is supported
  3. Try compressing large files
  4. Remove password protection

URL Extraction Problems:

  1. Ensure URL is publicly accessible
  2. Test with a simpler page first
  3. Check if site blocks automated access
  4. Consider using Text Corpus instead

Website Crawling Delays:

  1. Reduce crawl depth to 2-3 levels
  2. Target specific sections, not entire sites
  3. Monitor processing status regularly
  4. Contact support if stuck over 30 minutes

Processing Stuck:

  1. Wait 10-15 minutes for completion
  2. Check file formatting and content
  3. Try uploading smaller test files first
  4. Restart with cleaned content

Advanced Configuration

Integration with Roles and Assistants

Your knowledge sources work through a three-tier system:

  1. Knowledge Sources → Content repositories (what you just created)
  2. Roles → Logical groupings of knowledge sources
  3. Assistants → AI agents assigned to specific roles

Example Structure:

Customer Support Role:
├── FAQ Database.pdf
├── Return Policy.docx
├── Product Catalog (Website Crawl)
└── Troubleshooting Guide.txt

Sales Assistant Role:
├── Product Pricing.xlsx
├── Feature Comparison.pdf
├── Company Website (URL)
└── Sales Scripts.txt

Performance Optimization

Content Strategy:

  • Start with essential information (top FAQs, key policies)
  • Add specialized content based on usage patterns
  • Monitor response accuracy after updates
  • Balance comprehensive coverage with response speed

Technical Optimization:

  • Use multiple focused sources rather than one massive source
  • Structure content with clear headings and sections
  • Remove sensitive information before uploading
  • Test assistant responses after major content changes

Interactive Demo

Next Steps After Setup

Once your knowledge sources are created and processing:

1. Create Assistant Roles

2. Assign Roles to Assistants

3. Test Knowledge Integration

  • Verify assistants can access and use your content
  • Check response accuracy and relevance

Best Practices Summary

Before Upload

  • ✅ Organize and clean your content
  • ✅ Use descriptive, consistent naming
  • ✅ Remove sensitive or confidential information
  • ✅ Test with small files first

During Setup

  • ✅ Choose the right source type for your content
  • ✅ Configure appropriate processing settings
  • ✅ Monitor processing status
  • ✅ Verify content extraction quality

After Creation

  • ✅ Assign sources to appropriate roles
  • ✅ Test assistant responses
  • ✅ Monitor usage and performance
  • ✅ Establish regular update schedules

Getting Help

Need assistance with knowledge sources?

  • 📞 Priority Support: Available for enterprise customers

💡 Pro Tip: Start with your top 3 most frequently asked questions or core business processes. Once your assistants are responding well to these, gradually expand your knowledge base based on actual user interactions and feedback.


Additional Resources

Integration Guides