How to Create & Manage Knowledge Sources
Upload, crawl, and manage content sources that power your AI assistant's knowledge and enable accurate, contextual responses to user inquiries.
Video Walkthrough
Watch a step-by-step video guide for managing and optimizing your knowledge sources:
Understanding Knowledge Sources
Knowledge Sources are the foundation of your AI assistant's intelligence. They provide the specific information your assistant needs to give accurate, relevant responses about your business, products, or services.
When to Use Knowledge Sources
Create knowledge sources when you need your AI assistant to have access to:
- Business-specific information - Company policies, procedures, product details
- Updated documentation - FAQs, user manuals, support guides
- Website content - Product pages, service descriptions, company information
- Custom training data - Specialized knowledge for your industry or use case
Quick Start: Choose Your Knowledge Source Type
Decision Matrix
| Source Type | Best For | Content Volume | Processing Speed | Setup Complexity |
|---|---|---|---|---|
| Upload Document | Existing files, policies, reports | Single documents | Fast ⚡ | Simple |
| Add URL | Specific web pages, articles | Single page | Fast ⚡ | Simple |
| Crawl Website | Comprehensive site knowledge | Multiple pages | Moderate 🔄 | Moderate |
| Text Corpus | Custom FAQs, structured content | Manual input | Instant ⚡⚡ | Simple |
Step-by-Step Process
Step 1: Access Knowledge Sources
Navigate to Knowledge Sources in your dashboard sidebar. This is your central hub for managing all content that powers your AI assistants.
Step 2: Add Your First Source
Click "Add Source" and follow this visual workflow:
Step 3: Configure Your Source
Upload Document Configuration
- Supported formats: PDF, DOC, TXT, CSV, XLS, PPT
- File size limit: Maximum 21MB per file
- Best practices: Use descriptive names, clean formatting
- Processing time: Usually under 2 minutes
Add URL Configuration
- Requirements: Public, accessible web pages only
- Content extraction: Automatic text and media processing
- Limitations: Cannot access password-protected or dynamic content
- Processing time: Usually under 1 minute
Crawl Website Configuration
- Crawl depth: Set to 2-3 levels for best performance
- Scope options: Comprehensive or focused crawling
- Monitor progress: Check status for large websites
- Processing time: 5-30 minutes depending on site size
Text Corpus Configuration
- Minimum length: 50 characters required
- Format: Plain text, markdown, or structured content
- Use cases: FAQs, policies, custom knowledge
- Processing time: Instant
Step 4: Assign to Roles
After your knowledge source is processed, assign it to roles so your AI assistants can access the information.
Remember the flow: Knowledge Sources → Roles → Assistants
Supported File Formats
Format-Specific Guidelines
PDF Files:
- Best for: Reports, manuals, presentations, policies
- Tips: Ensure text is selectable (not scanned images)
- Optimization: Use bookmarks and clear headings
Word Documents (DOC/DOCX):
- Best for: Policies, procedures, documentation
- Tips: Use styles and headings for better extraction
- Optimization: Remove track changes and comments
Plain Text (TXT):
- Best for: Simple content, transcripts, raw data
- Tips: Use clear formatting and line breaks
- Optimization: Structure with headings and sections
Spreadsheets (CSV/XLS/XLSX):
- Best for: Data tables, contact lists, structured information
- Tips: Include clear column headers
- Optimization: Remove empty rows and formatting
Presentations (PPT/PPTX):
- Best for: Training materials, product information
- Tips: Use speaker notes for additional context
- Optimization: Ensure text is in text boxes, not images
Managing Your Knowledge Sources
Performance Monitoring
Track the effectiveness of your knowledge sources through:
- Usage analytics - Which sources are accessed most frequently
- Response accuracy - How well assistants answer using each source
- Processing status - Monitor upload and crawling progress
- Content freshness - Track when sources were last updated
Update Schedules
High Priority (Weekly Updates):
- Customer support FAQs
- Current pricing information
- Active promotions and offers
- Product availability
Medium Priority (Monthly Updates):
- Product documentation
- Feature descriptions
- Company policies
- Service offerings
Low Priority (Quarterly Updates):
- Company background information
- Historical data
- General industry information
- About us content
Quality Maintenance
Content Optimization:
- Use clear, descriptive headings in documents
- Include relevant keywords users might search for
- Break up long paragraphs into digestible sections
- Add context to technical terms and acronyms
Source Organization:
- Use consistent naming conventions:
ProductName-Documentation-2024 - Include dates for time-sensitive content
- Group related sources logically
- Remove outdated or redundant information
Troubleshooting Common Issues
Quick Solutions
File Upload Issues:
- Check file size (must be under 21MB)
- Verify file format is supported
- Try compressing large files
- Remove password protection
URL Extraction Problems:
- Ensure URL is publicly accessible
- Test with a simpler page first
- Check if site blocks automated access
- Consider using Text Corpus instead
Website Crawling Delays:
- Reduce crawl depth to 2-3 levels
- Target specific sections, not entire sites
- Monitor processing status regularly
- Contact support if stuck over 30 minutes
Processing Stuck:
- Wait 10-15 minutes for completion
- Check file formatting and content
- Try uploading smaller test files first
- Restart with cleaned content
Advanced Configuration
Integration with Roles and Assistants
Your knowledge sources work through a three-tier system:
- Knowledge Sources → Content repositories (what you just created)
- Roles → Logical groupings of knowledge sources
- Assistants → AI agents assigned to specific roles
Example Structure:
Customer Support Role:
├── FAQ Database.pdf
├── Return Policy.docx
├── Product Catalog (Website Crawl)
└── Troubleshooting Guide.txt
Sales Assistant Role:
├── Product Pricing.xlsx
├── Feature Comparison.pdf
├── Company Website (URL)
└── Sales Scripts.txt
Performance Optimization
Content Strategy:
- Start with essential information (top FAQs, key policies)
- Add specialized content based on usage patterns
- Monitor response accuracy after updates
- Balance comprehensive coverage with response speed
Technical Optimization:
- Use multiple focused sources rather than one massive source
- Structure content with clear headings and sections
- Remove sensitive information before uploading
- Test assistant responses after major content changes
Interactive Demo
Next Steps After Setup
Once your knowledge sources are created and processing:
1. Create Assistant Roles
- Group related knowledge sources by function
- Examples: Customer Support, Sales, Technical Help
- Learn how to create roles →
2. Assign Roles to Assistants
- Connect your knowledge to specific AI assistants
- Test different role combinations
- Set up assistant assignments →
3. Test Knowledge Integration
- Verify assistants can access and use your content
- Check response accuracy and relevance
Best Practices Summary
Before Upload
- ✅ Organize and clean your content
- ✅ Use descriptive, consistent naming
- ✅ Remove sensitive or confidential information
- ✅ Test with small files first
During Setup
- ✅ Choose the right source type for your content
- ✅ Configure appropriate processing settings
- ✅ Monitor processing status
- ✅ Verify content extraction quality
After Creation
- ✅ Assign sources to appropriate roles
- ✅ Test assistant responses
- ✅ Monitor usage and performance
- ✅ Establish regular update schedules
Getting Help
Need assistance with knowledge sources?
- ✉️ Direct Support: support@wiil.io
- 📞 Priority Support: Available for enterprise customers
💡 Pro Tip: Start with your top 3 most frequently asked questions or core business processes. Once your assistants are responding well to these, gradually expand your knowledge base based on actual user interactions and feedback.