Firecrawl
AI-powered web scraping, crawling, and content extraction.
Overview
Firecrawl converts websites into clean, LLM-ready data. It handles JavaScript rendering, removes boilerplate, and extracts structured content.
Provider: Firecrawl.dev
Credit Cost: 3 credits per operation
Authentication: API key required
Available Operations
Scrape URL
Extract content from a single webpage.
Tool: firecrawl_scrape
Parameters:
url(string, required): Target URL to scrapeformats(array, optional): Output formats -["markdown", "html", "links", "screenshot"]onlyMainContent(boolean, optional): Extract only main content (default: true)includeTags(array, optional): HTML tags to includeexcludeTags(array, optional): HTML tags to excludewaitFor(number, optional): Milliseconds to wait for page load
Example:
Scrape a webpage
- cURL
- TypeScript
- Python
curl -X POST https://api.joinreeva.com/mcp/server_YOUR_ID \
-H "Authorization: Bearer mcpk_your_key" \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "firecrawl_scrape",
"arguments": {
"url": "https://example.com/article",
"formats": ["markdown"],
"onlyMainContent": true
}
}
}'
const response = await fetch('https://api.joinreeva.com/mcp/server_YOUR_ID', {
method: 'POST',
headers: {
'Authorization': 'Bearer mcpk_your_key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
jsonrpc: '2.0',
id: 1,
method: 'tools/call',
params: {
name: 'firecrawl_scrape',
arguments: {
url: 'https://example.com/article',
formats: ['markdown'],
onlyMainContent: true
}
}
})
});
const data = await response.json();
import httpx
response = httpx.post(
'https://api.joinreeva.com/mcp/server_YOUR_ID',
headers={
'Authorization': 'Bearer mcpk_your_key',
'Content-Type': 'application/json'
},
json={
'jsonrpc': '2.0',
'id': 1,
'method': 'tools/call',
'params': {
'name': 'firecrawl_scrape',
'arguments': {
'url': 'https://example.com/article',
'formats': ['markdown'],
'onlyMainContent': True
}
}
}
)
data = response.json()
Response:
{
"content": [
{
"type": "text",
"text": "# Article Title\n\nArticle content in markdown..."
}
]
}
Crawl Website
Recursively crawl multiple pages from a website.
Tool: firecrawl_crawl
Parameters:
url(string, required): Starting URLmaxDepth(number, optional): Maximum crawl depth (default: 2)limit(number, optional): Maximum pages to crawl (default: 10)allowedDomains(array, optional): Restrict to specific domainsexcludePaths(array, optional): Paths to skip
Note: Returns a job ID. Use check_crawl_status to get results.
Check Crawl Status
Get results from a crawl job.
Tool: firecrawl_check_crawl_status
Parameters:
job_id(string, required): Job ID from crawl operation
Extract Structured Data
Extract structured data using AI.
Tool: firecrawl_extract
Parameters:
url(string, required): Target URLschema(object, required): JSON schema for extractionprompt(string, optional): Instructions for extraction
Example schema:
{
"type": "object",
"properties": {
"company_name": {"type": "string"},
"description": {"type": "string"},
"founded_year": {"type": "number"}
}
}
Authentication
Setting Up Firecrawl
-
Get API Key:
- Sign up at firecrawl.dev
- Get your API key from dashboard
-
Add to Reeva:
- Dashboard → Accounts
- Click "Add Account"
- Select "Firecrawl"
- Paste API key
- Save
-
Link to Custom Tool:
- When creating custom tool
- Select your Firecrawl account
- Credentials auto-injected
Use Cases
Documentation Scraping
Scrape technical documentation for knowledge bases:
firecrawl_crawl → Get all doc pages
↓
firecrawl_scrape → Extract each page
↓
Store in vector database
Competitor Analysis
Monitor competitor websites:
Schedule regular scrapes
↓
Compare changes over time
↓
Alert on updates
Content Aggregation
Build content databases:
firecrawl_scrape → Multiple sources
↓
Clean and format
↓
Store in Notion/Supabase
Research Automation
Gather research materials:
Google Search → Find URLs
↓
firecrawl_scrape → Get content
↓
Perplexity → Analyze
Best Practices
Performance
- ✅ Use
onlyMainContent: trueto reduce noise - ✅ Set reasonable
waitFortimes (avoid excessive delays) - ✅ Limit crawl depth to avoid too many pages
Cost Management
- ✅ Each operation costs 3 credits
- ✅ Crawls cost 3 credits × pages crawled
- ✅ Test on single pages before crawling
Reliability
- ✅ Handle rate limits (429 errors)
- ✅ Check for successful responses
- ✅ Use appropriate
formatsfor your use case
Troubleshooting
"Failed to scrape"
Causes:
- Website blocks scraping
- URL is invalid
- Timeout occurred
Solutions:
- Increase
waitForparameter - Check URL is accessible
- Try different format options
"Insufficient credits"
Solution: Purchase more credits in dashboard.
Slow scraping
Expected: 5-15 seconds per page is normal.
If slower: Check network connection and target website speed.
Related Tools
- Web Search - Find URLs to scrape
- Perplexity - Analyze scraped content
- Apify - Alternative scraping actors
Credit Cost
- Scrape: 3 credits
- Crawl: 3 credits × pages
- Extract: 3 credits