Apify

Social media and web scraping powered by Apify actors.

Provider: Apify.com
Authentication: API key required
Category: Social Media & Web Scraping
Credit Cost: 5 credits per actor run

Overview

Apify tools provide access to powerful scraping actors for social media platforms and specialized websites. These actors handle complex scraping challenges including anti-bot protection, rate limiting, and data normalization.

Setup

Get Apify API Key

Sign up at apify.com
Navigate to Settings → Integrations
Copy your API token
Add to Reeva:
- Dashboard → Accounts → Add Account
- Select Apify
- Paste API key
- Save

Available Tools

Reddit Scraper

Scrape Reddit posts, comments, communities, and user profiles.

Tool ID: apify_Reddit_Scraper
Credit Cost: 5 credits
Apify Actor: reddit-scraper-lite

Parameters:

start_urls (array, optional): Direct URLs to Reddit posts or profiles
searches (array, optional): Search terms to find posts
subreddits (array, optional): Subreddit names to scrape
users (array, optional): User profiles to scrape
search_posts (boolean, optional): Include posts in search
- Default: true
search_comments (boolean, optional): Include comments in search
- Default: false
sort (string, optional): Sort order (new, hot, top, rising)
- Default: "new"
time (string, optional): Time filter (hour, day, week, month, year, all)
- Default: "all"
max_items (integer, optional): Maximum items to scrape
- Default: 50

Example Usage:

# Python - Scrape subreddit
response = client.call_tool(
    name="apify_Reddit_Scraper",
    arguments={
        "subreddits": ["machinelearning"],
        "sort": "top",
        "time": "week",
        "max_items": 25
    }
)

for post in response["items"]:
    print(f"{post['title']} - {post['upvotes']} upvotes")

// TypeScript - Search Reddit
const response = await client.callTool({
  name: "apify_Reddit_Scraper",
  arguments: {
    searches: ["MCP protocol"],
    search_posts: true,
    max_items: 10
  }
});

Use Cases:

Market research and sentiment analysis
Track brand mentions
Gather community feedback
Monitor competitor discussions
Research trends and topics

Tweet Scraper

Extract tweets via searches, handles, or direct URLs.

Tool ID: apify_Tweet_Scraper
Credit Cost: 5 credits
Apify Actor: tweet-scraper
Status: Currently inactive

Parameters:

start_urls (array, optional): Direct tweet URLs
search_terms (array, optional): Keywords to search for
twitter_handles (array, optional): User handles to scrape
conversation_ids (array, optional): Conversation thread IDs
max_items (integer, optional): Maximum tweets to scrape
sort (string, optional): Sort order
tweet_language (string, optional): Filter by language code (e.g., "en")
only_verified_users (boolean, optional): Only verified accounts
only_twitter_blue (boolean, optional): Only Twitter Blue subscribers
only_image (boolean, optional): Only tweets with images
only_video (boolean, optional): Only tweets with videos
only_quote (boolean, optional): Only quote tweets
minimum_retweets (integer, optional): Minimum retweet count
minimum_favorites (integer, optional): Minimum like count
minimum_replies (integer, optional): Minimum reply count

Example Usage:

# Python - Monitor brand mentions
response = client.call_tool(
    name="apify_Tweet_Scraper",
    arguments={
        "search_terms": ["@YourBrand"],
        "max_items": 50,
        "sort": "Latest"
    }
)

Use Cases:

Social media monitoring
Influencer analysis
Sentiment tracking
Competitive intelligence
Trend analysis

YouTube Transcripts

Extract video transcripts and metadata from YouTube.

Tool ID: apify_Scrape_YouTube_Transcripts
Credit Cost: 5 credits
Apify Actor: youtube-transcripts

Parameters:

urls (array, required): YouTube video URLs
output_format (string, optional): Format for transcripts
- Default: "captions"
max_retries (integer, optional): Retry attempts for failed requests
- Default: 8
include_channel_name (boolean, optional): Include channel name
- Default: true
include_channel_id (boolean, optional): Include channel ID
- Default: true
include_date_published (boolean, optional): Include publish date
- Default: true
include_view_count (boolean, optional): Include view count
- Default: false
include_likes (boolean, optional): Include likes count
- Default: false
include_comments (boolean, optional): Include comments
- Default: false
include_keywords (boolean, optional): Include video keywords
- Default: false
include_thumbnail (boolean, optional): Include thumbnail URL
- Default: false
include_description (boolean, optional): Include video description
- Default: false

Response:

{
  "items": [
    {
      "video_id": "dQw4w9WgXcQ",
      "title": "Video Title",
      "channel_name": "Channel Name",
      "channel_id": "UCxxxxxxxx",
      "date_published": "2025-11-22",
      "transcript": "Full transcript text...",
      "duration": "3:45"
    }
  ]
}

Example Usage:

# Python - Get video transcript
response = client.call_tool(
    name="apify_Scrape_YouTube_Transcripts",
    arguments={
        "urls": ["https://www.youtube.com/watch?v=VIDEO_ID"],
        "include_description": True,
        "include_view_count": True
    }
)

transcript = response["items"][0]["transcript"]
print(f"Transcript length: {len(transcript)} characters")

// TypeScript - Batch transcript extraction
const videoUrls = [
  "https://www.youtube.com/watch?v=VIDEO1",
  "https://www.youtube.com/watch?v=VIDEO2",
  "https://www.youtube.com/watch?v=VIDEO3"
];

const response = await client.callTool({
  name: "apify_Scrape_YouTube_Transcripts",
  arguments: {
    urls: videoUrls,
    include_keywords: true,
    include_thumbnail: true
  }
});

Use Cases:

Content analysis and summarization
SEO research and keyword extraction
Educational content processing
Accessibility (add captions to videos)
Research and fact-checking

Zillow Scraper

Extract real estate property details from Zillow.

Tool ID: apify_Zillow_Scraper
Credit Cost: 5 credits
Apify Actor: zillow-detail-scraper

Parameters:

startUrls (array, optional): Direct Zillow property URLs
- Format: https://www.zillow.com/homedetails/Address/12345678_zpid/
addresses (array, optional): Property addresses
- Format: "123 Main St, City, State"
propertyStatus (string, optional): Property status filter
- Options: "FOR_SALE", "RECENTLY_SOLD", "FOR_RENT"
- Default: "RECENTLY_SOLD"
extractBuildingUnits (string, optional): Extract individual units from buildings
- Options: "disabled", "all", "for_sale", "recently_sold", "for_rent", "off_market"
- Default: "disabled"

Response:

{
  "items": [
    {
      "address": "123 Main St, San Francisco, CA 94102",
      "zpid": "12345678",
      "price": 1250000,
      "bedrooms": 3,
      "bathrooms": 2.5,
      "sqft": 1800,
      "lot_size": 4500,
      "year_built": 2010,
      "property_type": "Single Family",
      "listing_url": "https://www.zillow.com/...",
      "images": ["url1", "url2"],
      "description": "Beautiful home..."
    }
  ]
}

Example Usage:

# Python - Scrape by address
response = client.call_tool(
    name="apify_Zillow_Scraper",
    arguments={
        "addresses": [
            "123 Main St, San Francisco, CA",
            "456 Oak Ave, Los Angeles, CA"
        ],
        "propertyStatus": "FOR_SALE"
    }
)

for property in response["items"]:
    print(f"{property['address']}: ${property['price']:,}")

// TypeScript - Scrape by URLs
const response = await client.callTool({
  name: "apify_Zillow_Scraper",
  arguments: {
    startUrls: [
      { url: "https://www.zillow.com/homedetails/Address/12345678_zpid/" }
    ],
    propertyStatus: "RECENTLY_SOLD"
  }
});

Use Cases:

Real estate market analysis
Property investment research
Comparative market analysis (CMA)
Price trend monitoring
Rental market research

Cost Management

Understanding Apify Credits

Apify charges separately for actor runs (consumes Apify credits)
Reeva charges 5 credits per tool execution
Total cost = Reeva credits (5) + Apify credits (varies)

Optimization Tips

Batch Requests: Scrape multiple items in one call
- Use max_items to control volume
- Pass multiple URLs/searches in single request
Filter Early: Use parameters to reduce data
- only_verified_users for Twitter
- propertyStatus for Zillow
- Time filters for Reddit
Cache Results: Store scraped data to avoid re-scraping
- Use Supabase or Notion to cache
- Check cache before making new requests
Monitor Usage: Track Apify actor consumption
- Review Apify dashboard regularly
- Set up usage alerts

Best Practices

Rate Limiting

Apify actors handle rate limiting automatically
Spread large scraping jobs over time
Use Apify's built-in retry mechanisms

Data Quality

Validate scraped data before processing
Handle missing fields gracefully
Be aware of platform changes affecting scrapers

Legal Compliance

Review platform terms of service
Respect robots.txt and rate limits
Use scraped data ethically
Attribute data sources appropriately

Integration Examples

Example 1: Reddit Sentiment Analysis

# Scrape Reddit, analyze sentiment, store in Notion
def analyze_brand_sentiment(brand_name):
    # Scrape Reddit mentions
    reddit_data = client.call_tool(
        name="apify_Reddit_Scraper",
        arguments={
            "searches": [brand_name],
            "max_items": 100,
            "time": "week"
        }
    )
    
    # Analyze each post
    for post in reddit_data["items"]:
        # Use Perplexity to analyze sentiment
        sentiment = client.call_tool(
            name="Perplexity_Ask",
            arguments={
                "question": f"Analyze sentiment of this Reddit post: {post['title']} {post['body']}"
            }
        )
        
        # Store in Notion
        client.call_tool(
            name="notion_create_page",
            arguments={
                "title": post["title"],
                "properties": {
                    "Source": "Reddit",
                    "Sentiment": sentiment["answer"],
                    "Upvotes": post["upvotes"],
                    "URL": post["url"]
                }
            }
        )

Example 2: YouTube Content Aggregator

# Extract transcripts and create summaries
def process_youtube_playlist(video_urls):
    # Get transcripts
    transcripts = client.call_tool(
        name="apify_Scrape_YouTube_Transcripts",
        arguments={
            "urls": video_urls,
            "include_description": True,
            "include_keywords": True
        }
    )
    
    # Process each video
    for video in transcripts["items"]:
        # Summarize transcript
        summary = client.call_tool(
            name="web_scraper_Summarize_Webpage",
            arguments={
                "url": f"https://youtube.com/watch?v={video['video_id']}",
                "max_length": 200
            }
        )
        
        # Store in database
        client.call_tool(
            name="supabase_create_records",
            arguments={
                "table": "youtube_content",
                "records": [{
                    "video_id": video["video_id"],
                    "title": video["title"],
                    "transcript": video["transcript"],
                    "summary": summary["summary"],
                    "keywords": video.get("keywords", [])
                }]
            }
        )

Example 3: Real Estate Market Monitor

# Track property prices in target areas
def monitor_real_estate(addresses):
    # Scrape current listings
    properties = client.call_tool(
        name="apify_Zillow_Scraper",
        arguments={
            "addresses": addresses,
            "propertyStatus": "FOR_SALE"
        }
    )
    
    # Analyze and alert
    for prop in properties["items"]:
        # Check if price is below threshold
        if prop["price"] < 500000:
            # Send alert
            client.call_tool(
                name="HTTPS_Call",
                arguments={
                    "method": "POST",
                    "url": "https://hooks.slack.com/services/YOUR/WEBHOOK/URL",
                    "json": {
                        "text": f"🏠 Deal Alert: {prop['address']} for ${prop['price']:,}"
                    }
                }
            )

Troubleshooting

"Actor not found" Error

Cause: Required actor not available in your Apify account

Solutions:

Verify actor name in Apify dashboard
Check actor is active and accessible
Ensure Apify subscription includes required actors

Slow Execution

Cause: Actors can take time to complete

Solutions:

This is normal for large scraping jobs
Reduce max_items for faster results
Use Apify's async execution for very large jobs

Missing Data

Cause: Platform blocks or rate limits

Solutions:

Actors handle this automatically
Retry failed requests
Check Apify actor run logs
Reduce scraping frequency

Web Scraper - Simple webpage scraping
Firecrawl - Advanced web scraping
Perplexity - Analyze scraped content
Notion - Store scraped data
Supabase - Database storage

Overview​

Setup​

Get Apify API Key​

Available Tools​

Reddit Scraper​

Tweet Scraper​

YouTube Transcripts​

Zillow Scraper​

Cost Management​

Understanding Apify Credits​

Optimization Tips​

Best Practices​

Rate Limiting​

Data Quality​

Legal Compliance​

Integration Examples​

Example 1: Reddit Sentiment Analysis​

Example 2: YouTube Content Aggregator​

Example 3: Real Estate Market Monitor​

Troubleshooting​

"Actor not found" Error​

Slow Execution​

Missing Data​

Related Tools​

See Also​

Overview

Setup

Get Apify API Key

Available Tools

Reddit Scraper

Tweet Scraper

YouTube Transcripts

Zillow Scraper

Cost Management

Understanding Apify Credits

Optimization Tips

Best Practices

Rate Limiting

Data Quality

Legal Compliance

Integration Examples

Example 1: Reddit Sentiment Analysis

Example 2: YouTube Content Aggregator

Example 3: Real Estate Market Monitor

Troubleshooting

"Actor not found" Error

Slow Execution

Missing Data

Related Tools

See Also