Scrape tweets from a Nitter instance and serve them as RSS feeds.
This application scrapes tweets from a Nitter instance (a privacy-focused Twitter frontend) and stores them in a database. It uses headless Chrome via nodriver to render JavaScript and scroll through search results, extracting tweets progressively. Make sure the infinite scrolling is enabled in Nitter instance.
Key features:
- Scrape tweets for multiple Twitter/X users
- Organize users into lists
- Generate RSS feeds per list
- Automatic periodic scraping (configurable interval)
- Tweet type detection (post, reply, quote, retweet)
- Progressive saving (tweets saved as they're found)
- Django admin for management
- Docker support with minimal Alpine-based image
Self-hosted Nitter instance is strongly recommended. Public Nitter instances are often rate-limited or blocked by Twitter, resulting in empty or incomplete results. Running your own Nitter instance ensures reliable scraping.
Popular Nitter deployments:
- nitter - Original
- nitter-tee - Fork with tweet embedding
├── scripts/ # Standalone scraper (optional)
│ ├── scraper.py # Core scraping module
│ ├── scrape.py # CLI
│ └── requirements.txt
├── tweets_rss/ # Django web app
│ ├── manage.py
│ ├── settings.py
│ ├── urls.py
│ ├── requirements.txt
│ └── rss_aggregator/
│ ├── models.py # TwitterUser, Tweet, TweetList
│ ├── admin.py # Admin interface
│ ├── feeds.py # RSS feed generation
│ ├── scraper.py # Integrated scraper
│ ├── scheduler.py # Background task scheduler
│ └── views.py # YAML export endpoint
├── Dockerfile
├── docker-compose.yml
├── entrypoint.sh
├── .env.example
└── README.md
- Clone and configure:
git clone <repo-url>
cd tweets_retrieve
cp .env.example .env- Edit
.envwith your Nitter instance URL:
NITTER_INSTANCE=http://your-nitter-instance:8080- Start the container:
docker compose up -d
# or with podman
podman compose up -d- Access admin: http://localhost:8000/admin/
Default credentials: admin / admin (change via ADMIN_USER / ADMIN_PASSWORD env vars)
cd tweets_rss
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python manage.py migrate
python manage.py createsuperuser
python manage.py runserver- Go to Admin > Twitter Users > Add
- Enter the Twitter username (without @)
- Set the Start Date (tweets before this date won't be scraped)
- Choose whether to Include Replies
- Save - scraping starts automatically in the background
- Go to Admin > Tweet Lists > Add
- Create a list with a name and slug
- Add users to the list
- Subscribe to RSS at
/rss/<slug>/
| Endpoint | Description |
|---|---|
/admin/ |
Django admin interface |
/rss/<slug>/ |
RSS feed for a list |
/export/<username>.yaml |
Export user's tweets as YAML |
From the admin panel:
- Click "Scrape" button next to any user
- Select multiple users and use "Scrape selected users" action
- Click "Test Nitter" to verify your Nitter instance is working
From command line:
# Inside container
python manage.py scrape_tweets
python manage.py scrape_tweets --user karpathy| Variable | Default | Description |
|---|---|---|
SECRET_KEY |
dev key | Django secret key (change in production!) |
DEBUG |
False |
Debug mode |
ALLOWED_HOSTS |
* |
Comma-separated allowed hosts |
ADMIN_USER |
admin |
Admin username |
ADMIN_PASSWORD |
admin |
Admin password |
ADMIN_EMAIL |
admin@example.com |
Admin email |
NITTER_INSTANCE |
- | Required. Your Nitter instance URL |
SCRAPE_DELAY |
3.0 |
Seconds between scrolls (rate limiting) |
SCRAPE_INTERVAL_HOURS |
6 |
Hours between automatic scrapes |
To avoid overloading your Nitter instance:
- SCRAPE_DELAY: Wait time between page scrolls (default: 3 seconds)
- Max tweets: Scraping stops after 500 tweets per user per session
- User delay: 30 seconds wait between scraping different users
- Scraping resumes from
max(start_date, last_scrape - 1 day)to avoid re-fetching old tweets
The scripts/ folder contains a standalone scraper that can be used without Django:
cd scripts
pip install -r requirements.txt
# Basic usage
python scrape.py <username> <since> <until>
# Examples
python scrape.py karpathy 2024-01-01 2024-12-31
python scrape.py karpathy 2024-01-01 2024-12-31 --visible # Show browser
python scrape.py karpathy 2024-01-01 2024-12-31 -o tweets.yaml- Test your Nitter instance using the "Test Nitter" button in admin
- Ensure your Nitter instance is properly configured and can access Twitter
- Public Nitter instances are often rate-limited - use a self-hosted instance
The scraper runs headless Chrome. If you see sandbox errors:
- The container uses
no_sandbox=Truewhich is required for running as root - Ensure Chromium is installed (included in Docker image)
If you get migration errors after updates:
# Reset database (will lose data)
rm tweets_rss/db.sqlite3
docker compose down && docker compose up -dMIT