This repository provides a Python-based solution to scrape detailed public company information from LinkedIn using the Crawlbase Crawling API.
It includes:
- A scraper that sends asynchronous requests to LinkedIn company page URLs.
- A retrieval script that fetches the structured company data using the request ID (RID).
📖 Read the full tutorial: How to Scrape LinkedIn
crawlbase– for accessing Crawling and Storage APIsjson– for handling structured outputPython 3.6+
Install the required package:
pip install crawlbaseFile: linkedin_company_scraper.py
- Sends an asynchronous scraping request to a LinkedIn company page.
- Returns a rid (request ID) to be used for retrieving the full response.
- Replace
YOUR_API_TOKENwith your Crawlbase token. - Set the LinkedIn company page
URL.
python linkedin_company_scraper.py{
"rid": "f270321bbebe203b43cebedd"
}File: linkedin_company_retrieve.py
- Retrieves and prints the stored company page data using the rid.
- Replace YOUR_API_TOKEN and RID in the script.
python linkedin_company_retrieve.py{
{
"title": "Amazon",
"headline": "Software Development",
"cover_image": "https://media.licdn.com/dms/...",
"company_image": "https://media.licdn.com/dms/...",
"url": "https://www.linkedin.com/company/amazon",
...
}- Support for scraping multiple company pages
- Export company data to CSV/JSON
- Add CLI options for input/output
- Implement retry and error-handling logic
- Research competitors and market trends
- Monitor public-facing company updates
- Build datasets for lead generation and analytics