WTJ Scraper

A Python-based web scraper for extracting company information from Welcome to the Jungle (WTJ).

Description

This scraper automates the collection of company profiles from Welcome to the Jungle, including:

Company names and locations
Official websites and social media links
Industry sectors
Company descriptions and presentations
Recruitment information
Additional company insights

Installation

Clone the repository:

git clone https://github.com/zaidkx7/WTJ_Scrapper.git
cd WTJ_Scrapper

Create Virtual Environment

python -m venv .venv
.venv\Scripts\activate

Install the required dependencies:

pip install -r requirements.txt

Usage

Simply run the main script:

python main.py

The scraper will:

Fetch company data from WTJ
Save the results in two formats:
- JSON file (response/data.json)
- Excel file (response/companies_info.xlsx)

Output

The scraper generates two types of output files in the response directory:

data.json: Contains all scraped data in JSON format
companies_info.xlsx: An Excel file with formatted columns containing:
- Company name
- Location
- Website
- URL
- Sectors
- Social media links
- Description
- Presentation
- Recruitment information
- Additional insights

Dependencies

beautifulsoup4
openpyxl
requests

Note

This scraper is for educational purposes only. Please ensure you comply with WTJ's terms of service and robots.txt when using this tool.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
logger.py		logger.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WTJ Scraper

Description

Installation

Usage

Output

Dependencies

Note

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

zaidkx7/WTJ

Folders and files

Latest commit

History

Repository files navigation

WTJ Scraper

Description

Installation

Usage

Output

Dependencies

Note

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages