forked from mordax7/flathunter
-
Notifications
You must be signed in to change notification settings - Fork 199
Closed
Labels
Description
Hi,
I am trying to run flathunter on immscout24 using imagetyperz. I run into the following issue:
$ pipenv run python3 flathunt.py
[2023/01/25 21:04:20|config.py |INFO ]: Using config path /home/max/flathunter/config.yaml
[2023/01/25 21:04:20|chrome_wrapper.py |INFO ]: Initializing Chrome WebDriver for crawler...
[2023/01/25 21:04:21|patcher.py |INFO ]: patching driver executable /home/max/.local/share/undetected_chromedriver/9418e1b60bf980e1_chromedriver
[2023/01/25 21:04:33|abstract_crawler.py |INFO ]: Timeout waiting for iframe element - no captcha verification necessary?
[2023/01/25 21:04:33|crawl_immobilienscout.py|WARNING ]: Unable to find IS24 variable in window
[2023/01/25 21:04:33|crawl_immobilienscout.py|ERROR ]: IS24 bot detection has identified our script as a bot - we've been blockedWhat I think is weird is this: If I do not pass "--headless" as a driver_argument, a Chromium window opens. This window has the immoscout bot detection page loaded. If I copy the URL from that window, and open this URL in a new tab in Chromium, I get the same page, but this time with the Captcha.
Is this because immoscout24 classified me as a bot, or is there something else going on?
This is my config.yaml:
loop:
active: yes
sleeping_time: 600
urls:
- https://www.immobilienscout24.de/Suche/de/berlin/berlin/wohnung-mieten?enteredFrom=one_step_search
filters:
blacklist:
- Innenstadt
durations:
- name: John
destination: Hauptbahnhof, München
modes:
- gm_id: transit
title: "Öff."
- gm_id: bicycling
title: "Rad"
- name: Jane
destination: Karlsplatz, München
modes:
- gm_id: transit
title: "Öff."
- gm_id: driving
title: "Auto"
message: |
{title}
Zimmer: {rooms}
Größe: {size}
Preis: {price}
Ort: {address}
{url}
google_maps_api:
key: YOUR_API_KEY
url: https://maps.googleapis.com/maps/api/distancematrix/json?origins={origin}&destinations={dest}&mode={mode}&sensor=true&key={key}&arrival_time={arrival}
enable: False
captcha:
imagetyperz:
token: 4B59D2B4CC6B4DE0AFC09D310F77D8CE
# 2captcha:
# api_key: alskdjaskldjfklj
driver_arguments:
- "--no-sandbox"
- "--disable-gpu"
- "--remote-debugging-port=9222"
- "--disable-dev-shm-usage"
- "window-size=1024,768"
notifiers:
- telegram
# - mattermost
# - apprise
telegram:
bot_token: (censored)
notify_with_images: true
receiver_ids:
- (censored)
Thank you so much!