Skip to content

sandhose/tp-siris-docker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

URL crawler – Sample Python application

A simple service to crawl URLs and extracts the HTML <title>.

This has two components :

  • a web server, which serves HTTP requests and schedule crawling tasks
  • a worker, which executes crawling tasks

For scheduling asynchronous tasks, this application uses Celery, which leverages a message queue like RabbitMQ or Redis.

Installing the application dependencies

This project uses uv for Python project management. Make sure uv is installed, then run:

uv sync

Running the application

  1. First, you need a working Redis instance. You can start one using Docker:

    docker run --name redis -p 6379:6379 redis:latest
  2. Set the CELERY_BROKER_URL and CELERY_RESULT_BACKEND environment variables to point to your Redis instance:

    export CELERY_BROKER_URL=redis://localhost:6379/0
    export CELERY_RESULT_BACKEND=redis://localhost:6379/1
  3. Run the web server:

    uv run flask run
  4. Run the worker:

    uv run celery -A tasks worker

Interesting bits of code

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published