Skip to content
Elias Dorneles edited this page Aug 7, 2015 · 4 revisions

Porting Scrapy to Python 3

Scrapy is being ported to Python 3 using single-codebase approach. The goal is to make Scrapy work in Python 2.7 and 3.3+. six library is used to smooth differences between Python 2.x and 3.x.

Tests for Python 2.7 and 3.3 are executed on Travis CI after each commit. Not all Scrapy tests pass in Python 3, so there is a py3-ignores.txt file with a list of tests excluded in Python 3. It also acts as a TODO list.

To contribute Python 3 support fixes pick a test from py3-ignores.txt, then run it in Python 3 (e.g. tox -e py34 -- tests/test_logformatter.py), check failures, port the tested module, make sure tests are passing both in Python 2.7 and 3.x, remove the line from py3-ignores.txt and send a pull request.

See also: Contributing to Scrapy.

When porting, please try follow these rules:

  • Use a single codebase for Python 2.x and 3.x using six library;
  • All URLs should be "native strings" - bytes in Python 2.x and unicode in Python 3.x.
  • HTTP headers are bytes in Scrapy (both keys and values). This decision have its downsides and is not set in stone, but we needed a decision to move forward.
  • Don't rely only on existing tests to make the port; Scrapy test coverage is good but not perfect. Even if tests are passing don't consider a related module ported - please don't uncomment it from py3-ignores.txt without carefully reading the source code of a tested module and a source code of the test itself. Porting a module to Python 3 often require writing extra tests.
  • Try make sure your pull requests are focused and don't contain unrelated changes. If you've ported several modules it is better to make several pull requests. Reviewing Python 3 porting changes is hard because we can't rely only on tests - one have to check all the related modules; it is much easier to review and merge a small pull request.
  • It is OK not to port deprecated Scrapy features to Python 3 and skip the related tests for Python 3.
  • It is OK to make a feature optional if it depends on a part of Twisted which is not ported yet. Of course, contributing to Twisted is welcome. See also: PY3: Twisted Dependencies.

Clone this wiki locally