Selenium RC
off of your desktop
  and onto the grid
Jennifer Bevan & Jason Huggins
  {jbevan, hugs}@google.com
         Google, Inc.
        August 24, 2007
    What we have to share…
    • Experience report on extending Selenium RC from desktop
      use to running within a grid.
    • Demo: “gridified” (© jason) Selenium RC for the masses.
       – You don’t have to be Google to parallelize your selenium tests
    • Future directions & addressing needs.
2
    Background
    • Selenium RC (org.openqa.selenium) uses javascript to interact
      with browsers.
       – Provides ability to inspect contents of pages programmatically.
       – Executes only within the scope of a browser session.
           • Can’t handle all modal dialog windows:
                – File upload
                – File download
                – Browser crash report
       – Supports injection “of large chunks of badly written javascript”
         [S. Stewart, GTAC’07] into the page.
    • Google has created a “farm” of Selenium RC machines that
      facilitates parallelized testing of multiple configurations.
3
    High-level Architecture
4
    A couple of usage statistics…
    • Over 10 projects’
      automated/continu
      ous test systems,
      and many
      (uncounted)
      individual users.
    • Gmail tests,
      running at 1
      thread/test, went
      from taking over
      40 minutes to 3.5
      minutes
5
    Experience Report
    • We discovered many different issues while deploying and
      during adoption of the Google Selenium RC farm.
       – Browser/OS variances in configuration and capabilities
       – Reliability and scalability issues
       – Limitations that are out of the execution context of Selenium
         RC
    • Changes we make to Selenium RC to support our needs get
      propagated to the open source repository.
6
    Browser/OS Issues
    • Top issue: test isolation
       – Firefox: RC generates a user profile specific to a test session.
       – IE: RC modifies registry settings and configures the LAN
          connection settings directly.
    • Therefore, either IE tests must be externally isolated
      (individual VMs, single-tracked, etc.) or RC needs to manage
      access to the shared resources.
       – The latter approach not used in current RC code.
7
    Browser/OS Issues (cont.)
    • Javascript evaluation within the browser
       – Firefox (*chrome) allows tests to bypass security checks in
          specific situations
       – IE (*iehta) is expected to perform similarly, but is
          experimental within Selenium RC.
           • Therefore, we’ve only deployed *iexplore
    • Basic tests (no https, etc.) are not a problem.
    • Complex tests that work in *chrome are not immediately
      usable on *iehta.
    • Limits ability to immediately test multiple configurations
      (requires test modification)
8
    Reliability Issues
     • When we initially deployed our Selenium RC farm, the
       deadlock in the current RC code was not known.
        – Concurrently, ongoing work on RC increased deadlock
           rate, resulting in a wider awareness.
     • Is not related to browser type or test complexity, so all
       users experience spurious test failures.
        – Defensive strategy: retry failed tests
     • Google and the Selenium RC development team are
       actively addressing this problem (top priority).
9
     Reliability Issues (cont.)
     • Selenium RC has a memory leak and a connection leak.
     • We adopted a two-prong approach for this.
       – Defensive strategy: periodically drain off tests and restart RC
       – Offensive strategy: search and destroy source of leaks.
     • Defensive strategy works, so for right now this is not our top
       priority.
10
     Reliability Issues (more)
     • Current Selenium RC regression test suite contains both
       functional and unit tests.
        – And yet, most of them are “happy path” (© patrick) tests.
        – All of the issues we’ve found were in code that passed the
          RC regression tests.
        – Resulting uncertainty affects ability to quickly deploy new
          versions (with new RC features) on the farm.
     • We are contributing tests to the open source repository
       that better exercises our new code and the surrounding
       pre-existing code.
11
     Scalability Issues
     • Session identification
        – Current RC code uses a method of identifying sessions that is
           based solely on time.
            • For our purposes, this was not unique enough.
            • Patch to improve uniqueness made, will be given to open
              source repository.
     • Multiple tests per RC instance
       – Unofficial assumption by RC development team that each
           session is in an isolated VM.
            • We are considering tradeoffs between this model and adding
              full support for concurrent tests in one machine.
12
     What about Performance?
     • Sure -- making RC faster would be great.
        – We’re focusing on fixes that affect RC’s “gridability”: reliability and
          scalability
        – Also, the point of creating a grid is to make all of your tests run in
          the amount of time it takes your longest test to run.
            • At which point performance fixes get more attention.
13
     Other contributions…
     • Exposed bug in handling of InterruptedException within
       Selenium RC.
        – A waitFor…(elementName, 45000) statement “timed out” in
          under 4 seconds, not 45 seconds.
        – Added fix and tests for timeout calculation.
     • Exposed tendency of Selenium RC to leave browsers open
       after test session has ended.
        – Usually when a test enters a deadlock or when a javascript “eval”
          stops responding.
        – Current effort focused on eliminating deadlock; followup work will
          strengthen browser shutdown method.
14
     Out-of-scope RC limitations
     • A grid of Selenium RC machines is not something you want to
       maintain manually.
        – So if IE “encounters an error” and gets wedged to the point
          where no new IE session can be started…” -- do you really
          want to VNC in and click “Don’t Send” to clear the IE state?
     • Out-of-band communication with a farm manager to handle
       platform-specific browser error handling is necessary for large-
       scale deployment.
        – And once you have the requisite ‘watchdog’ client on the RC
          machine, you can also use it for metric collection, managing
          configuration recovery, etc…
15
     (and now we switch to Jason…)
16
     Which is faster? (1 server, sequential tasks)
                   3
         Servers
                              Time
17
     Which is faster? (4 servers, parallel tasks)
                   3
         Servers
                               Time
18
     How do you add more servers?
       Thank this dude
19
     Make computing a true utility (for anyone)
20
     Pricing is cheap, but it’s not free.
                 $.10 per hour
                          ==
               ~$74 per month
21
     Demo…
22
     Questions?
     1. Are you going to open-source this?
     2. Who is Paul Hammant?
        (http://paulhammant.com/)
23
     Creator of Selenium Driven aka “Remote Control”
24
     Thank you, Paul. :-)
25