Adventures in Task Cluster – Running tests locally

There is a lot of promise around Taskcluster (the replacement for BuildBot in our CI system at Mozilla) to be the best thing since sliced bread.  One of the deliverables on the Engineering Productivity team this quarter is to stand up the Linux debug tests on Taskcluster in parallel to running them normally via Buildbot.  Of course next quarter it would be logical to turn off the BuildBot tests and run tests via Taskcluster.

This post will outline some of the things I did to run the tests locally.  What is neat is that we run the taskcluster jobs inside a Docker image (yes this is Linux only), and we can download the exact OS container and configuration that runs the tests.

I started out with a try server push which generated some data and a lot of failed tests.  Sadly I found that the treeherder integration was not really there for results.  We have a fancy popup in treeherder when you click on a job, but for taskcluster jobs, all you need is to find the link to inspect task.  When you inspect a task, it takes you to a task cluster specific page that has information about the task.  In fact you can watch a test run live (at least from the log output point of view).  In this case, my test job is completed and I want to see the errors in the log, so I can click on the link for live.log and search away.  The other piece of critical information is the ‘Task‘ tab at the top of the inspect task page.  Here you can see the details about the docker image used, what binaries and other files were used, and the golden nugget at the bottom of the page, the “Run Locally” script!  You can cut and paste this script into a bash shell and theoretically reproduce the exact same failures!

As you can imagine this is exactly what I did and it didn’t work!  Luckily in the #taskcluster channel, there were a lot of folks to help me get going.  The problem I had was I didn’t have a v4l2loopback device available.  This is interesting because we need this in many of our unittests and it means that our host operating system running docker needs to provide video/audio devices for the docker container to use.  Now is time to hack this up a bit, let me start:

first lets pull down the docker image used (from the run locally script):

docker pull 'taskcluster/desktop-test:0.4.4'

next lets prepare my local host machine to run by installing/setting up v4l2loopback:

sudo apt-get install v4l2loopback-dkms

sudo modprobe v4l2loopback devices=2

Now we can try to run docker again, this time adding the –device command:

docker run -ti \
  --name "${NAME}" \
  --device=/dev/video1:/dev/video1 \
  -e MOZILLA_BUILD_URL='https://queue.taskcluster.net/v1/task/c7FbSCQ9T3mE9ieiFpsdWA/artifacts/public/build/target.tar.bz2' \
  -e MOZHARNESS_SCRIPT='mozharness/scripts/desktop_unittest.py' \
  -e MOZHARNESS_URL='https://queue.taskcluster.net/v1/task/c7FbSCQ9T3mE9ieiFpsdWA/artifacts/public/build/mozharness.zip' \
  -e GECKO_HEAD_REPOSITORY='https://hg.mozilla.org/try/' \
  -e MOZHARNESS_CONFIG='mozharness/configs/unittests/linux_unittest.py mozharness/configs/remove_executables.py
' \
  -e GECKO_HEAD_REV='5e76c816870fdfd46701fd22eccb70258dfb3b0c' \
  taskcluster/desktop-test:0.4.4

Now when I run the test command, I don’t get v4l2loopback failures!

bash /home/worker/bin/test.sh --no-read-buildbot-config '--installer-url=https://queue.taskcluster.net/v1/task/c7FbSCQ9T3mE9ieiFpsdWA/artifacts/public/build/target.tar.bz2' '--test-packages-url=https://queue.taskcluster.net/v1/task/c7FbSCQ9T3mE9ieiFpsdWA/artifacts/public/build/test_packages.json' '--download-symbols=ondemand' '--mochitest-suite=browser-chrome-chunked' '--total-chunk=7' '--this-chunk=1'

In fact, I get the same failures as I did when the job originally ran 🙂  This is great, except for the fact that I don’t have an easy way to run the test by itself, debug, or watch the screen- let me go into a few details on that.

Given a failure in browser/components/search/test/browser_searchbar_keyboard_navigation.js, how do we get more information on that?  Locally I would do:

./mach test browser/components/search/test/browser_searchbar_keyboard_navigation.js

Then at least see if anything looks odd in the console, on the screen, etc.  I might look at the test and see where we are failing at to give me more clues.  How do I do this in a docker container?  The command above to run the tests, calls test.sh, which then calls test-linux.sh as the user ‘worker’ (not as user root).  This is important that we use the ‘worker’ user as the pactl program to find audio devices will fail as root.  Now what happens is we setup the box for testing, including running pulseaudio, Xfvb, compiz (after bug 1223123), and bootstraps mozharness.  Finally we call the mozharness script to run the job we care about, in this case it is ‘mochitest-browser-chrome-chunked’, chunk 1.  It is important to follow these details because mozharness downloads all python packages, tools, firefox binaries, other binaries, test harnesses, and tests.  Then we create a python virtualenv to setup the python environment to run the tests while putting all the files and unpacking them in the proper places.  Now mozharness can call the test harness (python run_tests.py –browser-chrome …)  Given this overview of what happens, it seems as though we should be able to run:

test.sh <params> –test-path browser/components/search/test

Why this doesn’t work is that mozharness has no method for passing in a directory or single test, let along doing other simple things that |./mach test| allows.  In fact, in order to run this single test, we need to:

  • download Firefox binary, tools, and harnesses
  • unpack them (in all the right places)
  • setup the virtual env and install needed dependencies
  • then run the mochitest harness with the dirty dozen (just too many commands to memorize)

Of course most of this is scripted, how can we take advantage of our scripts to set things up for us?  What I did was hack the test-linux.sh locally to not run mozharness and instead echo the command.  Likewise with the mozharness script to echo the test harness call instead of calling it.  Here is the commands I ended up using:

  • bash /home/worker/bin/test.sh --no-read-buildbot-config '--installer-url=https://queue.taskcluster.net/v1/task/c7FbSCQ9T3mE9ieiFpsdWA/artifacts/public/build/target.tar.bz2' '--test-packages-url=https://queue.taskcluster.net/v1/task/c7FbSCQ9T3mE9ieiFpsdWA/artifacts/public/build/test_packages.json' '--download-symbols=ondemand' '--mochitest-suite=browser-chrome-chunked' '--total-chunk=7' --this-chunk=1
  • #now that it failed, we can do:
  • cd workspace/build
  • . venv/bin/activate
  • cd ../build/tests/mochitest
  • python runtests.py –app ../../application/firefox/firefox –utility-path ../bin –extra-profile-file ../bin/plugins –certificate-path ../certs –browser-chrome browser/browser/components/search/test/
  • # NOTE: you might not want –browser-chrome or the specific directory, but you can adjust the parameters used

This is how I was able to run a single directory, and then a single test.  Unfortunately that just proved that I could hack around the test case a bit and look at the output.  In docker there is no simple way to view the screen.   To solve this I had to install x11vnc:

apt-get install x11vnc

Assuming the Xvfb server is running, you can then do:

x11vnc &

This allows you to connect with vnc to the docker container!  The problem is you need the ipaddress.  I then need to get the ip address from the host by doing:

docker ps #find the container id (cid) from the list

docker inspect <cid> | grep IPAddress

for me this is 172.17.0.64 and now from my host I can do:

xtightvncviewer 172.17.0.64

This is great as I can now see what is going on with the machine while the test is running!

This is it for now.  I suspect in the future we will make this simpler by doing:

  • allowing mozharness (and test.sh/test-linux.sh scripts) to take a directory instead of some args
  • create a simple bootstrap script that allows for running ./mach style commands and installing tools like x11vnc.
  • figuring out how to run a local objdir in the docker container (I tried mapping the objdir, but had GLIBC issues based on the container being based on Ubuntu 12.04)

Stay tuned for my next post on how to update your own custom TaskCluster image- yes it is possible if you are patient.

2 Comments

Filed under Uncategorized

2 responses to “Adventures in Task Cluster – Running tests locally

  1. jopjopsen

    So an alternative to using vnc would be to map you X display into the docker container. There images and guides out there for running GUI things under docker. Granted you probably want to be using the same display server as runs in automation..

    Hmm, maybe one could develop a tool to dump screenshots from a framebuffer inside a container.

    Regarding v4l loopback I don’t see a good solution to this… Perhaps a vagrant box with all this already setup would be nice. There is a few special features like this that don’t just magically happen the “run locally script” is still experimental, hopefully we improve it or at least warn if the task use a special device or kernel module.

    One of the thing I dream of is exposing vnc from the containers we run in the cloud. That could be fun for debugging 🙂

Leave a comment