Tag Archives: xpcshell

converting xpcshell from listing directories to a manifest

Last year we ventured down the path of adding test manifests for xpcshell in bug 616999.  Finding a manifest format is not easy because there are plenty of objections to the format, syntax and relevance to the project at hand.  At the end of the day, we depend too much on our build system to filter tests and after that we have hardcoded data in tests or harnesses to run or ignore based on certain criteria.  So for xpcshell unittests, we have added a manifest so we can start to keep track of all these tests and not depend on iterating directories and sorting or reverse sorting head and tail files.

The first step is to get a manifest format for all existing tests.  This was landed today in bug 616999 and is currently on mozilla-central.  This requires that all test files in directories be in the manifest file and that the manifest file includes all files in the directory (verified at make time).  Basically if you do a build, it will error out if you forget to add a manifest or test file to the manifest.  Pretty straightforward.

The manifest we have chosen is the ini format from mozmill.  We found that there is no silver bullet for a perfect test manifest, which is why we chose an existing format that met the needs of xpcshell.  This is easy to hand edit (as opposed to json), is easy to parse from python and javascript.  As compared to reftests which have a custom manifest format, we needed to just have a list of test files and more specifically a way to associate a head and tail script file (not easy with reftest manifests).  The format might not work for everything, but it gives us a second format to work with depending on the problem we are solving.

Leave a comment

Filed under testdev

updated status of the winmo automation project

A couple weeks ago, I posted an initial status of the winmo automation project. Here is an update of where we are.

Only a few patches have landed since the last update, but a lot of reviews have happened. Great progress has been made to resolve some of the unknown issues with running xpcshell and reftest remotely and everything is truly up for review. I expect next week at this time to have a much shorter list of patches.

1 Comment

Filed under Uncategorized

status of the winmo automated tests project

I have been posting about this project for a while, so I figured I should give an update. Currently patches are landing and we are starting to get the final set of patches ready for review.

  • Talos: This was the first part of this project and we have checked in 3 of the 4 patches to get Talos TS running. There is 1 patch remaining which I need to upload for review
  • Mochitest: There are 4 patches required for this to work:
    1. Fix tests to not use hardcoded localhost – early review stages
    2. Add CLI options to mochitest for remote webserver – I need to cleanup my patch for review, at the end game
    3. Add devicemanager.py to the source tree – review started, waiting on sutagent.exe to resolve a few minor bugs
    4. Add runtestsremote.py to the source tree – review process started, waiting on other patches

    Good news is all 4 patches are at the review stage

  • Reftest: This requires 4 patches (1 is devicemanager.py from mochitest)
    1. Modify reftest.jar to support http url for manifest and test files – up for review
    2. Refactor runreftests.py – up for review
    3. Add remotereftests.py to source tree – needs work before review, but WIP posted

    Keep in mind here we are still blocked on registering the reftest extension. I also have instructions for how to setup and run this.

  • Xpcshell: this requires 3 patches (1 is device manager) and is still in WIP stages. There are two pieces to this that we still need to resolve: copying over the xpcshell data to the device and setting up a webserver to serve pages. Here are the two patches to date:
    1. Refactor runxpcshelltests.py to support subclass for winmo – WIP patch posted, close to review stage
    2. Add remotexpcshelltests.py to source tree – WIP patch posted

    I have written some instructions on how to run xpcshell tests on winmo if you are interested.

Stay tuned for updates when we start getting these patches landed and resolving some of our device selection/setup process.

4 Comments

Filed under testdev

first round of new test harness code has landed

Back in October, I started working on code to run the unittests on Windows CE + Mobile. This is an ongoing project, but I am starting to get the ball rolling in the right direction finally.

Today I checked in my first (actually a set of 2) patch (and it didn’t get backed out this time) which converts the bulk of the python test harness code to be OO instead of a raw scripts.

This is sort of a halfway point in the code that needs to get checked into mozilla-central in order for us to be testing automation on a windows mobile phone. Big thanks to Ted for reviewing all my patches and to Clint for helping me test and do the actual checkin.

NOTE: I originally wrote this Jan 7th, and it finally made in it today:)

1 Comment

Filed under testdev

Making mobile automation development more efficient

In a recent discussion with ctalbert, we discussed what is the right course to take for getting automation running on windows mobile and how we could avoid this 6 month nightmare for the next platform. The big question we want to answer is how can we implement our automation for talos and unittests on Windows Mobile and not have to reinvent the wheel for the next platform?

Our current approach to date has been to take what we have for Firefox (which works really good) and adapt it to work on a mobile device. This approach has not been very successful in the development of Fennec (functionality, UI, performance) nor the automation of unittests and talos.

After further thought on this subject, I find there are 4 area to focus on:

  1. Infrastructure and Tools
  2. Porting Harnesses and Tests
  3. Managing Automation and Failures
  4. Mobile Specific Automation

Each of these 4 areas are tightly coupled yet require a much different solution than the others. Let me outline each area in a bit more detail describing the problem, our solution, and a longer term solution.

Infrastructure and Tools:

This is the area of all the assumptions. The OS you run on, network connectivity, available disk space for test and log files, tools to manage your environment and processes and a method for doing this all automatically. I am going to leave out the build and reporting infrastructure from this topic as those take place separately and don’t run on the device.

Our first port of this to maemo was not as difficult because we could use python to manage files, network, and processes just as we do on a desktop OS. Minor adjustments had to be made for running on storage cards, using different python libraries (we have a limited set available on maemo) and system tools, as well as changing requirements for process names and directory structures. Also maemo has ssh functionality, a cli terminal, and a full set of command line tools to manage stuff.

Moving onto Windows Mobile, we don’t have the tools and infrastructure like we do on Maemo. This means we need to spend a lot of time hacking our communications required for automation and scripting tools like python. Almost all process interaction (create, query, kill), need custom code to take care of it. This has presented us with a problem where we don’t have the luxury of our OS support and tool support. Our approach to date has been to write (or in the case of python, port) our tools to make them work on the device. Unfortunately after 4 months we don’t have a working set of automation that people are happy with.

How can we create infrastructure that is scalable to all platforms? From what I have seen, we need to move away from our reliance on all tools on the device. This means no python or web server on the device, no test data on the device, and assume we won’t be able to store log files or use common system tools. I would like to see a custom communication layer for each OS. So for Windows Mobile, we would create a server that lives on the device which: ensures we have ip connectivity, provides file system tools, process management tools, and allows for the OS to reboot and come back connected. The other half of this is a job server which sends commands to the device and serves/collects test data via a web interface. I know this is a big step from where we are now, but in the future it seems like the easiest approach.

Porting Harnesses and Tests:

This focus area is more about making sure the environment is setup correctly, the right tests are run and useful logs are created. This can be developed without a full infrastructure in place, but it really requires some knowledge about the infrastructure.

For Maemo, a lot of work was done to extract the unittests from the source tree and retrofit the tools and harnesses to manage tests in “chunks” rather than running them all at once. We have done a lot of work to clean up tests that assume preferences or ones that look for features.

The challenge on Windows Mobile is without an infrastructure the tests rely on we need to do things differently. Very few bugs were found that prevented tests from running while porting tests to Maemo. For WinMo, that is a different story. We cannot run the web server locally, cannot load our mochitests in an iframe, and have trouble creating log files. All of these issues force us to morph our testing even further away from where it was and realize that we need to do this differently.

What I see as the ultimate solution here is to setup a “test server” which serves all our test data. Each test would remove the dependencies it has on localhost and work with a test server given that it has an IP address. We would then have an extension for Firefox/Fennec which would serve as the test driver to setup, run, and report the test results. This would allow for any mobile device (that supports addons), desktop, or community member to install the addon and start testing.

Managing Automation and Failures:

This is a much smaller topic than the previous two, but once we do have data generated, how do we keep generating it reliably and use it to our advantage?

Right now our toolset of Tinderbox and Buildbot do a great job for Firefox. I believe that we can still utilize them for mobile. There might be specific issues with them, but for the purposes of running Talos and Unittests, we need something that will take a given build, send it to a machine, run tests, and figure out when it is done. We even have great tools to notify us via IRC when a test suite fails.

The danger here is when testing on a new device or product, we find hundreds if not thousands of failures. The time required to track those down and file bugs is a big job by itself. When you have a large number of bugs waiting to be fixed it won’t happen in the same week (or quarter). This brings up a problem where nobody pays attention to the reported data because there are always so many failures.

The other problem that occurs is with so many crashes and running tests in smaller chunks or 1 by 1, we end up with smaller log files and lose the ability to get the pass/fail status that our test harnesses for Firefox rely upon. I know simply looking for a TEST-UNEXPECTED string in *.log is a reasonable solution, but as we have learned there are a lot of corner cases and that doesn’t tell you which tests were not run.

How can we make this better? Our first step to solving this problem is LogCompare. This is a log parser that uploads data to a database and lets us compare build by build. This solves the problem of finding new failures and ignoring known failures if we want to. A final solution to this would be to expand on this idea and have test runners (via the addon) upload test result blobs to a database. Adding more tools to query status of a test run, missed tests, etc… can be done giving us more detailed reports than just pass/fail. In the long term a tool like this can be used to look at random orange results as well as getting status of many community members pitching in CPU time.

Mobile Specific Automation:

The last piece of the puzzle is to create specific test cases to exercise mobile features. This is fairly trivial and great work has already been done for bookmarks and tabs using browser-chrome. This is important as the more developers we have working on mobile and the more branches we have, the greater the frequency of regressions.

Here is the problem, the mobile UI is changing so rapidly that it would take a full time job to keep up with the automation. This is assuming you have comprehensive tests written. It is actually faster to install a build and poke around the UI for 10 minutes than it is to keep the tests maintained. I know in reality there will be moving pieces on a regular basis, but right now we are ripping big pieces out and rewriting everything else. As a reference point the tab functionality has changed 4 times in the last year.

Looking at the regressions found in the last couple weeks we would not have found those with automation. There is a great list of stuff we want to automate in the Fennec UI and almost none of those would have failed as the result of a recent regression. This means we need many more tests than a few hundred core functionality tests. It also points out that we are not going to catch everything even if we all agree that our tests are comprehensive.

What is the best way to utilize automation? Until a 1.0 release, we have to expect that things will be very volatile. We should fix the automation we currently have and require developers to run it on their desktop or device before checking in. This should be a 1.0 requirement. If a developer is changing functionality, fix the tests. Why this works is we don’t have a lot of tests right now. This will serve more of a purpose to fix the process than finding bugs. Post 1.0, lets build up the automation to have decent coverage of all the major areas (pan, zoom, tabs, controls, awesomebar, bookmarks, downloads, addons, navigation), and keep the requirement that these tests need to run for each patch we checkin. The time to run a test will be fast on a desktop <5 minutes.

Summary:

While we seem to be flopping around like a fish out of water, we just need some clear focus and agreement from all parties about the direction and we can have a great solution. My goal is to be forward looking and not dwell on the existing techniques that work, yet are still being improved. After looking at this from a future point of view I see that developing a great solution to meet our needs now can also allow for greater community involvement leading to greater test coverage for Fennec and Firefox. The amount of work required to generalize this is equivalent to the same work for a specialized solution for Windows Mobile.

I encourage you to think about ways we can reduce our test requirements while allowing for greater flexibility.

2 Comments

Filed under general, testdev

Tackling the large backlog of failed unittests for Fennec

Most of my posts are related to getting the unittests to run on Fennec, there is not much communication about how we are tracking and getting the tests to be green (zero failures). Simple explanation, up until now there was no plan.

Last December, I went through every failure and documented what I thought was the problem. I created a little web tool to see the differences and track my bugs. Of course this is a static tool and was a real pain to update with new bugs and tests.

Now it is August and many new failures are occuring and the old failures are not fixed. I am going to outline an approach to get us to ZERO failures by the end of the year. In order to be successful, we need to reduce the variables as much as possible. This means that we will run Fennec on desktop linux builds in tinderbox per checkin instead of on Maemo! This sets us up for getting green Tinderboxes in this environment vs a device (I suspect we will be 90%+ passing when run on a device).

Actions to take:

  1. Start with XPCShell tests first (then do Crashtest, Reftest, Mochitest, Chrome one at a time) and for each failure do the next steps
  2. Reproduce failure (twice)
  3. Reduce testcase (if possible)
  4. File bug/update existing bug, add bug # to a master tracking bug
  5. When done with a specific test harness (XPCShell in this case), meet with the devs to prioritize bugs and get everybody on the same page

This sounds simple but could take a long time. The benefit of tackling the smaller test harnesses first is that we can see progress (list of bugs, green) faster and start keeping those harnesses green.

What this does not do:

  1. Help us track new failures
  2. Get green tinderboxes on Maemo and WinMo
  3. Resolve remote web server related issues
  4. Fix issues when running tests one at a time

Stay tuned for an update when we get our first batch of bugs filed for XPCShell.

1 Comment

Filed under testdev

how we are running xpcshell unittests on windows mobile

Here is another scrap of information on how our progress is coming on getting unittests to run on Windows Mobile. A lot has changed on our approach and tools which has allowed us to make measurable progress towards seeing these run end to end. Here are some of the things we have done.

Last month I discussed launching a unittest in wince and taking that further blassey has compiled python25 with the windows mobile compilers and added the pipelib code to allow for stdout to a file. This has been a huge step forward and resolves a lot of our problems

I found out that runxpcshelltests.py had gone under a major overhaul and my previous work was not compatible. While working with the new code, I ran into a problem where defining a variable with the -e parameter to xpcshell was not working. To work around this I have changed:
xpcshell.exe -e ‘const _HEAD_FILES = ["/head.js", "/head.js"];’

to:
xpcshell.exe -e ‘var[_HEAD_FILES]=[["/head.js","/head.js"]];’

This new method is ugly but works. We suspect this is related to how subprocess.Popen handles the commandline to execute (appears to require 2 args: app, argv). Regardless, with a series of additional hacks to runxpcshelltests.py, we can launch xpcshell.exe and get results in a log file.

Lastly, there is another issue we have where the lack of support for cwd on windows mobile is causing some of our test_update/unit/* tests to fail. There is a workaround in bug 458950 that we have where we can support cwd when it is passed on the command line:
xpcshell.exe –environ:CWD=path

Of course we need to have this code for xpcshell.exe (as noted in bug 503137), not just fennec.exe or xulrunner.exe.

With all the changes above, we can launch our python script (including cli args) via the visual studio debugger or a command line version on our device that is tethered via USB cable and activesync.

Unfortunately this is not complete yet. We have to clean up the python code and make it a patch. That is hinged on finding better fixes for the -e parameters. Also, while running on my HTC Touch Pro, the device hangs a lot for various reasons (requiring reseating the battery). Stabilizing this could require a tool change, as well as a different way to run the tests.

Leave a comment

Filed under testdev

Launching a unittest on Windows Mobile

This is a has turned into a larger problem than just retrofitting some python scripts to work on a windows mobile device. I have had success getting xpcshell and reftest results by fixing up the python scripts. Our biggest hurdle is getting subprocess.Popen functionality on a Windows Mobile device, especially capturing the output from the spawned process. How are we going to resolve this? I don’t know, but here are some ideas in the works:

  1. Write native code to create a process. This is the approach we have been taking so far using osce to launch a process and collect results in a log file. We can work around this, but there are issues calling these api’s multiple times.
  2. Rewrite subprocess to work with Windows Mobile. This is similar to the osce approach but instead returns handles to the python code to simulate stdout as well as kill the running process. This is a lot more code to write and would only be useful if we can fix the problems with osce. Some effort has been done here as seen in these notes.
  3. Use a remote api (rapi) harness from a host machine. This is similar to the communication process that Visual Studio uses to debug a process on a device. Ideally this would have the most support from Microsoft tools and API’s, but we decided to put this on the backburner as it will limit us to a 1:1 mapping of device/host.
  4. Write a native telnet/ssh server and shell. This is another approach that we have dabbled in but have not made much progress. Writing a telnet/ssh server on Windows Mobile is not too difficult, adding capabilities to do common tasks (ls, cd, mkdir, rmdir, cat, more, cp, scp) require even more code. As it stands this is good idea but has not had a lot of momentum.
  5. Write a telnet server that runs in xpcshell. This is something that blassey had mentioned yesterday. Currently we run a httpd.js web server from xpcshell, why not a telnet server that gives us a js prompt? Could be useful for our tests as we can launch a process and do file I/O. My concern here is how are we going to run the python code which creates profiles and reads files and directories to build up the list of arguments to then spawn fennec?
  6. Proxy python statements to run remotely on device. This was also an idea that blassey has come up with. Using the RAPI toolset we could have python test harnesses run on the host and instead of calling popen on “fennec “, it would call “ceexec.exe fennec “. This would work for getting results from reftest and xpcshell, but there will be problems with the python code when it is trying to create a profile or read a manifest file.

As you can tell, there are a lot of ideas out there, but no easy approach to resolve our problem. Part of the problem lies in Python as output from a python script is not easily collected. For example, if I launch a python script which prints out data from Visual Studio Debugger, I will see the PythonCE console on my device along with the output, but I will not see the output in my debugger window. This extends to the processes that are launched from inside of Python making it more difficult to collect results and debug.

Personally I like the telnet to xpcshell approach. It would require that we execute portions of the python test harness as individual scripts (e.g. creating the temporary profile). Alternatively using the RAPI tools could achieve the same thing if we had a good method for copying required files over to the device during runtime.

I am sure all of this is giving ctalbert and ted quite a headache just knowing I am talking about it. Minimizing the impact to the tests and harnesses is something I want to strive for, but if not, I can always send a fresh bottle of Tylenol to ease the pain.

6 Comments

Filed under testdev