Tag Archives: idea

accessing privileged content from a normal webpage, request for example

One project which I am working on is getting mochitests to run in fennec + electrolysis.  Why this is a project is we don’t allow web pages to access privileged information in the browser anymore.  Mochitests in the simplest form use file logging and event generation.

The communication channel that is available between the ‘chrome’ and ‘content’ process is the messageManager protocol.  There are even great examples of using this in the unit tests for ipc.  Unfortunately I have not been able to load a normal web page and allow for my extension which used messageManager calls to interact.

I think what would be nice to see is a real end to end example of an extension that would demonstrate functionality on any given webpage.  This would be helpful to all addon authors, not just me:)  If I figure this out, I will update with a new blog post.

1 Comment

Filed under testdev

Making mobile automation development more efficient

In a recent discussion with ctalbert, we discussed what is the right course to take for getting automation running on windows mobile and how we could avoid this 6 month nightmare for the next platform. The big question we want to answer is how can we implement our automation for talos and unittests on Windows Mobile and not have to reinvent the wheel for the next platform?

Our current approach to date has been to take what we have for Firefox (which works really good) and adapt it to work on a mobile device. This approach has not been very successful in the development of Fennec (functionality, UI, performance) nor the automation of unittests and talos.

After further thought on this subject, I find there are 4 area to focus on:

  1. Infrastructure and Tools
  2. Porting Harnesses and Tests
  3. Managing Automation and Failures
  4. Mobile Specific Automation

Each of these 4 areas are tightly coupled yet require a much different solution than the others. Let me outline each area in a bit more detail describing the problem, our solution, and a longer term solution.

Infrastructure and Tools:

This is the area of all the assumptions. The OS you run on, network connectivity, available disk space for test and log files, tools to manage your environment and processes and a method for doing this all automatically. I am going to leave out the build and reporting infrastructure from this topic as those take place separately and don’t run on the device.

Our first port of this to maemo was not as difficult because we could use python to manage files, network, and processes just as we do on a desktop OS. Minor adjustments had to be made for running on storage cards, using different python libraries (we have a limited set available on maemo) and system tools, as well as changing requirements for process names and directory structures. Also maemo has ssh functionality, a cli terminal, and a full set of command line tools to manage stuff.

Moving onto Windows Mobile, we don’t have the tools and infrastructure like we do on Maemo. This means we need to spend a lot of time hacking our communications required for automation and scripting tools like python. Almost all process interaction (create, query, kill), need custom code to take care of it. This has presented us with a problem where we don’t have the luxury of our OS support and tool support. Our approach to date has been to write (or in the case of python, port) our tools to make them work on the device. Unfortunately after 4 months we don’t have a working set of automation that people are happy with.

How can we create infrastructure that is scalable to all platforms? From what I have seen, we need to move away from our reliance on all tools on the device. This means no python or web server on the device, no test data on the device, and assume we won’t be able to store log files or use common system tools. I would like to see a custom communication layer for each OS. So for Windows Mobile, we would create a server that lives on the device which: ensures we have ip connectivity, provides file system tools, process management tools, and allows for the OS to reboot and come back connected. The other half of this is a job server which sends commands to the device and serves/collects test data via a web interface. I know this is a big step from where we are now, but in the future it seems like the easiest approach.

Porting Harnesses and Tests:

This focus area is more about making sure the environment is setup correctly, the right tests are run and useful logs are created. This can be developed without a full infrastructure in place, but it really requires some knowledge about the infrastructure.

For Maemo, a lot of work was done to extract the unittests from the source tree and retrofit the tools and harnesses to manage tests in “chunks” rather than running them all at once. We have done a lot of work to clean up tests that assume preferences or ones that look for features.

The challenge on Windows Mobile is without an infrastructure the tests rely on we need to do things differently. Very few bugs were found that prevented tests from running while porting tests to Maemo. For WinMo, that is a different story. We cannot run the web server locally, cannot load our mochitests in an iframe, and have trouble creating log files. All of these issues force us to morph our testing even further away from where it was and realize that we need to do this differently.

What I see as the ultimate solution here is to setup a “test server” which serves all our test data. Each test would remove the dependencies it has on localhost and work with a test server given that it has an IP address. We would then have an extension for Firefox/Fennec which would serve as the test driver to setup, run, and report the test results. This would allow for any mobile device (that supports addons), desktop, or community member to install the addon and start testing.

Managing Automation and Failures:

This is a much smaller topic than the previous two, but once we do have data generated, how do we keep generating it reliably and use it to our advantage?

Right now our toolset of Tinderbox and Buildbot do a great job for Firefox. I believe that we can still utilize them for mobile. There might be specific issues with them, but for the purposes of running Talos and Unittests, we need something that will take a given build, send it to a machine, run tests, and figure out when it is done. We even have great tools to notify us via IRC when a test suite fails.

The danger here is when testing on a new device or product, we find hundreds if not thousands of failures. The time required to track those down and file bugs is a big job by itself. When you have a large number of bugs waiting to be fixed it won’t happen in the same week (or quarter). This brings up a problem where nobody pays attention to the reported data because there are always so many failures.

The other problem that occurs is with so many crashes and running tests in smaller chunks or 1 by 1, we end up with smaller log files and lose the ability to get the pass/fail status that our test harnesses for Firefox rely upon. I know simply looking for a TEST-UNEXPECTED string in *.log is a reasonable solution, but as we have learned there are a lot of corner cases and that doesn’t tell you which tests were not run.

How can we make this better? Our first step to solving this problem is LogCompare. This is a log parser that uploads data to a database and lets us compare build by build. This solves the problem of finding new failures and ignoring known failures if we want to. A final solution to this would be to expand on this idea and have test runners (via the addon) upload test result blobs to a database. Adding more tools to query status of a test run, missed tests, etc… can be done giving us more detailed reports than just pass/fail. In the long term a tool like this can be used to look at random orange results as well as getting status of many community members pitching in CPU time.

Mobile Specific Automation:

The last piece of the puzzle is to create specific test cases to exercise mobile features. This is fairly trivial and great work has already been done for bookmarks and tabs using browser-chrome. This is important as the more developers we have working on mobile and the more branches we have, the greater the frequency of regressions.

Here is the problem, the mobile UI is changing so rapidly that it would take a full time job to keep up with the automation. This is assuming you have comprehensive tests written. It is actually faster to install a build and poke around the UI for 10 minutes than it is to keep the tests maintained. I know in reality there will be moving pieces on a regular basis, but right now we are ripping big pieces out and rewriting everything else. As a reference point the tab functionality has changed 4 times in the last year.

Looking at the regressions found in the last couple weeks we would not have found those with automation. There is a great list of stuff we want to automate in the Fennec UI and almost none of those would have failed as the result of a recent regression. This means we need many more tests than a few hundred core functionality tests. It also points out that we are not going to catch everything even if we all agree that our tests are comprehensive.

What is the best way to utilize automation? Until a 1.0 release, we have to expect that things will be very volatile. We should fix the automation we currently have and require developers to run it on their desktop or device before checking in. This should be a 1.0 requirement. If a developer is changing functionality, fix the tests. Why this works is we don’t have a lot of tests right now. This will serve more of a purpose to fix the process than finding bugs. Post 1.0, lets build up the automation to have decent coverage of all the major areas (pan, zoom, tabs, controls, awesomebar, bookmarks, downloads, addons, navigation), and keep the requirement that these tests need to run for each patch we checkin. The time to run a test will be fast on a desktop <5 minutes.

Summary:

While we seem to be flopping around like a fish out of water, we just need some clear focus and agreement from all parties about the direction and we can have a great solution. My goal is to be forward looking and not dwell on the existing techniques that work, yet are still being improved. After looking at this from a future point of view I see that developing a great solution to meet our needs now can also allow for greater community involvement leading to greater test coverage for Fennec and Firefox. The amount of work required to generalize this is equivalent to the same work for a specialized solution for Windows Mobile.

I encourage you to think about ways we can reduce our test requirements while allowing for greater flexibility.

2 Comments

Filed under general, testdev

considering new approach to mochitest on mobile

Windows mobile is rocking the boat for us and we are getting more and more creative. My last challenge is getting the mochitests up and running and this has proved very difficult.

Last week I setup the xpcshell httpd.js web server remotely and resolved my pending xpchell test. This week I moved onto mochitest. I did a sanity check by running a desktop version of fennec mochitest run with the remote httpd.js web server and it worked (with a lot of new failures/timeouts due to proxy and other issues I didn’t investigate).

Since my experience with xpcshell and reftest on Windows Mobile has resulted in more crashes and smaller chunks, I am considering running each test file in the mochitest harness by itself. This requires some retrofitting of the mochikit integration as there is no logging available for a single test file. It will also require more time to run as the overhead of starting the browser is expensive.

Any thoughts on this? Should I just find a more powerful phone to run these tests on? Is this something that we could find other uses for within Mozilla?

5 Comments

Filed under testdev

Where is the Fennec QA team?

This title can imply a lot of things, yet this post will outline where the Fennec QA team is on developing tests, plans, and processes for our Fennec releases.

As many of you have seen, we released Fennec Maemo Beta2 and Windows Mobile Alpha2 last Friday. We also held the first ever Fennec testday logging a massive 43 bugs. This brings up some questions about why would we release after just finding 43 bugs, what is our release criteria, and what is the QA team doing.

Here is my take on the Firefox QA process (which the Mozilla community is accustomed to and looks to Fennec for a similar process):

  • Build master test plan for release (alpha, beta, final)
    • This includes a list of new features, update features and bugs fixed
    • Define the target audience (hardware), performance goals, and list of must pass tests
    • Outline points of intersection with other products, upgrades, l10n, support, marketing and schedule
  • Start writing test cases and feature specific test plans
  • Test private builds for feature and integration with product and ecosystem
  • Test nightly builds
  • As bugs trend towards zero (naturally or through triage) finalize test plan for features, dates, criteria
  • Get release candidate build, start running full tests
  • When no blockers and tests are done start running update, l10n, and documentation tests
  • QA gives sign off, everybody is happy

For Fennec, we are not ready for that type of a cycle. Dev is cranking out serious changes to code. We are building from mozilla-central which is a roaming beast. Unittests are not running for all platforms. Everybody is asking where they can download Fennec for iPhone! A lot of chaos to apply quality measures to. Lets say (hypothetically) we want to check in a performance fix to gain 25% faster panning, but causes some issues in the url bar and focus. This might get done overnight (not planned) and there are only a few notes in a bug outlining this. Worse yet, this could happen after we are already testing a release candidate since it resolves so many side effects of bad performance and only introduces a few new problems.

Here is a quick outline of the current QA cycle for the Fennec releases to date:

  • List of bugs is built up (~40) for the upcoming release
  • a few weeks later, we are at 30 bugs (a lot of new issues)
  • We triage down to top 5, build release candidate and start testing
  • QA runs through litmus test suite, also some manual browsing
  • If no big issues are found, lets ship tonight

I know this is a lot less formal than Firefox and rightly so. We are looking with our Alpha releases to get feedback from the motivated community members who are willing to tinker and hack and give us good feedback that we act upon. For the Beta releases, we need to be much more stable and change our audience to people who are not so patient but willing to accept a lot of bugs. This means little to no crashes, faster performance and good compatibility with the web.

Going forward, we will step up the QA involvement and evolve into a more formal QA process. Keep in mind we will be doing alpha, beta, and formal releases all at the same time just on different platforms (we can’t lose our current methods completely).

Here is what we have in place right now:

These are all key components to providing quality test coverage. If you look at putting some solid release criteria around the things we have in place for an official release we will look more like this:

  • Create test plan with release criteria
  • Test individual changes and new features
  • Develop new test cases and flag cases required for release
  • Get consensus from team on release criteria
  • Request milestone build every month that we can make a quick pass on to see how close we are
  • Utilize Testdays to keep the product stable at each milestone
  • Bug count trends to zero, RC build is generated, test pass starts
  • Tests pass, move onto final release prep (note: not just call it good)
  • Test install on criteria hardware, l10n, and release note testing

This new approach falls in line with the Firefox QA methodology quite well. It really adds more overhead to the process, but give appropriate time for each area as well as a general consensus with the whole team of what to expect and how to make the decisions.

It is time to mature the Fennec release process and make it a real piece of software that is reliable and well respected in the mobile community.

2 Comments

Filed under qa

a useful tool – waze

I saw this on digg and thought back to my days at microsoft where I proposed making a similar product (using public transit actual travel times vs published times to predict the traffic levels) and was given a “oh that is nice” reply. This company seems to have made this work using cell phones as the source and gps:

http://www.waze.com/

There are a lot of great uses for this data and over time building a predictive model for traffic that is reliable becomes a reality.

I hope this is successful, and if it isn’t maybe I need to start writing code

Leave a comment

Filed under Uncategorized