Monthly Archives: June 2009

Where is the Fennec QA team?

This title can imply a lot of things, yet this post will outline where the Fennec QA team is on developing tests, plans, and processes for our Fennec releases.

As many of you have seen, we released Fennec Maemo Beta2 and Windows Mobile Alpha2 last Friday. We also held the first ever Fennec testday logging a massive 43 bugs. This brings up some questions about why would we release after just finding 43 bugs, what is our release criteria, and what is the QA team doing.

Here is my take on the Firefox QA process (which the Mozilla community is accustomed to and looks to Fennec for a similar process):

  • Build master test plan for release (alpha, beta, final)
    • This includes a list of new features, update features and bugs fixed
    • Define the target audience (hardware), performance goals, and list of must pass tests
    • Outline points of intersection with other products, upgrades, l10n, support, marketing and schedule
  • Start writing test cases and feature specific test plans
  • Test private builds for feature and integration with product and ecosystem
  • Test nightly builds
  • As bugs trend towards zero (naturally or through triage) finalize test plan for features, dates, criteria
  • Get release candidate build, start running full tests
  • When no blockers and tests are done start running update, l10n, and documentation tests
  • QA gives sign off, everybody is happy

For Fennec, we are not ready for that type of a cycle. Dev is cranking out serious changes to code. We are building from mozilla-central which is a roaming beast. Unittests are not running for all platforms. Everybody is asking where they can download Fennec for iPhone! A lot of chaos to apply quality measures to. Lets say (hypothetically) we want to check in a performance fix to gain 25% faster panning, but causes some issues in the url bar and focus. This might get done overnight (not planned) and there are only a few notes in a bug outlining this. Worse yet, this could happen after we are already testing a release candidate since it resolves so many side effects of bad performance and only introduces a few new problems.

Here is a quick outline of the current QA cycle for the Fennec releases to date:

  • List of bugs is built up (~40) for the upcoming release
  • a few weeks later, we are at 30 bugs (a lot of new issues)
  • We triage down to top 5, build release candidate and start testing
  • QA runs through litmus test suite, also some manual browsing
  • If no big issues are found, lets ship tonight

I know this is a lot less formal than Firefox and rightly so. We are looking with our Alpha releases to get feedback from the motivated community members who are willing to tinker and hack and give us good feedback that we act upon. For the Beta releases, we need to be much more stable and change our audience to people who are not so patient but willing to accept a lot of bugs. This means little to no crashes, faster performance and good compatibility with the web.

Going forward, we will step up the QA involvement and evolve into a more formal QA process. Keep in mind we will be doing alpha, beta, and formal releases all at the same time just on different platforms (we can’t lose our current methods completely).

Here is what we have in place right now:

These are all key components to providing quality test coverage. If you look at putting some solid release criteria around the things we have in place for an official release we will look more like this:

  • Create test plan with release criteria
  • Test individual changes and new features
  • Develop new test cases and flag cases required for release
  • Get consensus from team on release criteria
  • Request milestone build every month that we can make a quick pass on to see how close we are
  • Utilize Testdays to keep the product stable at each milestone
  • Bug count trends to zero, RC build is generated, test pass starts
  • Tests pass, move onto final release prep (note: not just call it good)
  • Test install on criteria hardware, l10n, and release note testing

This new approach falls in line with the Firefox QA methodology quite well. It really adds more overhead to the process, but give appropriate time for each area as well as a general consensus with the whole team of what to expect and how to make the decisions.

It is time to mature the Fennec release process and make it a real piece of software that is reliable and well respected in the mobile community.

2 Comments

Filed under qa

posting from fennec

In the spirit of the Fennec testday, I am doing a test post. Check out a quick snapshot of the Miami testlab.

1 Comment

Filed under general, Uncategorized

Launching a unittest on Windows Mobile

This is a has turned into a larger problem than just retrofitting some python scripts to work on a windows mobile device. I have had success getting xpcshell and reftest results by fixing up the python scripts. Our biggest hurdle is getting subprocess.Popen functionality on a Windows Mobile device, especially capturing the output from the spawned process. How are we going to resolve this? I don’t know, but here are some ideas in the works:

  1. Write native code to create a process. This is the approach we have been taking so far using osce to launch a process and collect results in a log file. We can work around this, but there are issues calling these api’s multiple times.
  2. Rewrite subprocess to work with Windows Mobile. This is similar to the osce approach but instead returns handles to the python code to simulate stdout as well as kill the running process. This is a lot more code to write and would only be useful if we can fix the problems with osce. Some effort has been done here as seen in these notes.
  3. Use a remote api (rapi) harness from a host machine. This is similar to the communication process that Visual Studio uses to debug a process on a device. Ideally this would have the most support from Microsoft tools and API’s, but we decided to put this on the backburner as it will limit us to a 1:1 mapping of device/host.
  4. Write a native telnet/ssh server and shell. This is another approach that we have dabbled in but have not made much progress. Writing a telnet/ssh server on Windows Mobile is not too difficult, adding capabilities to do common tasks (ls, cd, mkdir, rmdir, cat, more, cp, scp) require even more code. As it stands this is good idea but has not had a lot of momentum.
  5. Write a telnet server that runs in xpcshell. This is something that blassey had mentioned yesterday. Currently we run a httpd.js web server from xpcshell, why not a telnet server that gives us a js prompt? Could be useful for our tests as we can launch a process and do file I/O. My concern here is how are we going to run the python code which creates profiles and reads files and directories to build up the list of arguments to then spawn fennec?
  6. Proxy python statements to run remotely on device. This was also an idea that blassey has come up with. Using the RAPI toolset we could have python test harnesses run on the host and instead of calling popen on “fennec “, it would call “ceexec.exe fennec “. This would work for getting results from reftest and xpcshell, but there will be problems with the python code when it is trying to create a profile or read a manifest file.

As you can tell, there are a lot of ideas out there, but no easy approach to resolve our problem. Part of the problem lies in Python as output from a python script is not easily collected. For example, if I launch a python script which prints out data from Visual Studio Debugger, I will see the PythonCE console on my device along with the output, but I will not see the output in my debugger window. This extends to the processes that are launched from inside of Python making it more difficult to collect results and debug.

Personally I like the telnet to xpcshell approach. It would require that we execute portions of the python test harness as individual scripts (e.g. creating the temporary profile). Alternatively using the RAPI tools could achieve the same thing if we had a good method for copying required files over to the device during runtime.

I am sure all of this is giving ctalbert and ted quite a headache just knowing I am talking about it. Minimizing the impact to the tests and harnesses is something I want to strive for, but if not, I can always send a fresh bottle of Tylenol to ease the pain.

6 Comments

Filed under testdev

tpan fennecmark – update

A quick update on what it took to get this showing up on the graph server.

As Aki tried to get this going, he kept running into issues. I looked into it and found out that he was using perfConfigurator.py to generate the config:

python PerfConfigurator.py -v -e /media/mmc1/fennec/fennec -t `hostname` -b mobile-browser --activeTests tzoom --sampleConfig mobile.config --browserWait 60 --noChrome --oldresultsServer graphs-stage.mozilla.org --oldresultsLink /server/bulk.cgi --output local.config

So I do this, run the test and it spits out a number. The problem is that it was not sending data up to the graph server as Aki pointed out in the console output (notice there are no links on the RETURN: graph links):

transmitting test: tzoom: 
		Stopped Sat, 13 Jun 2009 09:58:37
RETURN: graph links
RETURN:
RETURN:

Details:
|

Completed sending results: Stopped Sat, 13 Jun 2009 09:58:37

After poking around I had no luck, so I asked Alice (who is the talos/graph-server expert). She had me try two things:
1) increase the # of cycles > 1 (I did 5 for the trial)
2) edit run_tests.py to include ‘tpan’ and ‘tzoom’ in the list of tests to look for (filed bug 497922)

That did the trick. I got output to the graph server and we are all set. All that remains is to check in the raw code for fennecmark.jar. This will be done in aki’s talos-maemo repository.

Leave a comment

Filed under testdev

reftests for windows mobile

In my never ending quest to conquer the unittests, I have made another dent by getting reftests running (in a very basic form) on windows mobile!

It turns out this was much easier than I thought to do, but getting my dozen or so changes to automation.py and runreftest.py took many hours of debugging and trial and error. The whole process started out just to get a test to run, and the journey ended up touching just about every line of code.

I started off commenting everything out and bringing each chunk of code one at a time. This was a process that would take a full day, but knowing there was so many calls to subprocess and os, I knew I was up against a tough problem.

So began the slashing of os.path.* calls and writing stubs for anything that references subprocess or its returned process handle. What I found was I had no good way to debug, so I set out to verify Fennec could run a reftest by hand with a static profile that I munged together and I got results! Now can I get it working from runreftests.py? Not so lucky.

Problem #1 – trouble creating profile and crashing on rmdir:
Creating a profile. This seems simple and it appears to create a profile as expected in the code. For some reason the code kept crashing while trying to rmdir on the profile. Ugh, lets stick with a static profile and get these tests running. This was actually resolved when problem #2 was figured out.

Problem #2 – fennec restarts each time it is launched:
We call automation.runApp twice and for some reason it never terminates fennec after the first call so the second call fails. Ok, I can remove the first instance since it just registers the reftest.jar extension (remember we are using a static profile, so we can cheat). To make matters worse, this didn’t solve my problem. I found out that we were restarting Fennec every time we launched. This was a nasty problem that was easily resolved with adding a –environ:”NO_EM_RESTART=1″. Now I can have both the extension registration and the test.

Problem #3 – python script can’t open another python script process:
In order to run reftests on a small device we need to run them in smaller sets rather than all 3800+ at once. The big bottleneck is the memory required to store the tests and results until we are done. So I used my maemkit toolkit to call runreftest.py over and over again with a different .manifest file! No luck :( I cannot seem to get python to create a new process (using osce) when that process is another python instance. Ok, I can work around this and just call runreftest.py (main) from inside my maemkit. Problem solved!

Problem #4 – osce.blockingtee crashes python script on the third call in a row:
I noticed this while getting xpcshell up and running, and never had a solution. Unfortunately this doesn’t allow me to run the full test suite. The problem is blockingtee will write the output of the process to a given file. The alternative is to use osce.systema() instead. This works for the majority of the tests (a few appear to hang) and will be good enough to prove that I can get this done end to end.

So other than actually collecting the output, I have a working demo. To simplify my changes, I did this (note, the code is attached to bug 493792):

runreftest.py -
remove arg dependencies:

#  if len(args) != 1:
#    print >>sys.stderr, "No reftest.list specified."
#    sys.exit(1)
 

hardcode cli arguments:

#  reftestlist = getFullPath(args[0])
  reftestlist = r'\tests\reftest\tests\layout\reftests\image\reftest.list'
  options.app = r'\tests\fennec\fennec.exe'
  options.xrePath = r'\tests\fennec\xulrunner'

and use tempfile for mkdtemp as it would crash all the time:

#    profileDir = mkdtemp()
    profileDir = tempfile.mkdtemp()

Ok, that isn’t too bad. Now for automation.py, I needed to change a lot of things. Mostly what I did was change things based on platform:

platform = "wince"
if platform == "wince":
  import osce
else:
  import subprocess

and fix some of the hardcoded variables that are useless from the wince build:

#for wince: this defaults to c:\
DIST_BIN = r'\tests\fennec'
DEFAULT_APP = DIST_BIN + r'\fennec.exe'
CERTS_SRC_DIR = r'\tests\certs'

There is a Process class which manages the subprocess.Popen, so I do this:

if platform == "wince":
 class Process():
  def kill(self):
    pass
else:
 class Process(subprocess.Popen):
 ...

for the last set of changes, in runApp, I disable SSL Tunnel (will have to deal with this for mochitest):

  if platform == "wince":
    runSSLTunnel = False

and since we are not in cygwin, we need the real backslash:

   if platform == "wince":
      profileDirectory = profileDir + "\\"
      args.append(r'--environ:"NO_EM_RESTART=1"')
    else:
      profileDirectory = profileDir + "/"

For launching the actual application, I have to use osce.systema instead of Popen:

  if platform == "wince":
    proc = osce.systema(cmd, args)
    log.info("INFO | (automation.py) | Application pid: None")
  else:
    proc = Process([cmd] + args, env = environment(env), stdout = outputPipe, stderr = subprocess.STDOUT)
    log.info("INFO | (automation.py) | Application pid: %d", proc.pid)

I force outputPipe = None which skips a lot of the process interaction code:

  if debuggerInfo and debuggerInfo["interactive"]:
    outputPipe = None
  elif platform == "wince":
    outputPipe = None
  else:
    outputPipe = subprocess.PIPE

and finally I force the status = 0 so we can avoid more tinkering with the process handle:

  if platform == "wince":
    status = 0
  else:
    status = proc.wait()

That is really all it takes. I think a better approach for the long run is to create a subprocess.py which gives us a Popen and spoofs the other commands (wait, PIPE, stdout, stderr). Also debugging the blockingtee crash after three calls and integrating that into subprocess.py.

Leave a comment

Filed under testdev

tpan – first draft

I got this wrapped up into patch and last night got it working on a deployed n810!

Here is what it took:

  • creating a shell page that took queryString parameters and controlled fennecmark
  • modifying fennecmark to take control commands
  • modifying fennecmark to report data correctly
  • modifying fennecmark to use a local webpage instead of a page on the internet
  • modifying the talos mobile.config to support fennecmark

the shell page and control commands go hand in hand. Here I created a very simple .html page which utilized the eventdata to communicate between privileged and no privileged modes:

    var test = "";
    function parseParams() {
      var s = document.location.search.substring(1);
      var params = s.split('&');
      for (var i = 0; i < params.length; ++i) {
        var fields = params[i].split('=');
        switch (fields[0]) {
        case 'test':
          test = fields[1];
          break;
        }
      }
    }
    parseParams();
    if (test == "Zoom" || test == "PanDown") {
      var element = document.createElement("myExtensionDataElement");
      element.setAttribute("attribute1", test);
      document.documentElement.appendChild(element);

      var evt = document.createEvent("Events");
      evt.initEvent("myExtensionEvent", true, false);
      element.dispatchEvent(evt);
    }

This is a very simple and basic page, but it does the trick. On the fennecmark side, I created a listener which upon receiving an event, would get the testname to run and kick off fennecmark inside of overlay.js:

var myExtension = {
  myListener: function(evt) {
    var test = evt.target.getAttribute("attribute1");
    if (test == "Zoom") { BenchFE.tests = [LagDuringLoad, Zoom]; };
    if (test == "PanDown") { BenchFE.tests = [LagDuringLoad, PanDown]; };

    setTimeout(function() {BenchFE.nextTest(); }, 3000);
  }
}

The next modification to fennecmark is to fix the report.js script and conform to the talos reporting standards:

    if (pretty_array(this.panlag) == null) {
      tpan = "__start_report" + median_array(this.zoominlag) + "__end_report";
    }
    else {
      tpan = "__start_report" + this.pantime + "__end_report";
    }

One quirky thing here is since I am only running pan or zoom, I check if pan is null and print zoom, otherwise just print pan. I suspect as I near this code to a checkin state I will make this more flexible.

Lastly to finish this off, I need to point fennecmark at a page that is local, not on the internet. My initial stab at doing everything had me developing with the standalone pageset which doesn’t life on the production talos boxes. After some back and forth with Aki, I learned what I needed to do and modified pageload.js to do this:

browser.loadURI("http://localhost/page_load_test/pages/www.wikipedia.org/www.wikipedia.org/index.html", null, null, false);

Ok, now I have a toolset to work with. What do we need to do for talos to install and recognize it. I found out that in the .config file there is a section that deals with extensions:

# Extensions to install in test (use "extensions: {}" for none)
# Need quotes around guid because of curly braces
# extensions : 
#     "{12345678-1234-1234-1234-abcd12345678}" : c:\path\to\unzipped\xpi
#     foo@sample.com : c:\path\to\other\unzipped\xpi
#extensions : { bench@taras.glek : /home/mozilla/Desktop/talos/tpan }
extensions : {}

#any directories whose contents need to be installed in the browser before running the tests
# this assumes that the directories themselves already exist in the firefox path
dirs:
  chrome : page_load_test/chrome
  components : page_load_test/components
  chrome : tpan/chrome

Here, I needed to add a new directory for chrome: tpan/chrome. In order to make this work, I needed to create a .jar file out of fennecmark instead of an unzipped extension in the profile (similar to dom inspector). This was a frustrating process until I found Ted’s wizard. After running the wizard, copying my code, tweaking config_build.sh to keep the .jar and running the build.sh script, I had what I needed.

The last step is to add the raw config to mobile.config so fennecmark will run:
tests :

- name: tpan
  shutdown: false
  url: tpan/fennecmark.html?test=PanDown
  resolution: 1
  cycles : 1
  timeout : 60
  win_counters: []
  unix_counters : []
- name: tzoom
  shutdown: false
  url: tpan/fennecmark.html?test=Zoom
  resolution: 1
  cycles : 1
  timeout : 60
  win_counters: []
  unix_counters : []

This leaves me at a good point where we can run fennecmark. Next steps are to solidify reporting, decide where to check in the fennecmark code (not just the .jar) and finally make any adjustments needed to get the installation, running and reporting well documented and stable.

Leave a comment

Filed under testdev

tpan for fennec – intial overview

My project this week is to add some fennec specific performance measurements to talos. Taras has written a great fennecmark tool that is an extension which measures page load, pan, and zoom.

My work flow is to:

  1. integrate fennec mark into talos as tpan
  2. update tpan to include more comprehensive tests
  3. make tpan simulate slow network latencies

Talos is a great framework that the mozilla crew has developed to measure performance metrics on a regular basis. It is a lightweight toolset and easy to setup/install locally, even for Fennec. I found it very straightforward to set this up and get talos results on my local linux box.

In order to meet the needs of the Fennec project and get this up and running, we are just going to implement step 1 and add the additional steps as projects on the fennec testdev page.

Leave a comment

Filed under testdev