Tag Archives: mozilla

browser-chrome is greener and in many chunks

On Friday we rolled out a big change to split up our browser-chrome tests.  It started out as a great idea to split the devtools out into their own suite, then after testing, we ended up chunking the remaining browser chrome tests into 3 chunks.

No more 200 minute wait times, in fact we probably are running too many chunks.  A lot of heavy lifting took place, a lot of it in releng from Armen and Ben, and much work from Gavin and RyanVM who pushed hard and proposed great ideas to see this through.

What is next?

There are a few more test cases to fix and to get all these changes on Aurora.  We have more work we want to do (lower priority) on running the tests differently to help isolate issues where one test affects another test.

In the next few weeks I want to put together a list of projects and bugs that we can work on to make our tests more useful and reliable.  Stay tuned!

 

1 Comment

Filed under testdev

is a phone too hard to use?

Working at Mozilla, I get to see a lot of great things.  One of them is collaborating with my team (as we are almost all remoties) and I have been doing that for almost 6 years.  Sometime around 3 years ago we switch to using Vidyo as a way to communicate in meetings.  This is great, we can see and hear each other.  Unfortunately heartbleed came out and affects Mozilla’s Vidyo servers.  So yesterday and today we have been without Vidyo.

Now I am getting meeting cancellation notices, why are we cancelling meetings?  Did meetings not happen 3 years ago?  Mozilla actually creates an operating system for a … phone.  In fact our old teleconferencing system is still in place.  I thought about this earlier today and wondered why we are cancelling meetings.  Personally I always put Vidyo in the background during meetings and keep IRC in the foreground.  Am I a minority?

I am not advocating for scrapping Vidyo, instead I would like to attend meetings, and if we find they cannot be held without Vidyo, we should cancel them (and not reschedule them). 

Meetings existed before Vidyo and Open Source existed before GitHub, we don’t need the latest and greatest things to function in life. Pick up a phone and discuss what needs to be discussed.

3 Comments

Filed under Uncategorized

tracking talos alerts across branches

A year without blogging and I am back.  I figured there was some cool stuff to share, here is one tidbit.

In the last year I have picked up looking at talos results and filing regression bugs for results.  This has been useful.  What currently happens is when results are submitted to g.m.o (graph server) we detect a regression and send out an email to the original patch author (if we can determine it) and post to mozilla.dev.tree-management.  I have been using dev.tree-management as a starting point for my hunting regressions.  When things are busy it can eat up a couple hours in a day.  Luckily many developers are responsible in taking action when they receive the emails.

Given that at least half of the regressions are not acted upon by the original developer, it is important to read the newsgroup. One of the things which makes it frustrating is that for a single regression we can get multiple alerts (regular builds vs pgo builds and as the patch merges between branches/projects).

To make my life easier, I have taken all the alerts on dev.tree-management and put them in a database (local right now).  The final goal is a webUI that lets me easily annotate these alerts similar to tbpl for random test failures.  One thing I wanted to do was help identify duplicate alerts.  Today in my attempt I had a clear picture of what the lifecycle of a regression looks like:

mysql> select date,branch,percent,keyrevision from alerts where test=’Paint’ and platform=’WINNT 6.2 x64′ order by date ASC;
+———————+————————-+———+————–+
| date                | branch                  | percent | keyrevision  |
+———————+————————-+———+————–+
| 2014-02-14 19:41:38 | Mozilla-Inbound-Non-PGO | 10.1%   | c7802c9d6eec |
| 2014-02-15 01:03:54 | Fx-Team-Non-PGO         | 9.53%   | 7a3adc5aac28 |
| 2014-02-15 21:43:48 | Mozilla-Inbound         | 10.6%   | c7802c9d6eec |
| 2014-02-16 03:46:12 | Firefox-Non-PGO         | 8.88%   | 5d7caa093f4f |
| 2014-02-16 03:46:13 | B2g-Inbound-Non-PGO     | 9.44%   | 071885f79841 |
| 2014-02-16 14:22:38 | Fx-Team                 | 10.4%   | 7a3adc5aac28 |
| 2014-02-17 04:42:57 | B2g-Inbound             | 10.7%   | 071885f79841 |
| 2014-02-18 11:43:54 | Firefox                 | 9.76%   | eac89fb04bb9 |
+———————+————————-+———+————–+
8 rows in set (0.00 sec)

This is really cool to see how 1 change can generate alerts for 4 days.

Stay tuned for more information on this and other topics!

Leave a comment

Filed under Uncategorized

Android automation is becoming more stable ~7% failure rate

At Mozilla we have made our unit testing on android devices to be as important as desktop testing. Earlier today I was asked how do we measure this and what is our definition of success. The obvious answer is no failures except for code that breaks a test, but reality is something where we allow for random failures and infrastructure failures. Our current goal is 5%

So what are these acceptable failures and what does 5% really mean. Failures can happen when we have tests which fail randomly, usually poorly written tests or tests which have been written a long time ago and hacked to work in todays environment. This doesn’t mean any test that fails is a problem, it could be a previous test that changes a Firefox preference on accident. For Android testing, this currently means the browser failed to launch and load the test webpage properly or it crashed in the middle of the test. Other failures are the device losing connectivity, our host machine having hiccups, the network going down, sdcard failures, and many other problems. With our current state of testing this mostly falls into the category of losing connectivity to the device. For infrastructure problems they are indicated as Red or Purple and for test related problems they are Orange.

I took at a look at the last 10 runs on mozilla-central (where we build Firefox nightlies from) and built this little graph:

Firefox Android Failures

Firefox Android Failures

Here you can see that our tests are causing 6.67% of the failures and 12.33% of the time we can expect a failure on Android.

We have another branch called mozilla-inbound (we merge this into mozilla-central regularly) where most of the latest changes get checked in.  I did the same thing here:

mozilla-inbound Android Failures

mozilla-inbound Android Failures

Here you can see that our tests are causing 7.77% of the failures and 9.89% of the time we can expect a failure on Android.

This is only a small sample of the tests, but it should give you a good idea of where we are.

3 Comments

Filed under testdev

Professional Development, Improv and your audience

I had the opportunity to attend some really exciting professional development sessions at the All Hands.  Personally I found these very interesting, but I heard a lot of grumbling about how these are not adding a lot of value or of interest.

One reason I found these interesting is that in a previous life I had attended a few years of Improv acting classes and did a short stint of real onstage Improv acting.  In looping back to these professional development sessions, they reminded me of the core concepts we learned in Improv 101.  So if you felt that you missed out, sign up for an Improv class.  Maybe if there are professional development sessions at a future event they could just have an Improv acting class.

Related to the professional development courses, I found that most of these were sparsely attended.  Of those that did attend the courses received great reviews/ratings.  To be fair, the technical tracks that I attended had about the same attendance records of the professional development tracks.  Maybe we are not creating sessions that are of interest to our audience?  I know for the technical tracks we just propose something and it magically becomes a session.  I don’t recall getting any input in what sessions would be available to me.  Maybe in the future we can do a better job of getting input from the community (a.k.a audience)!

1 Comment

Filed under general, reviews

converting xpcshell from listing directories to a manifest

Last year we ventured down the path of adding test manifests for xpcshell in bug 616999.  Finding a manifest format is not easy because there are plenty of objections to the format, syntax and relevance to the project at hand.  At the end of the day, we depend too much on our build system to filter tests and after that we have hardcoded data in tests or harnesses to run or ignore based on certain criteria.  So for xpcshell unittests, we have added a manifest so we can start to keep track of all these tests and not depend on iterating directories and sorting or reverse sorting head and tail files.

The first step is to get a manifest format for all existing tests.  This was landed today in bug 616999 and is currently on mozilla-central.  This requires that all test files in directories be in the manifest file and that the manifest file includes all files in the directory (verified at make time).  Basically if you do a build, it will error out if you forget to add a manifest or test file to the manifest.  Pretty straightforward.

The manifest we have chosen is the ini format from mozmill.  We found that there is no silver bullet for a perfect test manifest, which is why we chose an existing format that met the needs of xpcshell.  This is easy to hand edit (as opposed to json), is easy to parse from python and javascript.  As compared to reftests which have a custom manifest format, we needed to just have a list of test files and more specifically a way to associate a head and tail script file (not easy with reftest manifests).  The format might not work for everything, but it gives us a second format to work with depending on the problem we are solving.

Leave a comment

Filed under testdev

Some notes about adding new tests to talos

Over the last year and a half I have been editing the talos harness for various bug fixes, but just recently I have needed to dive in and add new tests and pagesets to talos for Firefox and Fennec.  Here are some of the things I didn’t realize or have inconveniently forget about what goes on behind the scenes.

  • tp4 is really 4 tests: tp4, tp4_nochrome, tp4_shutdown, tp4_shutdown_nochrome.  This is because in the .config file, we have “shutdown: true” which adds _shutdown to the test name and running with –noChrome adds the _nochrome to the test name.  Same with any test that us run with the shutdown=true and nochrome options.
  • when adding new tests, we need to add the test information to the graph server (staging and production).  This is done in the hg.mozilla.org/graphs repository by adding to data.sql.
  • when adding new pagesets (as I did for tp4 mobile), we need to provide a .zip of the pages and the pageloader manifest to release engineering as well as modifying the .config file in talos to point to the new manifest file.  see bug 648307
  • Also when adding new pages, we need to add sql for each page we load.  This is also in the graphs repository bug in pages_table.sql.
  • When editing the graph server, you need to file a bug with IT to update the live servers and attach a sql file (not a diff).   Some examples: bug 649774 and bug 650879
  • after you have the graph servers updated, staging run green, review done, then you can check in the patch for talos
  • For new tests, you also have to create a buildbot config patch to add the testname to the list of tests that are run for talos
  • the last step is to file a release engineering bug to update talos on the production servers.  This is done by creating a .zip of talos, posting it on a ftp site somewhere and providing a link to it in the bug.
  • one last thing is to make sure the bug to update talos has an owner and is looked at, otherwise it can sit for weeks with no action!!!

This is my experience from getting ts_paint, tpaint, and tp4m (mobile only) tests added to Talos over the last couple months.

Leave a comment

Filed under testdev

Orange Factor and the WOO-Tang Clan

I have silently put up a tool call Orange Factor early last month as part of the War On Orange (WOO) project.  Over the last few weeks I have been iterating on this and working with jgriffin, jhammel, mcote and ctalbert (some have referred to us as the WOO-tang clan) to make this more useful and accurate.  Let me outline a few features of the site to give you a general introduction.

To start off with I know it takes a long time to load, but it should load in <30 seconds.   All the data is collected from bugs that are blocking randomorange.  This is done by parsing the comments and linked tinderbox logs to determine the frequency and type of failure.  We display a graph that tracks the cumulative orange factor (failure/push) over time.  NOTE: we are going off the number of pushes, not the number of tests ran.

 

Orange Factor graph

Graph of the Orange Factor over time

Next there is  the Heatmap.  This is similar to what you see on tinderboxpushlog, except this is color code by the number of failures.

Overall HeatMap

Overall HeatMap

From the HeatMap, you can click on a specific value to see more details about that test run (in the time range).  For example, here is OSX MoOth:

OSX MOth Testrun

OSX MOth Testrun

Ok, this is really cool.  You can click on each day and filter down to the specific day, also at the top, you see the drop down select boxes.  This is super awesome because you can slice and dice up the data to view it just how you want.

Next I want to show you what the view looks like for a specific day.  On the left hand side of the webpage is a Calendar, you can click any day (I clicked Sept 11th) or click the day on a test run or orange factor graph (hover your mouse over the graph and a link will show up).

Daily Test Results for all tests by Platform

Daily Test Results for all tests by Platform

You should get the point that there are many ways to view the data.  Actually probably too much information!  So lately we have been working on some bug centric views.  To start off with, we have a topfails style report but this is based on bugs, not failures in log files.  To get here, click on the “Research and Top Bugs” link on the right hand side of the page.  Here is a  “weekly” view that is the top 5 bugs per week:

Top 5 Bugs every 7 days

Top 5 Bugs every 7 days

Hover over the color bars to see the bug number and research it in more details.  Here is what you see when viewing a specific bug (544601):

Individual Bug Graph over Time

Individual Bug Graph over Time

Orange Factor has much more to offer, just poke around and see how you can make it useful.  Feedback is welcome, and feel free to ask any questions in #ateam!

2 Comments

Filed under testdev

tests that require privileged access

I have been working on a project to get mochitests running on a build of Fennec + electrolysis.  In general, you can follow along in bug 567417.

One of the large TODO items in getting the tests to run is actually fixing the tests which use UniversalXPConnect.  So my approach was to grep through a mochitest tests/ directory for @mozilla and parse it out.  With a few corner cases, this resulted in a full list of services we utilize from our tests (here is a sorted list by frequency 76 total services.)  Cool, but that didn’t seem useful enough.  Then I took my work that I have done for filtering (the json file) and cross referenced that with my original list of tests that use UniversalXPConnect.

Now I have a list of 59 services which all should pass in Fennec (a mozilla-central build from 2 weeks ago on n900) along with the first filename of the test which utilizes that services!

What else would be useful?

Leave a comment

Filed under testdev

accessing privileged content from a normal webpage, request for example

One project which I am working on is getting mochitests to run in fennec + electrolysis.  Why this is a project is we don’t allow web pages to access privileged information in the browser anymore.  Mochitests in the simplest form use file logging and event generation.

The communication channel that is available between the ‘chrome’ and ‘content’ process is the messageManager protocol.  There are even great examples of using this in the unit tests for ipc.  Unfortunately I have not been able to load a normal web page and allow for my extension which used messageManager calls to interact.

I think what would be nice to see is a real end to end example of an extension that would demonstrate functionality on any given webpage.  This would be helpful to all addon authors, not just me:)  If I figure this out, I will update with a new blog post.

1 Comment

Filed under testdev