Category Archives: general

Working towards a productive definition of “intermittent orange”

Intermittent Oranges (tests which fail sometimes and pass other times) are an ever increasing problem with test automation at Mozilla.

While there are many common causes for failures (bad tests, the environment/infrastructure we run on, and bugs in the product)
we still do not have a clear definition of what we view as intermittent.  Some common statements I have heard:

  • It’s obvious, if it failed last year, the test is intermittent
  • If it failed 3 years ago, I don’t care, but if it failed 2 months ago, the test is intermittent
  • I fixed the test to not be intermittent, I verified by retriggering the job 20 times on try server

These are imply much different definitions of what is intermittent, a definition will need to:

  • determine if we should take action on a test (programatically or manually)
  • define policy sheriffs and developers can use to guide work
  • guide developers to know when a new/fixed test is ready for production
  • provide useful data to release and Firefox product management about the quality of a release

Given the fact that I wanted to have a clear definition of what we are working with, I looked over 6 months (2016-04-01 to 2016-10-01) of OrangeFactor data (7330 bugs, 250,000 failures) to find patterns and trends.  I was surprised at how many bugs had <10 instances reported (3310 bugs, 45.1%).  Likewise, I was surprised at how such a small number (1236) of bugs account for >80% of the failures.  It made sense to look at things daily, weekly, monthly, and every 6 weeks (our typical release cycle).  After much slicing and dicing, I have come up with 4 buckets:

  1. Random Orange: this test has failed, even multiple times in history, but in a given 6 week window we see <10 failures (45.2% of bugs)
  2. Low Frequency Orange: this test might fail up to 4 times in a given day, typically <=1 failures for a day. in a 6 week window we see <60 failures (26.4% of bugs)
  3. Intermittent Orange: fails up to 10 times/day or <120 times in 6 weeks.  (11.5% of bugs)
  4. High Frequency Orange: fails >10 times/day many times and are often seen in try pushes.  (16.9% of bugs or 1236 bugs)

Alternatively, we could simplify our definitions and use:

  • low priority or not actionable (buckets 1 + 2)
  • high priority or actionable (buckets 3 + 4)

Does defining these buckets about the number of failures in a given time window help us with what we are trying to solve with the definition?

  • Determine if we should take action on a test (programatically or manually):
    • ideally buckets 1/2 can be detected programatically with autostar and removed from our view.  Possibly rerunning to validate it isn’t a new failure.
    • buckets 3/4 have the best chance of reproducing, we can run in debuggers (like ‘rr’), or triage to the appropriate developer when we have enough information
  • Define policy sheriffs and developers can use to guide work
    • sheriffs can know when to file bugs (either buckets 2 or 3 as a starting point)
    • developers understand the severity based on the bucket.  Ideally we will need a lot of context, but understanding severity is important.
  • Guide developers to know when a new/fixed test is ready for production
    • If we fix a test, we want to ensure it is stable before we make it tier-1.  A developer can use math of 300 commits/day and ensure we pass.
    • NOTE: SETA and coalescing ensures we don’t run every test for every push, so we see more likely 100 test runs/day
  • Provide useful data to release and Firefox product management about the quality of a release
    • Release Management can take the OrangeFactor into account
    • new features might be required to have certain volume of tests <= Random Orange

One other way to look at this is what does gets put in bugs (war on orange bugzilla robot).  There are simple rules:

  • 15+ times/day – post a daily summary (bucket #4)
  • 5+ times/week – post a weekly summary (bucket #3/4 – about 40% of bucket 2 will show up here)

Lastly I would like to cover some exceptions and how some might see this flawed:

  • missing or incorrect data in orange factor (human error)
  • some issues have many bugs, but a single root cause- we could miscategorize a fixable issue

I do not believe adjusting a definition will fix the above issues- possibly different tools or methods to run the tests would reduce the concerns there.


Filed under general, testdev, Uncategorized

5 days in Portland with Mozillians and 10 great things that came from it

I took a lot of notes in Portland last week.  One might not know that based on the fact that I talked so much my voice ran out of steam by the second day.  Either way, in chatting with some co-workers yesterday about what we took away from Portland, I realized that there is a long list of awesomeness.

Let me caveat this by saying that some of these ideas have been talked about in the past, but despite our efforts to work with others and field interesting and useful ideas, there is a big list of great things that came to light while chatting in person:

  • :bgrins mentioned a mozscreenshot tool and the need for getting a screenshot of new features in development on various platforms so UX can review the changes.  Currently it is a method of asking UX to download the build from try or some other location and run it locally to see the changes.
  • :heycam/:jwatt – had a great an interesting talos discussion.  Mostly around how to run it and validate patches/fixes locally and on try server. (check out bug 1109243)
  • :glandium is looking at doing some changes (I recall something with build/pgo) and wanted to know how to compare some Talos numbers to help make the right decision – this can be done with either bug 1109243, or the existing in the Talos repo (we might need some cleanup on this)
  • :bobowen has been working to get csb tests working- after chatting in line to board a plane, it became clear he needs to solve some finer grain test selection problems- many of which the ateam has on a roadmap in Q2/Q3 – I see some tighter collaboration happening here.
  • Thanks to chatting with :lsblakk, I am motivated to expand the talos sheriff team and look for dedicated Mozillians (or soon to become Mozillians) to work with in keeping a lid on the alerts and overall state of performance (based on what we measure).
  • :lightsofapollo had a great conversation with me about TaskCluster and what barriers stood in the way of running Talos on it – this will result is some initial investigation work!
  • :kats was asking me how to generate alerts for  This is very doable via posting data to graph server
  • After a good session on how to handle intermittents (seems like the same people have this conversation every time a bunch of Mozillians get together), I am motivated to push Titanic further to find the root cause of an intermittent via brute force retriggers (ideally on weekends).  In fact :dbaron has done this a few times in the last month and so have the sheriffs.  This is similar to what we do to verify a talos regression, just with some different parameters.
  • The same conversation about intermittents yielded a stronger desire to look at new tests coming into the system and validating stability.  The simple solution is to run the job 100 times, verify that the new test didn’t have issues and then leave it along.  Of course we could get smart and do this for all test_* files that are edited in the tree.  Thanks to :ehsan for spawning this conversation.
  • Discussing the idea of a Talos Sheriff with a few folks, it seems like there are further conversations needs with the existing Sheriff team as well as to chat with :vladan and :avih about what type of policy we should have for existing performance failures which are detected.  I would expect some changes to be made early next year as we have more tests and need more help.  My initial thoughts are specifically with responding to regressions or getting backed out in XX hours.  Yeah that sounds nasty, but there are probably cut and dry parameters we can set and start enforcing.

Those are 10 specific topics which despite everybody knowing how to contact me or the ateam and share great ideas or frustrations, these came out of being in the same place at the same time.

Thinking through this, when I see these folks in a real office while working from there for a few days or a week, it seems as though the conversations are smaller and not as intense.  Usually just small talk whilst waiting for a build to finish.  I believe the idea where we are not expected to focus on our day to day work and instead make plans for the future is the real innovation behind getting these topics surfaced.

1 Comment

Filed under general

Say hi to Kaustabh Datta Choudhury, a newer Mozillian

A couple months ago I ran into :kaustabh93 online as he had picked up a couple good first bugs.  Since then he has continued to work very hard and submit a lot of great pull requests to Ouija and Alert Manager (here is his github profile).  After working with him for a couple of weeks, I decided it was time to learn more about him, and I would like to share that with Mozilla as a whole:

Tell us about where you live-

I live in a town called Santragachi in West Bengal. The best thing about this place is its ambience. It is not at the heart of the city but the city is easily accessible. That keeps the maddening crowd of the city away and a calm and peaceful environment prevails here.

Tell us about your school-

 I completed my schooling from Don Bosco School, Liluah. After graduating from there, now I am pursuing an undergraduate degree in Computer Science & Engineering from MCKV Institute of Engineering.

Right from when it was introduced to me, I was in love with the subject ‘Computer Science’. And introduction to coding was one of the best things that has happened to me so far.

Tell us about getting involved with Mozilla-

I was looking for some exciting real life projects to work on during my vacation & it was then that the idea of contributing to open source projects had struck me. Now I have been using Firefox for many years now and that gave me an idea of where to start looking. Eventually I found the volunteer tab and thus started my wonderful journey on Mozilla.

Right from when I was starting out, till now, one thing that I liked very much about Mozilla was that help was always at hand when needed. On my first day , I popped a few questions in the IRC channel #introduction & after getting the basic of where to start out, I started working on Ouija under the guidance of ‘dminor’ & ‘jmaher’. After a few bug fixes there, Dan recommended me to have a look at Alert Manager & I have been working on it ever since. And the experience of working for Mozilla has been great.

Tell us what you enjoy doing-

I really love coding. But apart from it I also am an amateur photographer & enjoy playing computer games & reading books.

Where do you see yourself in 5 years?

In 5 years’ time I prefer to see myself as a successful engineer working on innovative projects & solving problems.

If somebody asked you for advice about life, what would you say?

Rather than following the crowd down the well-worn path, it is always better to explore unchartered territories with a few.

:kaustabh93 is back in school as of this week, but look for activity on bugzilla and github from him.  You will find him online once in a while in various channels, I usually find him in #ateam.


Filed under Community, general

Mozilla A-Team – Unraveling the mystery of Talos – part 1 of a googol

Most people at Mozilla have heard of Talos, if you haven’t, Talos is the performance testing framework that runs for every checkin that occurs at Mozilla.

Over the course of the last year I have had the opportunity to extend, modify, retrofit, rewrite many parts of the harness and tests that make up Talos.

It seems that once or twice a month I get a question about Talos.  Wouldn’t it be nice if I documented Talos?  When Alice was the main owner of Talos, she had written up some great documentation and as of today I am announcing that it has gone through an update:

Stay tuned as there will be more updates to come as we make the documentation more useful.


Filed under general, testdev

Two failed attempts with technology today, just one of those days

Today I experienced two WTF moments while trying to use computers:

1) BrowserID ended up being a total failure for me

2) Accessing is next to impossible when trying to share files across computers

I have heard great things about BrowserID, and today was my first real chance at it.  I had an account on, and this was with my <me> email address.  It has been a few months since I had been on there and now it uses BrowserID for all access.  Great!!  I had signed up with BrowserID with my <me> address, but that failed to log me in.  So I clicked the ‘add another email address’, and got a verification email in my inbox.  Trying to verify was impossible with some cryptic error messages.  10 minutes later after trying to log in, I finally found my way to #identity and was told to try it again.  It magically worked.  OK, let me log in to my addons account, no luck.  After 15 more minutes of poking around, I found that my email address worked with BrowserID just fine by testing it on another site, but it still failed on addons.

Here is my take of the problem:

  • BrowserID is supposed to make logging in easier, 30 minutes of debugging and I still cannot login.
  • There are no useful error and help messages on the BrowserID site, nor AMO.  How could my mom figure this out?
  • Where in the world is my ‘I forgot my username/password’ link?  Honestly I could have signed up on AMO with a totally random email address and could have been wasting a lot of time.
  • I found it easier to signup as a new user with a different BrowserID email, than to figure out how to login with my normal account.

My next problem occurs with accessing  I have been using this for 3.5 years on a regular basis.  I put log files up there for people to read, zip files when I want to share some code or an build, and sometimes I create a webpage to outline data.  I depend on this as a workflow since I know of no other file server at mozilla that I can just scp files up to.  Just this past weekend, some work was done on the server and the permissions got messed up.  This was fixed, then it wasn’t, it was fixed and now it isn’t.  I can detect patterns and that is a pretty easy pattern to detect.  What really gets me is this message when I log in:

Last login: Thu May 17 18:41:20 2012 from
All files stored on this server are subject to automated scans.
You shouldn’t store sensitive information on this server, and you should
avoid having production services depend on data stored here.
Files in ~/public_html may be seen by anyone on the Internet.
[jmaher@people1.dmz.scl3 ~]$

Who in their right mind would think that putting files in a folder called ‘public_html’ would not be seen by anyone on the Internet?  I expect tomorrow I will have to sign a NDA to access my account.

The big problem here is that I wasted 20 minutes doing a task that I normally do in 2 minutes and delayed getting a perma red test fixed because I couldn’t find a place to upload a fixed to.

Enough complaining and ranting and back to work on reftests for android native!


Filed under general, personal, reviews

Professional Development, Improv and your audience

I had the opportunity to attend some really exciting professional development sessions at the All Hands.  Personally I found these very interesting, but I heard a lot of grumbling about how these are not adding a lot of value or of interest.

One reason I found these interesting is that in a previous life I had attended a few years of Improv acting classes and did a short stint of real onstage Improv acting.  In looping back to these professional development sessions, they reminded me of the core concepts we learned in Improv 101.  So if you felt that you missed out, sign up for an Improv class.  Maybe if there are professional development sessions at a future event they could just have an Improv acting class.

Related to the professional development courses, I found that most of these were sparsely attended.  Of those that did attend the courses received great reviews/ratings.  To be fair, the technical tracks that I attended had about the same attendance records of the professional development tracks.  Maybe we are not creating sessions that are of interest to our audience?  I know for the technical tracks we just propose something and it magically becomes a session.  I don’t recall getting any input in what sessions would be available to me.  Maybe in the future we can do a better job of getting input from the community (a.k.a audience)!

1 Comment

Filed under general, reviews

mochikit.jar changes are in mozilla central

Last night we landed the final patches to make mochikit.jar a reality.  This started out as a bug where we would package all the mochikit harness + chrome tests into a single .jar file and then on a remote system copy that to the application directory and run the tests locally.  It ended up being much more than that, let me explain some of the changes that have taken place.

why change all of this?

In order to test remotely (on mobile devices such as windows mobile and android) where there are not options to run tools and python scripts, we need to put everything in the browser that it needs and launch the browser remotely.  The solution for tests that are not accessible over the network is to run them from local files.

what is mochikit.jar?

Mochikit.jar is an extension that is installed in the profile and contains all the core files that mochitest (plain, chrome, browser-chrome, a11y) needs to run in a browser.  This doesn’t contain any of the external tools such as ssltunnel and python scripts to set up a webserver.  When you do a build, you will see a new directory in $(objdir)/_tests/testing/mochitest called mochijar.  Feel free to poke around there.  As a standalone application all chrome://mochikit/content calls will use this extension, not a pointer to the local file system.  The original intention of mochkit.jar was to include tests data, but we found that to create an extension using we needed a concrete list of files and that was not reasonable to do for our test files.  So we created tests.jar.

what is tests.jar?

tests.jar is the actual test data for browser-chrome, chrome, and a11y tests.  These are all tests that are not straightforward to access remotely over http, so we are running these locally out of a .jar file.  tests.jar is only created when you do a ‘make package-tests’ and ends up in the root of the mochitest directory as tests.jar.  If the harness finds this file, it copies it to the profile and generates a .manifest file for the .jar file, otherwise we generate a plain .manifest file to point to the filesytem.  Finally we dynamically registers tests.manifest from the profile.  Now all your tests will have a chrome://mochitests/content instead of chrome://mochikit/content.

What else changed?

A lot of tests had to change to work with this because we had hardcoded chrome://mochikit/content references in our test code and data.  It is fine to have that in there for references to the harness and core utilities, but to reference a local piece of test data, it was hardcoded and didn’t need to be.  A few tests required some more difficult work where we had to extract files temporarily to a temp folder in the profile and reference them with a file path.

what do I need to do when writing new tests?

please don’t cut and paste code then change it to reference a data, utility, or other uri that has chrome://mochikit/content/ in it.  If you need to access a file with the full URI or as a file path, here are some tips:

* a mochitest-chrome test that needs to reference a file in the same directory or subdir:
let chromeDir = getRootDirectory(window.location.href);

* a browser-chrome test that needs to reference a file in the same directory or subdir:
//NOTE: gTestPath is set because window.location.href is not always set on browser-chrome tests
let chromeDir = getRootDirectory(gTestPath);

* extracting files to temp and accessing them

  let rootDir = getRootDirectory(gTestPath);
  let jar = getJar(rootDir);
  if (jar) {
    let tmpdir = extractJarToTmp(jar);
    rootDir = "file://" + tmpdir.path + '/';
  loader.loadSubScript(rootDir + "privacypane_tests.js", this);

Leave a comment

Filed under general, testdev