Category Archives: Uncategorized

is a phone too hard to use?

Working at Mozilla, I get to see a lot of great things.  One of them is collaborating with my team (as we are almost all remoties) and I have been doing that for almost 6 years.  Sometime around 3 years ago we switch to using Vidyo as a way to communicate in meetings.  This is great, we can see and hear each other.  Unfortunately heartbleed came out and affects Mozilla’s Vidyo servers.  So yesterday and today we have been without Vidyo.

Now I am getting meeting cancellation notices, why are we cancelling meetings?  Did meetings not happen 3 years ago?  Mozilla actually creates an operating system for a … phone.  In fact our old teleconferencing system is still in place.  I thought about this earlier today and wondered why we are cancelling meetings.  Personally I always put Vidyo in the background during meetings and keep IRC in the foreground.  Am I a minority?

I am not advocating for scrapping Vidyo, instead I would like to attend meetings, and if we find they cannot be held without Vidyo, we should cancel them (and not reschedule them). 

Meetings existed before Vidyo and Open Source existed before GitHub, we don’t need the latest and greatest things to function in life. Pick up a phone and discuss what needs to be discussed.

3 Comments

Filed under Uncategorized

polishing browser-chrome – coming to a branch near you soon

The last 2 weeks I have gone head first into a world of resolving some issues with our mochitest browser-chrome tests with RyanVM, Armen, and the help of Gavin and many developers who are fixing problems left and right.

There are 3 projects I have been focusing on:

1) Moving our Linux debug browser chrome tests off our old fedora slaves in a datacenter and running them on ec2 slave instances, in bug 987892.

These are live and green on all Firefox 29, 30, and 31 trees!  More work is needed for Firefox-28 and ESR-24 which should be wrapped up this week.  Next week we can stop running all linux unittests on fedora slaves.

2) Splitting all the developer tools tests out of the browser-chrome suite into their own suite in bug 984930.

browser-chrome tests have been a thorn in the side of the sheriff team for many months.  More and more the rapidly growing features and tests of developer tools have been causing the entire browser-chrome suite to fail, in cases of debug to run for hours.  Splitting this out gives us a small shield of isolation.  In fact, we have this running well on Cedar, we are pushing hard to have this rolled out to our production and development branches by the end of this week!

3) Splitting the remaining browser chrome tests into 3 chunks, in bug 819963.

Just like the developer tools, we have been running browser-chrome in 3 chunks on Cedar.  With just 7 tests disabled, we are very green and consistently green. 

 

 

While there are a lot of other changes going on under the hood, what will be seen by next week on your favorite branch of Firefox will be:

  • ‘dt’ jobs for opt, and ‘dt1′, ‘dt2′, ‘dt3′ jobs for debug
  • ‘bc’ job will turn into ‘bc1′, ‘bc2′, ‘bc3′
  • much faster turnaround times on bc tests (62 minutes is the slowest right now, the rest are averaging ~20 minutes/job)
  • less random orange cluttering up results

 

Leave a comment

Filed under Uncategorized

Performance Bugs – How to stay on top of Talos regressions

Talos is the framework used for desktop Firefox to measure performance for every patch that gets checked in.  Running tests for every checkin on every platform is great, but who looks at the results?

As I mentioned in a previous blog post, I have been looking at the alerts which are posted to dev.tree-management, and taking action on them if necessary.  I will save discussing my alert manager tool for another day.  One great thing about our alert system is that we send an email to the original patch author if we can determine who it is.  What is great is many developers already take note of this and take actions on their own.  I see many patches backed out or discussed with no one but the developer initiating the action.

So why do we need a Talos alert sheriff?  For the main reason that not even half of the regressions are acted upon.  There are valid reasons for this (wrong patch identified, noisy data, doesn’t seem related to the patch) and of course many regressions are ignored due to lack of time.  When I started filing bugs 6 months ago, I incorrectly assumed all of them would be fixed or resolved as wontfix for a valid reason.  This happens for most of the bugs, but many regressions get forgotten about.

When we did the uplift of Firefox 30 from mozilla-central to mozilla-aurora, we saw 26 regression alerts come in and 4 improvement alerts.  This prompted us to revisit the process of what we were doing and what could be done better.  Here are some of the new things we will be doing:

  • For all regressions found, attempt to find the original bug and reopen/comment in the bug
  • For some regressions that it is not easy to find the original bug, we will open a new bug
  • All bugs that have regression information will be marked as blocking a new tracking bug
  • For each release we will create a new tracking bug for all regressions
  • After an uplift from central->aurora, we will ensure we have all alerts mapped to existing regressions

As this process goes through a cycle or two, we will refine it to ensure we have less noise for developers and more accuracy in tracking regressions faster

 

Leave a comment

Filed under Uncategorized

mochitests and manifests

Of all the tests that are run on tbpl, mochitests are the last ones to receive manifests.  As of this morning, we have landed all the changes that we can to have all our tests defined in mochitest.ini files and have removed the entries in b2g*.json, by putting entries in the appropriate mochitest.ini files.

Ahal, has done a good job of outlining what this means for b2g in his post.  As mentioned there, this work was done by a dedicated community member :vaibhav1994 as he continues to write patches, investigate failures, and repeat until success.

For those interested in the next steps, we are looking forward to removing our build time filtering and start filtering tests at runtime.  This work is being done by billm in bug 938019.  Once that is landed we can start querying which tests are enabled/disabled per platform and track that over time!

Leave a comment

Filed under Uncategorized

quick tip – when all else fails – “reseat”

While chatting with dminor the other day he mentioned his camera stopped working and after a reboot there was no mention of the camera hardware in the logs or via dmesg.  His conclusion, the camera was not working.  Since I have the same hardware and run Ubuntu 13.10 as he does he wanted a sanity check.  My only suggestion was to turn off the computer, unplug it and take the battery out, wait 30 seconds then reassemble and power on.

Hey my suggestion worked and now dminor has a working camera again. 

This general concept of reseating hardware is something that is easily forgotten, yet is so effective.

1 Comment

Filed under Uncategorized

Where did all the good first bugs go?

As this is the short time window of Google Summer of Code applications, I have seen a lot of requests for mochitest related bugs to work on.  Normally, we look for new bugs on the bugs ahoy! tool.  Most of these have been picked through, so I spent some time going through a bunch of mochitest/automation related bugs.  Many of the bugs I found were outdated, duplicates of other things, or didn’t apply to the tools today.

Here is my short list of bugs to get more familiar with automation while fixing bugs which solve real problems for us:

  • bug 958897 – ssltunnel lives if mochitest killed
  • Bug 841808 – mozfile.rmtree should handle windows directory in use better
  • Bug 892283 – consider using shutil.rmtree and/or distutils remove_tree for mozfile
  • Bug 908945 – Fix automation.py’s exit code handling
  • Bug 912243 – Mochitest shouldnt chdir in __init__
  • Bug 939755 – With httpd.js we sometimes don’t get the most recent version of the file

I have added the appropriate tags to those bugs to make them good first bugs.  Please take time to look over the bug and ask questions in the bug to get a full understanding of what needs to be done and how to test it.

Happy hacking!

Leave a comment

Filed under Uncategorized

Hi Vaibhav Agrawal, welcome to the Mozilla Community!

I have had the pleasure to work with Vaibhav for the last 6 weeks as he has joined the Mozilla community as a contributor on the A-Team.  As I have watched him grow in his skills and confidence, I thought it would be useful to introduce him and share a bit more about him.

From Vaibhav:

What is your background?

I currently reside in Pilani, a town in Rajasthan, India. The thing that I like the most about where I live is the campus life. I am surrounded by some awesome and brilliant people, and everyday I learn something new and interesting.

I am a third year student pursuing Electronics and Instrumentation at BITS Pilani. I have always loved and been involved in coding and hacking stuff. My favourite subjects so far have been Computer Programming and Data Structures and Algorithms. These courses have made my basics strong and I find the classes very interesting.

I like to code and hack stuff, and I am an open source enthusiast. Also, I enjoy solving algorithmic problems in my free time. I like following new startups and the technology they are working on, running, playing table tennis, and I am a cricket follower.

How did you get involved with Mozilla?

I have been using Mozilla Firefox for many years now. I have recently started contributing to the Mozilla community when a friend of mine encouraged me to do so. I had no idea how the open source community worked and I had the notion that people generally do not have time to answer silly questions and doubts of newcomers. But guess what? I was totally wrong. The contributors in Mozilla are really very helpful and are ready to answer every trivial question that a newbie faces.

Where do you see yourself in 5 years?

I see myself as a software developer solving big problems, building great products and traveling different places!

What advice would you give someone?

Do what you believe in and have the courage to follow your heart and instincts. They somehow know what you truly want to become.

In January :vaibhav1994 popped into IRC and wanted to contribute to a bug that I was a mentor on.  This was talos bug 931337, from that first bug Vaibhav has fixed many bugs including work on finalizing our work to support manifests in mochitest.  He wrote a great little script to generate patches for bugs 971132 and 970725.

Say hi in IRC and keep an eye out for bugs related to automation where he is uploading patches and fixing many outstanding issues!

 

1 Comment

Filed under Uncategorized

tracking talos alerts across branches

A year without blogging and I am back.  I figured there was some cool stuff to share, here is one tidbit.

In the last year I have picked up looking at talos results and filing regression bugs for results.  This has been useful.  What currently happens is when results are submitted to g.m.o (graph server) we detect a regression and send out an email to the original patch author (if we can determine it) and post to mozilla.dev.tree-management.  I have been using dev.tree-management as a starting point for my hunting regressions.  When things are busy it can eat up a couple hours in a day.  Luckily many developers are responsible in taking action when they receive the emails.

Given that at least half of the regressions are not acted upon by the original developer, it is important to read the newsgroup. One of the things which makes it frustrating is that for a single regression we can get multiple alerts (regular builds vs pgo builds and as the patch merges between branches/projects).

To make my life easier, I have taken all the alerts on dev.tree-management and put them in a database (local right now).  The final goal is a webUI that lets me easily annotate these alerts similar to tbpl for random test failures.  One thing I wanted to do was help identify duplicate alerts.  Today in my attempt I had a clear picture of what the lifecycle of a regression looks like:

mysql> select date,branch,percent,keyrevision from alerts where test=’Paint’ and platform=’WINNT 6.2 x64′ order by date ASC;
+———————+————————-+———+————–+
| date                | branch                  | percent | keyrevision  |
+———————+————————-+———+————–+
| 2014-02-14 19:41:38 | Mozilla-Inbound-Non-PGO | 10.1%   | c7802c9d6eec |
| 2014-02-15 01:03:54 | Fx-Team-Non-PGO         | 9.53%   | 7a3adc5aac28 |
| 2014-02-15 21:43:48 | Mozilla-Inbound         | 10.6%   | c7802c9d6eec |
| 2014-02-16 03:46:12 | Firefox-Non-PGO         | 8.88%   | 5d7caa093f4f |
| 2014-02-16 03:46:13 | B2g-Inbound-Non-PGO     | 9.44%   | 071885f79841 |
| 2014-02-16 14:22:38 | Fx-Team                 | 10.4%   | 7a3adc5aac28 |
| 2014-02-17 04:42:57 | B2g-Inbound             | 10.7%   | 071885f79841 |
| 2014-02-18 11:43:54 | Firefox                 | 9.76%   | eac89fb04bb9 |
+———————+————————-+———+————–+
8 rows in set (0.00 sec)

This is really cool to see how 1 change can generate alerts for 4 days.

Stay tuned for more information on this and other topics!

Leave a comment

Filed under Uncategorized

What is so special about 267194?

Today I landed the final patch for bug 761125.  This is a huge milestone for Firefox Android automation as we had been running 23,531 checks for each push by running known good directories of tests but now we will be running 290,725 checks by running ALL test cases except the 282 known test cases which fail, timeout, hang, crash.

I fully expect a lot of new random failures to crop up, this is just the nature of automated testing.

For the PI geeks out there, 267194 can be found at the 2051st digit of PI :)

Leave a comment

Filed under Uncategorized

Android unittests now have LOLcat

One common critcism of the android unittests is there is no logcat information.  I could go on for a long time about the decisions made from day 1 up to today, but the main thing is we have found adb to be very unreliable and we connect via telnet to a SUT agent on the tegra.  No usb cable or active adb session for all the automation we do.  This means collecting logcat information is not as easy as issuing a ‘adb logcat > testrun.log’. 

Today I landed bug 754873 which collects logcat information and puts it in the log file generated by tinderbox.  This does not capture the entire logcat session, but it does filter out the random noise that shows up in our logcat coming from tegras (dalvikvm and network wifi messages).  We will display up to 500 messages if they exist.  As logcat is a rotating queue of logs, it will be rare that we find a full 500 lines of useful information in the logs.  What this will solve is when we have a crash in Java or some other crash that is not detected by the crashreporter we will be able to see it in our tinderbox logs.

random fact: did you know you can do ‘adb lolcat’ and it will produce the same output as ‘adb logcat’

1 Comment

Filed under Uncategorized