Last week I wrote some notes about re-triggering jobs to find a root cause. This week I decided to look at the orange factor email of the top 10 bugs and see how I could help. Looking at each of the 10 bugs, I had 3 worth investigating and 7 I ignored.
- Bug 1163911 – test_viewport_resize.html – new test which was added 15 revisions back from the first instance in the bug. The sheriffs had already worked to get this test disabled prior to my results coming in!
- Bug 1081925 – browser_popup_blocker.js – previous test in the directory was modified to work in e10s 4 revisions back from the first instance reported in the bug, causing this to fail
- Bug 1118277 – browser_popup_blocker.js (different symptom, same test pattern and root cause as bug 1081925)
- Bug 1073442 – Intermittent command timed out; might not be code related and >30 days of history.
- Bug 1096302 – test_collapse.html | Test timed out. >30 days of history.
- Bug 1151786 – testOfflinePage. >30 days of history. (and a patch exists).
- Bug 1145199 – browser_referrer_open_link_in_private.js. >30 days of history.
- Bug 1073761 – test_value_storage.html. >30 days of history.
- Bug 1161537 – test_dev_mode_activity.html. resolved (a result from the previous bisection experiment).
- Bug 1153454 – browser_tabfocus.js. >30 days of history.
Looking at the bugs of interest, I jumped right in in retriggering. This time around I did 20 retriggers for the original changeset, then went back to 30 revisions (every 5th) doing the same thing. Effectively this was doing 20 retriggers for the 0, 5th, 10th, 15th, 20th, 25th, and 30th revisions in the history list (140 retriggers).
I ran into issues doing this, specifically on Bug 1073761. The reason why is that for about 7 revisions in history the windows 8 builds failed! Luckily the builds finished enough to get a binary+tests package so we could run tests, but mozci didn’t understand that the build was available. That required some manual retriggering. Actually a few cases on both retriggers were actual build failures which resulted in having to manually pick a different revision to retrigger on. This was fairly easy to then run my tool again and fill in the 4 missing revisions using slightly different mozci parameters.
This was a bit frustrating as there was a lot of manual digging and retriggering due to build failures. Luckily 2 of the top 10 bugs are the same root cause and we figured it out. Including irc chatter and this blog post, I have roughly 3 hours invested into this experiment.