Working towards a productive definition of “intermittent orange”

Intermittent Oranges (tests which fail sometimes and pass other times) are an ever increasing problem with test automation at Mozilla.

While there are many common causes for failures (bad tests, the environment/infrastructure we run on, and bugs in the product)
we still do not have a clear definition of what we view as intermittent.  Some common statements I have heard:

  • It’s obvious, if it failed last year, the test is intermittent
  • If it failed 3 years ago, I don’t care, but if it failed 2 months ago, the test is intermittent
  • I fixed the test to not be intermittent, I verified by retriggering the job 20 times on try server

These are imply much different definitions of what is intermittent, a definition will need to:

  • determine if we should take action on a test (programatically or manually)
  • define policy sheriffs and developers can use to guide work
  • guide developers to know when a new/fixed test is ready for production
  • provide useful data to release and Firefox product management about the quality of a release

Given the fact that I wanted to have a clear definition of what we are working with, I looked over 6 months (2016-04-01 to 2016-10-01) of OrangeFactor data (7330 bugs, 250,000 failures) to find patterns and trends.  I was surprised at how many bugs had <10 instances reported (3310 bugs, 45.1%).  Likewise, I was surprised at how such a small number (1236) of bugs account for >80% of the failures.  It made sense to look at things daily, weekly, monthly, and every 6 weeks (our typical release cycle).  After much slicing and dicing, I have come up with 4 buckets:

  1. Random Orange: this test has failed, even multiple times in history, but in a given 6 week window we see <10 failures (45.2% of bugs)
  2. Low Frequency Orange: this test might fail up to 4 times in a given day, typically <=1 failures for a day. in a 6 week window we see <60 failures (26.4% of bugs)
  3. Intermittent Orange: fails up to 10 times/day or <120 times in 6 weeks.  (11.5% of bugs)
  4. High Frequency Orange: fails >10 times/day many times and are often seen in try pushes.  (16.9% of bugs or 1236 bugs)

Alternatively, we could simplify our definitions and use:

  • low priority or not actionable (buckets 1 + 2)
  • high priority or actionable (buckets 3 + 4)

Does defining these buckets about the number of failures in a given time window help us with what we are trying to solve with the definition?

  • Determine if we should take action on a test (programatically or manually):
    • ideally buckets 1/2 can be detected programatically with autostar and removed from our view.  Possibly rerunning to validate it isn’t a new failure.
    • buckets 3/4 have the best chance of reproducing, we can run in debuggers (like ‘rr’), or triage to the appropriate developer when we have enough information
  • Define policy sheriffs and developers can use to guide work
    • sheriffs can know when to file bugs (either buckets 2 or 3 as a starting point)
    • developers understand the severity based on the bucket.  Ideally we will need a lot of context, but understanding severity is important.
  • Guide developers to know when a new/fixed test is ready for production
    • If we fix a test, we want to ensure it is stable before we make it tier-1.  A developer can use math of 300 commits/day and ensure we pass.
    • NOTE: SETA and coalescing ensures we don’t run every test for every push, so we see more likely 100 test runs/day
  • Provide useful data to release and Firefox product management about the quality of a release
    • Release Management can take the OrangeFactor into account
    • new features might be required to have certain volume of tests <= Random Orange

One other way to look at this is what does gets put in bugs (war on orange bugzilla robot).  There are simple rules:

  • 15+ times/day – post a daily summary (bucket #4)
  • 5+ times/week – post a weekly summary (bucket #3/4 – about 40% of bucket 2 will show up here)

Lastly I would like to cover some exceptions and how some might see this flawed:

  • missing or incorrect data in orange factor (human error)
  • some issues have many bugs, but a single root cause- we could miscategorize a fixable issue

I do not believe adjusting a definition will fix the above issues- possibly different tools or methods to run the tests would reduce the concerns there.

4 Comments

Filed under general, testdev, Uncategorized

QoC.2 – Iterations and thoughts

Quite a few weeks ago now, the Second official Quarter of Contribution wrapped up.  We had advertised 4 projects and found awesome contributors for all 4.  While all hackers gave a good effort, sometimes plans change and life gets in the way.  In the end we had 2 projects with very active contributors.

We had two projects with a lot of activity throughout the project:

First off, this 2nd round of QoC wouldn’t have been possible without the Mentors creating projects and mentoring, nor without the great contributors volunteering their time to build great tools and features.

I really like to look at what worked and what didn’t, let me try to summarize some thoughts.

What worked well:

  • building up interest in others to propose and mentor projects
  • having the entire community in #ateam serve as an environment of encouragement and learning
  • specifying exact start/end dates
  • advertising on blogs/twitter/newsgroups to find great hackers

What I would like to see changed for QoC.3:

  •  Be clear up front on what we expect.  Many contributors waited until the start date before working- that doesn’t give people a chance to ensure mentors and projects are a good fit for them (especially over a couple of months)
  • Ensure each project has clear guidelines on code expectations.  Linting, Tests, self review before PR, etc.  These are all things which might be tough to define and tough to do at first, but it makes for better programmers and end products!
  • Keep a check every other week on the projects as mentors (just a simple irc chat or email chain)
  • Consider timing of the project, either on-demand as mentors want to do it, or continue in batches, but avoid with common mentor time off (work weeks, holidays)
  • Encourage mentors to set weekly meetings and “office hours”

As it stands now, we are pushing on submitting Outreachy and GSoC project proposals, assuming that those programs pick up our projects, we will look at QoC.3 more into September or November.

 

Leave a comment

Filed under Uncategorized

QoC.2 – WPT Results Viewer – wrapping up

Quite a few weeks ago now, the Second official Quarter of Contribution wrapped up.  We had advertised 4 projects and found awesome contributors for all 4.  While all hackers gave a good effort, sometimes plans change and life gets in the way.  In the end we had 2 projects with very active contributors.

This post, I want to talk about WPT Results Viewer.  You can find the code on github, and still find the team on irc in #ateam.  As this finished up, I reached out to :martianwars to learn what his experience was like, here are his own words:

What interested you in QoC?

So I’d been contributing to Mozilla for sometime fixing random bugs here and there. I was looking for something larger and more interesting. I think that was the major motivation behind QoC, besides Manishearth’s recommendation to work on the Web Platform Test Viewer. I guess I’m really happy that QoC came around the right time!

What challenges did you encounter while working on your project?  How did you solve them?

I guess the major issue while working on wptview was the lack of Javascript knowledge and the lack of help online when it came to Lovefield. But like every project, I doubt I would have enjoyed much had I known everything required right from the start. I’m glad I got jgraham as a mentor, who made sure I worked my way up the learning curve as we made steady progress.

What are some things you learned?

So I definitely learnt some Javascript, code styling, the importance of code reviews, but there was a lot more to this project. I think the most important thing that I learnt was patience. I generally tend to search for StackOverflow answers when it I need to perform a programming task I’m unaware of. With Lovefield being a relatively new project, I was compelled to patiently read and understand the documentation and sample programs. I also learnt a bit on how a large open source community functions, and I feel excited being a part of it!  A bit irrelevant to the question, but I think I’ve made some friends in #ateam :) The IRC is like my second home, and helps me escape life’s never ending stress, to a wonderland of ideas and excitement!

If you were to give advice to students looking at doing a QoC, what would you tell them?

Well the first thing I would advice them is not to be afraid, especially of asking the so called “stupid” questions on the IRC. The second thing would be to make sure they give the project a decent amount of time, not with the aim of completing it or something, but to learn as much as they can🙂 Showing enthusiasm is the best way to ensure one has a worthwhile QoC🙂 Lastly, I’ve tried my level best to get a few newcomers into wptview. I think spreading the knowledge one learns is important, and one should try to motivate others to join open source🙂

If you were to give advice to mentors wanting to mentor a project, what would you tell them?

I think jgraham has set a great example of what an ideal mentor should be like. Like I mentioned earlier, James helped me learn while we made steady progress. I especially appreciate the way he had (has rather) planned this project. Every feature was slowly built upon and in the right order, and he ensured the project continued to progress while I was away. He would give me a sufficient insight into each feature, and leave the technical aspects to me, correcting my fallacies after the first commit. I think this is the right approach. Lastly, a quality every mentor MUST have, is to be awake at 1am on a weekend night reviewing PRs😉

Personally I have really enjoyed getting to know :martianwars and seeing the great progress he has made.

Leave a comment

Filed under Uncategorized

Introducing the contributors for the MozCI Project

As I previously announced who will be working on Pulse Guardian, the Web Platform Tests Results Explorer, and the  Web Driver Infrastructure projects, I would like to introduce the contributors for the 4th project this quarter, Mozilla CI Tools – Polish and Packaging:

* MikeLing (:mikeling on IRC) –

What interests you in this specific project?

As its document described, Mozilla CI Tools is designed to allow interacting with the various components which compose Mozilla’s Continuous Integration. So, I think get involved into it can help me know more about how Treeherder and Mozci works and give me better understanding of A-team.

What do you plan to get out of this after 8 weeks?

Keep try my best to contribute! Hope I can push forward this project with Armen, Alice and other contributors in the furture🙂

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

I’m a guy who would like to keep challenge myself and try new stuff.

* Stefan (:F3real on IRC) –

What interests you in this specific project?

I thought it would be good starting project and help me learn new things.

What do you plan to get out of this after 8 weeks?

Expand my knowledge and meet new people.

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

I play guitar but I don’ t think that’s really interesting.

* Vaibhav Tulsyan (:xenny on IRC) –

What interests you in this specific project?

Continuous Integration, in general, is interesting for me.

What do you plan to get out of this after 8 weeks?

I want to learn how to work efficiently in a team in spite of working remotely, learn how to explore a new code base and some new things about Python, git, hg and Mozilla. Apart from learning, I want to be useful to the community in some way. I hope to contribute to Mozilla for a long term, and I hope that this helps me build a solid foundation.

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

One of my hobbies is to create algorithmic problems from real-world situations. I like to think a lot about the purpose of existence, how people think about things/events and what affects their thinking. I like teaching and gaining satisfaction from others’ understanding.

 

Please join me in welcoming all the contributors to this project and the previously mentioned ones as they have committed to work on a larger project with their free time!

Leave a comment

Filed under Community

Introducing a contributor for the WebDriver Infrastructure project

As I previously announced who will be working on Pulse Guardian and the Web Platform Tests Results Explorer, let me introduce who will be working on Web Platform Tests – WebDriver Infrastructure:

* Ravi Shankar (:waffles on IRC) –

What interests you in this specific project?

There are several. Though I love coding, I’m usually more inclined to Python & Rust (so, a “Python project” is what excited me at first). Then, my recently-developed interest in networking code (ever since my work on a network-related issue in Servo), and finally, I’m very curious about how we’re establishing the Python-JS communication and emulate user inputs.

What do you plan to get out of this after 8 weeks?

Over the past few months of my (fractional) contributions to Mozilla, I’ve always learned something useful whenever I finish working on a bug/issue. Since this is a somewhat “giant” implementation that requires more time and commitment, I think I’ll learn some great deal of stuff in relatively less time (which is what excites me).

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

Well, I juggle, or I (try to) reproduce some random music in my flute (actually, a Bansuri – Indian flute) when I’m away from my keyboard.

 

We look forward to working with Ravi over the next 8 weeks.  Please say hi in irc when you see :waffles in channel🙂

Leave a comment

Filed under Community

Introducing 2 contributors for the Web Platform Tests project

As I previously announced who will be working on Pulse Guardian, let me introduce who will be working on Web Platform Tests – Results Explorer:

* Kalpesh Krishna (:martianwars on irc) –

What interests you in this specific project?

I have been contributing to Mozilla for a couple of months now and was keen on taking up a project on a slightly larger scale. This particular project was recommended to me by Manish Goregaokar. I had worked out a few issues in Servo prior to this and all involved Web Platform Tests in some form. That was the initial motivation. I find this project really interesting as it gives me a chance to help build an interface that will simplify browser comparison so much! This project seems to have more of planning rather than execution, and that’s another reason that I’m so excited! Besides, I think this would be a good chance to try out some statistics / data visualization ideas I have, though they might be a bit irrelevant to the goal.

What do you plan to get out of this after 8 weeks?

I plan to learn as much as I can, make some great friends, and most importantly make a real sizeable contribution to open source🙂

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

I love to star gaze. Constellations and Messier objects fascinate me. Given a chance, I would love to let my imagination run wild and draw my own set of constellations! I have an unusual ambition in life. Though a student of Electrical Engineering, I have always wanted to own a chocolate factory (too much Roald Dahl as a child) and have done some research regarding the same. Fingers crossed! I also love to collect Rubiks Cube style puzzles. I make it a point to increase my collection by 3-4 puzzles every semester and learn how to solve them. I’m not fast at any of them, but love solving them!

* Daniel Deutsch

What interests you in this specific project?

I am really interested in getting involved in Web Standards. Also, I am excited to be involved in a project that is bigger than itself–something that spans the Internet and makes it better for everyone (web authors and users).

What do you plan to get out of this after 8 weeks?

As primarily a Rails developer, I am hoping to expand my skill-set. Specifically, I am looking forward to writing some Python and learning more about JavaScript. Also, I am excited to dig deeper into automated testing. Lastly, I think Mozilla does a lot of great work and am excited to help in the effort to drive the web forward with open source contribution.

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

I live in Brooklyn, NY and have terrible taste in music. I like writing long emails, running, and Vim.

 

We look forward to working with these great 2 hackers over the next 8 weeks.

Leave a comment

Filed under Community

Introducing a contributor for the Pulse Guardian project

3 weeks ago we announced the new Quarter of Contribution, today I would like to introduce the participants.  Personally I really enjoy meeting new contributors and learning about them. It is exciting to see interest in all 4 projects.  Let me introduce who will be working on Pulse Guardian – Core Hacker:

Mike Yao

What interests you in this specific project?

Python, infrastructure

What do you plan to get out of this after 8 weeks?

Continue to contribute to Mozilla

Are there any interesting facts/hobbies that you would like to share so others can enjoy reading about you?

Cooking/food lover, I was chef long time ago. Free software/Open source and Linux changed my mind and career.

 

I do recall one other eager contributor who might join in late when exams are completed, meanwhile, enjoy learning a bit about Mike Yao (who was introduced to Mozilla by Mike Ling who did our first every Quarter of Contribution).

Leave a comment

Filed under Uncategorized