Have you ever been working on a change that you think will affect performance numbers and you were not sure how to verify the impact of your change?
The main use case I needed to do was run a change on Try server and verify that it did in fact fix my performance regression. Normally I would go to tbpl, and click on each of my tests to see the reported number(s). For each of those test:number sets, I would look on graph server (hint: you can get to graph server for a given test by clicking on the reported number) and verify that my numbers were inside the expected range for that test/platform/branch based on the history. If only I was part of a software developers union I could complain that that boring time intensive work was not in my contract.
To simplify my life, I decided to automate this with a python script. I wrote compare.py which will spit out a text based summary of what I described above. Here is a sample output:
python compare.py --revision c094aeea5f73 --branch Try --masterbranch Firefox --test tp5n --platform Linux Linux: tp5n: 292.157 -> 400.444; 308.596
A quick explanation:
- 292.157 is the lowest number reported in the last 7 days for tp5n,linux
- 400.444 is the highest number reported in the last 7 days for tp5n,linux
- 308.596 is the value reported from my test on try server for tp5n,linux
While this doesn’t do the previous 30 changesets and the next 5, it gives a pretty good indicator about what to expect. I can run this on a different time range (to check the 7 days prior to my introduced regression) by adding –skipdays to the command line:
python compare.py --revision c094aeea5f73 --branch Try --masterbranch Firefox --test tp5n --platform Linux --skipdays 6 Linux: **tp5n: 311.975 -> 398.571; 308.596
Here you will see a “**tp5n”, and that indicates that the Try server number is not in the range and should be looked at the old fashioned way.
Hope this helps in debugging.