skip to main content
10.1145/3368089.3409681acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Biases and differences in code review using medical imaging and eye-tracking: genders, humans, and machines

Published:08 November 2020Publication History

ABSTRACT

Code review is a critical step in modern software quality assurance, yet it is vulnerable to human biases. Previous studies have clarified the extent of the problem, particularly regarding biases against the authors of code,but no consensus understanding has emerged. Advances in medical imaging are increasingly applied to software engineering, supporting grounded neurobiological explorations of computing activities, including the review, reading, and writing of source code. In this paper, we present the results of a controlled experiment using both medical imaging and also eye tracking to investigate the neurological correlates of biases and differences between genders of humans and machines (e.g., automated program repair tools) in code review. We find that men and women conduct code reviews differently, in ways that are measurable and supported by behavioral, eye-tracking and medical imaging data. We also find biases in how humans review code as a function of its apparent author, when controlling for code quality. In addition to advancing our fundamental understanding of how cognitive biases relate to the code review process, the results may inform subsequent training and tool design to reduce bias.

Skip Supplemental Material Section

Supplemental Material

fse20main-p133-p-teaser.mp4

mp4

31.4 MB

fse20main-p133-p-video.mp4

mp4

12.4 MB

References

  1. J. G. Altonji and R. M. Blank. Race and gender in the labor market. Handbook of labor economics, 3 : 3143-3259, 1999.Google ScholarGoogle Scholar
  2. A. Bacchelli and C. Bird. Expectations, outcomes, and challenges of modern code review. In Proceedings of the 2013 international conference on software engineering, pages 712-721. IEEE Press, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. Baum, H. Leßmann, and K. Schneider. The choice of code review process: A survey on the state of the practice. In International Conference on Product-Focused Software Process Improvement, pages 111-127. Springer, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  4. L. Beckwith, D. Inman, K. Rector, and M. Burnett. On to the real world: Gender and self-eficacy in excel. In Proceeding of the 2007 Symposium on Visual Languages and Human-Centric Computing, pages 119-126. IEEE, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Bednarik. Expertise-dependent visual attention strategies develop over time during debugging with multiple code representations. International Journal of Human-Computer Studies, 70 ( 2 ): 143-155, Feb. 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. S. Beer, M. Stallen, M. V. Lombardo, K. Gonsalkorale, W. A. Cunningham, and J. W. Sherman. The quadruple process model approach to examining the neural underpinnings of prejudice. Neuroimage, 43 ( 4 ): 775-783, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  7. A. Begel and H. Vrzakova. Eye movements in code review. In Proceedings of the Workshop on Eye Movements in Programming, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Beyer. Gender diferences in the accuracy of self-evaluations of performance. Journal of personality and social psychology, 59 ( 5 ): 960, 1990.Google ScholarGoogle Scholar
  9. A. Bosu and J. C. Carver. Impact of peer code review on peer impression formation: A survey. In Empirical Software Engineering and Measurement, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  10. A. Bosu, M. Greiler, and C. Bird. Characteristics of useful code reviews: An empirical study at microsoft. In 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pages 146-156. IEEE, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  11. T. Camp, W. DuBow, D. Levitt, L. J. Sax, V. Taylor, and C. Lewis. The new NSF requirement for broadening participation in computing (BPC) plans: Community advice and resources. In Computer Science Education, pages 332-333, 2019.Google ScholarGoogle Scholar
  12. L. F. Capretz and F. Ahmed. Why do we need personality diversity in software engineering ? ACM SIGSOFT Software Engineering Notes, 35 ( 2 ): 1-11, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Castelhano, I. C. Duarte, C. Ferreira, J. Duraes, H. Madeira, and M. CasteloBranco. The Role of the Insula in Intuitive Expert Bug Detection in Computer Code: An fMRI Study. Brain Imaging and Behavior, May 2018.Google ScholarGoogle Scholar
  14. Z. Cattaneo, G. Mattavelli, E. Platania, and C. Papagno. The role of the prefrontal cortex in controlling gender-stereotypical associations: a tms investigation. NeuroImage, 56 ( 3 ): 1839-1846, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  15. A. M. Chekroud, J. A. Everett, H. Bridge, and M. Hewstone. A review of neuroimaging studies of race-related prejudice: does amygdala response reflect threat? Frontiers in Human Neuroscience, 8 : 179, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  16. J. Cohen. 11 proven practices for more efective, eficient peer code review. https://www.ibm.com/developerworks/rational/library/11-provenpractices-for-peer-review/index.html, January 2011.Google ScholarGoogle Scholar
  17. J. Cohen, E. Brown, B. DuRette, and S. Teleki. Best kept secrets of peer code review. Smart Bear Somerville, 2006.Google ScholarGoogle Scholar
  18. W. A. Cunningham, J. J. Van Bavel, and I. R. Johnsen. Afective flexibility: evaluative processing goals shape amygdala activity. Psychological Science, 19 ( 2 ): 152-160, 2008.Google ScholarGoogle Scholar
  19. L. Dabbish, C. Stuart, J. Tsay, and J. Herbsleb. Social coding in GitHub: transparency and collaboration in an open software repository. In Computer Supported Cooperative Work, pages 1277-1286, 2012.Google ScholarGoogle Scholar
  20. David Meyer. Amazon Reportedly Killed an AI Recruitment System Because It Couldn't Stop the Tool from Discriminating Against Women. https://https: //fortune.com/ 2018 /10/10/amazon-ai-recruitment-bias-women-sexist/.Google ScholarGoogle Scholar
  21. J. Diedrichsen and R. Shadmehr. Detecting and adjusting for artifacts in fMRI time series data. NeuroImage, 27 ( 3 ): 624-634, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  22. J. J. Dolado, M. C. Otero, and M. Harman. Equivalence hypothesis testing in experimental software engineering. Software Quality Journal, 22 ( 2 ): 215-238, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Duraes, H. Madeira, J. Castelhano, C. Duarte, and M. C. Branco. WAP: Understanding the Brain at Software Debugging. In International Symposium on Software Reliability Engineering, pages 87-92, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  24. M. Fagan. Design and code inspections to reduce errors in program development. In Software pioneers, pages 575-607. Springer, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  25. S. Fakhoury, Y. Ma, V. Arnaoudova, and O. Adesope. The efect of poor source code lexicon and readability on developers' cognitive load. In International Conference on Program Comprehension, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. B. Floyd, T. Santander, and W. Weimer. Decoding the representation of code in the brain: An fMRI study of code review and expertise. In International Conference on Software Engineering (ICSE), pages 175-186, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Ford, M. Behroozi, A. Serebrenik, and C. Parnin. Beyond the code itself: how programmers really look at pull requests. In International Conference on Software Engineering: Software Engineering in Society, 2019.Google ScholarGoogle Scholar
  28. Z. P. Fry, B. Landau, and W. Weimer. A human study of patch maintainability. In International Symposium on Software Testing and Analysis, pages 177-187, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. G. H. Glover. Overview of functional magnetic resonance imaging. Neurosurgery Clinics, 22 ( 2 ): 133-139, 2011.Google ScholarGoogle Scholar
  30. J. H. Goldberg and J. I. Helfman. Comparing information graphics: A critical look at eye tracking. In BEyond Time and Errors: Novel evaLuation Methods for Information Visualization, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. E. H. Gorman and J. A. Kmec. We (have to) try harder: Gender and required work efort in britain and the united states. Gender & Society, 21 ( 6 ): 828-856, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  32. C. Goues, S. Forrest, and W. Weimer. Current challenges in automatic software repair. Software Quality Journal, 21 ( 3 ): 421-443, Sept. 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Gozzi, V. Raymont, J. Solomon, M. Koenigs, and J. Grafman. Dissociable efects of prefrontal and anterior temporal cortical lesions on stereotypical gender attitudes. Neuropsychologia, 47 ( 10 ): 2125-2132, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  34. A. G. Greenwald, D. E. McGhee, and J. L. Schwartz. Measuring individual diferences in implicit cognition: the implicit association test. Journal of personality and social psychology, 74 ( 6 ): 1464, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  35. P. Grimm. Social desirability bias. Wiley international encyclopedia of marketing, 2010.Google ScholarGoogle Scholar
  36. S. O. Haraldsson, J. R. Woodward, A. E. I. Brownlee, and K. Siggeirsdottir. Fixing Bugs in Your Sleep: How Genetic Improvement Became an Overnight Success. 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. He, B. S. Butler, and W. R. King. Team cognition: Development and evolution in software project teams. Journal of Management Information Systems, 24 ( 2 ): 261-292, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. E. Heilman. Gender stereotypes and workplace bias. Research in organizational Behavior, 32 : 113-135, 2012.Google ScholarGoogle Scholar
  39. M. E. Heilman, A. S. Wallen, D. Fuchs, and M. M. Tamkins. Penalties for success: reactions to women who succeed at male gender-typed tasks. Journal of applied psychology, 89 ( 3 ): 416, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  40. S. Hoogendoorn, H. Oosterbeek, and M. Van Praag. The impact of gender diversity on the performance of business teams: Evidence from a field experiment. Management Science, 59 ( 7 ): 1514-1528, 2013.Google ScholarGoogle Scholar
  41. Y. Huang, X. Liu, R. Krueger, T. Santander, X. Hu, K. Leach, and W. Weimer. Distilling neural representations of data structure manipulation using fMRI and fNIRS. In International Conference on Software Engineering (ICSE), 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Y. Ikutani and H. Uwano. Brain activity measurement during program comprehension with nirs. In Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, pages 1-6. IEEE, 2014.Google ScholarGoogle Scholar
  43. N. Imtiaz, J. Middleton, J. Chakraborty, N. Robson, G. Bai, and E. R. Murphy-Hill. Investigating the efects of gender bias on GitHub. In International Conference on Software Engineering (ICSE), pages 700-711, 2019.Google ScholarGoogle Scholar
  44. J. D. Ivory. Still a man's game: Gender representation in online reviews of video games. Mass Communication & Society, 9 ( 1 ): 103-114, 2006.Google ScholarGoogle Scholar
  45. R. J. Jacob and K. S. Karn. Eye tracking in human-computer interaction and usability research: Ready to deliver the promises. Mind, 2 ( 3 ): 4, 2003.Google ScholarGoogle Scholar
  46. X. Jiang, E. Rosen, T. Zefiro, J. VanMeter, V. Blanz, and M. Riesenhuber. Evaluation of a shape-based model of human face discrimination using fmri and behavioral techniques. Neuron, 50 ( 1 ): 159-172, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  47. D. M. Johnson and D. H. Roen. Complimenting and involvement in peer reviews: Gender variation. Language in society, 21 ( 1 ): 27-57, 1992.Google ScholarGoogle Scholar
  48. C. Jones. Measuring defect potentials and defect removal eficiency. CrossTalk The Journal of Defense Software Engineering, 21 ( 6 ): 11-13, 2008.Google ScholarGoogle Scholar
  49. M. A. Just and P. A. Carpenter. A theory of reading: from eye fixations to comprehension. Psychological review, 87 ( 4 ): 329, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  50. N. Kennedy. How google does web-based code reviews with mondrian, 2006.Google ScholarGoogle Scholar
  51. D. Kim, J. Nam, J. Song, and S. Kim. Automatic patch generation learned from human-written patches. In 2013 35th International Conference on Software Engineering (ICSE), pages 802-811. IEEE, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. S.-G. Kim and S. Ogawa. Biophysical and physiological origins of blood oxygenation level-dependent fmri signals. Journal of Cerebral Blood Flow & Metabolism, 32 ( 7 ): 1188-1206, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  53. B. A. Kitchenham, S. L. Pfleeger, L. M. Pickard, P. W. Jones, D. C. Hoaglin, K. E. Emam, and J. Rosenberg. Preliminary guidelines for empirical research in software engineering. IEEE Transactions on Software Engineering, 28 ( 8 ): 721-734, Aug. 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. S. Knobloch-Westerwick, C. J. Glynn, and M. Huge. The matilda efect in science communication: an experiment on gender bias in publication quality perceptions and collaboration interest. Science Communication, 35 ( 5 ): 603-625, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  55. R. Krueger, Y. Huang, X. Liu, T. Santander, W. Weimer, and K. Leach. Neurological divide: An fmri study of prose and code writing. In International Conference on Software Engineering, 2020.Google ScholarGoogle Scholar
  56. C. Le Goues, M. Dewey-Vogt, S. Forrest, and W. Weimer. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In International Conference on Software Engineering, 2012.Google ScholarGoogle Scholar
  57. F. Long and M. Rinard. Automatic patch generation by learning correct code. In Principles of Programming Languages, 2016.Google ScholarGoogle Scholar
  58. Q. Luo, M. Nakic, T. Wheatley, R. Richell, A. Martin, and R. J. R. Blair. The neural basis of implicit moral attitude-an iat study using event-related fmri. Neuroimage, 30 ( 4 ): 1449-1457, 2006.Google ScholarGoogle Scholar
  59. J. B. Lyons, N. T. Ho, W. E. Fergueson, G. G. Sadler, S. D. Cals, C. E. Richardson, and M. A. Wilkins. Trust of an automatic ground collision avoidance technology: A fighter pilot perspective. Military Psychology, 28 ( 4 ): 271-277, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  60. D. S. Ma, J. Correll, and B. Wittenbrink. The chicago face database: A free stimulus set of faces and norming data. Behavior research methods, 47 ( 4 ): 1122-1135, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  61. D. A. Magezi. Linear mixed-efects models for within-participant psychology experiments: an introductory tutorial and free, graphical user interface (lmmgui). Frontiers in psychology, 6:2, 2015.Google ScholarGoogle Scholar
  62. A. Marginean, J. Bader, S. Chandra, M. Harman, Y. Jia, K. Mao, A. Mols, and A. Scott. SapFix: Automated end-to-end repair at scale. In International Conference on Software Engineering: Software Engineering in Practice, 2019.Google ScholarGoogle Scholar
  63. J. Marlow, L. Dabbish, and J. Herbsleb. Impression formation in online peer production: activity traces and personal profiles in GitHub. In Computer Supported Cooperative Work, 2013.Google ScholarGoogle Scholar
  64. H. W. Marsh, L. Bornmann, R. Mutz, H.-D. Daniel, and A. O'Mara. Gender efects in the peer reviews of grant proposals: A comprehensive meta-analysis comparing traditional and multilevel approaches. Review of Educational Research, 79 ( 3 ): 1290-1326, 2009.Google ScholarGoogle Scholar
  65. S. Merritt, L. Shirase, and G. Foster. Normed images for x-ray screening vigilance tasks. Journal of Open Psychology Data, 8 ( 1 ), 2020.Google ScholarGoogle Scholar
  66. M. Monperrus. Automatic software repair: A bibliography. ACM Comput. Surv., 51 ( 1 ), Jan. 2018.Google ScholarGoogle Scholar
  67. M. Monperrus, S. Urli, T. Durieux, M. Martinez, B. Baudry, and L. Seinturier. Repairnator patches programs automatically. Ubiquity, 2019 (July), July 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. D. Nafus. 'patches don't have gender': What is not open in open source software. New Media & Society, 14 ( 4 ): 669-683, 2012.Google ScholarGoogle Scholar
  69. T. Nakagawa, Y. Kamei, H. Uwano, A. Monden, K. Matsumoto, and D. M. German. Quantifying programmers' mental workload during program comprehension based on cerebral blood flow measurement: A controlled experiment. In International Conference on Software Engineering, 2014.Google ScholarGoogle Scholar
  70. B. A. Nosek, A. G. Greenwald, and M. R. Banaji. Understanding and using the implicit association test: Ii. method variables and construct validity. Personality and Social Psychology Bulletin, 31 ( 2 ): 166-180, 2005.Google ScholarGoogle Scholar
  71. U. Obaidellah, M. Al Haek, and P. C.-H. Cheng. A survey on the usage of eyetracking in computer programming. ACM Comput. Surv., 51 ( 1 ):5: 1-5 : 58, Jan. 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. N. Peitek, J. Siegmund, C. Parnin, S. Apel, J. Hofmeister, and A. Brechmann. Simultaneous Measurement of Program Comprehension with fMRI and Eye Tracking: A Case Study. In Symposium on Empirical Software Engineering and Measurement, 2018. To appear.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. V. Pieterse, D. G. Kourie, and I. P. Sonnekus. Software engineering team diversity and performance. In South African institute of computer scientists and information technologists on IT research in developing countries, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. A. Poole and L. J. Ball. Eye tracking in human-computer interaction and usability research: Current status and future. In Encyclopedia of Human-Computer Interaction, 2005.Google ScholarGoogle Scholar
  75. S. Quadflieg, D. J. Turk, G. D. Waiter, J. P. Mitchell, A. C. Jenkins, and C. N. Macrae. Exploring the neural correlates of social stereotyping. Journal of Cognitive Neuroscience, 21 ( 8 ): 1560-1570, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. K. Rayner. Eye movements in reading and information processing. Psychological Bulletin, 85 ( 3 ): 618-660, 1978.Google ScholarGoogle Scholar
  77. P. C. Rigby, D. M. German, and M.-A. Storey. Open source software peer review practices: a case study of the apache server. In Proceedings of the 30th international conference on Software engineering, pages 541-550. ACM, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. G. Robles, L. Arjona Reina, A. Serebrenik, B. Vasilescu, and J. M. GonzálezBarahona. Floss 2013 : A survey dataset about free software contributors: challenges for curating, sharing, and combining. In Proceedings of the 11th Working Conference on Mining Software Repositories, pages 396-399. ACM, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. G. Robles, L. A. Reina, J. M. González-Barahona, and S. D. Domínguez. Women in free/libre/open source software: The situation in the 2010s. In IFIP International Conference on Open Source Systems, pages 163-173. Springer, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  80. P. L. Roth, K. L. Purvis, and P. Bobko. A meta-analysis of gender group diferences for measures of job performance in field studies. Journal of Management, 38 ( 2 ): 719-739, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  81. T. J. Ryan, G. M. Alarcon, C. Walter, R. Gamble, S. A. Jessup, A. Capiola, and M. D. Pfahler. Trust in automated software repair. In International Conference on Human-Computer Interaction, pages 452-470. Springer, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  82. S. Sarkar and C. Parnin. Characterizing and predicting mental fatigue during programming tasks. In Emotion Awareness in Software Engineering, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  83. J. R. Shapiro and S. L. Neuberg. From stereotype threat to stereotype threats: Implications of a multi-threat framework for causes, moderators, mediators, consequences, and interventions. Personality and Social Psychology Review, 11 ( 2 ): 107-130, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  84. Z. Sharafi, T. Shafer, B. Sharif, and Y.-G. Guéhéneuc. Eye-tracking metrics in software engineering. In 2015 Asia-Pacific Software Engineering Conference (APSEC), pages 96-103. IEEE, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  85. Z. Sharafi, Z. Soh, and Y.-G. Guéhéneuc. A systematic literature review on the usage of eye-tracking in software engineering. Inf. Softw. Technol., 67(C): 79-107, Nov. 2015.Google ScholarGoogle Scholar
  86. Z. Sharafi, Z. Soh, Y.-G. Guéhéneuc, and G. Antoniol. Women and men-diferent but equal: On the impact of identifier style on source code reading. In International Conference on Program Comprehension, 2012.Google ScholarGoogle Scholar
  87. B. Sharif, M. Falcone, and J. I. Maletic. An eye-tracking study on the role of scan time in finding source code defects. In Symposium on Eye Tracking Research and Applications, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. J. Siegmund, C. Kästner, S. Apel, C. Parnin, A. Bethmann, T. Leich, G. Saake, and A. Brechmann. Understanding understanding source code with functional magnetic resonance imaging. In International Conference on Software Engineering, pages 378-389, 2014.Google ScholarGoogle Scholar
  89. J. Siegmund, N. Peitek, C. Parnin, S. Apel, J. Hofmeister, C. Kästner, A. Begel, A. Bethmann, and A. Brechmann. Measuring Neural Eficiency of Program Comprehension. In Foundations of Software Engineering, pages 140-150, 2017.Google ScholarGoogle Scholar
  90. N. Subrahmaniyan, L. Beckwith, V. Grigoreanu, M. Burnett, S. Wiedenbeck, V. Narayanan, K. Bucht, R. Drummond, and X. Fern. Testing vs. code inspection vs. what else?: Male and female end users' debugging strategies. In Human Factors in Computing Systems, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. J. Terrell, A. Kofink, J. Middleton, C. Rainear, E. Murphy-Hill, C. Parnin, and J. Stallings. Gender diferences and bias in open source: Pull request acceptance of women versus men. PeerJ Computer Science, 3 : e111, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  92. J. Tsay, L. Dabbish, and J. Herbsleb. Influence of social and technical factors for evaluating contribution in github. In Proceedings of the 36th international conference on Software engineering, pages 356-366. ACM, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. P. Tse and K. Hyland. 'robot kung fu': Gender and professional identity in biology and philosophy reviews. Journal of Pragmatics, 40 ( 7 ): 1232-1248, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  94. A. Tsotsis. Meet phabricator, the witty code review tool built inside facebook. City, 2006.Google ScholarGoogle Scholar
  95. S. Urli, Z. Yu, L. Seinturier, and M. Monperrus. How to design a program repair bot? insights from the Repairnator project. In International Conference on Software Engineering: Software Engineering in Practice, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. H. Uwano, M. Nakamura, A. Monden, and K.-i. Matsumoto. Analyzing individual performance of source code review using reviewers' eye movement. In Eye Tracking Research Applications, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. R. Van Megen and D. B. Meyerhof. Costs and benefits of early defect detection: experiences from developing client server and host applications. Software Quality Journal, 4 ( 4 ): 247-256, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  98. R. van Tonder and C. Le Goues. Towards s/engineer/bot: Principles for program repair bots. In 2019 IEEE/ACM 1st International Workshop on Bots in Software Engineering (BotSE), pages 43-47, May 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. B. Vasilescu, D. Posnett, B. Ray, M. G. van den Brand, A. Serebrenik, P. Devanbu, and V. Filkov. Gender and tenure diversity in github teams. In Human factors in computing systems, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. M. Vorvoreanu, L. Zhang, Y.-H. Huang, C. Hilderbrand, Z. Steine-Hanson, and M. Burnett. From gender biases to gender-inclusive design: An empirical investigation. In Human Factors in Computing Systems, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. D. Wakabayashi. Google finds it's underpaying many men as it addresses wage equity. https://www.nytimes.com/ 2019 /03/04/technology/google-gender-paygap.html, March 2019.Google ScholarGoogle Scholar
  102. M. Welsh. My love afair with code reviews. http://matt-welsh.blogspot.com/ 2012 /02/my-love-afair-with-code-reviews.html, 2012. [Online; accessed 4-September-2019].Google ScholarGoogle Scholar
  103. J. O. Wobbrock, L. Findlater, D. Gergle, and J. J. Higgins. The aligned rank transform for nonparametric factorial analyses using only ANOVA procedures. In Human factors in computing systems, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. yeeguy. How Facebook Ships Code. https://framethink.wordpress.com/ 2011 / 01/17/how-facebook-ships-code/, 2011. [Online; accessed 4-September-2019].Google ScholarGoogle Scholar
  105. S. Zweben and B. Bizot. 2017 CRA Taulbee Survey. Computing Research News, 30 ( 5 ): 1-47, 2018.Google ScholarGoogle Scholar

Index Terms

  1. Biases and differences in code review using medical imaging and eye-tracking: genders, humans, and machines

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
        November 2020
        1703 pages
        ISBN:9781450370431
        DOI:10.1145/3368089

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 November 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate112of543submissions,21%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader