2020
Conference article  Open Access

JTeC: A Large Collection of Java Test Classes for Test Code Analysis and Processing

Corò F., Verdecchia R., Cruciani E., Miranda B., Bertolino A.

Java  GitHub  Software Testing  Test Suite  Large Scale  [INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE] 

The recent push towards test automation and test-driven development continues to scale up the dimensions of test code that needs to be maintained, analysed, and processed side-by-side with production code. As a consequence, on the one side regression testing techniques, e.g., for test suite prioritization or test case selection, capable to handle such large-scale test suites become indispensable; on the other side, as test code exposes own characteristics, specific techniques for its analysis and refactoring are actively sought. We present JTeC, a large-scale dataset of test cases that researchers can use for benchmarking the above techniques or any other type of tool expressly targeting test code. JTeC collects more than 2.5M test classes belonging to 31K+ GitHub projects and summing up to more than 430 Million SLOCs of ready-to-use real-world test code.

Source: 2020 IEEE/ACM 17th International Conference on Mining Software Repositories (MSR), pp. 578–582, 29-30/06/2020

Publisher: ACM, Association for computing machinery, New York, USA


[1] ElasTest Project, 2017.
[2] Federico Coro, Roberto Verdecchia, Emilio Cruciani, Breno Miranda, and Antonia Bertolino. JTeC: A Large Collection of Java Test Classes forTest Code Analysis and Processing, May 2019. Companion page for the JTeC dataset at https://github.com/JTeCDataset/JTeC.
[3] Emilio Cruciani, Breno Miranda, Roberto Verdecchia, and Antonia Bertolino. Scalable approaches for test suite reduction. In Proceedings of the 41st International Conference on Software Engineering, ICSE '19, 2019.
[4] Sebastian Elbaum, Gregg Rothermel, and John Penix. Techniques for improving regression testing in continuous integration development environments. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pages 235{245, New York, NY, USA, 2014. ACM.
[5] Martin Fowler and Matthew Foemmel. Continuous integration. Thought-Works, 122:14, 2006. Accessed: 2019-01-22.
[6] Vahid Garousi and Michael Felderer. Developing, verifying, and maintaining high-quality automated test scripts. IEEE Software, (3):68{75, 2016.
[7] Danielle Gonzalez, Joanna Santos, Andrew Popovich, Mehdi Mirakhorli, and Mei Nagappan. A largescale study on the usage of testing patterns that address maintainability attributes: patterns for ease of modi cation, diagnoses, and comprehension. In Proceedings of the 14th International Conference on Mining Software Repositories, pages 391{401. IEEE Press, 2017.
[8] Georgios Gousios. The GHTorent dataset and tool suite. In Proceedings of the 10th working conference on mining software repositories, pages 233{236. IEEE Press, 2013.
[9] Michaela Greiler, Arie Van Deursen, and Margaret-Anne Storey. Automated detection of test xture strategies and smells. In Software Testing, Veri cation and Validation (ICST), 2013 IEEE Sixth International Conference on, pages 322{331. IEEE, 2013.
[10] David Janzen and Hossein Saiedian. Test-driven development concepts, taxonomy, and future direction. Computer, 38(9):43{50, 2005.
[11] Magda Kacmajor. Say it in plain english, 2018. Accessed: 2019-02-05.
[12] Jussi Kasurinen, Ossi Taipale, and Kari Smolander. Software test automation in practice: empirical observations. Advances in Software Engineering, 2010, 2010.
[13] M. Linares-Vasquez, C. Bernal-Cardenas, K. Moran, and D. Poshyvanyk. How do developers test android applications? In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 613{622, Sep. 2017.
[14] Atif Memon, Zebao Gao, Bao Nguyen, Sanjeev Dhanda, Eric Nickell, Rob Siemborski, and John Micco. Taming google-scale continuous testing. In Proceedings of the 39th International Conference on Software Engineering: Software Engineering in Practice Track, SEIP'17, pages 233{242, Piscataway, NJ, USA, 2017. IEEE Press.
[15] Breno Miranda, Emilio Cruciani, Roberto Verdecchia, and Antonia Bertolino. FAST approaches to scalable similarity-based test case prioritization. In Proceedings of the 40th International Conference on Software Engineering, ICSE '18, pages 222{232, New York, NY, USA, 2018. ACM.
[16] Fabio Palomba, Andy Zaidman, and Andrea De Lucia. Automatic test smell detection using information retrieval techniques. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 311{322. IEEE, 2018.
[17] Bret Pettichord. Success with test automation. In 9th International Quality Week Conference, 1996. Accessed: 2019-01-22.
[18] Zion Market Research. Test automation market by test type: Global industry perspective, comprehensive analysis, and forecast, 2016 - 2022, 2017. Accessed: 2019-02-04.
[19] Arie Van Deursen, Leon Moonen, Alex Van Den Bergh, and Gerard Kok. Refactoring test code. In Proceedings of the 2nd international conference on extreme programming and exible processes in software engineering (XP2001), pages 92{95, 2001.
[20] Liming Zhu, Len Bass, and George Champlin-Schar . Devops and its practices. IEEE Software, 33(3):32{34, 2016.

Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:435435,
	title = {JTeC: A Large Collection of Java Test Classes for Test Code Analysis and Processing},
	author = {Corò F. and Verdecchia R. and Cruciani E. and Miranda B. and Bertolino A.},
	publisher = {ACM, Association for computing machinery, New York, USA},
	doi = {10.1145/3379597.3387484},
	booktitle = {2020 IEEE/ACM 17th International Conference on Mining Software Repositories (MSR), pp. 578–582, 29-30/06/2020},
	year = {2020}
}