Large-scale Java GitHub search of 'test' in content, filename and file path
Saved in:
| Title: | Large-scale Java GitHub search of 'test' in content, filename and file path |
|---|---|
| Authors: | Matej Madeja, orcid:0000-0002-8197- |
| Contributors: | Jaroslav Porubän |
| Publisher Information: | Zenodo |
| Publication Year: | 2021 |
| Collection: | Zenodo |
| Subject Terms: | GitHub analysis, |
| Description: | Dataset of large-scale GitHub analysis based on GHTorrent list of repositories from May 2019. Dataset includes only repositories with majority Java language, that are not forks. Each of 4.3M repositories was searched for the word "test" via Github Search API in: all files content java files content all filenames java filenames all file paths java file paths Simultaneously, number of current repository commits and watchers where obtained. The dataset was obtained between 2019-08-20 and 2019-10-01. Dataset is a mysql dump of 1 table, containing the following columns: id - internal table ID project_id - ID of `projects` table of GHTorrent's mirror mysql-2019-05-01 full_name - full name of the project found_test_in_path_java - number of occurrences of "test" in java paths found_test_in_path - number of occurrences of "test" in all paths found_test_in_body_java - number of occurrences of "test" in java files content found_test_in_body - number of occurrences of "test" in all files content found_test_in_filename_java - number of occurrences of "test" in java filenames found_test_in_filename - number of occurrences of "test" in all filenames watchers - number of project's watchers created_at - datetime of data fetching last_commit - datetime of last commit all_commits - all commits, along with the inherited (from other ones) project_commits - only commits of the project, without the inherited ; This work was supported by project VEGA No. 1/0762/19: Interactive pattern- driven language development. |
| Document Type: | text |
| Language: | English |
| Relation: | https://zenodo.org/records/4566198; oai:zenodo.org:4566198; https://doi.org/10.5281/zenodo.4566198 |
| DOI: | 10.5281/zenodo.4566198 |
| Availability: | https://doi.org/10.5281/zenodo.4566198 https://zenodo.org/records/4566198 |
| Rights: | Creative Commons Attribution 4.0 International ; cc-by-4.0 ; https://creativecommons.org/licenses/by/4.0/legalcode |
| Accession Number: | edsbas.A0C64685 |
| Database: | BASE |
| Abstract: | Dataset of large-scale GitHub analysis based on GHTorrent list of repositories from May 2019. Dataset includes only repositories with majority Java language, that are not forks. Each of 4.3M repositories was searched for the word "test" via Github Search API in: all files content java files content all filenames java filenames all file paths java file paths Simultaneously, number of current repository commits and watchers where obtained. The dataset was obtained between 2019-08-20 and 2019-10-01. Dataset is a mysql dump of 1 table, containing the following columns: id - internal table ID project_id - ID of `projects` table of GHTorrent's mirror mysql-2019-05-01 full_name - full name of the project found_test_in_path_java - number of occurrences of "test" in java paths found_test_in_path - number of occurrences of "test" in all paths found_test_in_body_java - number of occurrences of "test" in java files content found_test_in_body - number of occurrences of "test" in all files content found_test_in_filename_java - number of occurrences of "test" in java filenames found_test_in_filename - number of occurrences of "test" in all filenames watchers - number of project's watchers created_at - datetime of data fetching last_commit - datetime of last commit all_commits - all commits, along with the inherited (from other ones) project_commits - only commits of the project, without the inherited ; This work was supported by project VEGA No. 1/0762/19: Interactive pattern- driven language development. |
|---|---|
| DOI: | 10.5281/zenodo.4566198 |
Nájsť tento článok vo Web of Science