On the Applicability of the Pareto Principle to Source-Code Growth in Open Source Projects

Context: research on understanding the laws related to software-project evolution can indirectly impact the way we design software development processes, e.g., knowing the nature of the code-repository content growth could help us improve the ways we monitor the progress of OSS software development projects and predict their future development Goal: our aim is to empirically verify a hypothesis that the OSS code repositories grow in size according to the Pareto principle. Method: we collected and curated a sample of 31,343 OSS code repositories hosted on GitHub and analyzed their content growth over time to verify whether it follows the Pareto principle. Results: we observed that, on average, monotonically growing OSS repositories reach 75% of their final content size within the first 25% revisions. Conclusions: the content size of monotonically growing OSS repositories seems to grow in size according to the Pareto principle with the 75/25 ratio.
