Discovering power laws in computer programs

作者:

Highlights:

摘要

The power-law regularities have been discovered behind many complex natural and social phenomenons. We discover that the power-law regularities, especially the Zipf’s and Heaps’ laws, also exist in large-scale software systems. We find that the distribution of lexical tokens in modern Java, C++ and C programs follows Zipf–Mandelbrot law, and the growth of program vocabulary follows Heaps’ law. The results are obtained through empirical analysis of real-world software systems. We believe our discovery reveals the statistical regularities behind computer programming.

论文关键词:Power-law,Zipf’s law,Heaps’ law,Zipf–Mandelbrot law,Infometric,Software metric

论文评审过程:Received 12 May 2008, Revised 20 January 2009, Accepted 3 February 2009, Available online 17 March 2009.

论文官网地址:https://doi.org/10.1016/j.ipm.2009.02.001