Power-Laws in a Large Object-Oriented Software System

We present a comprehensive study of an implementation of the Smalltalk object oriented system, one of the first and purest object-oriented programming environment, searching for scaling laws in its properties. We study ten system properties, including the distributions of variable and method names,...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on software engineering Vol. 33; no. 10; pp. 687 - 708
Main Authors:	Concas, Giulio, Marchesi, Michele, Pinna, Sandro, Serra, Nicola
Format:	Journal Article
Language:	English
Published:	New York IEEE 01.10.2007 IEEE Computer Society
Subjects:	Architecture C plus plus Computer programs D.2.3.a Object-oriented programming D.2.4.h Statistical methods D.2.8.a Complexity measures D.2.8.d Product metrics D.2.8.e Software science D.3.2.p Object-oriented languages Digital Object Identifier G.3.p Stochastic processes Graphs Java Java (programming language) Mathematical models Methods Object oriented Object oriented modeling Object oriented programming Open source software Power generation Power system modeling Process parameters Searching Shape measurement Software Software engineering Software systems Standard deviation Statistical analysis Statistical distributions Stochastic models Studies Tail
ISSN:	0098-5589, 1939-3520
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	We present a comprehensive study of an implementation of the Smalltalk object oriented system, one of the first and purest object-oriented programming environment, searching for scaling laws in its properties. We study ten system properties, including the distributions of variable and method names, inheritance hierarchies, class and method sizes, system architecture graph. We systematically found Pareto - or sometimes log-normal - distributions in these properties. This denotes that the programming activity, even when modeled from a statistical perspective, can in no way be simply modeled as a random addition of independent increments with finite variance, but exhibits strong organic dependencies on what has been already developed. We compare our results with similar ones obtained for large Java systems, reported in the literature or computed by ourselves for those properties never studied before, showing that the behavior found is similar in all studied object oriented systems. We show how the Yule process is able to stochastically model the generation of several of the power-laws found, identifying the process parameters and comparing theoretical and empirical tail indexes. Lastly, we discuss how the distributions found are related to existing object-oriented metrics, like Chidamber and Kemerer's, and how they could provide a starting point for measuring the quality of a whole system, versus that of single classes. In fact, the usual evaluation of systems based on mean and standard deviation of metrics can be misleading. It is more interesting to measure differences in the shape and coefficients of the data?s statistical distributions.
Bibliography:	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23
ISSN:	0098-5589 1939-3520
DOI:	10.1109/TSE.2007.1019