Power-Laws in a Large Object-Oriented Software System
We present a comprehensive study of an implementation of the Smalltalk object oriented system, one of the first and purest object-oriented programming environment, searching for scaling laws in its properties. We study ten system properties, including the distributions of variable and method names,...
Saved in:
| Published in: | IEEE transactions on software engineering Vol. 33; no. 10; pp. 687 - 708 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
IEEE
01.10.2007
IEEE Computer Society |
| Subjects: | |
| ISSN: | 0098-5589, 1939-3520 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | We present a comprehensive study of an implementation of the Smalltalk object oriented system, one of the first and purest object-oriented programming environment, searching for scaling laws in its properties. We study ten system properties, including the distributions of variable and method names, inheritance hierarchies, class and method sizes, system architecture graph. We systematically found Pareto - or sometimes log-normal - distributions in these properties. This denotes that the programming activity, even when modeled from a statistical perspective, can in no way be simply modeled as a random addition of independent increments with finite variance, but exhibits strong organic dependencies on what has been already developed. We compare our results with similar ones obtained for large Java systems, reported in the literature or computed by ourselves for those properties never studied before, showing that the behavior found is similar in all studied object oriented systems. We show how the Yule process is able to stochastically model the generation of several of the power-laws found, identifying the process parameters and comparing theoretical and empirical tail indexes. Lastly, we discuss how the distributions found are related to existing object-oriented metrics, like Chidamber and Kemerer's, and how they could provide a starting point for measuring the quality of a whole system, versus that of single classes. In fact, the usual evaluation of systems based on mean and standard deviation of metrics can be misleading. It is more interesting to measure differences in the shape and coefficients of the data?s statistical distributions. |
|---|---|
| Bibliography: | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23 |
| ISSN: | 0098-5589 1939-3520 |
| DOI: | 10.1109/TSE.2007.1019 |