The means of producing information and the infrastructure for disseminating it are constantly changing. The web mobilizes information in electronic formats, making it easier to copy, modify, remix, and redistribute. This has changed how information is produced, distributed, and used. People are not just consuming information; they are actively producing, remixing, and sharing information, using the web as a platform for creativity and production. This is true of software development as well.
It is frequently commented by programmers and researchers who study software development, that programmers frequently copy and paste code. Although this practice is widely acknowledged, it is rarely studied directly, or explicitly accounted for in models of software development. However, this attitude is changing as software becomes more ubiquitous, and software development practice shifts away from the formal models of software engineering, towards a post-modernist perspective.
This study explores how source code snippets in programming books and on the web are changing software development practice. By examining program source code using clone detection algorithms, this study provides a comprehensive view of code copying across 6,190 PHP-language applications. These data are used to explore the concept of a “remix” method of software production, where software and systems are built out of copied and pasted snippets of code. These findings are contrasted against both traditional models of information production coming from informetrics (e.g., authorship, citation analysis), and models from software engineering (e.g., the Lego Hypothesis). Explanations for observed phenomena are discussed borrowing metaphors from linguistics, which provide a richer explanation of copy-paste programming than offered by the Lego Hypothesis.
The focus and findings of this study ultimately point to a pressing demand for further research centered on the notion of software as information. Software and software repositories hold a large amount of information about how it was produced, and how it is used, adapted, and maintained. Software informatics is proposed as an organizing label to study the science of information, practice, and communication around software. It studies the individual, collaborative, and social aspects of software production and use, spanning multiple representations of software from design, to source code, to application.
|Advisor:||Downie, J. Stephen, Twidale, Michael B.|
|School:||University of Illinois at Urbana-Champaign|
|School Location:||United States -- Illinois|
|Source:||DAI-A 72/07, Dissertation Abstracts International|
|Subjects:||Library science, Information science, Computer science|
|Keywords:||Copy-paste programming, Programming by Google, Remix programming, Software production, Source code|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be