Title: Copy to Create: Patterns of copying and reuse in Web 2.0
1Copy to Create Patterns of copying and reuse in
Web 2.0
- M. Cameron Jones
- mjones2_at_uiuc.edu , http//cameronjones.com/
- Graduate School of Library and Information
Science - University of Illinois at Urbana-Champaign
2How Its Made
- How are information technologies created?
- By professionals
- By novice programmers
- By end-users
- What aspects of the technology landscape enable
or facilitate technology creation? - How can we better design systems, and tools to
support IT production?
3The Copying Meme
- People create things often by sharing with, and
copying from each other - People learn how to create things by copying
(Mackay, 1990 Nardi Miller, 1991) - Copying and sharing code helps bridge the
expertise gap (Nardi Miller, 1990) - Professional programmers also copy source code
when programming (Kim et al. 2005).
4What is Web 2.0?
Markus Angermeier retrieved from
http//kosmar.de/wp-content/web20map.png
5How does Web 2.0 function?
- Who is web 2.0?
- How do people
- contribute?
- create content?
- program mashups?
- consume and recycle material?
- What is unique about these interactions?
- What is familiar?
6Outline
- Web Mashups
- Recent research on mashup programming
- Yahoo! Pipes
- Authorship in Pipes
- Social Networks of Cloning
- Mapping Mashups
- Google Maps
- Yahoo! Maps
7Web Mashups
- Websites which combine data and services from
across the web - What is interesting about web mashups
- Creativity
- Innovation
- Exploration
- Functional solutions
- Store (and share) knowledge/expertise
8HousingMaps.com
HousingMaps.com Google Maps Craigslist
9WASABE
WASABE Amazon Google Wikipedia OPAC
10Student Projects
Campus Schedule Planner
11Student Projects
Champaign-Urbana Bus Route Planner
12 but Mashups are hard!
- Examples presented are exception, not the norm
(e.g., Jones Twidale, 2006 Jones, Twidale,
Urban, 2007) - High threshold (lots of diverse knowledge and
skills needed) - Cryptic APIs (each one is different, and often
changes) - But there may be hope!
13Yahoo! Pipes
14Yahoo! Pipes Data
Screenshot taken June 18, 2007 from
http//pipes.yahoo.com/pipes/pipes.popular
15Authorship in Pipes
- 18,680 pipes (collected June 6, 2007)
- 11,868 authors
16Long-tail Distributions
- Book sales
- City population sizes
- Web page hits
- Authorship in scholarly journals
- Authorship in the Wikipedia
- Citation counts in scholarly writing
- Project membership in Open-Source
17Yahoo! Pipes Data
18Social Networks of cloning
- 1,856 clones identified with names Copy of or
copy - Identify who was cloning from whom by trying to
determine the author of the original pipe being
cloned (an inexact measure) - 1,579 pipe authors nodes in network
- 1,483 edges representing the cloned a pipe from
relationship
19Social network of Pipe Cloning
20What pipes are being cloned?
- Examples (DanielRaffel, Edward H, Pasha Sadri)
- Cloned pipes
21Clusters of clones
Aggregated News Alerts
Example pipes
Apartment NearSomething del.icio.us Web Search
eBay Price Watch
22Factors determining cloning
franticindustries
23Factors determining cloning
- Cumulative Advantage Distributions (Simon, 1957
Price, 1976)
24Factors determining cloning
25Further topics to be explored
- What modules are most frequently used?
- What modules are most frequently cloned?
- How are the pipes modified or changed when they
are cloned? Are they modified? - Are portions (subsets of modules) of pipes copied
into new pipes? What subsets?
26Copying Code in Programming
- What about mashup programming more generally
where there isnt a simple clone function? - HTML View Source
- Rosson and Caroll (1993) qualitative study of
professional SmallTalk programmers found code was
often copied from documentation examples.
27Map-based Web Mashups
- 58 of Mashups on Programmable Web are mapping
mashups (1,178/2,038) - Three main API providersGoogle Maps (50 of all
mashups)Yahoo Maps (4 of all mashups)Microsoft
Virtual Earth (4 of all mashups) - How are people coding and constructing Map
Mashups?
28Data Collection - Mashups
- Downloaded JavaScript source code for all mapping
mashups listed - Problems Dead links and inaccurate URLs
- Google Maps 494 unique mashups
- Yahoo Maps 94 unique mashups
- Microsoft Virtual Earth 17 unique mashups
29Data Collection - Snippets
- Downloaded JavaScript example snippets from API
provider documentation. - Google Maps 32 (1121) example snippets
- Yahoo Maps 16 example snippets
- Microsoft Virtual Earth 65 example snippets
- Microsoft Virtual Earth excluded from further
analysis (not enough data)
30Data Analysis
- Clone Analysis on source code
- Identify code clones in the Mashup code
- Clone Pair a pair of source code segments that
are structurally or syntactically similar
(Kapser Godfrey, 2003). - Use source code cloning to identify what code is
being copied and how mashups are related.
31Software Clones
function load() if (GBrowserIsCompatible()
) var map new GMap2(document.getElemen
tById("map")) map.addControl(new
GSmallMapControl()) map.addControl(new
GMapTypeControl()) map.setCenter(new
GLatLng(37.4419, -122.1419), 13) //
Create our "tiny" marker icon var icon
new GIcon() icon.image
"/ridefinder/images/mm_20_red.png"
icon.shadow "/ridefinder/images/mm_20_shadow.png
" icon.iconSize new GSize(12, 20)
icon.shadowSize new GSize(22, 20)
icon.iconAnchor new GPoint(6, 20)
icon.infoWindowAnchor new GPoint(5, 1)
// Add 10 markers to the map at random
locations var bounds map.getBounds()
var southWest bounds.getSouthWest()
var northEast bounds.getNorthEast()
var lngSpan northEast.lng() - southWest.lng()
function load() if (GBrowserIsCompatible()
) var map new GMap2(document.getElemen
tById("map")) map.addControl(new
GSmallMapControl()) map.addControl(new
GMapTypeControl()) map.setCenter(new
GLatLng(37.4419, -122.1419), 13)
From gmap.doc.3.mash
From gmap.doc.15.mash
32Data Analysis
- Filter applications w/o clones
- Google Maps 505 applications
- Yahoo Maps 101 applications
- Filter intra-application clones
- Google Maps 5,731 clones
- Yahoo Maps 2,718 clones
- Dichotomous application-by-clone occurrence
matrix - Hamming -difference distance measure
- Classic, metric multi-dimensional scaling
33Google Maps
34Yahoo Maps Clones
Yahoo! Widgets
Microsoft Virtual Earth
MochiKit
35Conclusions
- Copying and sharing are essential components of
technology production - Example and documentation are heavily used in the
processes of creating and learning to create - What is familiar?
- Patterns of production appear to be similar to
other media and contexts - What is unique about copying on the web?
- The scale of web systems - many of the
statistical tests and measures do not adequately
cope.
36Future Research
- Collect code snippets from other sources
- Forums
- Mailing lists
- Coding websites
- Collect code from other sources
- PHP-language Open-Source projects
- Analyze and classify the clones
- What code is being copied? Why?
- Other domains and contexts
37Copying Across the web
- MySpace layouts
- RSS aggregation
- SpamBlogs
- Video and music mashups
- CyWorld scraping
- Online learning (Inquiry Page)
38Scrapbooking in CyWorld
39(No Transcript)