Preferential attachment in the growth of social networks: the Internet encyclopedia Wikipedia

From WikiLit
Jump to: navigation, search
Publication (help)
Preferential attachment in the growth of social networks: the Internet encyclopedia Wikipedia
Authors: Andrea Capocci, V. D. P. Servedio, F. Colaiori, Luciana S. Buriol, Debora Donato, Stefano Leonardi, Guido Caldarelli [edit item]
Citation: Physical Review E - Statistical, Nonlinear, and Soft Matter Physics 74 (3): . 2006.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: 10.1103/PhysRevE.74.036116.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Preferential attachment in the growth of social networks: the Internet encyclopedia Wikipedia is a publication by Andrea Capocci, V. D. P. Servedio, F. Colaiori, Luciana S. Buriol, Debora Donato, Stefano Leonardi, Guido Caldarelli.


[edit] Abstract

We present an analysis of the statistical properties and growth of the free on-line encyclopedia Wikipedia. By describing topics by vertices and hyperlinks between them as edges, we can represent this encyclopedia as a directed graph. The topological properties of this graph are in close analogy with those of the World Wide Web, despite the very different growth mechanism. In particular, we measure a scale-invariant distribution of the in and out degree and we are able to reproduce these features by means of a simple statistical model. As a major consequence, Wikipedia growth can be described by local rules such as the preferential attachment mechanism, though users, who are responsible of its evolution, can act globally on the network.

[edit] Research questions

"We present an analysis of the statistical properties and growth of the free on-line encyclopedia Wikipedia."

Research details

Topics: Other corpus topics [edit item]
Domains: Computer science [edit item]
Theory type: Analysis [edit item]
Wikipedia coverage: Main topic [edit item]
Theories: "they use the "graph theory" to analyse wikipedia, and use "preferrencial attachment" to explain the graph of Wikipedia and its growth" [edit item]
Research design: Statistical analysis [edit item]
Data source: Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Live Wikipedia [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: Multiple [edit item]

[edit] Conclusion

"We find that the Wikipedia graph exhibits a topological bow-tie-like structure, as does the WWW. Moreover, the frequency distributions of the number of incoming indegree and outgoing out-degree edges show fat-tail powerlaw behaviors. Further, the in degrees of connected vertices are not correlated. These last two findings suggest that edges are not drawn toward and from existing topics uniformly. Rather, the large number of incoming and outgoing edges of a node increases the probability of acquiring new incoming and outgoing edges, respectively. In the literature concerning scale-free networks, this phenomenon is called “preferential attachment” and is explained in detail below.

Thus, empirical and theoretical evidences show that traditional models introduced to explain nontrivial features of complex networks by simple algorithms remain qualitatively valid for Wikipedia, whose technological framework would allow a wider variety of evolutionary patterns. This reflects on the role played by the preferential attachment in generating complex networks: such mechanism is traditionally believed to hold when the dissemination of information throughout a social network is not efficient and a “bounded rationality” hypothesis is assumed. In the WWW, for example, the preferential attachment is the result of the dif- ficulty for a webmaster to identify optimal sources of information to refer to, favoring the herding behavior which generates the “rich-get-richer” rule. One would expect the coordination of the collaborative effort to be more effective in the Wikipedia environment since any authoritative agent can use his expertise to tune the linkage from and toward any page in order to optimi information mining. Nevertheless, empirical evidences show that the statistical properties of Wikipedia do not differ substantially from those of the WWW. This suggests two possible scenarios: preferential attachment may be the consequence of the intrinsic organization of the underlying knowledge; alternatively, the preferential attachment mechanism emerges because the Wiki technical capabilities are not fully exploited by Wikipedia contributors: if this is the case, their focus on each specific subject puts much more effort in building a single Wiki entry, with little attention toward the global efficiency of the organization of information across the whole encyclopedia."

[edit] Comments


Further notes[edit]

Facts about "Preferential attachment in the growth of social networks: the Internet encyclopedia Wikipedia"RDF feed
AbstractWe present an analysis of the statistical We present an analysis of the statistical properties and growth of the free on-line encyclopedia Wikipedia. By describing topics by vertices and hyperlinks between them as edges, we can represent this encyclopedia as a directed graph. The topological properties of this graph are in close analogy with those of the World Wide Web, despite the very different growth mechanism. In particular, we measure a scale-invariant distribution of the in and out degree and we are able to reproduce these features by means of a simple statistical model. As a major consequence, Wikipedia growth can be described by local rules such as the preferential attachment mechanism, though users, who are responsible of its evolution, can act globally on the network.volution, can act globally on the network.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
ConclusionWe find that the Wikipedia graph exhibits aWe find that the Wikipedia graph exhibits a topological bow-tie-like structure, as does the WWW. Moreover, the frequency distributions of the number of incoming indegree and outgoing out-degree edges show fat-tail powerlaw behaviors. Further, the in degrees of connected vertices are not correlated. These last two findings suggest that edges are not drawn toward and from existing topics uniformly. Rather, the large number of incoming and outgoing edges of a node increases the probability of acquiring new incoming and outgoing edges, respectively. In the literature concerning scale-free networks, this phenomenon is called “preferential attachment” and is explained in detail below. Thus, empirical and theoretical evidences show that traditional models introduced to explain nontrivial features of complex networks by simple algorithms remain qualitatively valid for Wikipedia, whose technological framework would allow a wider variety of evolutionary patterns. This reflects on the role played by the preferential attachment in generating complex networks: such mechanism is traditionally believed to hold when the dissemination of information throughout a social network is not efficient and a “bounded rationality” hypothesis is assumed. In the WWW, for example, the preferential attachment is the result of the dif- ficulty for a webmaster to identify optimal sources of information to refer to, favoring the herding behavior which generates the “rich-get-richer” rule. One would expect the coordination of the collaborative effort to be more effective in the Wikipedia environment since any authoritative agent can use his expertise to tune the linkage from and toward any page in order to optimi information mining. Nevertheless, empirical evidences show that the statistical properties of Wikipedia do not differ substantially from those of the WWW. This suggests two possible scenarios: preferential attachment may be the consequence of the intrinsic organization of the underlying knowledge; alternatively, the preferential attachment mechanism emerges because the Wiki technical capabilities are not fully exploited by Wikipedia contributors: if this is the case, their focus on each specific subject puts much more effort in building a single Wiki entry, with little attention toward the global efficiency of the organization of information across the whole encyclopedia.information across the whole encyclopedia.
Data sourceWikipedia pages +
Doi10.1103/PhysRevE.74.036116 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Preferential%2Battachment%2Bin%2Bthe%2Bgrowth%2Bof%2Bsocial%2Bnetworks%3A%2Bthe%2BInternet%2Bencyclopedia%2BWikipedia%22 +
Has authorAndrea Capocci +, V. D. P. Servedio +, F. Colaiori +, Luciana S. Buriol +, Debora Donato +, Stefano Leonardi + and Guido Caldarelli +
Has domainComputer science +
Has topicOther corpus topics +
Issue3 +
Peer reviewedYes +
Publication typeJournal article +
Published inPhysical Review E - Statistical, Nonlinear, and Soft Matter Physics +
Research designStatistical analysis +
Research questionsWe present an analysis of the statistical properties and growth of the free on-line encyclopedia Wikipedia.
Revid10,911 +
Theoriesthey use the "graph theory" to analyse wikipedia, and use "preferrencial attachment" to explain the graph of Wikipedia and its growth
Theory typeAnalysis +
TitlePreferential attachment in the growth of social networks: the Internet encyclopedia Wikipedia
Unit of analysisArticle +
Urlhttp://dx.doi.org/10.1103/PhysRevE.74.036116 +
Volume74 +
Wikipedia coverageMain topic +
Wikipedia data extractionLive Wikipedia +
Wikipedia languageMultiple +
Wikipedia page typeArticle +
Year2006 +