Wikipedias: collaborative web-based encyclopedias as complex networks

From WikiLit
Jump to: navigation, search
Publication (help)
Wikipedias: collaborative web-based encyclopedias as complex networks
Authors: Vinko Zlatić, Miran Božičević, Hrvoje Štefančić, Mladen Domazet [edit item]
Citation: Physical Review E - Statistical, Nonlinear, and Soft Matter Physics 74 (1): . 2006.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: 10.1103/PhysRevE.74.016115.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Wikipedias: collaborative web-based encyclopedias as complex networks is a publication by Vinko Zlatić, Miran Božičević, Hrvoje Štefančić, Mladen Domazet.


[edit] Abstract

Wikipedia is a popular web-based encyclopedia edited freely and collaboratively by its users. In this paper we present an analysis of Wikipedias in several languages as complex networks. The hyperlinks pointing from one Wikipedia article to another are treated as directed links while the articles represent the nodes of the network. We show that many network characteristics are common to different language versions of Wikipedia, such as their degree distributions, growth, topology, reciprocity, clustering, assortativity, path lengths, and triad significance profiles. These regularities, found in the ensemble of Wikipedias in different languages and of different sizes, point to the existence of a unique growth process. We also compare Wikipedias to other previously studied networks.

[edit] Research questions

"Wikipedia is a popular web-based encyclopedia edited freely and collaboratively by its users. In this paper we present an analysis of Wikipedias in several languages as complex networks. The hyperlinks pointing from one Wikipedia article to another are treated as directed links while the articles represent the nodes of the network."

Research details

Topics: Size of Wikipedia [edit item]
Domains: Physics [edit item]
Theory type: Analysis [edit item]
Wikipedia coverage: Main topic [edit item]
Theories: "In the last few years the physics community has paid a lot

of attention to the field of complex networks. A considerable amount of research has been done on different real world networks, complex network theory, and mathematical models

1–4 . Many real world systems can be described as complex

networks:WWW 5 , internet routers 6–8 , proteins 9 , and scientific collaborations 10 , among others. Complex network theory benefitted from the study of such networks both from the motivational aspect as well as from the new problems that arise with every newly analyzed system." [edit item]

Research design: Statistical analysis [edit item]
Data source: Experiment responses, Websites, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Language [edit item]
Wikipedia data extraction: Dump [edit item]
Wikipedia page type: Article, Information categorization and navigation [edit item]
Wikipedia language: Multiple [edit item]

[edit] Conclusion

"Based on our results, it is very likely that the growth process of Wikipedias is universal. The similarities between Wikipedias in all the measured characteristics suggest that we have observed the same kind of a complex network in different stages of development. We have also found that certain individual Wikipedias, such as Polish or Italian, significantly differ from the other members of the observed set. This difference can be seen most easily in their degree distributions, but also shows in assortativity, clustering and the triad significance profile. In the case of the Polish Wikipedia, where the discrepancies are the greatest, we have found that they were caused by an editorial decision involving calendar pages. This shows that the common growth process we have observed is very sensitive to community-driven decisions. We have shown further that Wikipedia article networks on the whole resemble the WWW networks. Specifically, they belong to the TSP superfamily described in Ref. 24 that includes WWW and social networks, and exhibit smallworld behavior, with average shortest path lengths close to those of a random network. In some characteristics, however, large Wikipedias seem to diverge from the WWW. Their reciprocity is lower than that of the WWW reported in Ref.

22 , and their average shortest path lengths seem to tend to

a stable value. It is possible that the specific properties of Wikipedias are related to the underlying structure of knowledge, but also that their shared features stem from growth dynamics driven by free contributions, common policies, and community decision making. Whichever the case, the regularities we have found point to the existence of a unique growth process. These findings in turn support the method of using statistical ensembles in network research, and, finally, affirm the role of statistical physics in modeling complex social interaction systems such as Wikipedia."

[edit] Comments

"The article network characteristics accross multiple Wikipedias showed "that the growth process of Wikipedias is universal" p.8"


Further notes[edit]

Facts about "Wikipedias: collaborative web-based encyclopedias as complex networks"RDF feed
AbstractWikipedia is a popular web-based encyclopeWikipedia is a popular web-based encyclopedia edited freely and collaboratively by its users. In this paper we present an analysis of Wikipedias in several languages as complex networks. The hyperlinks pointing from one Wikipedia article to another are treated as directed links while the articles represent the nodes of the network. We show that many network characteristics are common to different language versions of Wikipedia, such as their degree distributions, growth, topology, reciprocity, clustering, assortativity, path lengths, and triad significance profiles. These regularities, found in the ensemble of Wikipedias in different languages and of different sizes, point to the existence of a unique growth process. We also compare Wikipedias to other previously studied networks.dias to other previously studied networks.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
CommentsThe article network characteristics accross multiple Wikipedias showed "that the growth process of Wikipedias is universal" p.8
ConclusionBased on our results, it is very likely thBased on our results, it is very likely that the growth

process of Wikipedias is universal. The similarities between Wikipedias in all the measured characteristics suggest that we have observed the same kind of a complex network in different stages of development. We have also found that certain individual Wikipedias, such as Polish or Italian, significantly differ from the other members of the observed set. This difference can be seen most easily in their degree distributions, but also shows in assortativity, clustering and the triad significance profile. In the case of the Polish Wikipedia, where the discrepancies are the greatest, we have found that they were caused by an editorial decision involving calendar pages. This shows that the common growth process we have observed is very sensitive to community-driven decisions. We have shown further that Wikipedia article networks on the whole resemble the WWW networks. Specifically, they belong to the TSP superfamily described in Ref. 24 that includes WWW and social networks, and exhibit smallworld behavior, with average shortest path lengths close to those of a random network. In some characteristics, however, large Wikipedias seem to diverge from the WWW. Their reciprocity is lower than that of the WWW reported in Ref.

22 , and their average shortest path lengths seem to tend to

a stable value. It is possible that the specific properties of Wikipedias are related to the underlying structure of knowledge, but also that their shared features stem from growth dynamics driven by free contributions, common policies, and community decision making. Whichever the case, the regularities we have found point to the existence of a unique growth process. These findings in turn support the method of using statistical ensembles in network research, and, finally, affirm the role of statistical physics in modeling complex social interaction systems such as Wikipedia.ial interaction

systems such as Wikipedia.
Data sourceExperiment responses +, Websites + and Wikipedia pages +
Doi10.1103/PhysRevE.74.016115 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Wikipedias%3A%2Bcollaborative%2Bweb-based%2Bencyclopedias%2Bas%2Bcomplex%2Bnetworks%22 +
Has authorVinko Zlatić +, Miran Božičević +, Hrvoje Štefančić + and Mladen Domazet +
Has domainPhysics +
Has topicSize of Wikipedia +
Issue1 +
Peer reviewedYes +
Publication typeJournal article +
Published inPhysical Review E - Statistical, Nonlinear, and Soft Matter Physics +
Research designStatistical analysis +
Research questionsWikipedia is a popular web-based encyclopeWikipedia is a popular web-based encyclopedia edited freely and collaboratively by its users. In this paper

we present an analysis of Wikipedias in several languages as complex networks. The hyperlinks pointing from one Wikipedia article to another are treated as directed links while the articles represent the nodes of the network.ticles represent the nodes of the

network.
Revid11,103 +
TheoriesIn the last few years the physics communitIn the last few years the physics community has paid a lot

of attention to the field of complex networks. A considerable amount of research has been done on different real world networks, complex network theory, and mathematical models

1–4 . Many real world systems can be described as complex

networks:WWW 5 , internet routers 6–8 , proteins 9 , and scientific collaborations 10 , among others. Complex network theory benefitted from the study of such networks both from the motivational aspect as well as from the new problems

that arise with every newly analyzed system.
at arise with every newly analyzed system.
Theory typeAnalysis +
TitleWikipedias: collaborative web-based encyclopedias as complex networks
Unit of analysisLanguage +
Urlhttp://dx.doi.org/10.1103/PhysRevE.74.016115 +
Volume74 +
Wikipedia coverageMain topic +
Wikipedia data extractionDump +
Wikipedia languageMultiple +
Wikipedia page typeArticle + and Information categorization and navigation +
Year2006 +