Ranking very many typed entities on Wikipedia

From WikiLit
Revision as of 20:30, January 30, 2014 by Fnielsen (Talk | contribs) (Text replace - "|collected_datatype=" to "|data_source=")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Publication (help)
Ranking very many typed entities on Wikipedia
Authors: Hugo Zaragoza, Henning Rode, Peter Mika, Jordi Atserias, Massimiliano Ciaramita, Giuseppe Attardi [edit item]
Citation: CIKM '07 Proceedings of the sixteenth ACM conference on Conference on information and knowledge management  : 1015-1018. 2007.
Publication type: Conference paper
Peer-reviewed: Yes
Database(s):
DOI: Define doi.
Google Scholar cites: Citations
Link(s):
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Ranking very many typed entities on Wikipedia is a publication by Hugo Zaragoza, Henning Rode, Peter Mika, Jordi Atserias, Massimiliano Ciaramita, Giuseppe Attardi.


[edit] Abstract

We discuss the problem of ranking very many entities of different types. In particular we deal with a heterogeneous set of types, some being very generic and some very specific. We discuss two approaches for this problem: i) exploiting the entity containment graph and ii) using a Web search engine to compute entity relevance. We evaluate these approaches on the real task of ranking Wikipedia entities typed with a state-of-the-art named-entity tagger. Results show that both approaches can greatly increase the performance of methods based only on passage retrieval.

[edit] Research questions

"We discuss the problem of ranking very many entities of different types. In particular we deal with a heterogeneous set of types, some being very generic and some very specific."

Research details

Topics: Ranking and clustering systems [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Main topic [edit item]
Theories: "Undetermined" [edit item]
Research design: Statistical analysis [edit item]
Data source: Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Dump [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: English [edit item]

[edit] Conclusion

"We have taken the rst steps towards studying the problem of ad-hoc entity ranking in the presence of a large set of heterogeneous entities. We have constructed a realistic test- bed to carry out evaluation of entity ranking models, and we have provided some initial directions of research. With respect to entity containment graphs our results show that it is important to take into account the notion of inverted entity frequency to discount general types. With respect to Web methods we showed that taking into account the rank of the documents in the computation of correlations can yield signi cant improvements in performance"

[edit] Comments

"" With respect to entity containment graphs our results show that it is important to take into account the notion of inverted entity frequency to discount general types. With respect to Web methods we showed that taking into account the rank of the documents in the computation of correlations can yield signi cant improvements in performanc" p. 1018"


Further notes[edit]

Facts about "Ranking very many typed entities on Wikipedia"RDF feed
AbstractWe discuss the problem of ranking very manWe discuss the problem of ranking very many entities of different types. In particular we deal with a heterogeneous set of types, some being very generic and some very specific. We discuss two approaches for this problem: i) exploiting the entity containment graph and ii) using a Web search engine to compute entity relevance. We evaluate these approaches on the real task of ranking Wikipedia entities typed with a state-of-the-art named-entity tagger. Results show that both approaches can greatly increase the performance of methods based only on passage retrieval.f methods based only on passage retrieval.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
Comments" With respect to entity containment graph" With respect to entity containment graphs our results show that it is important to take into account the notion of inverted entity frequency to discount general types. With respect to Web methods we showed that taking into account the rank of the documents in the computation of correlations can yield signi cant improvements in performanc" p. 1018i cant improvements in performanc" p. 1018
ConclusionWe have taken the rst steps towards studyWe have taken the rst steps towards studying the problem

of ad-hoc entity ranking in the presence of a large set of heterogeneous entities. We have constructed a realistic test- bed to carry out evaluation of entity ranking models, and we have provided some initial directions of research. With respect to entity containment graphs our results show that it is important to take into account the notion of inverted entity frequency to discount general types. With respect to Web methods we showed that taking into account the rank of the documents in the computation of correlations can yield signi cant improvements in performanceeld

signi cant improvements in performance
Data sourceWikipedia pages +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Ranking%2Bvery%2Bmany%2Btyped%2Bentities%2Bon%2BWikipedia%22 +
Has authorHugo Zaragoza +, Henning Rode +, Peter Mika +, Jordi Atserias +, Massimiliano Ciaramita + and Giuseppe Attardi +
Has domainComputer science +
Has topicRanking and clustering systems +
Pages1015-1018 +
Peer reviewedYes +
Publication typeConference paper +
Published inCIKM '07 Proceedings of the sixteenth ACM conference on Conference on information and knowledge management +
Research designStatistical analysis +
Research questionsWe discuss the problem of ranking very many entities of different types. In particular we deal with a heterogeneous set of types, some being very generic and some very specific.
Revid10,921 +
TheoriesUndetermined
Theory typeDesign and action +
TitleRanking very many typed entities on Wikipedia
Unit of analysisArticle +
Wikipedia coverageMain topic +
Wikipedia data extractionDump +
Wikipedia languageEnglish +
Wikipedia page typeArticle +
Year2007 +