Retrieval and feedback models for blog feed search

From WikiLit
Jump to: navigation, search
Publication (help)
Retrieval and feedback models for blog feed search
Authors: Jonathan L. Elsas, Jaime Arguello, Jamie Callan, Jaime G. Carbonell [edit item]
Citation: SIGIR '08 Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval  : 347-354. 2008 July 20-24. Singapore, Singapore. Association for Computing Machinery.
Publication type: Conference paper
Peer-reviewed: Yes
Database(s):
DOI: 10.1145/1390334.1390394.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Retrieval and feedback models for blog feed search is a publication by Jonathan L. Elsas, Jaime Arguello, Jamie Callan, Jaime G. Carbonell.


[edit] Abstract

Blog feed search poses different and interesting challenges from traditional ad hoc document retrieval. The units of retrieval, the blogs, are collections of documents, the blog posts. In this work we adapt a state-of-the-art federated search model to the feed retrieval task, showing a significant improvement over algorithms based on the best performing submissions in the TREC 2007 Blog Distillation task [12]. We also show that typical query expansion techniques such as pseudo-relevance feedback using the blog corpus do not provide any significant performance improvement and in many cases dramatically hurt performance. We perform an in-depth analysis of the behavior of pseudorelevance feedback for this task and develop a novel query expansion technique using the link structure in Wikipedia. This query expansion technique provides significant and consistent performance improvements for this task, yielding a 22% and 14% improvement in MAP over the unexpanded query for our baseline and federated algorithms respectively.

[edit] Research questions

"In this work we adapt a state-of-the-art federated search model to the feed retrieval task, showing a significant improvement over algorithms based on the best performing submissions in the TREC 2007 Blog Distillation task[12]."

Research details

Topics: Query processing [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Main topic [edit item]
Theories: "Undetermined" [edit item]
Research design: Experiment [edit item]
Data source: Experiment responses, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Dump [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: English [edit item]

[edit] Conclusion

"we presented an in-depth analysis of query expansion for blog feed retrieval. On this task, our novel Wikipedia link-based approach obtained a greater than 13% improvement over no expansion (across large and small document models) in terms of both MAP and P@10. Although this method did not generalize to the Terabyte Track ad hoc queries it does show promise for queries that represent more general information needs, similar to those typical of feed retrieval."

[edit] Comments

""our novel Wikipedia link-based approach obtained a greater than 13% improvement over no expansion (across large and small document models) in terms of both MAP and P@10" p. 354"


Further notes[edit]

Facts about "Retrieval and feedback models for blog feed search"RDF feed
AbstractBlog feed search poses different and interBlog feed search poses different and interesting challenges from traditional ad hoc document retrieval. The units of retrieval, the blogs, are collections of documents, the blog posts. In this work we adapt a state-of-the-art federated search model to the feed retrieval task, showing a significant improvement over algorithms based on the best performing submissions in the TREC 2007 Blog Distillation task [12]. We also show that typical query expansion techniques such as pseudo-relevance feedback using the blog corpus do not provide any significant performance improvement and in many cases dramatically hurt performance. We perform an in-depth analysis of the behavior of pseudorelevance feedback for this task and develop a novel query expansion technique using the link structure in Wikipedia. This query expansion technique provides significant and consistent performance improvements for this task, yielding a 22% and 14% improvement in MAP over the unexpanded query for our baseline and federated algorithms respectively.ine and federated algorithms respectively.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
Comments"our novel Wikipedia link-based approach obtained a greater than 13% improvement over no expansion (across large and small document models) in terms of both MAP and P@10" p. 354
Conclusionwe presented an in-depth

analysis of querywe presented an in-depth analysis of query expansion for blog feed retrieval. On this task, our novel Wikipedia link-based approach obtained a greater than 13% improvement over no expansion (across large and small document models) in terms of both MAP and P@10. Although this method did not generalize to the Terabyte Track ad hoc queries it does show promise for queries that represent more general information needs,

similar to those typical of feed retrieval.imilar to those typical of feed retrieval.
Conference locationSingapore, Singapore +
Data sourceExperiment responses + and Wikipedia pages +
Dates20-24 +
Doi10.1145/1390334.1390394 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Retrieval%2Band%2Bfeedback%2Bmodels%2Bfor%2Bblog%2Bfeed%2Bsearch%22 +
Has authorJonathan L. Elsas +, Jaime Arguello +, Jamie Callan + and Jaime G. Carbonell +
Has domainComputer science +
Has topicQuery processing +
MonthJuly +
Pages347-354 +
Peer reviewedYes +
Publication typeConference paper +
Published inSIGIR '08 Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval +
PublisherAssociation for Computing Machinery +
Research designExperiment +
Research questionsIn this work we adapt a state-of-the-art federated search model to the feed retrieval task, showing a significant improvement over algorithms based on the best performing submissions in the TREC 2007 Blog Distillation task[12].
Revid10,930 +
TheoriesUndetermined
Theory typeDesign and action +
TitleRetrieval and feedback models for blog feed search
Unit of analysisArticle +
Urlhttp://dl.acm.org/citation.cfm?id=1390394 +
Wikipedia coverageMain topic +
Wikipedia data extractionDump +
Wikipedia languageEnglish +
Wikipedia page typeArticle +
Year2008 +