Browse wiki

Crossing textual and visual content in different application scenarios
Abstract This paper deals with multimedia informatiThis paper deals with multimedia information access. We propose two new approaches for hybrid text-image information processing that can be straightforwardly generalized to the more general multimodal scenario. Both approaches fall in the trans-media pseudo-relevance feedback category. Our first method proposes using a mixture model of the aggregate components, considering them as a single relevance concept. In our second approach, we define trans-media similarities as an aggregation of monomodal similarities between the elements of the aggregate and the new multimodal object. We also introduce the monomodal similarity measures for text and images that serve as basic components for both proposed trans-media similarities. We show how one can frame a large variety of problem in order to address them with the proposed techniques: image annotation or captioning, text illustration and multimedia retrieval and clustering. Finally, we present how these methods can be integrated in two applications: a travel blog assistant system and a tool for browsing the Wikipedia taking into account the multimedia nature of its content.ount the multimedia nature of its content.
Added by wikilit team Added on initial load  +
Collected data time dimension Cross-sectional  +
Comments Wikipedia pages: This corpus concerns arouWikipedia pages: This corpus concerns around 8,500 pages taken from the french Wikipedia corpus. We extracted these pages from the xml dump done in September 2007 and provided by the Wikipedia Foundation. In the part that describes their program with visualization of Wikipedia data there is little experiment. "Data source" should not be "Experiment responses". "Research design" should be "design science", and perhaps "mathematical modelling", probably not "experiment". They do not use the whole part of the document only title, free-text image description and the paragraph where it is used. Thus the "unit of analysis" is not the full article. Collected time dimension must be "cross-sectional". time dimension must be "cross-sectional".
Conclusion We have presented a framework for accessinWe have presented a framework for accessing multimodal data. First of all, the theoretical contribution is the extension of the principle of trans-media feedback, into a metric view: the definition of trans-media similarities. As it was shown, these new similarity measures of cross-content enables to find illustrative images for a text, to annotate an image, cluster or retrieve multi-modal objects. Moreover, the trans-media similarities are not specific to image and text: they can be applied to any mixture of media ( speech, video, text ) or views of an object. Most importantly, we have shown how these techniques can be used in two use cases: the travel blog assistant system and the multimedia browsing tool. These two applications stress the necessity of cross-media systems, where no monomedia systems can solve the user’s problem, nor address all the different user’s need at the same time.he different user’s need at the same time.
Data source Wikipedia pages  +
Doi 10.1007/s11042-008-0246-8 +
Google scholar url http://scholar.google.com/scholar?ie=UTF-8&q=%22Crossing%2Btextual%2Band%2Bvisual%2Bcontent%2Bin%2Bdifferent%2Bapplication%2Bscenarios%22  +
Has author Julien Ah-Pine + , Marco Bressan + , Stephane Clinchant + , Gabriela Csurka + , Yves Hoppenot + , Jean-Michel Renders +
Has domain Computer science +
Has topic Multimedia information retrieval +
Issue 1  +
Pages 31-56  +
Peer reviewed Yes  +
Publication type Journal article  +
Published in Multimedia Tools and Applications +
Research design Design science  +
Research questions This paper deals with multimedia informatiThis paper deals with multimedia information access. We propose two new approaches for hybrid text-image information processing that can be straightforwardly generalized to the more general multimodal scenario. Both approaches fall in the trans-media pseudo-relevance feedback category. Our first method proposes using a mixture model of the aggregate components, considering them as a single relevance concept. In our second approach, we define trans-media similarities as an aggregation of monomodal similarities between the elements of the aggregate and the new multimodal object. We also introduce the monomodal similarity measures for text and images that serve as basic components for both proposed trans-media similarities. We show how one can frame a large variety of problem in order to address them with the proposed techniques: image annotation or captioning, text illustration and multimedia retrieval and clustering. Finally, we present how these methods can be integrated in two applications: a travel blog assistant system and a tool for browsing the Wikipedia taking into account the multimedia nature of its content.ount the multimedia nature of its content.
Revid 11,129  +
Theories First of all, the theoretical contributionFirst of all, the theoretical contribution is the extension of the principle of trans-media feedback, into a metric view: the definition of trans-media similarities. As it was shown, these new similarity measures of cross-content enables to find illustrative images for a text, to annotate an image, cluster or retrieve multi-modal objects., cluster or retrieve multi-modal objects.
Theory type Design and action  +
Title Crossing textual and visual content in different application scenarios
Unit of analysis Article  +
Url http://dx.doi.org/10.1007/s11042-008-0246-8  +
Volume 42  +
Wikipedia coverage Sample data  +
Wikipedia data extraction Dump  +
Wikipedia language French  +
Wikipedia page type Article  +
Year 2009  +
Creation dateThis property is a special property in this wiki. 15 March 2012 20:25:38  +
Categories Multimedia information retrieval  + , Computer science  + , Publications  +
Modification dateThis property is a special property in this wiki. 30 January 2014 20:53:48  +
hide properties that link here 
  No properties link to this page.
 

 

Enter the name of the page to start browsing from.