there is lots and lots of work on these topics, you just need to look. Marco Baroni and others have tried to produce a joint model of picture and distributional word similarity (Bruni et al ACL 2012 Distributional Semantics in Technicolor), Shane Bergsma produced a visual-features model of selectional preferences (Bergsma and Goebel RANLP 2011 Using Visual Information to predict selectional preference).
Most of this work on finding object features uses a combination of color-space distribution and SIFT features (both probably have a decent implementation in OpenCV/SimpleCV).
Movies and pictures can probably be handled by color space statistics quite easily, as this aspect is manipulated in post-production and subject to deliberate stylistic choice by the people making the film or the picture. (e.g., National Geographic style jungle realism vs. modern scifi movie blue-and-orange tints).
On Tue, Nov 13, 2012 at 11:31 AM, Albretch Mueller <lbrtchx at gmail.com>wrote:
> Many forms of string metrics are used in corpora research, but I
> don't see anywhere anything about binary/encoded data. This is what I
> have in mind:
> you feed many paintings and the styles/authors are figured out:
> Picassos, Matisses, Kansdinkys, ...
> object detection inside pictures: these are all shoes ...
> while watching Michela Watkins' "Bitch Pleeze" performance or her Ann
> Coulter impersonation, you want to stop the reel at some point and
> have her face at this very moment to match other faces in a corpus,
> follow the faces' transitions and check it as a multi-modal input with
> their speech context and other gestures
> ... stratify all types of face expressions
> you want to know when a video segment has been repeated exactly or
> approximately/similarly, say, the actual Chaplin silent movies in the
> views and rehashings in Richard Attenborough's biographical movie
> Do you know on any research on those topics? google scholar didn't
> give me much good leads
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
-- Dr. Yannick Versley
Sonderforschungsbereich 833 Universität Tübingen Nauklerstr. 35 72074 Tübingen
Tel.: +49-7071-29 77155 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 3226 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20121113/02a81e49/attachment.txt>