neWMW

CBIR and good flickr applications

with 14 comments

Although Content-Based Image Retrieval / CBIR (or Content-Based Visual Information Retrieval / CBVIR)  is not new, it is still not appearing in the mainstream. Applications like retrievr and xcavator are attracting attention, but they still aren’t very common. Thanks to flickr CBIR finally found a good database to draw its images from, and vice versa CBIR makes for good flickr applications. So flickr made CBIR usefull for the consumer. But besides that CBIR still requires a very different, nonlingual approach to searching.

Just to give you a short introduction on the subject: CBIR is a way to search images like it is content, and thus a way to overcome the problem of searching images in a large database. So if you look at retrievr, you can make a sketch and the relevant images appear. If you haven’t already, check out the great Art of retrievr page, which contains some very nice sketches people made using the sketch pad. Check the Wikipedia page on CBIR if you want more information, and a good introduction on the subject. 

But research is also being done in the field of searching for relevant images in a moviedatabase, for example by the Oxford University with their Video Google. Although it is hard to imagine what kind of impact this can have on our use of the search engine, we can say that this will impact our way of crawling through the World Wide Web in the future. We will not be bound by language. Think about it for example; When you are watching a movie and want to know in which movie you saw an object or composition before, you select a part of the screen (for example the shower curtain in Hitchcock’s Psycho) and search for the same image in an IMDB.com like database containing the data of all the movie images and their compositions. Or think about face recognition.

A quick thought
Just to get some thoughts organised for myself here is a short theory. A very important question that has been in my mind for sometime now; is language actually a way for us to overcome our initial restraint to communicate images. Is language simply nothing more than a way to communicate images and views over distance? Or a way to initiate (technological) change and pave the way for the image?

An explanation that fascinated me comes (again) from Marshall McLuhan. The next answer comes from an interview he did with Playboy in 1969. No it’s not a very academical source, but it is one of the more clearer versions of this theory.

“When tribal man becomes phonetically literate, he may have an improved abstract intellectual grasp of the world, but most of the deeply emotional corporate family feeling is excised from his relationship with his social milieu. This division of sight and sound and meaning causes deep psychological effects, and he suffers a corresponding separation and impoverishment of his imaginative, emotional and sensory life. He begins reasoning in a sequential linear fashion; he begins categorizing and classifying data. As knowledge is extended in alphabetic form, it is localized and fragmented into specialties, creating division of function, of social classes, of nations and of knowledge–and in the process, the rich interplay of all the senses that characterized the tribal society is sacrificed.”

So, when someone learns language, he begins to think in a lineair fashion. But now with CBIR we go back to the ways of “primitive and pre-alphabetic people who integrate time and space as one and live in an acoustic, horizonless, boundless, olfactory space, rather than in visual space.” Many have debated these remarks of McLuhan, but more and more we see that the World Wide Web is transforming from text-only to a full visual experience, requiring no language but the programming language behind the screens like the linguistic version of the Wizard of Oz.

But on the other hand, I’m typing this, how could I possibly have explained all this to you in images? To conclude these thoughts I think that language is the gateway to the mind (personal, of ourselves), and images are the gateway to the visual (public; of the other) world. The other is always an image in your head, when it becomes a word it is personal and from your own mind.

On the Wikipedia page are some more examples of CBIR (I also tried imgSeek and IKONA) and some good papers on the subject. Try some and maybe think about the different approach to searching and the amount of words that go around in your thoughts. In my case I suddenly realized that I was only paying attention to colors and shapes, words were far away from my memory. Also on the Wikipedia page are some usefull papers on the subject. Definitely worth some time to check out CBIR search engines.

At the end just a Youtube introduction of a CBIR application called xcavator which is also a flickr search engine with an original approach. Developer Cognisign is also looking for feedback on the subject, for more info read the xcavator blog.


Advertisements

14 Responses

Subscribe to comments with RSS.

  1. Really interesting post.

    First, retrievr is a lot of fun! I just spent a couple of minutes playing around with it and I can already see massive potential for this kind of thing. There are often times where I’ll know what kind of image I’m looking for, but be unable to express it verbally. So there retrievr scores top marks.

    And you don’t even have to be any good at drawing! Even with a series of lines it managed to find some people posed in roughly the same configuration.

    About your thoughts on language. This happens to be my area, and I think you’d find the Sapir-Whorf hypothesis or “linguistic relativity” interesting reading if you haven’t already heard of it. Similar to what you wrote, it proposes that we organise our thoughts in terms of the distinctions laid out by the different languages we speak.

    What you were proposing seemed to be a more general version wherein language itself causes us to think differently in virtue of its linear nature. I agree, I think. Or at least I think a certain type of thought (maybe conscious thought) is mediated by language.

    lostmoya

    September 4, 2006 at 1:57 pm

  2. Thanks for the tip about Sapir-Whorf. I’m always having trouble finding writers/auteurs who come close to some of my theories and thoughts, doubting if it is original or already done before 🙂

    Great to hear by the way that this is your area, and I’ll definitely check the hypothesis from Sapir-Whorf. And it’s very true that we organise thoughts through distinctions of different languages. Another fascinating thing. But of course, in viewing the images we are (but even this is arguable) all the same.

    newmw

    September 5, 2006 at 12:02 am

  3. Another interesting thing is that some cultures (maybe some South American tribes, but don’t quote me on that) don’t recognise 3D images like a line-drawn cube on a 2D piece of paper — all they “see” is a bunch of lines. I can’t recall any references for that off the top of my head, but I remember reading about it a while back, and the conclusion was that even vision (or at least our perception of what we see) is culturally determined to some extent.

    lostmoya

    September 5, 2006 at 10:10 am

  4. That is pretty amazing and interesting, since that would also mean that objects are subjective once they enter our minds (becomes personal, human). But nobody would be able to see it’s true form?

    newmw

    September 5, 2006 at 10:43 am

  5. I did my thesis on CBIR and metatagging. I feel tagging photo’s or images is not right (sure, it works a tad, but when you really look at it it doesn’t do a great job). We are using the textual linguistic domain to describe the visual domain. Only because we can understand and interpret images and the computer cannot. Thus we fall back to something the computer is able to compare more easily. You might want to read my thesis at http://photoindex.thingsdesigner.com about this ..

    Matthijs Rouw

    March 13, 2007 at 4:42 am

  6. Hey Matthijs,
    Interesting stuff and thesis. Perhaps the problem is also of language, just from the top of my head: If the user has to talk in a language which is foreign (understanding images can perhaps be like trying to understand French), then it is hard to talk to a computer through images?

    Anyway interesting! The examples that I gave here are just ‘fun’ to play around with and still hard to really use in for example research. Perhaps when we are ‘freed’ of the linguistic keyboard, we will become more adept at CBIR (walking through files as if walking through the city perhaps).

    newmw

    March 14, 2007 at 2:40 pm

  7. […] (by tomato-icon). Retrievr. Interactive search by sketch, but results do not impress. XCavator and the video about it. Search by example and sketch, by dominant colour and colour sets with their percentage, by spatial […]

  8. […] Pixolu MUFIN Piximilar. Visual similarity search for large image collections, it can be used in combination with keywords to refine searches on extremely large collections. XCavator (see video) […]

  9. […] it can be used in combination with keywords to refine searches on extremely large collections. XCavator (see video) ImBrowse. A Browser for Large Image Databases. Recogmission, their porno filter and search by […]

  10. […] it can be used in combination with keywords to refine searches on extremely large collections. XCavator (see video) ImBrowse. A Browser for Large Image Databases. Recogmission, their porno filter and search […]

  11. […] Squared circle Color Selectr Multicolr Search Lab Exalead Chromatik Search Etsy Kuler XCavator (see video) […]

  12. […] Pixolu MUFIN Piximilar. Visual similarity search for large image collections, it can be used in combination with keywords to refine searches on extremely large collections. XCavator (see video) […]

  13. […] it can be used in combination with keywords to refine searches on extremely large collections. XCavator (see video) ImBrowse. A Browser for Large Image Databases. Recogmission, their porno filter and search by […]

  14. […] it’s “face recognition and text recognition, to search your personal photos”. XCavator and the video about it. Search by example and sketch, by dominant colour and colour sets with their percentage, by spatial […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: