CBIR and good flickr applications
Although Content-Based Image Retrieval / CBIR (or Content-Based Visual Information Retrieval / CBVIR) is not new, it is still not appearing in the mainstream. Applications like retrievr and xcavator are attracting attention, but they still aren’t very common. Thanks to flickr CBIR finally found a good database to draw its images from, and vice versa CBIR makes for good flickr applications. So flickr made CBIR usefull for the consumer. But besides that CBIR still requires a very different, nonlingual approach to searching.
Just to give you a short introduction on the subject: CBIR is a way to search images like it is content, and thus a way to overcome the problem of searching images in a large database. So if you look at retrievr, you can make a sketch and the relevant images appear. If you haven’t already, check out the great Art of retrievr page, which contains some very nice sketches people made using the sketch pad. Check the Wikipedia page on CBIR if you want more information, and a good introduction on the subject.
But research is also being done in the field of searching for relevant images in a moviedatabase, for example by the Oxford University with their Video Google. Although it is hard to imagine what kind of impact this can have on our use of the search engine, we can say that this will impact our way of crawling through the World Wide Web in the future. We will not be bound by language. Think about it for example; When you are watching a movie and want to know in which movie you saw an object or composition before, you select a part of the screen (for example the shower curtain in Hitchcock’s Psycho) and search for the same image in an IMDB.com like database containing the data of all the movie images and their compositions. Or think about face recognition.
A quick thought
Just to get some thoughts organised for myself here is a short theory. A very important question that has been in my mind for sometime now; is language actually a way for us to overcome our initial restraint to communicate images. Is language simply nothing more than a way to communicate images and views over distance? Or a way to initiate (technological) change and pave the way for the image?
An explanation that fascinated me comes (again) from Marshall McLuhan. The next answer comes from an interview he did with Playboy in 1969. No it’s not a very academical source, but it is one of the more clearer versions of this theory.
“When tribal man becomes phonetically literate, he may have an improved abstract intellectual grasp of the world, but most of the deeply emotional corporate family feeling is excised from his relationship with his social milieu. This division of sight and sound and meaning causes deep psychological effects, and he suffers a corresponding separation and impoverishment of his imaginative, emotional and sensory life. He begins reasoning in a sequential linear fashion; he begins categorizing and classifying data. As knowledge is extended in alphabetic form, it is localized and fragmented into specialties, creating division of function, of social classes, of nations and of knowledge–and in the process, the rich interplay of all the senses that characterized the tribal society is sacrificed.”
So, when someone learns language, he begins to think in a lineair fashion. But now with CBIR we go back to the ways of “primitive and pre-alphabetic people who integrate time and space as one and live in an acoustic, horizonless, boundless, olfactory space, rather than in visual space.” Many have debated these remarks of McLuhan, but more and more we see that the World Wide Web is transforming from text-only to a full visual experience, requiring no language but the programming language behind the screens like the linguistic version of the Wizard of Oz.
But on the other hand, I’m typing this, how could I possibly have explained all this to you in images? To conclude these thoughts I think that language is the gateway to the mind (personal, of ourselves), and images are the gateway to the visual (public; of the other) world. The other is always an image in your head, when it becomes a word it is personal and from your own mind.
On the Wikipedia page are some more examples of CBIR (I also tried imgSeek and IKONA) and some good papers on the subject. Try some and maybe think about the different approach to searching and the amount of words that go around in your thoughts. In my case I suddenly realized that I was only paying attention to colors and shapes, words were far away from my memory. Also on the Wikipedia page are some usefull papers on the subject. Definitely worth some time to check out CBIR search engines.
At the end just a Youtube introduction of a CBIR application called xcavator which is also a flickr search engine with an original approach. Developer Cognisign is also looking for feedback on the subject, for more info read the xcavator blog.