Influence of downsampling filter characteristics on compression performance in wavelet-based scalable video coding September 23, 2008
Posted by whaldsz in : research , add a commentThe application of different downsampling filters in video coding directly models visual information at lower resolutions and influences the compression performance of a chosen coding system. In wavelet-based scalable video coding the spatial scalability is achieved by the application of wavelets as downsampling filters. However, characteristics of different wavelets influence the performance at targeting spatio-temporal decoding points. An analysis of different downsampling filters in popular wavelet-based scalable video coding schemes is presented. Evaluation is performed for both intra- and inter-coding schemes using wavelets and standard downsampling strategies. On the basis of the obtained results a new concept of inter-resolution prediction is proposed, which maximises the average performance using a combination of standard downsampling filters and wavelet-based coding.
More: continued here
Class-Based Feature Matching Across Unrestricted Transformations September 23, 2008
Posted by whaldsz in : research , add a commentWe develop a novel method for class-based feature matching across large changes in viewing conditions. The method is based on the property that when objects share a similar part, the similarity is preserved across viewing conditions. Given a feature and a training set of object images, we first identify the subset of objects that share this feature. The transformation of the feature’s appearance across viewing conditions is determined mainly by properties of the feature, rather than of the object in which it is embedded. Therefore, the transformed feature will be shared by approximately the same set of objects. Based on this consistency requirement, corresponding features can be reliably identified from a set of candidate matches. Unlike previous approaches, the proposed scheme compares feature appearances only in similar viewing conditions, rather than across different viewing conditions. As a result, the scheme is not restricted to locally planar objects or affine transformations. The approach also does not require examples of correct matches. We show that by using the proposed method, a dense set of accurate correspondences can be obtained. Experimental comparisons demonstrate that matching accuracy is significantly improved over previous schemes. Finally, we show that the scheme can be successfully used for invariant object recognition.
More: continued here
Design of Multimodal Dissimilarity Spaces for Retrieval of Video Documents September 23, 2008
Posted by whaldsz in : research , add a commentThis paper proposes a novel representation space for multimodal information, enabling fast and efficient retrieval of video data. We suggest describing the documents not directly by selected multimodal features (audio, visual or text), but rather by considering cross-document similarities relatively to their multimodal characteristics. This idea leads us to propose a particular form of emph{dissimilarity space} that is adapted to the asymmetric classification problem, and in turn to the emph{query-by-example} and emph{relevance feedback} paradigm, widely used in information retrieval. Based on the proposed dissimilarity space, we then define various strategies to fuse modalities through a kernel-based learning approach. The problem of automatic kernel setting to adapt the learning process to the queries is also discussed. The properties of our strategies are studied and validated on artificial data. In a second phase, a large annotated video corpus, (emph{ie} TRECVID-05), indexed by visual, audio and text features is considered to evaluate the overall performance of the dissimilarity space and fusion strategies. The obtained results confirm the validity of the proposed approach for the representation and retrieval of multimodal information in a real-time framework.
More: continued here
Using convex hull to define the initial boundary of active contour model September 5, 2008
Posted by whaldsz in : research , add a commentFew hour ago, I was reading some papers to increase my understanding on support vector machine (SVM). I came across the term convex hull and I was reminded of challenge I had when I was working on image segmentation of medium to long milled rice grains for quality evaluation. Although there are some published papers that deal with this problem (e.g. “automatic segmentation of touching kernels with gradient vector flow” and “Separation and identification of touching kernels and dockage components in digital images“), they were able to solve only those touching grains with circular-like objects.
So I thought, using convex hull to define the initial boundary of an active contour could be an interesting idea. But it looks like somebody has already done it – see An unsupervised GVF snake approach for white blood cell segmentation based on nucleus.
People Counting August 28, 2008
Posted by whaldsz in : research , add a commentWe are currently doing research in people counting. The goal is to develop an algorithm that would solve the problem of occlusion. More info later….