Tutorial

Algorithms for generating and evaluating visually sorted grid layouts

Duration: half-day

Speaker: Prof. Dr. Kai Uwe Barthel

256 IKEA kitchenware images shown in random order
The same images sorted by visual similarity
Description

The increasing amount of visual data shared online highlights the importance of organizing and finding related content. However, current efforts to improve visual search and image classification lack support for exploratory image search. Sorting images by similarity offers a solution, allowing users to view and recognize several hundred images at once. This tutorial covers the main principles of image sorting techniques, including visual feature vectors, various sorting algorithms, and metrics used to evaluate sorting results. The workshop also presents a new sorting algorithm (Linear Assignment Sorting) and efficient optimization techniques and coding examples. By the end of the workshop, participants will be able to implement image sorting techniques and address any special requirements related to layout and positioning constraints.

Motivation

As the internet produces and shares more visual data, the task of organizing and finding related content becomes increasingly difficult and important. Despite efforts to improve visual search and image classification, there is a lack of support for exploratory image search.

Humans can easily understand complex images, but have difficulties with a large number of unordered individual images. When searching photo archives or trying to find products online, users are often presented with vast collections of images. However, as human perception is limited, overview is quickly lost when too many images are displayed at once. Typically, only about 10-20 images can be perceived on a single screen, which is a small fraction of the number of available images. Because image archives and e-commerce websites do not offer visual browsing or exploration of their collection, users are left with unstructured lists of images from keyword or similarity searches.

One solution to this problem is sorting/arranging images by similarity, which enables users to view and recognize up to several hundreds of images at once (see example above). Although the sorted images may not be recognized perfectly, users can quickly identify where images of interest are located. Conventional dimensionality reduction schemes that project high-dimensional visual feature vectors to 2D cannot be used for image sorting because they result in unequally distributed and overlapping images. As the number of ways to arrange images in a dense regular grid increases factorially with the grid size, finding the optimal arrangement becomes impractical. However, approximate solutions can be obtained through the use of self-organizing maps (SOMs), self-sorting maps (SSMs), or discrete optimization algorithms.

In this tutorial, participants acquire knowledge of the main principles of image sorting techniques, including the requirements for the visual feature vectors to be used. Various sorting algorithms will be presented, as well as the results of extensive user testing. Also, various metrics used to evaluate the sorting results are presented, and their correlation with human perception is evaluated. In addition, a new sorting algorithm (Linear Assignment Sorting) and efficient optimization techniques and coding examples are presented. By the end of the workshop, participants will be equipped to implement image sorting techniques and address any special requirements related to layout and positioning constraints.

Indicative breakdown and description
  • Welcome participants and obtain information on their background and goals for attending the tutorial. Provide an overview of the tutorial’s goals, as well as distribute the necessary tutorial materials such as Jupyter notebooks that serve as supplementary materials for programming.
  • Explain the limitations of dimensionality reduction techniques such as PCA, Multidimensional scaling (MDS), Isomap, Local-linear embedding (LLE), t-Distributed Stochastic Neighbor Embedding (t-SNE), and why they are not suitable for organizing and sorting images.
  • Introduce the main concepts of image sorting techniques, including Self Organizing Maps (SOM), Self-Sorting Maps (SSM), and Linear Assignment Sorting (LAS), as well as neural networks for learning permutations.
  • Provide an overview of which visual feature vectors are best suited for visual image sorting.
  • Report on human perception of large image sets and describe evaluation metrics for 2D image arrangements.
  • Explain optimization techniques for fast image sorting, including filtering using integral images, fast matching / swapping using a Linear Assignment Problem solver, and other techniques.
  • Provide tips and tricks for sorting images, particularly when there are constraints on the layout shape and fixed positioning of specific images.
  • Extend image sorting techniques to the visualization of image graphs for continuously changing image sets to enable visual exploration / recommendation of image collections.
  • Conclude with a final discussion to answer any remaining questions and reinforce key takeaways from the tutorial.

The tutorial provides content that may be of interest to the following conference topics:

  • Multimedia content-based search and retrieval
  • Large-scale and web-scale multimedia retrieval
  • User intent and human perception in multimedia retrieval
  • Interactive recommendation systems
  • Multimedia browsing, summarization, and visualization
  • Applications of multimedia retrieval
GitHub

Addtional information can be found in GitHub. Further information will be made available on GitHub throughout the conference.

Relevant Publications
  1. Kai Uwe Barthel, Nico Hezel, Klaus Jung, and Konstantin Schall. 2023
    Improved Evaluation and Generation of Grid Layouts Using Distance Preservation Quality and Linear Assignment Sorting, Computer Graphics Forum Journal, https://doi.org/10.1111/cgf.14718
  2. Kai Uwe Barthel, Nico Hezel, Konstantin Schall, and Klaus Jung. 2022.
    Combining Semantic and Visual Image Graphs for Efficient Search and Exploration of Large Dynamic Image Collections, In Proceedings of the 2nd International Workshop on Interactive Multimedia Retrieval (Lisboa, Portugal) (IMuR ’22). Association for Computing Machinery, New York, NY, USA, pp. 1–8, https://doi.org/10.1145/3552467.3554796
  3. Kai Uwe Barthel, Nico Hezel. 2019
    Visually Exploring Millions of Images using Image Maps and Graphs, John Wiley & Sons, Ltd, ch. 11, pp. 289–315, https://doi.org/10.1002/9781119376996.ch11
  4. Kai Uwe Barthel, Nico Hezel, and Klaus Jung. 2017.
    Visually Browsing Millions of Images Using Image Graphs, In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (Bucharest, Romania), Association for Computing Machinery, New York, NY, USA, pp. 475–479, https://doi.org/10.1145/3078971.3079016
Slides
Contact
Prof. Dr. Kai Uwe Barthel

Professor for Media and Computing at HTW Berlin

Wilhelminenhofstraße 75a, 12459 Berlin, Germany

barthel@htw-berlin.de

Kai Uwe Barthel is a Professor at the Institute for Media and Computing at the University for Applied Sciences in Berlin Germany (HTW Berlin) where he leads the Visual Computing Group. His main area of expertise is the development of technologies and applications that simplify the search for media content. In his research and teaching, he emphasizes the theory, design, and development of digital media systems for the analysis and comprehension of digital images and video. His current research interests include automatic image keywording, content-based image retrieval, metric learning, image sorting and clustering, and visual image navigation systems.

As part of his doctoral dissertation at the Communication Systems Lab at Technische Universität Berlin, Germany, Prof. Barthel developed fractal image compression schemes that at the time outperformed the JPEG standard by a large amount. After leading a research project about 3D- video coding at the Technical University of Berlin in 1997 he became head of R&D with N-Tec Media and LuraTech Inc. in Berlin where hard- and software solutions for image and video compression were developed. He led research and development teams in image compression and mixed raster technology for which two patents were awarded in 1997 and 1999. In addition, he was a member of the JPEG2000 standardization committee.

In 2001 Kai Barthel became a professor for visual computing at HTW Berlin, where he is teaching courses such as image analysis, machine learning, computer vision, and visual information retrieval. In 2009 he founded pixolution, a company for visual image search https://pixolution.org/. Pixolution’s visual search technology is used by many stock image agencies.

Prof. Barthel has numerous publications, presentations, and workshops, to his credit. He received several awards for user centered approaches for fast searching of images and videos. In 2019 the HTW Visual Computing Group did win the ACM Multimedia „Best Demo“ award (second place) for showcasing wikiview.net (graph-based image exploration of millions of Wikimedia images). In 2016, 2022, and 2023 he and his team won the annual “Best Video Browser Award” where the task was to find video clips as fast as possible in over 2300 hours of video content.

Demo systems, publications and awards can be found at http://www.visual-computing.com/