2014
2015
2016
2017

The twenty-sixth meeting of the Prague computer science seminar

Tomáš Skopal

Similarity search in unstructured data

Nowadays, in the "Big Data" era, we often encounter data that come from sensors digitizing the "signals of nature", where their technical data structure is used merely for manipulation and reproduction. We often think of multimedia (image, audio) as the prominent non-structured data types, however, general sensory data are much more diverse.

February 23, 2017

4:00pm

Auditorium E-107, FEL CTU
Karlovo nám. 13, Praha 2
Show on the map

Lecture annotation

Nowadays, in the "Big Data" era, we often encounter data that come from sensors digitizing the "signals of nature", where their technical data structure is used merely for manipulation and reproduction. We often think of multimedia (image, audio) as the prominent non-structured data types, however, general sensory data are much more diverse. There are abstract similarity models used for searching non-structured data, where the data entities are represented by domain-specific descriptors (e.g., high-dimensional vectors, time series or strings). The similarity of two entities is then measured as a distance of their descriptors, so the problem is geometrized as searching for the nearest descriptors to a descriptor of the query object.

The geometry of similarity spaces is very important for database indexing, i.e., techniques for speeding up the search, but also for modeling the similarity and the descriptor itself. In the talk we will show that the implicit Euclidean perception of space is not the only possibility; the more general metric space model is also very popular. One could even develop unique distance spaces whose topological properties are directly derived from the data. We will also discuss problems related to similarity modeling, especially the choice between semantic descriptors and smart similarity functions.

Lecturer

Tomáš Skopal

Tomáš Skopal works in the area of similarity search and various topics connected to multimedia databases and information retrieval. He is an associate professor and currently the head of Department of Software Engineering at the Faculty of Mathematics and Physics, Charles University, Prague. He is also the leader of the successful SIRET (SImilarity RETrieval) research group, which he founded in 2006. He received his MSc from Palacký University in Olomouc and his PhD from the Technical University of Ostrava (VŠB). Afterwards he moved to Charles University in Prague and also worked as a visiting professor and researcher at the University of Konstanz, Germany, and at the DCC, University of Chile, Santiago.

ABOUT THE PRAGUE COMPUTER SCIENCE SEMINAR

The seminar takes place usually on the 4th Thursday of each month at 4:00pm (except June, July, August and December) alternately in the buildings of Faculty of Electrical Engineering, Czech Technical University, Karlovo nám. 13, Praha 2 and Faculty of Mathematics and Physics, Charles University, Malostranské nám. 25, Praha 1.

Its program consists of a one-hour lecture followed by a discussion. The lecture is based on an (internationally) exceptional or remarkable achievement of the lecturer, presented in a way which is comprehensible and interesting to a broad computer science community. The lectures are in English.

The seminar is organized by the organizational committee consisting of Roman Barták (Charles University, Faculty of Mathematics and Physics), Jaroslav Hlinka (Czech Academy of Sciences, Computer Science Institute), Michal Chytil, Pavel Kordík (Czech Tech. Univ., Faculty of Information Technologies), Michal Koucký (Charles University, Faculty of Mathematics and Physics), Jan Kybic (Czech Tech. Univ., Faculty of Electrical Engineering), Michal Pěchouček (Czech Tech. Univ., Faculty of Electrical Engineering), Jiří Sgall (Charles University, Faculty of Mathematics and Physics), Vojtěch Svátek (University of Economics, Faculty of Informatics and Statistics), Michal Šorel (Czech Academy of Sciences, Institute of Information Theory and Automation), Tomáš Werner (Czech Tech. Univ., Faculty of Electrical Engineering), and Filip Železný (Czech Tech. Univ., Faculty of Electrical Engineering)

The idea to organize this seminar emerged in discussions of the representatives of several research institutes on how to avoid the undesired fragmentation of the Czech computer science community.

Supporters

Contact