12. ledna 2023
Efficient search for information in large volumes of oral history audiovisual data came into prominence roughly at the turn of the century when substantial amounts of archive materials (stored so far on films, video tapes and various analogue audio storage devices) started to be digitized and, at the same time, digital personal recording devices became so affordable that also the amount of newly generated content grew with a geometric rate.
Efficient search for information in large volumes of oral history audiovisual data came into prominence roughly at the turn of the century when substantial amounts of archive materials (stored so far on films, video tapes and various analogue audio storage devices) started to be digitized and, at the same time, digital personal recording devices became so affordable that also the amount of newly generated content grew with a geometric rate. One of the first digital archives that was in need for efficient search capabilities were the recordings of testimonies given by the Holocaust survivors, collected by the Survivors of the Shoah Visual History Foundation (now USC Shoah Foundation). The consortium of research teams from the US and the Czech Republic was established in 2001 and started to build a system that used automatic speech recognition and information retrieval techniques to give users an effective and user-friendly way of accessing the information contained in the archive.
The aim of the talk is to provide a detailed overview of the core methods that our team has been using in both automatic speech recognition (from HMM-based systems used at the beginning to modern end-to-end neural models that are used today) and information retrieval (from the heuristic approach that was implemented about 10 years ago as a proof-of-concept to the current transformer-based solution). We will also show the progress in the system's performance over the two decades of work on this task, as well as the evolution of the graphical user interface used for the actual access to the collection.
Pavel Ircing is an associate professor at the Department of Cybernetics, Faculty of Applied Sciences, University of West Bohemia (UWB). His research interests include speech recognition and information retrieval from speech data. He is the author or co-author of over 80 scientific publications in those
areas. He has spent over a year in total as a visiting scholar at the Center for Language and Speech Processing, Johns Hopkins University in 1999, 2000 and 2004. Among other projects, he also served as UWB’s principal investigator of the NSF-funded project MALACH (2001-2007), whose aim was to employ speech recognition and information retrieval techniques for improving access to large archives of testimonies given by the Holocaust survivors. The international cooperation and research efforts that started during this project are still active today.
Jan Švec is a scientific researcher at the Department of Cybernetics, Faculty of Applied Sciences, University of West Bohemia. His research focuses on spoken dialog systems, speech recognition and spoken term detection. He applies state-of-the-art neural models trained using the transfer learning paradigm in many practical applications, including speech recognition for oral history archives and spoken language understanding. He designed the overall architecture of web-based audiovisual archive technology used in many projects, such as the MALACH project or archives of the Czech Institute for the Study of Totalitarian Regimes.
Jeho program je tvořen hodinovou přednáškou, po níž následuje časově neomezená diskuse. Základem přednášky je něco (v mezinárodním měřítku) mimořádného nebo aspoň pozoruhodného, na co přednášející přišel a co vysvětlí způsobem srozumitelným a zajímavým i pro širší informatickou obec. Přednášky jsou standardně v angličtině.
Seminář připravuje organizační výbor ve složení Roman Barták (MFF UK), Jaroslav Hlinka (ÚI AV ČR), Michal Chytil, Pavel Kordík (FIT ČVUT), Michal Koucký (MFF UK), Jan Kybic (FEL ČVUT), Michal Pěchouček (FEL ČVUT), Jiří Sgall (MFF UK), Vojtěch Svátek (FIS VŠE), Michal Šorel (ÚTIA AV ČR), Tomáš Werner (FEL ČVUT), Filip Železný (FEL ČVUT)
Idea Pražského informatického semináře vznikla z rozhovorů představitelů několika vědeckých institucí na téma, jak odstranit zbytečnou fragmentaci informatické komunity v ČR.