The future of compressed data structures, 20 years after the FM-index
July 19th - July 20th, 2022
Speakers
Aim & Scope
In November 2000, at the 41st Annual Symposium on Foundations of Computer Science, the paper “Opportunistic Data Structures with Applications” showed how the Burrows-Wheeler Transform (BWT), a powerful data transformation introduced in 1994 for data compression, could also be used for efficiently searching patterns in the BWT-compressed string. This result established for the first time that it is possible to compress a dataset up to the theoretical limit of its entropy and efficiently search for the occurrences of arbitrary substrings in it. The resulting data structure, called FM-index, was instrumental in bioinformatics, and indeed it is at the core of two alignment tools. Among them, it is worth mentioning that BWA and Bowtie are widely used in BioMed laboratories worldwide.
After introducing the FM-index, many researchers obtained other significant results for compactly representing different kinds of data and strings, often maintaining optimal search speed. The overall effect has been a flourish of theoretical results, algorithms, and software libraries that have provided solid theoretical and practical foundations for the field of Algorithmics which is now called Compressed Data Structures. In this era of big data, the need for compressing them is even more stringent and challenging because of new types of data to be managed (as Neural Networks), new computing architectures to be used (GPUs/TPUs), and sophisticated computational constraints that vary across users, devices and applications, and possibly evolve.
This event aims to present current state-of-the-art and future challenges on this topic by bringing to the Lipari Ph.D. School the topmost researchers in the field; we foresee strong interactions between audience and speakers to widen the knowledge of these powerful algorithmic techniques and identify new applications and open problems.
This Workshop has been supported by the Italian Ministry of University and Research under the project "Multicriteria Data Structures and Algorithms: from compressed to learned indexes, and beyond" (PRIN no. 2017WR7SHH)
Schedule
July 19th, 3 pm: Compressed indexes and Bioinformatics applications
- Paolo Ferragina
Opening - Richard Durbin
TBA - Gene Myers
Merging Sorted Lists of Similar Strings - Bud Mishra
Genomics Algorithms: Try again! Fail again!! Fail Better!!! - Travis Gagie
Pangenomic FM-indexes: Alignment and Beyond - Simona Rombo
Extraction of Functional Knowledge from Large Biological Graphs and Applications in Precision Medicine - Alfredo Pulvirenti
Knowledge Graphs Inference from Biomedical literature
July 20th, 3 pm: Foundations of data-centric AI and algorithms, with applications
- Gonzalo Navarro
The Ring: Worst-Case Optimal Graph Joins in Compressed Space - Simon Gog
The virtues of compressed data structure in modern information retrieval systems - Nicola Prezza
A theory of (co-lex) ordered regular languages - Giorgio Vinciguerra
Learning-based approaches to compressed data structures design - Antonio Arcidiacono
AI tools to better inform European Citizens
Travel and accommodation
Participants will be arranged in comfortable hotels at very special rates. The conference room (located at Hotel Giardino sul Mare, Via Maddalena, Lipari) is air-conditioned and equipped with all conference materials. Special areas are reserved to the students for the afternoon coursework and study. The island of Lipari can be easily reached from Milazzo, Palermo, Naples, Messina, and Reggio Calabria by ferry or hydrofoil (50 minutes from Milazzo).
Registration
Participants can apply to the Workshop or the Lipari School on Computational Complex and Social Systems. For details regarding the application to the school, please refer to the website https://complex22.liparischool.it.
The registration fee for the workshop participants is 150€, which covers participation in the Workshop, materials, and coffee breaks. Transports and social events are not included.
Applications can be submitted up to July 1st, 2022.
Applicants should include an updated link to an official website. In case of cancellation after fees payments, a 25% penalty will be applied. In the case of credit card payment, processing fees cannot be reimbursed. Payment must be received as soon as possible, and in any case, no later than July 1st, 2022. Only wire transfers or credit card payments can be accepted.