The future of compressed data structures, 20 years after the FM-index

July 19th - July 20th, 2022

Apply

Aim & Scope

In November 2000, at the 41st Annual Symposium on Foundations of Computer Science, the paper “Opportunistic Data Structures with Applications” showed how the Burrows-Wheeler Transform (BWT), a powerful data transformation introduced in 1994  for data compression, could also be used for efficiently searching patterns in the BWT-compressed string. This result established for the first time that it is possible to compress a dataset up to the theoretical limit of its entropy and efficiently search for the occurrences of arbitrary substrings in it. The resulting data structure, called FM-index, was instrumental in bioinformatics, and indeed it is at the core of two alignment tools. Among them, it is worth mentioning that BWA and Bowtie are widely used in BioMed laboratories worldwide.

After introducing the FM-index, many researchers obtained other significant results for compactly representing different kinds of data and strings, often maintaining optimal search speed. The overall effect has been a flourish of theoretical results, algorithms, and software libraries that have provided solid theoretical and practical foundations for the field of Algorithmics which is now called Compressed Data Structures. In this era of big data, the need for compressing them is even more stringent and challenging because of new types of data to be managed (as Neural Networks), new computing architectures to be used (GPUs/TPUs), and sophisticated computational constraints that vary across users, devices and applications, and possibly evolve.

This event aims to present current state-of-the-art and future challenges on this topic by bringing to the Lipari Ph.D. School the topmost researchers in the field; we foresee strong interactions between audience and speakers to widen the knowledge of these powerful algorithmic techniques and identify new applications and open problems.

This Workshop has been supported by the Italian Ministry of University and Research under the project "Multicriteria Data Structures and Algorithms: from compressed to learned indexes, and beyond" (PRIN no. 2017WR7SHH)

Schedule

July 19th, 3 pm: Compressed indexes and Bioinformatics applications

  • Paolo Ferragina
    Opening
  • Richard Durbin
    TBA
  • Gene Myers
    Merging Sorted Lists of Similar Strings
  • Bud Mishra
    Genomics Algorithms: Try again! Fail again!! Fail Better!!!
  • Travis Gagie
    Pangenomic FM-indexes: Alignment and Beyond
  • Simona Rombo
    Extraction of Functional Knowledge from Large Biological Graphs and Applications in Precision Medicine
  • Alfredo Pulvirenti
    Knowledge Graphs Inference from Biomedical literature

July 20th, 3 pm: Foundations of data-centric AI and algorithms, with applications

  • Gonzalo Navarro
    The Ring: Worst-Case Optimal Graph Joins in Compressed Space
  • Simon Gog
    The virtues of compressed data structure in modern information retrieval systems
  • Nicola Prezza
    A theory of (co-lex) ordered regular languages
  • Giorgio Vinciguerra
    Learning-based approaches to compressed data structures design
  • Antonio Arcidiacono
    AI tools to better inform European Citizens
     

Travel and accommodation

Participants will be arranged in comfortable hotels at very special rates. The conference room (located at Hotel Giardino sul Mare, Via Maddalena, Lipari) is air-conditioned and equipped with all conference materials. Special areas are reserved to the students for the afternoon coursework and study. The island of Lipari can be easily reached from Milazzo, Palermo, Naples, Messina, and Reggio Calabria by ferry or hydrofoil (50 minutes from Milazzo).

Registration

Participants can apply to the Workshop or the Lipari School on Computational Complex and Social Systems. For details regarding the application to the school, please refer to the website https://complex22.liparischool.it.

The registration fee for the workshop participants is 150€, which covers participation in the Workshop, materials, and coffee breaks. Transports and social events are not included.

Applications can be submitted up to July 1st, 2022

Applicants should include an updated link to an official website. In case of cancellation after fees payments, a 25% penalty will be applied. In the case of credit card payment, processing fees cannot be reimbursed. Payment must be received as soon as possible, and in any case, no later than July 1st, 2022. Only wire transfers or credit card payments can be accepted.

Sponsors