Decompressing Massive Datasets

Relatore:  Simon J. Puglisi - University of Helsinki
  martedì 27 novembre 2018 alle ore 16.30 - Sala verde
Simple and fast decoding is one of the main advantages of LZ-type text encoding used in many popular file compressors and software systems. However, the standard LZ decoding algorithm - which has not changed for 40 years - makes random accesses to the text and cannot be trivially modified to deal with files whose decompressed size exceeds that of main memory. This talk explores two algorithmic approaches to scaling LZ decoding to massive datasets. One of these approaches constitutes the first external memory algorithms for LZ decoding. We show the I/O complexity of these algorithms is optimal and demonstrate through experiments that they are very fast in practice. The second approach is a suite of algorithms that use working space proportional to the size of the compressed data and that streams their output - avoiding access to it during decoding.

Contact person: Zsuzsanna Lipták


Data pubblicazione
18 ottobre 2018

Offerta formativa