Université de Fribourg

Histosketch: fast similarity-preserving sketching of streaming histograms with concept drift

Yang, Dingqi ; Li, Bin ; Rettig, Laura ; Cudre-Mauroux, Philippe

In: 2017 IEEE International Conference on Data Mining (ICDM), 2017, p. 545–554

Histogram-based similarity has been widely adopted in many machine learning tasks. However, measuring histogram similarity is a challenging task for streaming data, where the elements of a histogram are observed in a streaming manner. First, the ever-growing cardinality of histogram elements makes any similarity computation inefficient. Second, the concept-drift issue in the data streams also...