In: IEEE 37th International Conference on Data Engineering (ICDE), 2021, p. 2661--2664
Anomaly detection is a fundamental problem that consists of identifying irregular patterns that do not conform to the expected behavior of a system or the generated data. Many anomaly detection techniques have been proposed for time series data. However, selecting the most suitable detection method remains challenging as the proposed techniques widely vary in performance. The appropriate...
|
In: WWW '20: Proceedings of The Web Conference 2020, 2020, vol. April, p. 1851-1862
Finding social influencers is a fundamental task in many online applications ranging from brand marketing to opinion mining. Existing methods heavily rely on the availability of expert labels, whose collection is usually a laborious process even for domain experts. Using open-ended questions, crowdsourcing provides a cost-effective way to find a large number of social influencers in a short...
|
In: Knowledge and Information Systems, 2020, vol. 62, p. 2257-2280
Missing values are very common in real-world data including time-series data. Failures in power, communication or storage can leave occasional blocks of data missing in multiple series, affecting not only real-time monitoring but also compromising the quality of data analysis. Traditional recovery (imputation) techniques often leverage the correlation across time series to recover missing...
|
In: The Web Conference 2021, Ljubljana, Slovenia, April 12-23, 2021, 2021, p. 1-12
Scientific peer review is pivotal to maintain quality standards for academic publication. The effectiveness of the reviewing process is currently being challenged by the rapid increase of paper submissions in various conferences. Those venues need to recruit a large number of reviewers of different levels of expertise and background. The submitted reviews often do not meet the conformity...
|
In: Proceedings of the VLDB Endowment, 2020, vol. 14, no. 3, p. 294-306
With the emergence of the Internet of Things (IoT), time series streams have become ubiquitous in our daily life. Recording such data is rarely a perfect process, as sensor failures frequently occur, yielding occasional blocks of data that go missing in multiple time series. These missing blocks do not only affect real-time monitoring but also compromise the quality of online data analyses....
|
In: Proceedings of the VLDB Endowment, 2020, vol. 13, no. 5, p. 768-782
Recording sensor data is seldom a perfect process. Failures in power, communication or storage can leave occasional blocks of data missing, affecting not only real-time monitoring but also compromising the quality of near- and off-line data analysis. Several recovery (imputation) algorithms have been proposed to replace missing blocks. Unfortunately, little is known about their relative...
|
In: 2018 IEEE International Conference on Big Data (Big Data), 2018, p. 2253–2262
Large knowledge bases typically contain data adhering to various schemas with incomplete and/or noisy type information. This seriously complicates further integration and post-processing efforts, as type information is crucial in correctly handling the data. In this paper, we introduce a novel statistical type inference method, called StaTIX, to effectively infer instance types in Linked Data...
|
In: 2018 IEEE International Conference on Data Mining Workshops (ICDMW), 2018, p. 1481–1486
There is a great diversity of clustering and community detection algorithms, which are key components of many data analysis and exploration systems. To the best of our knowledge, however, there does not exist yet any uniform benchmarking framework, which is publicly available and suitable for the parallel benchmarking of diverse clustering algorithms on a wide range of synthetic and...
|
|