Implementation of multidimensional aggregate query service for time series data
CSTR:
Author:
Clc Number:

TP311

  • Article
  • | |
  • Metrics
  • |
  • Reference [12]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    With the continuous expansion of power quality monitoring points, a large number of multi-dimensional power quality data with time series characteristics have been generated. The existing data query methods can not meet the need of interactive multi-dimensional aggregation query of power quality monitoring data. This paper presents a method to implement multi-dimensional aggregation service for sequential data. It establishes a hash storage structure for pre-aggregated task results in memory, a bitmap index storage structure for real-time data, and stores pre-aggregated historical data in memory as much as possible thereby improving the performance of random reading and writing, and the efficiency of query, solving the problem of interactive query. At the same time, the optimal aggregation task selection algorithm is used to select as many pre-aggregation tasks as possible to improve the hit rate of interactive queries. Experiments verify the feasibility of the proposed algorithm. Compared with the grouped two-dimensional knapsack algorithm, it has certain advantages in the number of pre-aggregated tasks.

    Reference
    [1] 陈军成, 丁治明, 高需. 大数据热点技术综述[J]. 北京工业大学学报,2017,43(3):358-367.CHEN Juncheng, DING Zhiming, GAO Xu. Survey of big data hot techniques[J]. Journal of Beijing University of Technology,2017,43(3):358-367. (in Chinese)
    [2] Li J, Li D S, Zhang Y M. Efficient distributed data clustering on spark[C/OL]. 2015 IEEE International Conference on Cluster Computing. Piscataway, NJ:IEEE,2015(2015-10-29)[2020-04-05]. https://doi.org/10.1109/CLUSTER.2015.84
    [3] 高彦杰, 陈冠诚. Spark SQL:基于内存的大数据处理引擎[J]. 程序员,2014(8):104-107.GAO Yanjie, CHEN Guancheng. Spark SQL:Big data processing engine based on memory[J]. Programmer, 2014(8):104-107. (in Chinese)
    [4] 张茜. 基于聚合函数的物化视图关键技术的研究[D]. 南京:南京理工大学,2010.ZHANG qian. Research on key technology of materialized view based on aggregation function[D]. Nanjing:Nanjing University of Science and Technology,2010. (in Chinese)
    [5] 丁治明, 蔺春华, 郭黎敏, 等. 一种物联网感知大数据的缓存设计和查询方法[P]. 中国:CN201810314923.2. 2018-09-14.DING Zhiming, LIN Chunhua, GUO Limin, et al. IoT-aware big data cache design and query method[P]. China:CN201810314923.2. 2018-09-14. (in Chinese)
    [6] 冯诗淳, 曹斌, 晁德文, 等. 结合HBase的散列概要森林索引方案[J]. 小型微型计算机系统,2018,39(1):100-104.FENG Shichun, CAO Bin, CHAO Dewen, et al. Hash synopsis forest index schema based on HBase[J]. Journal of Chinese Computer Systems,2018,39(1):100-104. (in Chinese)
    [7] 钟丽娟. 时间序列数据相似性与聚合top-k查询算法研究与应用[D]. 杭州:浙江大学,2016.ZHONG Lijuan. Time series similarity, aggregate top-k query algorithms and applications[D]. Hangzhou:Zhejiang University,2016. (in Chinese)
    [8] 王丹. 基于数据库的Summary查询研究[D]. 沈阳:东北大学,2013.WANG Dan. Research on summary query based on database[D]. Shenyang:Northeastern University,2013. (in Chinese)
    [9] Wand Y X, Luo J Z, Song A B, et al. Partition-based online aggregation with shared sampling in the cloud[J]. Journal of Computer Science and Technology,2013,28(6):989-1011.
    [10] 欧阳辰, 刘麒赟, 张海雷, 等. Druid实时大数据分析原理与实践[M]. 北京:电子工业出版社,2017.OUYANG Chen, LIU Qiyun, ZHANG Hailei, et al. Principles and practice of druid real-time big data analysis[M]. Beijing:Electronic Industry Press,2017. (in Chinese)
    [11] 范欣欣. 时序数据库技术体系-Druid多维查询之Bitmap索引[DB/OL]. http://hbasefly.com/2018/06/19/timeseries-database-8/. (2018-06-19)[2020-04-15].FAN Xinxin. Time series database technology system-Druid multidi-mensional query bitmap index[DB/OL]. http://hbasefly.com/2018/06/19/timeseries-database-8/. (2018-06-19)[2020-04-15].
    [12] 王璐华, 肖敏, 周伟强, 等. 一种多维数据立方体增量聚合及查询优化方法[P]. 中国:CN102360379 A. 2013-04-10.WANG Luhua, XIAO Min, ZHOU Weiqiang, et al. An incremental aggregation and query optimization method for multidimensional data cubes[P]. China:CN102360379 A. 2013-04-10.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

盛家,房俊,郭晓乾,王承栋.时序数据多维聚合查询服务的实现[J].重庆大学学报,2020,43(7):121~128

Copy
Share
Article Metrics
  • Abstract:827
  • PDF: 753
  • HTML: 1077
  • Cited by: 0
History
  • Received:January 20,2020
  • Online: July 18,2020
Article QR Code