nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo searchdiv qikanlogo popupnotification paper paperNew
2023, 02, v.53 271-280
基于多线程通信机制的云数据库查询优化方法
基金项目(Foundation): 国家自然科学基金(61772091,61802035,61962006); 四川省科技计划资助(2021JDJQ0021,2022YFG0186,2021YZD0009,2021ZYD0033,2022NSFSC0511,2022YFG0325,2021YFG0029); 成都市技术创新研发项目(2021-YF05-00491-SN,2021-YF05-02414-GX,2021-YF05-02413-GX); 成都市重大科技创新项目(2021-YF08-00156-GX,2021-YF08-00159-GX); 成都市“揭榜挂帅”科技项目(2021-JB00-00025-GX); 广东省基础与应用基础研究基金(2020B1515120028)~~
邮箱(Email):
DOI:
摘要:

从本地数据库架构转变为云上数据库架构,对于大型复杂系统的高效运行和海量异构数据的存储分析至关重要。云数据库的启发式查询优化方法存在局限性,如难以获得最优执行计划使得查询性能下降、无法从历史查询中学习经验等。基于学习的优化方法虽然能够提升查询性能,但也存在缺陷,如没有对物理运算符进行编码、无法捕捉执行计划的结构特征、依赖于错误的基数估计和代价估计、现有强化学习算法的探索性较差、训练速度较慢以及收敛不稳定。针对上述问题,提出了一种新型基于异步Soft Actor-Critic的连接查询优化器(Asynchronous Soft Actor-Critic for Join Query, ASA-Join)。ASA-Join利用多线程通信机制设计了异步Soft Actor-Critic算法,能够增加策略的随机性和智能体的探索能力,尽可能避免陷入局部最优解;集成了一种新的编码方法,将物理运算符和执行计划的结构特征进行编码,能够准确表示执行计划;设计了状态表示模型,利用Bidirectional Gate Recurrent Unit (BiGRU)来捕捉执行计划的结构特征;设计了一种新的奖励机制,利用多任务学习方法将执行计划的代价和延迟均作为优化目标,使得执行计划能够反映真实的查询时间。在真实数据集JOB和TPC-H上对ASA-Join的代价和延迟进行评估。实验结果表明,ASA-Join的性能优于现有基于强化学习的优化器。

Abstract:

The transformation from local database architecture to cloud database architecture is crucial for the efficient operation of large and complex systems and the storage and analysis of massive heterogeneous data. However, the heuristic query optimization method of cloud database has limitations, such as difficulty in obtaining the optimal execution plan, which reduces query performance and leads to inability to learn experiences from historical queries. Although learning-based optimization methods can improve query performance, these methods still have some shortcomings, such as not encoding physical operators, being unable to capture the structural characteristics of the execution plan, relying on wrong cardinality estimation and cost estimation, poor exploratory properties of existing reinforcement learning algorithms, slow training speed, and unstable convergence. To overcome the above problems, a novel type of join query optimizer based on asynchronous Soft Actor-Critic is proposed, called Asynchronous Soft Actor-Critic for Join Query(ASA-Join). ASA-Join uses the multi-threaded communication mechanism to design an asynchronous Soft Actor-Critic algorithm, which can increase the randomness of the strategy and the exploration ability of the agent, and avoid falling into the local optimal solution as much as possible. ASA-Join integrates a new encoding method, which encodes the physical operator and the structural features of the execution plan, which can accurately represent the execution plan. ASA-Join designs a state representation model and uses Bidirectional Gate Recurrent Unit(BiGRU) to capture the structural features of the execution plan. ASA-Join designs a new reward mechanism, which uses the multi-task learning method to take both the cost and latency of the execution plan as the optimization goal, so the execution plan can reflect the real query time. The cost and latency of ASA-Join are evaluated on real datasets JOB and TPC-H, and the experimental results show that ASA-Join outperforms existing reinforcement learning-based optimizers.

参考文献

[1] 祁思博,黄伟,梁洪泉.面向云计算的虚拟化仿真平台研究[J].无线电工程,2019,49(10):886-892.

[2] 李国良,周煊赫.面向AI的数据管理技术综述[J].软件学报,2021,32(1):21-40.

[3] 李国良,周煊赫.轩辕:AI原生数据库系统[J].软件学报,2020,31(3):831-844.

[4] 李国良,周煊赫,孙佶,等.基于机器学习的数据库技术综述[J].计算机学报,2020,43(11):2019-2049.

[5] NEUMANN T,RADKE B.Adaptive Optimization of Very Large Join Queries[C]//Proceedings of the 2018 International Conference on Management of Data.New York:ACM,2018:677-692.

[6] 宋晓眉,叶晓俊,曾小青,等.PostgreSQL查询优化中的等价类研究与改进[J].计算机工程与应用,2014,50(14):31-38.

[7] WASS F,PELLENKOFT A.Join Order Selection[C]//Proceedings of British National Conference on Databases.Berlin:Springer,2000:256-265.

[8] STILLGER M,LOHMAN G,MARKL M,et al.LEO-DB2’s Learning Optimizer[C]//Proceedings of 27th International Conference on Very Large Data Bases.San Francisco:Morgan Kaufmann Publishers Inc.,2001:19-28.

[9] MARCUS R,PAPAEMMANOUIL O.Deep Reinforcement Learning for Join Order Enumeration[C]//Proceedings of the First International Workshop on Exploiting Artificial Intelligence Techniques for Data Management.New York:ACM,2018:1-4.

[10] KRISHNAN S,YANG Z H,GOLDBERG K,et al.Learning to Optimize Join Queries with Deep Reinforcement Learning[J/OL].(2019-01-10)[2022-08-10].https://arxiv.org/abs/1808.03196.

[11] HASSELT H,GUEZ A,SILVER D.Deep Reinforcement Learning with Double Q-Learning[J/OL].(2015-12-08)[2022-08-09].https://arxiv.org/abs/1509.06461.

[12] SUN J,LI G L,TANG N.Learned Cardinality Estimation for Similarity Queries[C]//Proceedings of the 2021 International Conference on Management of Data.New York:ACM,2021:1745-1757.

[13] MARCUS R,NEGI P,MAO H Z.Neo:A Learned Query Optimizer[J].Proceedings of VLDB Endowment,2019,12(11):1705-1718.

[14] ZHANG J,ABEDJAN Z,HOSE K.AlphaJoin:Join Order Selection à la AlphaGo[C]//Proceedings of the VLDB 2020 PhD Workshop Co-located with the 46th International Conference on Very Large Databases.Tokyo:[s.n.],2020:5-8.

[15] MACUS R,NEGI P J,MAO H Z,et al.Bao:Making Learned Query Optimization Practical[C]//Proceedings of the 2021 International Conference on Management of Data.New York:ACM,2021:1275-1288.

[16] SCHULMAN J,WOLSKI F,DHARIWAL P,et al.Proximal Policy Optimization Algorithms[J/OL].(2017-08-28)[2022-07-15].https://arxiv.org/abs/1707.06347.

[17] 李凯文,张涛,王锐,等.基于深度强化学习的组合优化研究进展[J].自动化学报,2021,47(11):2521-2537.

[18] 刘全,翟建伟,章宗长,等.深度强化学习综述[J].计算机学报,2018,41(1):1-27.

[19] AYIFU M,SLAMU W,PALIDAN M.Multilingual Named Entity Recognition Based on the BiGRU-CNN-CRF Hybrid Model[J].International Arab Journal of Information Technology,2019,15(3):223-242.

[20] ZHOU D,WANG J,JIANG B.Multi-task Multi-view Learning Based on Cooperative Multi-objective Optimization [J].IEEE Access,2018,6:19465-19477.

[21] LEIS V,GUBICHEV A,MIRCHEV A.How Good Are Query Optimizers,Really?[J].Proceedings of the VLDB Endowment,2015,9(3):204-215.

[22] LI G L,ZHOU X H,SUN J,et al.openGauss:An Autonomous Database System[J].Proceedings of the VLDB Endowment,2021,14(12):3028-3042.

基本信息:

DOI:

中图分类号:TP393.09;TP311.13

引用信息:

[1]高瑞玮,叶青,徐小玲等.基于多线程通信机制的云数据库查询优化方法[J].无线电工程,2023,53(02):271-280.

基金信息:

国家自然科学基金(61772091,61802035,61962006); 四川省科技计划资助(2021JDJQ0021,2022YFG0186,2021YZD0009,2021ZYD0033,2022NSFSC0511,2022YFG0325,2021YFG0029); 成都市技术创新研发项目(2021-YF05-00491-SN,2021-YF05-02414-GX,2021-YF05-02413-GX); 成都市重大科技创新项目(2021-YF08-00156-GX,2021-YF08-00159-GX); 成都市“揭榜挂帅”科技项目(2021-JB00-00025-GX); 广东省基础与应用基础研究基金(2020B1515120028)~~

检 索 高级检索

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文