大型数据仓库应该具有所有列存储索引表还是可以混合使用?

编程入门 行业动态 更新时间:2024-10-28 04:27:37
本文介绍了大型数据仓库应该具有所有列存储索引表还是可以混合使用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我们正在建立一个Dim-Fact数据仓库.数据库应该具有所有列存储"还是所有基于行的二进制表,还是可以混合使用? Azure SQL Server是否对此有任何指导?混音时会出现性能下降问题吗?例如,

We are building a Dim-Fact Datawarehouse. Should the database have All Columnstore or all Row based binary tables, or is it okay to have a mix? Does Azure SQL Server have any guidance for this? Any Performance Degradation joining when having a mix? Eg,

如果答案是混合方案:在哪种方案中可以?

If the answer is Mixed scenario: In which scenarios is it okay?

例如?

(a)具有基于大行的二叉树表的大型群集列存储索引表,例如每个表为(400万行,5GB)

(a) Large Clustered Columnstore Index Table with Large Row based Binary-Tree Table , say each table is (4 million rows, 5GB)

(b)具有基于小行的查找表的大型群集列存储索引表,例如(50行,1 MB)

(b) Large Clustered Columnstore Index Table with Small Row based Lookup Table which is eg (50 rows, 1 MB)

(我们避免使用Azure数据仓库,我们的整个数据库小于2 TB,存在其他问题,等等)

(We are refraining from using Azure Data Warehouse, our whole database is less than 2 TB, other issues, etc)

推荐答案

对此的最短答案是,这将视情况而定.我不会将小的查找表放入列存储索引中.这对性能没有帮助,可能会损害性能.因此,混合是正确的答案.但是,真正混合的东西真的沸腾了 到有问题的结构,如果没有可靠的示例,将很难远程告诉您.我将遵循的一般准则是,任何进行分析的事情,这意味着应该在哪里进行分组和聚合 在列存储中.任何仅出于完整性,查找目的或仅用于一般报告目的而存储数据,但不具有分析式查询的内容都应位于行存储区中.也就是说,请不要忘记您还可以添加非集群的列存储索引 根据需要添加到行存储表中,并且可以根据需要将非集群行存储索引添加到列存储表中.但是,无论是分析还是缺乏分析,我都会在这里提出基本问题. The shortest possible answer to this is that it's going to be situational. I would not put small lookup tables into columnstore indexes. It will not help performance and could hurt it. So a mix is the correct answer. However, exactly what mix really boils down to the structures in question and is going to be hard to tell you remotely without hard examples. The general guideline I would follow is, anything that is doing analytics, meaning where the grouping and aggregation is going to occur, should probably be in columnstore. Anything that is just storing data for integrity or for lookup or just for general reportage, but will not have analysis-style queries, should be in rowstore. That said, don't forget that you can also add non-clustered columnstore indexes to your rowstore tables as needed and you can add non-clustered rowstore indexes to your columnstore tables as needed. However, I would let the analysis, or lack of it, drive the fundamental questions here.

更多推荐

大型数据仓库应该具有所有列存储索引表还是可以混合使用?

本文发布于:2023-07-26 09:37:02,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1215536.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:索引   数据仓库

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!