SSIS业务密钥重复(多个LOBS)(SSIS Business Key Duplicates (more than one LOBS))

编程入门 行业动态 更新时间:2024-10-12 03:26:29
SSIS业务密钥重复(多个LOBS)(SSIS Business Key Duplicates (more than one LOBS))

提前谢谢你看这个。

我准备了一个数据仓库,并使用来自2个系统的客户信息填充客户表。 来自一个系统的客户与另一个系统中的客户具有相同的业务密钥。

什么是区分这两者的最佳方法,所以我不更新记录不想和保持良好的数据完整性。

我想知道一个系统标志,但我不确定。

欢迎所有建议/问题。

Thanks in advance for looking at this.

I have prepared a data warehouse and am populating a customer table with customer information from 2 systems. Customers from one system have the same Business Key as Customers in the other system.

What would be the best way to distinguish between the two so I don't update records don't want to and maintain good data integrity.

I wondered about a system flag but I am unsure.

All suggestions/questions welcome.

最满意答案

我同意上面的评论 - '源系统'键是来自源系统的复合键+另一个标识实际源系统的字符串或int。 这与前面答案中提到的代理键分开。 您的维度中确实有两个键。 一个是标准IDENTITY代理密钥 - 没有那里的意外。

另一个是一个复合键,由来自源系统的键和一个标识符(我实际上通常只使用一个字符串)组成,它告诉你它来自哪个系统。

所以你的维度看起来像这样:

Customer_SK SRC_Key SRC_System Customer Name 1 5 SAP Jim 2 5 MYOB Joe 来自MYOB> DW的ETL代码知道只查看MYOB数据 来自SAP> DW的ETL代码知道只查看SAP数据 您的数据仓库仅使用Customer_SK代理键

随着DW的发展和新的源系统的引入,您只需继续添加SRC_Systems即可

可以按照另一个答案中的建议将这些列放在不同的列中,但最后会得到以下结果:

Customer_SK SRC_Key_SAP SRC_Key_MYOB Customer Name 1 5 NULL Jim 2 NULL 5 Joe

这似乎有点浪费,并且每次新系统上线时都要求您添加一列。

重要的问题是:两个源系统中是否存在相同的客户? 这种设计实际上允许跨行合并。

还要确保在SRC_Key , SRC_System上放置一个唯一约束,因为这有助于提高性能,确保完整性,并自我记录密钥。

I agree with the comment above - the 'source system' key is a composite key from the source system + another string or int identifying the actual source system. This is seperate to the surrogate key mentioned in the previous answers. You really have two keys in your dimension. One is the standard IDENTITY surrogate key - no suprises there.

The other is a composite key consisting of the key from the source system and an identifier (I actually usually just use a string) that tells you which system it comes from.

So your dimension looks like this:

Customer_SK SRC_Key SRC_System Customer Name 1 5 SAP Jim 2 5 MYOB Joe Your ETL code from MYOB > DW knows to only look at MYOB data Your ETL code from SAP > DW knows to only look at SAP data Your data warehouse only uses the Customer_SK surrogate key

As your DW develops and new source systems are introduced you just keep adding SRC_Systems

You can put these in different columns as suggested in another answer but then you end up with this:

Customer_SK SRC_Key_SAP SRC_Key_MYOB Customer Name 1 5 NULL Jim 2 NULL 5 Joe

Which seems a bit wasteful and requires you to add a column everytime a new system comes online.

The important question is: does the same customer exist in both source systems? This design actually allows for merging across rows if they do.

Also make absolutely sure you put a unique constraint on SRC_Key, SRC_System as this aids in performance, ensures integrity, and self documents the key.

更多推荐

本文发布于:2023-08-07 14:24:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1464737.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:多个   密钥   业务   SSIS   Duplicates

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!