将数据加载到维度日期表中(loading data into dimension date table)

编程入门 行业动态 更新时间:2024-10-27 04:30:25
将数据加载到维度日期表中(loading data into dimension date table)

好吧,我有一个维度日期表,它有两个属性Month和Year,就像它写在脚本中一样,月(mois)是我的主键。 和源文件节目中的同一个月有不同的一年。 所以我该怎么做? 把它们全部带走还是应该使用不同的?

我的源文件是excel plus。 我有来自表销售的数据(ord-date),我需要从数据库pubs获取。

PS:我正在使用ssis(visual studio)

这是我的来源日期

这就是我创建维度日期表的原因,因为我们的教授告诉我们要创建它(不需要详细信息)

我知道如何从excel导入数据,我分割日期。 我只有当我加载到维度表时才发现问题。 我的问题是我有例如2个日期(仅限月份和年份)05/1995和05/1995并且我想将它们加载到我的维度表中但我不能将我的月份作为我的主键。 主键应该是唯一的

Well I have a dimension date table which has two attributes Month and Year like it's written in the script, and the month (mois) is my primary key. And I have a different year with the same month like in the source file show. So what should I do? take them all or should I use distinct?

My source file is excel plus. I have data (ord-date) from table sales that I need to get from database pubs.

PS: I'm using ssis (visual studio)

This is my source date

This is how I create my dimension date table cause our professor told us to create it like (no need to the details)

I know how to import data from excel and I split the date. I only have from is when I load to the dimension table I found a problem. My problem is I have for example 2 dates (month and year only) 05/1995 and 05/1995 and I want to load them into my dimension table but I can't have my month as my primary key . And primary key should be unique

最满意答案

您必须添加包含Excel Source - > Script Component - > 2x OLEDB Destination的DataFlow Task 在脚本组件中,您必须添加具有2列Month和Year的New Output Dim Date ,并将Synchronous Input属性设置为none 。

在此处输入图像描述

第一个输出将包含包含其他数据的列,并且它与输入缓冲区同步。 在脚本中,您必须创建一个存储不同月份的列表,每次找到新月份时,它都会生成一个Dim Date输出。

你的脚本应该是这样的。

Dim lstDates As New System.Collections.Generic.List(Of String) Public Overrides Sub Input0Buffer_ProcessInputRow(ByVal Row As Input0Buffer) If Not Row.Date_IsNull Then Dim strMonth As String = Row.Date.ToString("MM/yyyy") If Not lstDates.Contains(strMonth) Then With DimDateBuffer .AddRow() .Month = strMonth .Year = Right(strMonth, 4) End With lstDates.Add(strMonth) End If End If End Sub

这样你就不会在Dimension表中出现重复,如果这个包应定期运行,你可以添加一个Execute SQL Task ,从数据仓库获取所有维度,并使用Object变量将它们加载到脚本组件中创建的列表中

最后将DimDate输出列映射到第一个OLEDB destination ,将第一个输出列映射到另一个目标 You have to add a DataFlow Task that contains an Excel Source -> Script Component - > 2x OLEDB Destination In the script component you have to add a New Output Dim Date with 2 Columns Month and Year , and set the Synchronous Input property to none.

enter image description here

The First Output will contains the columns containing the other data and it is synchronous to the Input buffer. In the script you have to create a list that store distinct months and every time a new month is found it will generate a Dim Date output.

your script should look like.

Dim lstDates As New System.Collections.Generic.List(Of String) Public Overrides Sub Input0Buffer_ProcessInputRow(ByVal Row As Input0Buffer) If Not Row.Date_IsNull Then Dim strMonth As String = Row.Date.ToString("MM/yyyy") If Not lstDates.Contains(strMonth) Then With DimDateBuffer .AddRow() .Month = strMonth .Year = Right(strMonth, 4) End With lstDates.Add(strMonth) End If End If End Sub

That way you will have no duplicate in Dimension table, also if this package should run periodically you can add an Execute SQL Task that get all dimensions from data warehouse and load them into the list created in the script component using an Object variable

Finally map the DimDate output columns to the First OLEDB destination and the First Output columns to the other destination

更多推荐

本文发布于:2023-08-03 10:49:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1388233.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:维度   加载   日期   数据   loading

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!