好吧,我有一个维度日期表,它有两个属性Month和Year,就像它写在脚本中一样,月(mois)是我的主键。 和源文件节目中的同一个月有不同的一年。 所以我该怎么做? 把它们全部带走还是应该使用不同的?
我的源文件是excel plus。 我有来自表销售的数据(ord-date),我需要从数据库pubs获取。
PS:我正在使用ssis(visual studio)
这是我的来源日期
这就是我创建维度日期表的原因,因为我们的教授告诉我们要创建它(不需要详细信息)
我知道如何从excel导入数据,我分割日期。 我只有当我加载到维度表时才发现问题。 我的问题是我有例如2个日期(仅限月份和年份)05/1995和05/1995并且我想将它们加载到我的维度表中但我不能将我的月份作为我的主键。 主键应该是唯一的
Well I have a dimension date table which has two attributes Month and Year like it's written in the script, and the month (mois) is my primary key. And I have a different year with the same month like in the source file show. So what should I do? take them all or should I use distinct?
My source file is excel plus. I have data (ord-date) from table sales that I need to get from database pubs.
PS: I'm using ssis (visual studio)
This is my source date
This is how I create my dimension date table cause our professor told us to create it like (no need to the details)
I know how to import data from excel and I split the date. I only have from is when I load to the dimension table I found a problem. My problem is I have for example 2 dates (month and year only) 05/1995 and 05/1995 and I want to load them into my dimension table but I can't have my month as my primary key . And primary key should be unique
最满意答案
您必须添加包含Excel Source - > Script Component - > 2x OLEDB Destination的DataFlow Task 在脚本组件中,您必须添加具有2列Month和Year的New Output Dim Date ,并将Synchronous Input属性设置为none 。第一个输出将包含包含其他数据的列,并且它与输入缓冲区同步。 在脚本中,您必须创建一个存储不同月份的列表,每次找到新月份时,它都会生成一个Dim Date输出。
你的脚本应该是这样的。
Dim lstDates As New System.Collections.Generic.List(Of String) Public Overrides Sub Input0Buffer_ProcessInputRow(ByVal Row As Input0Buffer) If Not Row.Date_IsNull Then Dim strMonth As String = Row.Date.ToString("MM/yyyy") If Not lstDates.Contains(strMonth) Then With DimDateBuffer .AddRow() .Month = strMonth .Year = Right(strMonth, 4) End With lstDates.Add(strMonth) End If End If End Sub这样你就不会在Dimension表中出现重复,如果这个包应定期运行,你可以添加一个Execute SQL Task ,从数据仓库获取所有维度,并使用Object变量将它们加载到脚本组件中创建的列表中
最后将DimDate输出列映射到第一个OLEDB destination ,将第一个输出列映射到另一个目标 You have to add a DataFlow Task that contains an Excel Source -> Script Component - > 2x OLEDB Destination In the script component you have to add a New Output Dim Date with 2 Columns Month and Year , and set the Synchronous Input property to none. The First Output will contains the columns containing the other data and it is synchronous to the Input buffer. In the script you have to create a list that store distinct months and every time a new month is found it will generate a Dim Date output.your script should look like.
Dim lstDates As New System.Collections.Generic.List(Of String) Public Overrides Sub Input0Buffer_ProcessInputRow(ByVal Row As Input0Buffer) If Not Row.Date_IsNull Then Dim strMonth As String = Row.Date.ToString("MM/yyyy") If Not lstDates.Contains(strMonth) Then With DimDateBuffer .AddRow() .Month = strMonth .Year = Right(strMonth, 4) End With lstDates.Add(strMonth) End If End If End SubThat way you will have no duplicate in Dimension table, also if this package should run periodically you can add an Execute SQL Task that get all dimensions from data warehouse and load them into the list created in the script component using an Object variable
Finally map the DimDate output columns to the First OLEDB destination and the First Output columns to the other destination更多推荐
发布评论