我有一个内部的SAP数据库,我想每天将一些数据( SAP数据库中的一些表 )推送到Azure。 将使用合并策略上载数据( 如果检测到更改则更新,如果缺少记录则删除,否则插入 )。 我想使用以下设置执行此操作:
- Install on premise integration Services (SSIS). The client already has license for this. - Install SAP connector on premise. - Pull data from SAP into text files (on premise, into multiple CSV files) (create ETL's for this). - Push those files into a Blob Storage (via AzCopy), so I create a job or process somehow locally that will handle newly created files. - Load data into Data Warehouse using PolyBase (https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-get-started-load-with-polybase) using an UP-SERT strategy.这肯定会起作用,但我想知道是否可以使用ETL(使用SSIS包)直接上传该数据,基本上是跳过以下步骤:
将普通文件保存到内部磁盘。 将它们推送到Azure Blob存储。 使用上载文件上的虚拟表从文件加载数据。我关注的是: - 绩效 - 成本
您认为最好的方法是什么?
谢谢!
I have a SAP database on premise and I want to push daily some of this data (some tables from SAP database) to Azure. The data will be uploaded using a merge strategy (update if a change is detected, delete if a record is missing, otherwise insert). I am thinking to do this using the following setup:
- Install on premise integration Services (SSIS). The client already has license for this. - Install SAP connector on premise. - Pull data from SAP into text files (on premise, into multiple CSV files) (create ETL's for this). - Push those files into a Blob Storage (via AzCopy), so I create a job or process somehow locally that will handle newly created files. - Load data into Data Warehouse using PolyBase (https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-get-started-load-with-polybase) using an UP-SERT strategy.This will work for sure but I am wondering whether it's possible to upload directly that data using an ETL (using an SSIS package), basically skipping the steps:
Save plain files to on premise disk. Push them to Azure Blob Storage. Load data from files using a virtual table over the uploaded file.My concerns are in terms of: - Performance - Costs
What do you think is the best way?
Thank you!
最满意答案
您可以使用SQL Server Integration Services(SSIS)通过Azure SQL DW上载任务将数据加载到Azure SQL数据仓库,该任务是Azure Feature Pack for Integration Services的一部分 。 此任务从本地文件共享获取文件,并在后台使用Polybase将它们加载到仓库表。 使用此方法,您不必将文件显式推送到Blob存储或使用AzCopy。 这个设计看起来像:
SAP>平面文件> SQL DW上载任务
您还可以使用Dataflow任务将数据直接加载到仓库表,但历史上性能非常慢。 我想这个设计的优点是你不必将文件解压缩到平面文件。 我最近没有尝试过,因为推荐的方法是Polybase。 这个设计看起来像:
SAP> Azure SQL数据仓库(使用数据流任务)
另一种需要考虑的方法是Azure数据工厂,它现在具有SAP连接(对于某些产品)。 您需要一个网关来促进从内部部署到Azure的复制。 ADF可能具有更大的可扩展性,但如果您之前没有使用它,则会有一些学习曲线。
You can use SQL Server Integration Services (SSIS) to load data to Azure SQL Data Warehouse via the Azure SQL DW Upload Task which is part of the Azure Feature Pack for Integration Services. This task takes files from a local file share and loads them to warehouse tables using Polybase in the background. Using this method you would not have to explicitly push files into Blob Storage or use AzCopy. This design looks like:
SAP > flat files > SQL DW Upload Task
You can also load data directly to a warehouse table using a Dataflow task but historically performance was very slow. The advantage I suppose of this design is that you don't have to extract the files to flat files. I haven't tried this recently as the recommended method is Polybase. This design looks like:
SAP > Azure SQL Data Warehouse (using Data Flow task)
An alternative method to consider is Azure Data Factory which now has SAP connectivity (for certain products). You would need a gateway to facilitate the copy from on-premises to Azure. ADF is potentially more scalable but there is a bit of a learning curve if you have not used it before.
更多推荐
发布评论