我正在尝试使用高性能表参数方法插入记录(www.altdevblogaday/2012/05/16/sql-server-high-performance-inserts/ ),我很好奇是否有可能取回我插入的每条记录的标识值.
目前,答案似乎是否定的 - 我插入数据,然后检索身份值,但它们不匹配.具体来说,它们大约有 75% 的时间不匹配,并且它们以不可预测的方式不匹配.下面是一些复制此问题的代码:
//创建一个 10 万行的数据表数据表 dt = 新数据表();dt.Columns.Add(new DataColumn("item_id", typeof(int)));dt.Columns.Add(new DataColumn("comment", typeof(string)));for (int i = 0; i <100000; i++) {dt.Rows.Add(new object[] { 0, i.ToString() });}//插入这些记录并取回身份使用 (SqlConnection conn = new SqlConnection("Data Source=localhost;Initial Catalog=testdb;Integrated Security=True")) {conn.Open();使用 (SqlCommand cmd = new SqlCommand("proc_bulk_insert_test", conn)) {cmd.CommandType = CommandType.StoredProcedure;//添加一个结构化"参数可以让你以低开销插入大量数据SqlParameter param = new SqlParameter("@mytable", SqlDbType.Structured);param.Value = dt;cmd.Parameters.Add(param);SqlDataReader dr = cmd.ExecuteReader();//设置所有记录的标识值int i = 0;而 (dr.Read()) {dt.Rows[i].ItemArray = new object[] { dr.GetInt32(0), dt.Rows[i].ItemArray[1] };我++;}博士关闭();}//所有记录的 ID 号是否与我从数据库中收到的相匹配?使用 (SqlCommand cmd = new SqlCommand("SELECT * FROM bulk_insert_test WHERE item_id >= @base_identity ORDER BY item_id ASC", conn)) {cmd.Parameters.AddWithValue("@base_identity", (int)dt.Rows[0].ItemArray[0]);SqlDataReader dr = cmd.ExecuteReader();DataTable dtresult = new DataTable();dtresult.Load(dr);}}使用此 SQL 服务器脚本定义数据库:
创建表bulk_insert_test (item_id int IDENTITY (1, 1) NOT NULL PRIMARY KEY,注释 varchar(20))去CREATE TYPE bulk_insert_table_type AS TABLE ( item_id int, comment varchar(20) )去创建程序 proc_bulk_insert_test@mytable bulk_insert_table_type 只读作为声明@TableOfIdentities 表(IdentValue INT)INSERT INTO bulk_insert_test(评论)OUTPUT Inserted.item_id INTO @TableOfIdentities(IdentValue)从@mytable 中选择评论SELECT * FROM @TableOfIdentities问题在于:从 proc_bulk_insert_test 返回的值与插入原始记录的顺序不同.因此,我无法以编程方式为每条记录分配从 OUTPUT 语句返回的 item_id 值.
似乎唯一有效的解决方案是 SELECT 返回我刚刚插入的整个记录列表,但坦率地说,我更喜欢任何可以减少通过我的 SQL Server 传输的数据量的解决方案网卡.有没有人有更好的解决方案来处理大插入,同时仍然检索标识值?
让我试着进一步澄清这个问题.问题是我希望我的 C# 程序了解 SQL Server 分配给我刚刚插入的数据的标识值.顺序不是必需的;但我希望能够在 C# 中获取任意一组记录,使用快速表参数方法插入它们,然后在 C# 中分配它们自动生成的 ID 号,而不必将整个表重新查询回内存.
鉴于这是一个人工测试集,我试图将其压缩为尽可能小的可读代码.让我描述一下我使用了哪些方法来解决这个问题:
基本上,如果 SQL Server 始终按照我提供的顺序插入记录,这个问题就会得到解决.是否可以让 SQL Server 完全按照表值参数插入中提供的顺序插入记录?
这种方法似乎与 Cade Roux 在下面引用的非常相似:
www.sqlteam/article/using-the-output-clause-to-capture-identity-values-on-multi-row-inserts
但是,在文章中,作者使用了一个神奇的唯一值ProductNumber",将插入的信息从输出"值连接到原始表值参数.如果我的表没有神奇的唯一值,我正在尝试弄清楚如何执行此操作.
解决方案你的 TVP 是一个无序集合,就像一个普通的表.它只有在您指定时才有顺序.您不仅无法在此处指示实际顺序,而且您还只是在最后执行 SELECT * 而没有 ORDER BY.你期望什么顺序?您已经有效地告诉 SQL Server 您不在乎.也就是说,我实现了你的代码,并且按正确的顺序恢复行没有问题.我稍微修改了程序,这样你就可以真正知道哪个标识值属于哪个评论:
DECLARE @TableOfIdentities TABLE (IdentValue INT, 注释 varchar(20))INSERT INTO bulk_insert_test(评论)OUTPUT Inserted.item_id, InsertedmentINTO @TableOfIdentities(IdentValue, 评论)从@mytable 中选择评论SELECT * FROM @TableOfIdentities然后我使用这个代码调用它(我们不需要所有的 C#):
DECLARE @t bulk_insert_table_type;INSERT @t VALUES(5,'foo'),(2,'bar'),(3,'zzz');SELECT * FROM @t;执行 dbo.proc_bulk_insert_test @t;结果:
1 foo2 巴3 zzz如果您想确保输出是按照身份分配的顺序(不一定与您的无序 TVP 具有相同的顺序"),您可以将 ORDER BY item_id 添加到程序中的最后一个选择.
如果您想插入到目标表中,以便您的身份值按对您很重要的顺序排列,那么您有两个选择:
向您的 TVP 添加一列并将订单插入该列,然后使用游标按该顺序遍历行,并一次插入一个.仍然比为每一行调用整个过程更有效,恕我直言.
在您的 TVP 中添加一列指示顺序,并在插入时使用 ORDER BY.这不能保证,但相对可靠,特别是如果您使用 MAXDOP 1 消除并行性问题.
在任何情况下,您似乎都非常重视 ORDER.您的订单实际上是什么意思?如果您想为订单赋予某种意义,则不应使用 IDENTITY 列.
I'm trying to insert records using a high performance table parameter method ( www.altdevblogaday/2012/05/16/sql-server-high-performance-inserts/ ), and I'm curious if it's possible to retrieve back the identity values for each record I insert.
At the moment, the answer appears to be no - I insert the data, then retrieve back the identity values, and they don't match. Specifically, they don't match about 75% of the time, and they don't match in unpredictable ways. Here's some code that replicates this issue:
// Create a datatable with 100k rows DataTable dt = new DataTable(); dt.Columns.Add(new DataColumn("item_id", typeof(int))); dt.Columns.Add(new DataColumn("comment", typeof(string))); for (int i = 0; i < 100000; i++) { dt.Rows.Add(new object[] { 0, i.ToString() }); } // Insert these records and retrieve back the identity using (SqlConnection conn = new SqlConnection("Data Source=localhost;Initial Catalog=testdb;Integrated Security=True")) { conn.Open(); using (SqlCommand cmd = new SqlCommand("proc_bulk_insert_test", conn)) { cmd.CommandType = CommandType.StoredProcedure; // Adding a "structured" parameter allows you to insert tons of data with low overhead SqlParameter param = new SqlParameter("@mytable", SqlDbType.Structured); param.Value = dt; cmd.Parameters.Add(param); SqlDataReader dr = cmd.ExecuteReader(); // Set all the records' identity values int i = 0; while (dr.Read()) { dt.Rows[i].ItemArray = new object[] { dr.GetInt32(0), dt.Rows[i].ItemArray[1] }; i++; } dr.Close(); } // Do all the records' ID numbers match what I received back from the database? using (SqlCommand cmd = new SqlCommand("SELECT * FROM bulk_insert_test WHERE item_id >= @base_identity ORDER BY item_id ASC", conn)) { cmd.Parameters.AddWithValue("@base_identity", (int)dt.Rows[0].ItemArray[0]); SqlDataReader dr = cmd.ExecuteReader(); DataTable dtresult = new DataTable(); dtresult.Load(dr); } }The database is defined using this SQL server script:
CREATE TABLE bulk_insert_test ( item_id int IDENTITY (1, 1) NOT NULL PRIMARY KEY, comment varchar(20) ) GO CREATE TYPE bulk_insert_table_type AS TABLE ( item_id int, comment varchar(20) ) GO CREATE PROCEDURE proc_bulk_insert_test @mytable bulk_insert_table_type READONLY AS DECLARE @TableOfIdentities TABLE (IdentValue INT) INSERT INTO bulk_insert_test (comment) OUTPUT Inserted.item_id INTO @TableOfIdentities(IdentValue) SELECT comment FROM @mytable SELECT * FROM @TableOfIdentitiesHere's the problem: the values returned from proc_bulk_insert_test are not in the same order as the original records were inserted. Therefore, I can't programmatically assign each record the item_id value I received back from the OUTPUT statement.
It seems like the only valid solution is to SELECT back the entire list of records I just inserted, but frankly I'd prefer any solution that would reduce the amount of data piped across my SQL Server's network card. Does anyone have better solutions for large inserts while still retrieving identity values?
EDIT: Let me try clarifying the question a bit more. The problem is that I would like my C# program to learn what identity values SQL Server assigned to the data that I just inserted. The order isn't essential; but I would like to be able to take an arbitrary set of records within C#, insert them using the fast table parameter method, and then assign their auto-generated ID numbers in C# without having to requery the entire table back into memory.
Given that this is an artificial test set, I attempted to condense it into as small of a readable bit of code as possible. Let me describe what methods I have used to resolve this issue:
Basically, this problem would be solved if SQL Server would always insert the records in exactly the order I provided them. Is it possible to make SQL server insert records in exactly the order they are provided in a table value parameter insert?
EDIT2: This approach seems very similar to what Cade Roux cites below:
www.sqlteam/article/using-the-output-clause-to-capture-identity-values-on-multi-row-inserts
However, in the article, the author uses a magic unique value, "ProductNumber", to connect the inserted information from the "output" value to the original table value parameter. I'm trying to figure out how to do this if my table doesn't have a magic unique value.
解决方案Your TVP is an unordered set, just like a regular table. It only has order when you specify as such. Not only do you not have any way to indicate actual order here, you're also just doing a SELECT * at the end with no ORDER BY. What order do you expect here? You've told SQL Server, effectively, that you don't care. That said, I implemented your code and had no problems getting the rows back in the right order. I modified the procedure slightly so that you can actually tell which identity value belongs to which comment:
DECLARE @TableOfIdentities TABLE (IdentValue INT, comment varchar(20)) INSERT INTO bulk_insert_test (comment) OUTPUT Inserted.item_id, Insertedment INTO @TableOfIdentities(IdentValue, comment) SELECT comment FROM @mytable SELECT * FROM @TableOfIdentitiesThen I called it using this code (we don't need all the C# for this):
DECLARE @t bulk_insert_table_type; INSERT @t VALUES(5,'foo'),(2,'bar'),(3,'zzz'); SELECT * FROM @t; EXEC dbo.proc_bulk_insert_test @t;Results:
1 foo 2 bar 3 zzzIf you want to make sure the output is in the order of identity assignment (which isn't necessarily the same "order" that your unordered TVP has), you can add ORDER BY item_id to the last select in your procedure.
If you want to insert into the destination table so that your identity values are in an order that is important to you, then you have a couple of options:
add a column to your TVP and insert the order into that column, then use a cursor to iterate over the rows in that order, and insert one at a time. Still more efficient than calling the entire procedure for each row, IMHO.
add a column to your TVP that indicates order, and use an ORDER BY on the insert. This isn't guaranteed, but is relatively reliable, particularly if you eliminate parallelism issues using MAXDOP 1.
In any case, you seem to be placing a lot of relevance on ORDER. What does your order actually mean? If you want to place some meaning on order, you shouldn't be doing so using an IDENTITY column.
更多推荐
可以使用表参数插入,还可以检索标识值吗?
发布评论