我有大约200万行左右的数据,每行有一个人工PK,还有两个Id字段(所以:PK,ID1,ID2)。 我对ID1 + ID2有一个唯一的约束(和索引)。
我得到两种更新,每次更新都有不同的ID1。
100-1000行全新数据(ID1是新的) 100-1000行大部分但不一定完全重叠的数据(ID1已经存在,可能是新的ID1 + ID2对)保持这种“设定”的最有效方法是什么? 以下是我看到的选项:
删除ID1的所有行,插入所有新行(yikes) 查询新数据ID1 + ID2集中的所有现有行,仅插入新行 插入所有新行,忽略触发唯一约束违规的插入有什么想法吗?
I have ~2 million rows or so of data, each row with an artificial PK, and two Id fields (so: PK, ID1, ID2). I have a unique constraint (and index) on ID1+ID2.
I get two sorts of updates, both with a distinct ID1 per update.
100-1000 rows of all-new data (ID1 is new) 100-1000 rows of largely, but not necessarily completely overlapping data (ID1 already exists, maybe new ID1+ID2 pairs)What's the most efficient way to maintain this 'set'? Here are the options as I see them:
Delete all the rows with ID1, insert all the new rows (yikes) Query all the existing rows from the set of new data ID1+ID2, only insert the new rows Insert all the new rows, ignore inserts that trigger unique constraint violationsAny thoughts?
最满意答案
并非所有列出的解决方案在功能上都是等效的,因此如果不了解您想要或需要完成的内容,很难说哪种解决方案最合适。
您可能会丢失您想要或需要保留的数据。 根据您提到的表模式,这应该是合理的。 这仅在您单独执行每个INSERT时才有效。我根据可用的信息建议[2]。
Not all of your listed solutions are functionally equivalent, so without more knowledge about what you want or need to accomplish, it's hard to say which is most appropriate.
You may lose data that you want or need to keep. Based on the table schema that you mentioned, this should be reasonable. This will only work if you perform each INSERT separately.I'd suggest [2] based on the available info.
更多推荐
发布评论