我有一个大约100M行的表,我将要复制到alter,添加一个索引。我不是那么关心创建新表的时间,但如果我在插入任何数据或插入数据之前改变表,然后添加索引,那么创建的索引会更高效吗?
I have a table of about 100M rows that I am going to copy to alter, adding an index. I'm not so concerned with the time it takes to create the new table, but will the created index be more efficient if I alter the table before inserting any data or insert the data first and then add the index?
推荐答案在数据插入后创建索引是更有效的方法(甚至经常建议在批量导入之前删除索引,导入后重新创建它)。
Creating index after data insert is more efficient way (it even often recomended to drop index before batch import and after import recreate it).
合成示例(PostgreSQL 9.1,缓慢开发机器,一百万行):
Syntetic example (PostgreSQL 9.1, slow development machine, one million rows):
CREATE TABLE test1(id serial, x integer); INSERT INTO test1(id, x) SELECT x.id, x.id*100 FROM generate_series(1,1000000) AS x(id); -- Time: 7816.561 ms CREATE INDEX test1_x ON test1 (x); -- Time: 4183.614 ms插入,然后创建索引 - 约12秒
Insert and then create index - about 12 sec
CREATE TABLE test2(id serial, x integer); CREATE INDEX test2_x ON test2 (x); -- Time: 2.315 ms INSERT INTO test2(id, x) SELECT x.id, x.id*100 FROM generate_series(1,1000000) AS x(id); -- Time: 25399.460 ms创建索引,然后插入 - 约25.5秒比两倍慢)
Create index and then insert - about 25.5 sec (more than two times slower)
更多推荐
是在数据填充表之前创建索引还是在数据到位后创建索引?
发布评论