我是一个使用python和sqlite的新手。 我正在尝试创建一个从表(rawdata)读取数据的脚本,然后执行一些计算,然后将其存储在新表中。 我正在计算一名球员在该日期之前在特定赛道位置赢得的数字竞赛并计算百分比。 总共有15个轨道位置。 整体而言,脚本非常慢。 任何提高其速度的建议。 我已经使用了PRAGMA参数。
下面是脚本。
for item in result: l1 = str(item[0]) l2 = item[1] l3 = int(item[2]) winpost = [] key = l1.split("|") dt = l2 ###Denominator-------------- cursor.execute( "SELECT rowid FROM rawdata WHERE Track = ? AND Date< ? AND Distance = ? AND Surface =? AND OfficialFinish=1", (key[2], dt, str(key[4]), str(key[5]),)) result_den1 = cursor.fetchall() cursor.execute( "SELECT rowid FROM rawdata WHERE Track = ? AND RaceSN<= ? AND Date= ? AND Distance = ? AND Surface =? AND OfficialFinish=1", (key[2], int(key[3]), dt, str(key[4]), str(key[5]),)) result_den2 = cursor.fetchall() totalmat = len(result_den1) + len(result_den2) if totalmat > 0: for i in range(1, 16): cursor.execute( "SELECT rowid FROM rawdata WHERE Track = ? AND Date< ? AND PolPosition = ? AND Distance = ? AND Surface =? AND OfficialFinish=1", (key[2], dt, i, str(key[4]), str(key[5]),)) result_num1 = cursor.fetchall() cursor.execute( "SELECT rowid FROM rawdata WHERE Track = ? AND RaceSN<= ? AND Date= ? AND PolPosition = ? AND Distance = ? AND Surface =? AND OfficialFinish=1", (key[2], int(key[3]), dt, i, str(key[4]), str(key[5]),)) result_num2 = cursor.fetchall() winpost.append(len(result_num1) + len(result_num2)) winpost = [float(x) / totalmat for x in winpost] rank = rankmin(winpost) franks = list(rank) franks.insert(0, int(key[3])) franks.insert(0, dt) franks.insert(0, l1) table1.append(franks) franks = [] cursor.executemany("INSERT INTO posttable VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)", table1)I am a very newbie in using python and sqlite. I am trying to create a script that reads a data from a table (rawdata) and then performs some calculations which is then stored in a new table. I am counting the number race that a player has won before that date at a particular track position and calculating the percentage. There are 15 track positions in total. Overall the script is very slow. Any suggestions to improve its speed. I have already used the PRAGMA parameters.
Below is the script.
for item in result: l1 = str(item[0]) l2 = item[1] l3 = int(item[2]) winpost = [] key = l1.split("|") dt = l2 ###Denominator-------------- cursor.execute( "SELECT rowid FROM rawdata WHERE Track = ? AND Date< ? AND Distance = ? AND Surface =? AND OfficialFinish=1", (key[2], dt, str(key[4]), str(key[5]),)) result_den1 = cursor.fetchall() cursor.execute( "SELECT rowid FROM rawdata WHERE Track = ? AND RaceSN<= ? AND Date= ? AND Distance = ? AND Surface =? AND OfficialFinish=1", (key[2], int(key[3]), dt, str(key[4]), str(key[5]),)) result_den2 = cursor.fetchall() totalmat = len(result_den1) + len(result_den2) if totalmat > 0: for i in range(1, 16): cursor.execute( "SELECT rowid FROM rawdata WHERE Track = ? AND Date< ? AND PolPosition = ? AND Distance = ? AND Surface =? AND OfficialFinish=1", (key[2], dt, i, str(key[4]), str(key[5]),)) result_num1 = cursor.fetchall() cursor.execute( "SELECT rowid FROM rawdata WHERE Track = ? AND RaceSN<= ? AND Date= ? AND PolPosition = ? AND Distance = ? AND Surface =? AND OfficialFinish=1", (key[2], int(key[3]), dt, i, str(key[4]), str(key[5]),)) result_num2 = cursor.fetchall() winpost.append(len(result_num1) + len(result_num2)) winpost = [float(x) / totalmat for x in winpost] rank = rankmin(winpost) franks = list(rank) franks.insert(0, int(key[3])) franks.insert(0, dt) franks.insert(0, l1) table1.append(franks) franks = [] cursor.executemany("INSERT INTO posttable VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)", table1)最满意答案
就时间而言,发送和检索SQL查询是“昂贵的”。 加快速度的最简单方法是使用SQL函数来减少查询次数。
例如,可以使用COUNT(),UNION和Aliases将前两个查询简化为单个调用。
SELECT COUNT(*) FROM ( SELECT rowid FROM rawdata where ... UNION SELECT rowid FROM rawdata where ... ) totalmatch在这种情况下,我们将两个原始查询(使用您的条件代替“...”)将它们与UNION语句组合,为该联合提供别名“totalmatch”,并计算其中的所有行。
第二组查询也可以完成同样的事情。 而不是通过2次查询循环16次(导致对SQL引擎的32次调用),您也可以使用GROUP BY将其替换为一个查询。
SELECT PolPosition, COUNT(PolPosition) FROM ( SELECT PolPosition FROM rawdata WHERE ... UNION SELECt PolPosition FROM rawdata WHERE ... ) totalmatch GROUP BY PolPosition在这种情况下,我们采用与以前完全相同的查询,并通过PolPosition对其进行分组,使用COUNT显示每组中的行数。
W3Schools是这些功能如何运作的绝佳资源: http : //www.w3schools.com/sql/default.asp
Sending and retrieving an SQL query is "expensive" in terms of time. The easiest way to speed things up would be to use SQL functions to reduce the number of queries.
For example, the first two queries could be reduced to a single call using COUNT(), UNION, and Aliases.
SELECT COUNT(*) FROM ( SELECT rowid FROM rawdata where ... UNION SELECT rowid FROM rawdata where ... ) totalmatchIn this case we take the two original queries (with your conditions in place of the "...") combine them with a UNION statement, give that union the alias "totalmatch", and count all the rows in it.
Same thing can be done with the second set of queries. Instead of cycling 16 times over 2 queries (resulting in 32 calls to the SQL engine) you can replace it with one query by also using GROUP BY.
SELECT PolPosition, COUNT(PolPosition) FROM ( SELECT PolPosition FROM rawdata WHERE ... UNION SELECt PolPosition FROM rawdata WHERE ... ) totalmatch GROUP BY PolPositionIn this case we take the exact same query as before and group it by PolPosition, using COUNT to display how many rows are in each group.
W3Schools is a great resource for how these functions work: http://www.w3schools.com/sql/default.asp
更多推荐
发布评论