Django程序无法获取数据库更新过的记录

之前尝试在django应用中写一个定期从数据库取数据处理的小程序. 这个小程序的代码大体像是这样子:

worker.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import os, sys
import time

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "settings")

from apps.my_models import Toy

if __name__ == "__main__":
    while True:
        broken_toys = Toy.objects.filter(is_broken=1)

        ## do someting good to these toys
        print broken_toys
        .....

        ## for some rest ;)
        time.sleep(5)

运行后发现: 当数据库toys表记录发生改变时, 程序每次运行时获取的broken_toys都是一样的. Strange~

排查

一开始我以为是QuerySet缓存的影响, 但是后来用pdb跟了一下, 发现没有关系. 在stackoverflow也找了下, 发现一个类似的问题. upvote最高的一个回答提到了这个问题与mysql的表相关. 当时没有仔细看, 后来从google论坛的这个帖子才重视起来.

简而言之, 在innodb默认的情况下, 每个transaction内的读取都是第一次读取(snapshot)时的内容. 上面的样例程序我用的都是同一个连接, 而且都是读取操作, 所以出现这个问题.

从MySQL官网上摘一段比较重要的内容, 尤其注意中间的那段, 加深印象:

consistent read A read operation that uses snapshot information to present query results based on a point in time, regardless of changes performed by other transactions running at the same time. If queried data has been changed by another transaction, the original data is reconstructed based on the contents of the undo log. This technique avoids some of the locking issues that can reduce concurrency by forcing transactions to wait for other transactions to finish.

With the repeatable read isolation level, the snapshot is based on the time when the first read operation is performed. With the read committed isolation level, the snapshot is reset to the time of each consistent read operation.

Consistent read is the default mode in which InnoDB processes SELECT statements in READ COMMITTED and REPEATABLE READ isolation levels. Because a consistent read does not set any locks on the tables it accesses, other sessions are free to modify those tables while a consistent read is being performed on the table.

更加详细的可以参考官网的说明:

  1. set-transaction
  2. innodb-consistent-read

解决方法:

知道了问题所在, 解决方式也比较多:

  1. 调整模式: set transaction isolation level read committed.
  2. 使用update: broken_toys.update(), 会强制刷新一次transaction
  3. 使用使用新的连接: 参考

个人比较倾向于第三种.

PPPPSSSS

其实排查这个问题还捎带学习了descriptor, 就是Toy.objects.filter这样的使用. 有时间再总结吧~

Comments