Opened 13 years ago
Closed 12 years ago
#19895 closed Bug (fixed)
Second iteration over an invalid queryset returns an empty list instead of an exception
| Reported by: | Grzegorz Nosek | Owned by: | Grzegorz Nosek | 
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 1.4 | 
| Severity: | Normal | Keywords: | |
| Cc: | robert.coup@… | Triage Stage: | Accepted | 
| Has patch: | yes | Needs documentation: | no | 
| Needs tests: | no | Patch needs improvement: | yes | 
| Easy pickings: | no | UI/UX: | no | 
Description
As a part of #17664 it was discovered that an invalid queryset only raises exceptions during the first iteration. When iterating over the queryset again, an empty list is returned, i.e. the following test case would fail:
def test_invalid_qs_list(self): qs = Article.objects.order_by('invalid_column') self.assertRaises(FieldError, list, qs) self.assertRaises(FieldError, list, qs)
Attachments (3)
Change History (19)
comment:1 by , 13 years ago
| Status: | new → assigned | 
|---|
by , 13 years ago
| Attachment: | 19895_1.diff added | 
|---|
by , 13 years ago
| Attachment: | 19895_2.diff added | 
|---|
comment:2 by , 13 years ago
comment:3 by , 13 years ago
| Has patch: | set | 
|---|---|
| Patch needs improvement: | set | 
comment:4 by , 13 years ago
| Patch needs improvement: | unset | 
|---|
As suggested by jacobkm on IRC, here's the updated patch:
comment:5 by , 13 years ago
| Resolution: | → fixed | 
|---|---|
| Status: | assigned → closed | 
comment:6 by , 13 years ago
| Patch needs improvement: | set | 
|---|---|
| Resolution: | fixed | 
| Severity: | Normal → Release blocker | 
| Status: | closed → new | 
| Triage Stage: | Unreviewed → Accepted | 
| Type: | Uncategorized → Bug | 
That commit is causing a serious ORM memory leak in one of my applications. It may be that my code is not the cleanest, but anyway, I consider this as a serious regression.
comment:7 by , 13 years ago
Attached is a minimalistic test case that will show the memory leak. The case is simple - have enough objects that one ITERATOR_CHUNK_SIZE will not convert all the objects (that is, more than 100 objects in the queryset). Do bool(qs). This will result in memory leak when this ticket's patch is applied, but will not leak if this ticket's patch isn't applied.
The reason for the leak is a bug in Python itself. The gc.garbage docs say that:
"""
A list of objects which the collector found to be unreachable but could not be freed (uncollectable objects). By default, this list contains only objects with __del__() methods. Objects that have __del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily in the cycle but reachable only from it. ...
"""
However, no __del__ method is defined anywhere, so there should not be any uncollectable objects. Also, pypy collects the garbage, so this is another thing pointing to a bug in Python.
I have tested this with Python 2.7.3 and Python 3.2.3, and both of those will leak. Pypy 1.8.0 collects the garbage correctly.
Steps to reproduce: unpack the attachment, run tester.py, see if gc.garbage has reference to _safe_iterator.
Even if this is a bug in Python this has to be fixed in Django itself. The memory leak can be bad. It seems just reverting the commit is the right fix.
Interestingly enough doing this change in Query.iterator() is enough to cause leak:
try:
    iterator() code here...
except Exception:
    raise
by , 13 years ago
| Attachment: | leak.tar.gz added | 
|---|
comment:8 by , 13 years ago
Here is a minimalistic case showing the bug in Python:
class MyObj(object):
    def __iter__(self):
        self._iter = iter(self.iterator())
        return iter(self._iter)
    def iterator(self):
        try:
            while True:
                yield 1
        except Exception:
            raise
i = next(iter(MyObj()))
import gc
gc.collect()
print(gc.garbage)
comment:9 by , 13 years ago
I filed a bug to Python bug tracker. http://bugs.python.org/issue17468
Does anybody see any other solution than reverting the patch?
comment:10 by , 13 years ago
I think we should roll back the patch. Your queryset-iteration simplification patch will fix this bug anyway, correct?
comment:11 by , 13 years ago
The more complex version of the simplification patch has this same issue. It is likely possible to work around this issue in the patch.
As for 1.5 a roll back seems like the only option.
comment:12 by , 13 years ago
| Resolution: | → fixed | 
|---|---|
| Status: | new → closed | 
comment:13 by , 13 years ago
| Resolution: | fixed | 
|---|---|
| Status: | closed → new | 
comment:14 by , 13 years ago
| Cc: | added | 
|---|
comment:15 by , 12 years ago
| Severity: | Release blocker → Normal | 
|---|
This isn't a release blocker any more, the leak is fixed, the second iteration works in the same way as before.
comment:16 by , 12 years ago
| Resolution: | → fixed | 
|---|---|
| Status: | new → closed | 
Test committed in 904084611d740e26eb3cb44af9a3d2f3a6d1b665
All the solutions I can come up with are apparently ugly. I'm attaching two versions of the patch for discussion (with tests stripped).
One solution is wrapping the iterator in another method, the other is putting the required try/catch in the iterator() method itself, which pushes the indentation to six levels deep maximum.