|
View:
New views
18 Messages
—
Rating Filter:
Alert me
|
|
|
django.contrib.sessions problemsI'd like to get some feedback for the following major tickets regarding sessions, all of which are in scope for 1.0. 1) Session key collisions: http://code.djangoproject.com/ticket/1180 Due to the birthday paradox, sqrt(n) is roughly the number you need to have a 50% collision chance when picking items at random from an inexhaustible set of n items. (0, sys.maxint - 1) is currently the random range. On 32-bit platforms the collision bound is thus quite low as sqrt(2^31) is about 46 000 keys. Although get_or_create() is used to guard against collisions, they can still occur in threaded environments that have millions of sessions in session store (multiple threads enter get_or_create(), collision probability is very high when the number of stored sessions is, say, 10 times 46 000). This has been reported by at least one user. There's no problem on 64-bit platforms as the collision bound is sqrt(2^63) ~ 3,000,000,000. Proposal: use 63 bits of randomness regardless of architecture. (Trivial) patch attached to #1180. According to reports, this fixes the problem on 32-bit systems. See analysis at http://code.djangoproject.com/ticket/1180#comment:26 2) Reliable session clearing: http://code.djangoproject.com/ticket/7515 Clearing session data is a very common use case. Currently, the only way to do it is to manually erase session keys in a loop. The patch attached to #7515 fixes that by clearing the underlying dictionary in one step. However, it is not exception-safe, see problem statement at http://code.djangoproject.com/ticket/7515#comment:6 . The right way to solve this should be discussed further. 3) Clear session on logout: http://code.djangoproject.com/ticket/6941 Currently sessions are not related to users. As discussed before, one- way relation is useful, i.e. users are related to sessions and sessions are cleared if they logout or if they login and a different user was logged in previously. This depends on #7515. --- There are also two less acute tickets, http://code.djangoproject.com/ticket/6984 and http://code.djangoproject.com/ticket/6791 . IMHO they are in scope for 1.0 as well. After resolving these issues, the session framework should be ready for 1.0 feature freeze. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsOn Mon, 2008-07-07 at 06:48 -0700, mrts wrote: > I'd like to get some feedback for the following major tickets > regarding sessions, all of which are in scope for 1.0. > > 1) Session key collisions: http://code.djangoproject.com/ticket/1180 > > Due to the birthday paradox, sqrt(n) is roughly the number you need to > have a 50% collision chance when picking items at random from an > inexhaustible set of n items. (0, sys.maxint - 1) is currently the > random range. On 32-bit platforms the collision bound is thus quite > low as sqrt(2^31) is about 46 000 keys. This isn't correct. The birthday paradox models a situation of choices *with replacement* from a set. That is, when multiple entities can have the same value. We don't have replacement. You have a set of X active sessions, all unique (since that's a database constraint) and somebody is trying to guess one of them at random. So the probability is X/N, not something approaching sqrt(N)/N. Once a session has been created and saved successfully to the database, it's known to be unique amongst all active sessions. I'm not saying there's no issue here, but accuracy is important. [...] > 3) Clear session on logout: http://code.djangoproject.com/ticket/6941 > > Currently sessions are not related to users. As discussed before, one- > way relation is useful, i.e. users are related to sessions and > sessions are cleared if they logout or if they login and a different > user was logged in previously. This depends on #7515. As discussed in the earlier thread, deleting sessions on logout is almost certainly the right solution to this. Regards, Malcolm --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsOn Mon, Jul 7, 2008 at 8:48 AM, mrts <mrts@...> wrote: > I'd like to get some feedback for the following major tickets > regarding sessions, all of which are in scope for 1.0. I don't really know a nice way of saying this, so I'll trust you to understand that I don't mean to be a dick. You don't get to decide what's in scope for 1.0 and what isn't. You're completely right that these tickets fall into the "bug" category (though there's some feature-creep with #7515), which makes them likely candidates. Still, you don't get to make a unilateral decision here. I appreciate that you think these features are vital, but you must understand that *everyone* has his or her own "must have" list, and if we're to meet deadlines some people are inevitably going to be disappointed. So: please feel free to *suggest* that tickets be closed for 1.0; feel free to *argue* for them... but stop demanding. Further, if you've looked at the roadmap you'll see that we're two weeks out from 1.0 alpha, which is a major deadline for newforms-admin and a couple of other thorny issues. This means that all eyes are going to be on finishing those features, and *not* on bug fixes -- we'll be focusing on those after feature freeze. As I said, though, these are very likely candidates for 1.0. So the best thing for you to do is tag them with the 1.0 milestone and wait a couple weeks until we start focusing on bug fixes. Jacob --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsOn Jul 7, 5:00 pm, "Jacob Kaplan-Moss" <ja...@...> wrote: > On Mon, Jul 7, 2008 at 8:48 AM, mrts <m...@...> wrote: > > I'd like to get some feedback for the following major tickets > > regarding sessions, all of which are in scope for 1.0. > > I don't really know a nice way of saying this, so I'll trust you to > understand that I don't mean to be a dick. > > You don't get to decide what's in scope for 1.0 and what isn't. That was not my intention (and it appears that somehow many of my posts create a bit of friction :) -- their relevance looks to be mostly accepted but wording not... oh well). What I wanted to say is that robust sessions are a vital component of a web framework and all the issues above need some discussion and decisions before 1.0. So, indeed, I wanted to argue, not demand something. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problems> > Due to the birthday paradox, sqrt(n) is roughly the number you need to > > have a 50% collision chance when picking items at random from an > > inexhaustible set of n items. (0, sys.maxint - 1) is currently the > > random range. On 32-bit platforms the collision bound is thus quite > > low as sqrt(2^31) is about 46 000 keys. > > This isn't correct. The birthday paradox models a situation of choices > *with replacement* from a set. That is, when multiple entities can have > the same value. We don't have replacement. You have a set of X active > sessions, all unique (since that's a database constraint) and somebody > is trying to guess one of them at random. So the probability is X/N, not > something approaching sqrt(N)/N. random.randint(0, n) *is* a set with choices with replacements. Multiple picks can, should and, as demonstrated at http://code.djangoproject.com/ticket/1180#comment:26 , do create collisions at around sqrt(n) picks and less. I entirely agree that collisions in picks from the random value set shouldn't directly manifest in collisions in the session key set as 1) time.time() is used as another varying input 2) collision is explicitly checked for in a loop until a unique value is created So, for a session key collision to occur, multiple threads should simultaneously enter the function, get the same time value and a colliding random value (and negative result from the uniqueness check). Admittedly this looks very unlikely when random values collide at around 46 000 picks. However, according to the report http://code.djangoproject.com/ticket/1180#comment:31 , collisions do exist when random range is (0, 2^31) and do not exist when it is (0, 2^63). So perhaps we miss something. To properly solve the problem, the particular scenario that leads to the collision should be pinpointed. Then we can make an informed decision about the amount of randomness needed. Until that, 63 bits can be used as it is safer than 31, does not cause much performance penalty and has proven to be enough to fix the problem. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsOn Mon, 2008-07-07 at 08:28 -0700, mrts wrote: > > > Due to the birthday paradox, sqrt(n) is roughly the number you need to > > > have a 50% collision chance when picking items at random from an > > > inexhaustible set of n items. (0, sys.maxint - 1) is currently the > > > random range. On 32-bit platforms the collision bound is thus quite > > > low as sqrt(2^31) is about 46 000 keys. > > > > This isn't correct. The birthday paradox models a situation of choices > > *with replacement* from a set. That is, when multiple entities can have > > the same value. We don't have replacement. You have a set of X active > > sessions, all unique (since that's a database constraint) and somebody > > is trying to guess one of them at random. So the probability is X/N, not > > something approaching sqrt(N)/N. > > random.randint(0, n) *is* a set with choices with replacements. > Multiple picks can, should and, as demonstrated at > http://code.djangoproject.com/ticket/1180#comment:26 , do create > collisions at around sqrt(n) picks and less. That comment has no bearing. (1) We pick a random session key. (2) We save it to the database, where it should be unique, otherwise an error is raised. (3) We use that session key to pass back to the user. At this point, the birthday paradox no longer applies. (3) should not happen before (2). If it does, it's a bug that should be fixed, but the database should be being used to guarantee uniqueness here. It's the same database across all threads/processes/machines. Your comment in that ticket avoids step (2), which is a necessary uniqueness constraint. Malcolm --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsOn Jul 7, 6:32 pm, Malcolm Tredinnick <malc...@...> wrote: > > That comment has no bearing. > > (1) We pick a random session key. > (2) We save it to the database, where it should be unique, otherwise an > error is raised. > (3) We use that session key to pass back to the user. > > At this point, the birthday paradox no longer applies. (3) should not > happen before (2). If it does, it's a bug that should be fixed, but the > database should be being used to guarantee uniqueness here. It's the > same database across all threads/processes/machines. > > Your comment in that ticket avoids step (2), which is a necessary > uniqueness constraint. Agreed, but increasing the range from (0, 2^31) to (0, 2^63) should have no effect on collisions then. The report at #1180 seems to indicate otherwise (unless it is invalid). --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsAs one of the people experiencing issues with the session collisions I will attempt to explain how it manifests and my setup. My server is a shared host, running Apache, python2.5 and Django is configured though fast-cgi (My host wont let me use mod-python). The way it seems to work is a new thread is spawned on each page request and terminated when the request is finished. The issues manifests when the sessions system generates a duplicate session id for two different users. Only 1 entry is entered into the DB, so both users are using the same session id, thus can (and do) have incorrect permissions and data. I am detecting the collisions by setting a setting a cookie value with the username they logged in with and checking it against the username in the session data. With the SVN implementation of session generation I get about 10% of my users getting an incorrect session when they login, with the latest patch I have gotten 0 collisions. The issue may actually be with the get_or_create() method of blocking collisions since if it generates a duplicate id it just grabs that data from the table and runs with it (I think), but whatever the actual cause of the issue the latest patch prevents it on my setup. *Adding this info to the ticket as well --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsOn Jul 7, 11:09 am, digitalxero <digitalx...@...> wrote: > *Adding this info to the ticket as well Well Trac decided I was spam and wouldn't let me post the data --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsOn Mon, Jul 7, 2008 at 12:12 PM, digitalxero <digitalxero@...> wrote: > > On Jul 7, 11:09 am, digitalxero <digitalx...@...> wrote: >> *Adding this info to the ticket as well > > Well Trac decided I was spam and wouldn't let me post the data Log in to trac. See discussion here: http://groups.google.com/group/django-developers/browse_thread/thread/560c324325723ba1/147c30570e4216d1?lnk=gst&q=spam#147c30570e4216d1 --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsThus, looks there are two problems -- collisions happen too often on 32-bit machines (fixed by my patch) and db session backend doesn't handle them properly (no fix so far). As for the latter -- Malcolm, you were about to add a clear distinction between INSERT and UPDATE to .save() in http://groups.google.com/group/django-developers/browse_thread/thread/179ed55b3bf44af0 . That would help to fix def save(self): Session.objects.create( session_key = self.session_key, session_data = self.encode(self._session), expire_date = self.get_expiry_date() ) As far as I can see, create uses .save() and .save() does an UPDATE when the object exists instead of unconditionally failing as it should. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsOn Mon, 2008-07-07 at 10:09 -0700, digitalxero wrote: [...] > The issue may actually be with the get_or_create() method of blocking > collisions since if it generates a duplicate id it just grabs that > data from the table and runs with it (I think), but whatever the > actual cause of the issue the latest patch prevents it on my setup. Yes. It's important to view things like this thread in the context of the earlier thread about session fixes. There are a number of inter-related things here and a fix that solves the general problem instead of playing whack-a-mole is in order. Part of that, and implicit in this whole thing, is ensuring that the new session id is saved (and *created*) immediately to avoid collisions. The database is the only way to ensure uniqueness here and there is a bug at that level. Increasing some value from 32 bits to 64 bits is only changing some probabilities; it's not actually solving the problem, just moving the goalposts to make it harder to score an own goal. The rest of the conversation should proceed on the assumption that the bug about creating unique database entries will be fixed first. Regards, Malcolm --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsMalcolm Tredinnick wrote: > The rest of the conversation should > proceed on the assumption that the bug about creating unique database > entries will be fixed first. Now I think that the problem is only exists if one uses non-transactional DB setup. In this case due to race conditions one of the two simultaneous get_or_create calls will "get" instead of "create". In a transactional setup one of the transactions will fail to commit eventually since both get_or_create will try to do INSERT not seeing each other. May be the fix then will be to always use explicit INSERT instead of get_or_create and let it fail on unique constraint? P.S. Did I miss it in the thread or nobody talks about non-DB session storages? --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsOn Tue, 2008-07-08 at 10:30 +0400, Ivan Sagalaev wrote: > Malcolm Tredinnick wrote: > > The rest of the conversation should > > proceed on the assumption that the bug about creating unique database > > entries will be fixed first. > > Now I think that the problem is only exists if one uses > non-transactional DB setup. In this case > due to race conditions one of the two simultaneous get_or_create calls > will "get" instead of "create". In a transactional setup one of the > transactions will fail to commit eventually since both get_or_create > will try to do INSERT not seeing each other. Not quite. The root problem is that get_or_create() is simply not designed at the moment to enforce the "create" side. It assumes that if another process creates the object during the call, then it's fine to get the result and carry on. The idea is that after you call it, the object exists in the database and you have a reference to the copy, not that *only* get or *only* create will be called. This is why two people can end up retrieving the same session object. There needs to be a path that says "this must create if it isn't already there". That's a bug that will be fixed. > May be the fix then will be > to always use explicit INSERT instead of get_or_create and let it fail > on unique constraint? That is the fix. It was discussed in the thread a couple of months ago. When I've done my high priority tasks, I'll get around to adding the ability to force insert or update at the model save level and the rest falls out naturally. It's easy and I have a patch that mostly works. But I need to review it again and it's in the queue. It's almost like we need some kind of ticket tracking system so that these things don't get lost and people won't start threads saying "look at mine! look at mine!" Oh, wait ... we do have one of those. :-( > P.S. Did I miss it in the thread or nobody talks about non-DB session > storages? It's more or less a side-issue that we'll work out as we go along. If non-db storages have a way of enforcing uniqueness, we'll use it (and that does require some effort to do -- I'm not dismissing them). However, the storage is the only reliable place to enforce uniqueness, so if the storage cannot guarantee it, Django can only work with what it's given. So, for example, file storage might have problems on some filesystems that don't have good locking semantics. Then it's a case of "go buy yourself a real filesystem". Regards, Malcolm --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsOn Tue, 2008-07-08 at 19:27 +1000, Malcolm Tredinnick wrote: [...] > It's almost like we need some kind of ticket tracking system so that > these things don't get lost and people won't start threads saying "look > at mine! look at mine!" Oh, wait ... we do have one of those. :-( Sorry. That was uncalled for on my part. I'm feeling buried under too many active threads, but I didn't need to take it out on the people in this one. Malcolm --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsOn Jul 8, 5:27 am, Malcolm Tredinnick <malc...@...> wrote: > Increasing > some value from 32 bits to 64 bits is only changing some probabilities; > it's not actually solving the problem, just moving the goalposts to make > it harder to score an own goal. The rest of the conversation should > proceed on the assumption that the bug about creating unique database > entries will be fixed first. Agreed that the uniqueness bug takes precedence. But the uniform 63- bit randomness patch is also important. Once collisions happen, their handling is quite expensive, so they should be better avoided in the first place. Moreover, 63 bits are already used on 64-bit machines and I don't see any reason not to use it on 32-bit machines as well. --- I would like to get some feedback on the reliable session clearing problem next, http://code.djangoproject.com/ticket/7515 . Clearing session data in one step is a trivial addition: def clear(self): self._session.clear() self.modified = True Though better than manually erasing all session keys in a loop, this is not robust enough. Consider this scenario: * session with key X is created for user A * A "logs out" and session data is cleared, session key X will not be reset * session db backend tries remove the empty session, but database connection fails, exception is raised * cookie with old session key X remains in browser * user B accesses the site using the same browser, cookie with key X will be sent to the site * database has come online, session exists and A's sensitive information will be happily served to B Thus, reliable session clearing should assure that the session cookie is removed or updated with a different session key even when exceptions occur in the session storage backend. No hurry in replying, but discussing this is important in my humble opinion. Best, Mart Sõmermaa --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@... To unsubscribe from this group, send email to django-developers-unsubscribe@... For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: django.contrib.sessions problemsI've updated the patch at #7515 [1] by adding an exception- safe .destroy() method to session objects that * clears the session dict * removes the old session from session store * generates a new session key I propose the following roadmap for fixing session problems: 1. review and commit #1180 [2]. This depends indirectly on Malcolm's work on unique db entries, whether the following queue should wait for that is up to the core developers. 2. review and commit #7515 [1] as this subtly depends on #1180 (the rare corner case where exists() throws and key uniqueness can not be checked, collision probability should be low in this case). 3. review and commit # 6941 [3], this depends on #7515. On Jul 8, 2:15 pm, mrts <m...@...> wrote: > I would like to get some feedback on the reliable session clearing > problem next,http://code.djangoproject.com/ticket/7515. [1] Add .clear() and .destroy() to session objects. |