|
View:
New views
15 Messages
—
Rating Filter:
Alert me
|
|
|
Identity, tgrepozewho user permissionsToday while working on a project not using TG I was poking around the identity code looking to see what the method was to get all of a user's permissions. I was rather surprised to see that every group the user is a member of is iterated through to build up the permissions[1]. Is there a reason that a sql join isn't used instead[2]? I would have thought a join was better performing. Just wondering what I'm overlooking or misunderstanding, Janzert 1. All the TG related code I found uses something similar to this: @property def permissions(self): p = set() for g in self.groups: p |= set(g.permissions) return p 2. Why not something like the following (untested code)? @property def permissions(self): p_q = session.query(Permission).join(['tg_group', 'tg_user']) p_q.filter_by(user_id = self.user_id) return p_q.all() --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsJanzert schrieb: > 1. All the TG related code I found uses something similar to this: > @property > def permissions(self): > p = set() > for g in self.groups: > p |= set(g.permissions) > return p > > 2. Why not something like the following (untested code)? > @property > def permissions(self): > p_q = session.query(Permission).join(['tg_group', 'tg_user']) > p_q.filter_by(user_id = self.user_id) > return p_q.all() No reason than simplicity and readability, I think. There's always a tradeoff between optimization and readable code. Patches are welcome, though. But please make sure that it works with all SQLAlchemy version >=0.3. Chris --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsOn Thu, Jul 3, 2008 at 2:48 PM, Christopher Arndt <chris.arndt@...> wrote: > > Janzert schrieb: >> 1. All the TG related code I found uses something similar to this: >> @property >> def permissions(self): >> p = set() >> for g in self.groups: >> p |= set(g.permissions) >> return p >> >> 2. Why not something like the following (untested code)? >> @property >> def permissions(self): >> p_q = session.query(Permission).join(['tg_group', 'tg_user']) >> p_q.filter_by(user_id = self.user_id) >> return p_q.all() > > No reason than simplicity and readability, I think. There's always a > tradeoff between optimization and readable code. Patches are welcome, > though. But please make sure that it works with all SQLAlchemy version > >=0.3. > I'll read that one and test, I like the idea of sql optimizing the identity check sounds interesting for tg1 and tg2.... :) Could you provide a patch in our trac system so that I don't lose that nugget? Florent. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsFor what it's worth, I would argue the generative SA query should be more legible to a TG coder than the ior logic (which, frankly, I'm still a little iffy on).
On Thu, Jul 3, 2008 at 9:04 AM, Florent Aide <florent.aide@...> wrote:
--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsJanzert schrieb: > 2. Why not something like the following (untested code)? > @property > def permissions(self): > p_q = session.query(Permission).join(['tg_group', 'tg_user']) > p_q.filter_by(user_id = self.user_id) > return p_q.all() I would write something like that in a long statement, without the arbitrary p_q variable and avoid using hard coded table names. The other problem is that this code doesn't work at all. Also, the method must return a set, not a list with duplicates. There are basically two ways to collect the permissions via SQL. Either you make the two joins, creating all possible ways a user can have a certain permission, and then remove the duplicates, like that: def permissions(self): return set(Permission.query.join(Permission.groups, Group.users).filter_by(user_id=self.user_id).all()) Or you create the query using exist clauses, so that the query does not produce any duplicates in the first place: def permissions(self): return set(Permission.query.filter(Permission.groups.any( Group.users.any(User.user_id==self.user_id))).all()) The problem with this latter solution only works with SA >= 0.4 (and the former solution even only with SA >= 0.5). (Also, both solutions do a last join with the user table that is actually unnecessary. I currently see no elegant way to avoid this. It does not harm much, though.) -- Christoph --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsChristoph Zwerschke wrote: > Janzert schrieb: >> 2. Why not something like the following (untested code)? >> @property >> def permissions(self): >> p_q = session.query(Permission).join(['tg_group', 'tg_user']) >> p_q.filter_by(user_id = self.user_id) >> return p_q.all() > > I would write something like that in a long statement, without the > arbitrary p_q variable and avoid using hard coded table names. > > The other problem is that this code doesn't work at all. > Like I said it was untested. :/ > Also, the method must return a set, not a list with duplicates. > > There are basically two ways to collect the permissions via SQL. > > Either you make the two joins, creating all possible ways a user can > have a certain permission, and then remove the duplicates, like that: > > def permissions(self): > return set(Permission.query.join(Permission.groups, > Group.users).filter_by(user_id=self.user_id).all()) > > Or you create the query using exist clauses, so that the query does not > produce any duplicates in the first place: > > def permissions(self): > return set(Permission.query.filter(Permission.groups.any( > Group.users.any(User.user_id==self.user_id))).all()) > > The problem with this latter solution only works with SA >= 0.4 (and the > former solution even only with SA >= 0.5). > > (Also, both solutions do a last join with the user table that is > actually unnecessary. I currently see no elegant way to avoid this. It > does not harm much, though.) > > -- Christoph > My understanding and testing seem to show that an inner join doesn't return duplicates. Wrapping the result in a set doesn't seem necessary. After playing around with SA 0.3.10, 0.4.6 and 0.5.0beta1 and taking into account your comments I came up with this. def permissions(self): return Permission.query.join(["groups", "users"]).filter_by(user_id=self.user_id).all() permissions = property(permissions) I tested by setting up a user belonging to two groups with each of them containing two permissions. One of the permissions was unique to the group and one was the same in both groups. I put a diff from a TG 1.0.5 quickstarted project to what I used for testing at http://paste.turbogears.org/paste/3056 and the full project at http://janzert.com/files/jointest.zip If this seems the right way to go I can create a trac issue and try to work up a real patch. Janzert --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsJanzert schrieb: > My understanding and testing seem to show that an inner join doesn't > return duplicates. Wrapping the result in a set doesn't seem necessary. The query creates duplicates, and they are pulled from the database, but they are removed by SQLAlchemy because you are querying object instances and it makes no sense for SQLAlchemy to deliver duplicates in this case. Nevertheless, the result should be converted to a set for backward compatibility and performance reasons - you usually check for a certain permission with "in", and that will be faster for a set than for a list. > After playing around with SA 0.3.10, 0.4.6 and 0.5.0beta1 and taking > into account your comments I came up with this. > > def permissions(self): > return Permission.query.join(["groups", > "users"]).filter_by(user_id=self.user_id).all() > permissions = property(permissions) Yes, this seems to work with all these SA versions. And the names in the join here are attribute names, not table names, so this is ok. Still, this join query produces more results than necessary. You can verify this by using count() instead of all() - you will see more results counted than elements in the list. As I explained above, the duplicates are filtered away by SQLAlchemy because the query requests object instances, that's why you don't see them. So a solution using "any" (exist clauses) may still be more performant. (Of course ehis is all a bit theoretical since users are usually not in hundreds of groups, but we should try to do it right anyway.) -- Christop --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsChristoph Zwerschke wrote: > Still, this join query produces more results than necessary. As a compromize we could eliminate the duplicates on the database level already so that they are not returned by the query: def permissions(self): return set(Permission.query.distinct().join(['groups', 'users']).filter_by(user_id=self.user_id)) Note that I also removed the all(): It is not necessary if we convert to a set, since the query itself is iterable. -- Christoph --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsChristoph Zwerschke wrote: > Janzert schrieb: >> My understanding and testing seem to show that an inner join doesn't >> return duplicates. Wrapping the result in a set doesn't seem necessary. > > The query creates duplicates, and they are pulled from the database, but > they are removed by SQLAlchemy because you are querying object instances > and it makes no sense for SQLAlchemy to deliver duplicates in this case. Ah yes, by running the generated sql directly I see that now. I'll try and see if there is a way to generate the exists clause that is compatible with 0.3 on up. Janzert --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsJanzert schrieb: > I put a diff from a TG 1.0.5 quickstarted project to what I used for > testing at http://paste.turbogears.org/paste/3056 and the full project > at http://janzert.com/files/jointest.zip > > If this seems the right way to go I can create a trac issue and try to > work up a real patch. Please provide a patch to the quickstart templates resp. the identity module then. Please also provide tests and mention in your ticket if anything needs to be added / altered in the Identity docs http://docs.turbogears.org/1.0/UsingIdentity Thx, Chris --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsChristopher Arndt wrote: > Janzert schrieb: >> I put a diff from a TG 1.0.5 quickstarted project to what I used for >> testing at http://paste.turbogears.org/paste/3056 and the full project >> at http://janzert.com/files/jointest.zip >> >> If this seems the right way to go I can create a trac issue and try to >> work up a real patch. > > Please provide a patch to the quickstart templates resp. the identity > module then. Please also provide tests and mention in your ticket if > anything needs to be added / altered in the Identity docs > > http://docs.turbogears.org/1.0/UsingIdentity > > Thx, Chris > I created a ticket at http://trac.turbogears.org/ticket/1879 no patch or anything yet though. One thing looking at the current identity tests, 1.1 is only testing the soprovider. Should a second ticket be opened to create saprovider tests? There shouldn't need to be any documentation changes (or really even test changes if the current tests had covered it) because the change is functionally equivalent just better performing. Janzert --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsJanzert wrote: > Christoph Zwerschke wrote: >> Janzert schrieb: >>> My understanding and testing seem to show that an inner join doesn't >>> return duplicates. Wrapping the result in a set doesn't seem necessary. >> The query creates duplicates, and they are pulled from the database, but >> they are removed by SQLAlchemy because you are querying object instances >> and it makes no sense for SQLAlchemy to deliver duplicates in this case. > > Ah yes, by running the generated sql directly I see that now. I'll try > and see if there is a way to generate the exists clause that is > compatible with 0.3 on up. > > Janzert > Before working on getting a SA 0.3 compatible version I decided to take a look at the speed differences. I initialized the database with a user that belonged to 20 groups with each group having 11 out of 60 permissions. The timing below were done with SA 0.5 and postgresql 8.3.1. I also tried sqlite and the timings were basically the same. python -m timeit -s"import perm_test;perm_test.iter()" "perm_test.iter()" 10000 loops, best of 3: 94.9 usec per loop python -m timeit -s"import perm_test;perm_test.join()" "perm_test.join()" 100 loops, best of 3: 13.1 msec per loop python -m timeit -s"import perm_test;perm_test.exist()" "perm_test.exist()" 100 loops, best of 3: 14.7 msec per loop As you can see the current method of iterating through the groups is actually 130 to 150 times faster than single query methods. I've attached my test project and scripts to the trac issue* if someone wants to double check my methods. Just now I also tried a test with 500 groups, 800 permissions and 250 permissions per group, for what should be a torture test to the current method. python -m timeit -s"import perm_test;perm_test.iter()" "perm_test.iter()" 100 loops, best of 3: 16.9 msec per loop python -m timeit -s"import perm_test;perm_test.join()" "perm_test.join()" 10 loops, best of 3: 808 msec per loop python -m timeit -s"import perm_test;perm_test.exist()" "perm_test.exist()" 10 loops, best of 3: 256 msec per loop While a little closer still the current method is much better. As a side note while watching the process usage during these test, the iterative methods has most of the cpu usage in python, the join query mostly in postgres and the exist query is split about evenly. Janzert * http://trac.turbogears.org/ticket/1879 --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsJanzert schrieb: > I've attached my test project and scripts to the trac issue* if > someone wants to double check my methods. I'm using the following add-on, maybe it's useful for you, too ;-) https://addons.mozilla.org/en-US/thunderbird/addon/5759 > As you can see the current method of iterating through the groups is > actually 130 to 150 times faster than single query methods. I think this is because the current method uses the group.permissions lists in the database session if possible. I.e. if you run it in a loop, it will become very efficient. However, in reality this function will be called only once or a few times per request, always in a fresh session. You can simulate this with turbogears.database.session.expire_all(). Doing so, I measured that the join method is about 3 times faster and the exists method is about 2 times faster than the old method (similar for both SQLite and Postgres 8.3). So it depends on how people use the permissions attribute. If used more than 2 times per request, the old method is faster. One solution for getting good performance in both cases is using the join method and caching the permissions in the User instance: def permissions(self): try: p = User._permissions except AttributeError: p = set(Permission.query.distinct().join(['groups', 'users']).filter_by(user_id=self.user_id)) User._permissions = p return p Another interesting question is why the join method is in fact faster than the exists method, though you would expect the opposite. See also: http://wrschneider.blogspot.com/2004/12/postgresql-performance-of-where-exists.html In this case we have a chained exists, maybe this makes it difficult to optimize for the database. -- Christoph --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsChristoph Zwerschke wrote: > Janzert schrieb: > > I've attached my test project and scripts to the trac issue* if > > someone wants to double check my methods. > > I'm using the following add-on, maybe it's useful for you, too ;-) > https://addons.mozilla.org/en-US/thunderbird/addon/5759 > Yeah, I probably would find that useful. But in this case I really did mean "I've attached ... to the trac issue*". With the url at the end of the message. :) * http://trac.turbogears.org/ticket/1879 >> As you can see the current method of iterating through the groups is >> actually 130 to 150 times faster than single query methods. > > I think this is because the current method uses the group.permissions > lists in the database session if possible. I.e. if you run it in a loop, > it will become very efficient. However, in reality this function will be > called only once or a few times per request, always in a fresh session. > You can simulate this with turbogears.database.session.expire_all(). > Doing so, I measured that the join method is about 3 times faster and > the exists method is about 2 times faster than the old method (similar > for both SQLite and Postgres 8.3). > Ah, yes some sort of caching going on would certainly make sense. I'll take another look at it. Janzert --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: Identity, tgrepozewho user permissionsJanzert schrieb: > Christoph Zwerschke wrote: >> Janzert schrieb: >> > I've attached my test project and scripts to the trac issue* if >> > someone wants to double check my methods. >> >> I'm using the following add-on, maybe it's useful for you, too ;-) >> https://addons.mozilla.org/en-US/thunderbird/addon/5759 > > Yeah, I probably would find that useful. But in this case I really did > mean "I've attached ... to the trac issue*". With the url at the end of > the message. :) Oh well :) I think I need another add-on to help me not reading over such important details. Somehow I totally missed you created a ticket. So I suggest we continue the discussion there. -- Chris --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "TurboGears" group. To post to this group, send email to turbogears@... To unsubscribe from this group, send email to turbogears-unsubscribe@... For more options, visit this group at http://groups.google.com/group/turbogears?hl=en -~----------~----~----~----~------~----~------~--~--- |
| Free Forum Powered by Nabble | Forum Help |