|
View:
New views
18 Messages
—
Rating Filter:
Alert me
|
|
|
Major SMP problems with lstat/nameiWe have encountered some serious SMP performance/scalability problems that we've tracked back to lstat/namei calls. I've written a quick benchmark with a pair of tests to simplify/measure the problem. Both tests use a tree of directories: the top level directory contains five subdirectories a, b, c, d, and e. Each subdirectory contains five subdirectories a, b, c, d, and e, and so on.. 1 directory at level one, 5 at level two, 25 at level three, 125 at level four, 625 at level five, and 3125 at level six. In the "realpath" test, a random path is constructed at the bottom of the tree (e.g. /tmp/lstat/a/b/c/d/e) and realpath() is called on that, provoking lstat() calls on the whole tree. This is to simulate a mix of high-contention and low-contention lstat() calls. In the "lstat" test, lstat is called directly on a path at the bottom of the tree. Since there are 3125 files, this simulates relatively low-contention lstat() calls. In both cases, the test repeats as many times as possible for 60 seconds. Each test is run simultaneously by multiple processes, with progressively doubling concurrency from 1 to 512. What I found was that everything is fine at concurrency 2, probably indicating that the benchmark pegged on some other resource limit. At concurrency 4, realpath drops to 31.8% of concurrency 1. At concurrency 8, performance is down to 18.3%. In the interim, CPU load goes to 80-90% system CPU. I've confirmed via ktrace and the rusage that the CPU usage is all system time, and that lstat() is the *only* system call in the test (realpath() is called with an absolute path). I then reran the 32-process test on 1-7 cores, and found that performance peaks at 2 cores and drops sharply from there. eight cores runs *fifteen* times slower than two cores. The test full results are at the bottom of this message. This is on 6.3-RELEASE-p4 with vfs.lookup_shared=1. I believe this is the same issue that was previously discussed as "2 x quad-core system is slower that 2 x dual core on FreeBSD" archived here: http://lists.freebsd.org/pipermail/freebsd-stable/2007-November/038441.html In that post, Kris Kennaway wrote: > It is hard to say for certain without a direct profile comparison of the > workload, but it is probably due to lockmgr contention. lockmgr is used > for various locking operations to do with VFS data structures. It is > known to have poor performance and scale very badly." At this point, what I've got is one of those synthetic benchmarks, but it matches our production problems exactly, except that the production processes need a whole lot more RAM and eventually when this manifests, they backlog and the server death spirals through swap, which is a most unfortunate difference. I've chased my way up the kernel source to kern_lstat(), where a shared lock is obtained, and then onto namei, where vfs.lookup_shared comes into play. But unfortunately, I don't understand lockmgr, I don't know how the macros and flags I see here relate to it, I can't figure out what happened to the changes that Attilio Rao was working on, and there didn't seem to be much other hope at the time. This is becoming a huge problem for us. Is there anything that at all can be done, or any news? In the case linked above, improvement was made by changing a PHP setting that isn't applicable in our case. Thanks, Jeff Concurrency 1 realpath Total = 1409069 (100%) Total/Sec = 23484 Total/Sec/Worker = 23484 lstat Total = 6828763 (100%) Total/Sec = 113812 Total/Sec/Worker = 113812 Concurrency 2 realpath Total = 1450489 (100%) Total/Sec = 24174 Total/Sec/Worker = 12087 lstat Total = 6891417 (100.9%) Total/Sec = 114856 Total/Sec/Worker = 57428 Concurrency 4 realpath Total = 448693 (31.8%) Total/Sec = 7478 Total/Sec/Worker = 1869 lstat Total = 3047933 (44.6%) Total/Sec = 50798 Total/Sec/Worker = 12699 Concurrency 8 realpath Total = 258281 (18.3%) Total/Sec = 4304 Total/Sec/Worker = 538 lstat Total = 1688728 (24.7%) Total/Sec = 28145 Total/Sec/Worker = 3518 Concurrency 16 realpath Total = 179150 (12.7%) Total/Sec = 2985 Total/Sec/Worker = 186 lstat Total = 966558 (14.1%) Total/Sec = 16109 Total/Sec/Worker = 1006 Concurrency 32 realpath Total = 116982 (8.3%) Total/Sec = 1949 Total/Sec/Worker = 60 lstat Total = 644703 (9.4%) Total/Sec = 10745 Total/Sec/Worker = 335 Concurrency 64 realpath Total = 112050 (7.9%) Total/Sec = 1867 Total/Sec/Worker = 29 lstat Total = 572798 (8.3%) Total/Sec = 9546 Total/Sec/Worker = 149 Concurrency 128 realpath Total = 111544 (7.9%) Total/Sec = 1859 Total/Sec/Worker = 14 lstat Total = 570800 (8.3%) Total/Sec = 9513 Total/Sec/Worker = 74 Concurrency 256 realpath Total = 96461 (6.8%) Total/Sec = 1607 Total/Sec/Worker = 6 lstat Total = 580679 (8.5%) Total/Sec = 9677 Total/Sec/Worker = 37 Concurrency 512 realpath Total = 91224 (6.4%) Total/Sec = 1520 Total/Sec/Worker = 2 lstat Total = 498342 (7.2%) Total/Sec = 8305 Total/Sec/Worker = 16 realpath Concurrency 32 - 1 Core Total = 1289527 Total/Sec = 21492 Total/Sec/Worker = 671 realpath Concurrency 32 - 2 Core Total = 1753625 Total/Sec = 29227 Total/Sec/Worker = 913 realpath Concurrency 32 - 3 Core Total = 1197896 Total/Sec = 19964 Total/Sec/Worker = 623 realpath Concurrency 32 - 4 Core Total = 631293 Total/Sec = 10521 Total/Sec/Worker = 328 realpath Concurrency 32 - 5 Core Total = 227814 Total/Sec = 3796 Total/Sec/Worker = 118 realpath Concurrency 32 - 6 Core Total = 153550 Total/Sec = 2559 Total/Sec/Worker = 79 realpath Concurrency 32 - 7 Core Total = 136013 Total/Sec = 2266 Total/Sec/Worker = 70 _______________________________________________ freebsd-hackers@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." |
|
|
Re: Major SMP problems with lstat/nameiHello Jeff, On Wed, 24 Sep 2008 00:52:59 -0400, Jeff Wheelhouse <freebsd-hackers@...> wrote: > > We have encountered some serious SMP performance/scalability problems > that we've tracked back to lstat/namei calls. I've written a quick this all seems like a reason of very poor performance of PHP when used with open_basedir and safe_mode enabled. It would be nice to see if there's something what could be done to make it better. -- S pozdravom / Best regards Daniel Geržo _______________________________________________ freebsd-hackers@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." |
|
|
Re: Major SMP problems with lstat/nameiJeff Wheelhouse wrote:
> This is on 6.3-RELEASE-p4 with vfs.lookup_shared=1. > > I believe this is the same issue that was previously discussed as "2 x > quad-core system is slower that 2 x dual core on FreeBSD" archived here: > > http://lists.freebsd.org/pipermail/freebsd-stable/2007-November/038441.html > This is becoming a huge problem for us. Is there anything that at all > can be done, or any news? In the case linked above, improvement was > made by changing a PHP setting that isn't applicable in our case. There is nothing that can be done within the 6.x branch. 7.x contains many improvements but I think only 8.x will directly change the lockmgr and the namei cache. The best things you can try right now is to use 7-STABLE (or soon to be released 7.1; you might need tuning with 7.0-RELEASE) or try 8-CURRENT (it's quite stable). |
|
|
Re: Major SMP problems with lstat/nameiIvan Voras wrote:
> There is nothing that can be done within the 6.x branch. 7.x contains > many improvements but I think only 8.x will directly change the lockmgr > and the namei cache. The best things you can try right now is to use > 7-STABLE (or soon to be released 7.1; you might need tuning with > 7.0-RELEASE) or try 8-CURRENT (it's quite stable). I remembered two more things: * The problematic load can also be generated with benchmarks/blogbench * I don't have the numbers here but I think I remember that ZFS had noticably larger score than UFS in this workload. Of course, ZFS has other problems. |
|
|
Re: Major SMP problems with lstat/nameiOn Wed, Sep 24, 2008 at 09:26:55AM +0200, Daniel Gerzo wrote:
> Hello Jeff, > > On Wed, 24 Sep 2008 00:52:59 -0400, Jeff Wheelhouse > <freebsd-hackers@...> wrote: > > > > We have encountered some serious SMP performance/scalability problems > > that we've tracked back to lstat/namei calls. I've written a quick > > this all seems like a reason of very poor performance of PHP when used with > open_basedir and safe_mode enabled. It would be nice to see if there's > something what could be done to make it better. Both of which are features which will, thankfully, be removed in PHP 6. Whoever uses these features in PHP deserves the pain -- they're worthless and provide no security what-so-ever. Consider using suPHP or an MPM like mpm-itk. Also, PHP and performance shouldn't be put in the same sentence. </rant> -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | _______________________________________________ freebsd-hackers@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." |
|
|
Re: Major SMP problems with lstat/nameiOn Wednesday 24 September 2008 12:52:59 am Jeff Wheelhouse wrote:
> > We have encountered some serious SMP performance/scalability problems > that we've tracked back to lstat/namei calls. I've written a quick > benchmark with a pair of tests to simplify/measure the problem. Both > tests use a tree of directories: the top level directory contains five > subdirectories a, b, c, d, and e. Each subdirectory contains five > subdirectories a, b, c, d, and e, and so on.. 1 directory at level > one, 5 at level two, 25 at level three, 125 at level four, 625 at > level five, and 3125 at level six. > > In the "realpath" test, a random path is constructed at the bottom of > the tree (e.g. /tmp/lstat/a/b/c/d/e) and realpath() is called on that, > provoking lstat() calls on the whole tree. This is to simulate a mix > of high-contention and low-contention lstat() calls. > > In the "lstat" test, lstat is called directly on a path at the bottom > of the tree. Since there are 3125 files, this simulates relatively > low-contention lstat() calls. > > In both cases, the test repeats as many times as possible for 60 > seconds. Each test is run simultaneously by multiple processes, with > progressively doubling concurrency from 1 to 512. > > What I found was that everything is fine at concurrency 2, probably > indicating that the benchmark pegged on some other resource limit. At > concurrency 4, realpath drops to 31.8% of concurrency 1. At > concurrency 8, performance is down to 18.3%. In the interim, CPU load > goes to 80-90% system CPU. I've confirmed via ktrace and the rusage > that the CPU usage is all system time, and that lstat() is the *only* > system call in the test (realpath() is called with an absolute path). > > I then reran the 32-process test on 1-7 cores, and found that > performance peaks at 2 cores and drops sharply from there. eight > cores runs *fifteen* times slower than two cores. > > The test full results are at the bottom of this message. > > This is on 6.3-RELEASE-p4 with vfs.lookup_shared=1. Shared lookups only work on the NFS client in 6.x. I'm about to turn them on for UFS in HEAD (8.x) and will backport the needed fixes to 7.x after 7.1 (too risky to merge to 7.x this close to a release). So lookup_shared=1 isn't going to really help on 6.x unless you are doing it all over NFS. You also want to backport my fix to cache_enter() before using lookup_shared at all: jhb 2008-08-23 15:13:39 UTC FreeBSD src repository Modified files: sys/kern vfs_cache.c Log: SVN rev 182061 on 2008-08-23 15:13:39Z by jhb Fix a race condition with concurrent LOOKUP namecache operations for a vnode not in the namecache when shared lookups are enabled (vfs.lookup_shared=1, it is currently off by default) and the filesystem supports shared lookups (e.g. NFS client). Specifically, if multiple concurrent LOOKUPs both miss in the name cache in parallel, each of the lookups may each end up adding an entry to the namecache resulting in duplicate entries in the namecache for the same pathname. A subsequent removal of the mapping of that pathname to that vnode (via remove or rename) would only evict one of the entries from the name cache. As a result, subseqent lookups for that pathname would still return the old vnode. This race was observed with shared lookups over NFS where a file was updated by writing a new file out to a temporary file name and then renaming that temporary file to the "real" file to effect atomic updates of a file. Other processes on the same client that were periodically reading the file would occasionally receive an ESTALE error from open(2) because the VOP_GETATTR() in nfs_open() would receive that error when given the stale vnode. The fix here is to check for duplicates in cache_enter() and just return if an entry for this same directory and leaf file name for this vnode is already in the cache. The check for duplicates is done by walking the per-vnode list of name cache entries. It is expected that this list should be very small in the common case (usually 0 or 1 entries during a cache_enter() since most files only have 1 "leaf" name). Reviewed by: ups, scottl MFC after: 2 months Revision Changes Path 1.124 +33 -9 src/sys/kern/vfs_cache.c If you want to try the UFS stuff on 7, you would need to probably backport at least the following, maybe more: jeff 2008-04-11 09:44:25 UTC FreeBSD src repository Modified files: sys/ufs/ufs ufs_lookup.c Log: - cache dp->i_offset in the local 'i_offset' variable for use in loop indexes so directory lookup becomes shared lock safe. In the modifying cases an exclusive lock is held here so the commit routine may rely on the state of i_offset. - Similarly handle i_diroff by fetching at the start and setting only once the operation is complete. Without the exclusive lock these are only considered hints. - Assert that an exclusive lock is held when we're preparing for a commit routine. - Honor the lock type request from lookup instead of always using exclusive locking. Tested by: pho, kris Revision Changes Path 1.87 +48 -29 src/sys/ufs/ufs/ufs_lookup.c jeff 2008-04-22 12:34:16 UTC FreeBSD src repository Modified files: sys/ufs/ufs inode.h ufs_lookup.c Log: - Use a local variable for i_ino in ufs_lookup. It is only used to communicate between two parts of this one function. This was causing problems with shared lookups as each would trash the ino value in the inode. - Remove the unused i_ino field from the inode structure. Revision Changes Path 1.53 +0 -1 src/sys/ufs/ufs/inode.h 1.88 +10 -13 src/sys/ufs/ufs/ufs_lookup.c jhb 2008-07-30 21:07:56 UTC FreeBSD src repository Modified files: sys/ufs/ufs ufs_lookup.c Log: SVN rev 181018 on 2008-07-30 21:07:56Z by jhb Whitespace tweak. Revision Changes Path 1.90 +0 -1 src/sys/ufs/ufs/ufs_lookup.c jhb 2008-09-16 16:18:36 UTC FreeBSD src repository Modified files: sys/ufs/ufs ufs_lookup.c Log: SVN rev 183079 on 2008-09-16 16:18:36Z by jhb - Only set i_offset in the parent directory's i-node during a lookup for non-LOOKUP operations. - Relax a VOP assertion for a DELETE lookup. rename() uses WANTPARENT instead of LOCKPARENT when looking up the source pathname. ufs_rename() uses a relookup() to lock the parent directory when it decides to finally remove the source path. Thus, it is ok for a DELETE with WANTPARENT set instead of LOCKPARENT to use a shared vnode lock rather than an exclusive vnode lock. Reported by: kris (2) Reviewed by: jeff Revision Changes Path 1.91 +9 -3 src/sys/ufs/ufs/ufs_lookup.c jhb 2008-09-16 19:06:44 UTC FreeBSD src repository Modified files: sys/ufs/ufs inode.h ufs_lookup.c Log: SVN rev 183093 on 2008-09-16 19:06:44Z by jhb Retire the 'i_reclen' field from the in-memory i-node. Previously, during a DELETE lookup operation, lookup would cache the length of the directory entry to be deleted in 'i_reclen'. Later, the actual VOP to remove the directory entry (ufs_remove, ufs_rename, etc.) would call ufs_dirremove() which extended the length of the previous directory entry to "remove" the deleted entry. However, we always read the entire block containing the directory entry when doing the removal, so we always have the directory entry to be deleted in-memory when doing the update to the directory block. Also, we already have to figure out where the directory entry that is being removed is in the block so that we can pass the component name to the dirhash code to update the dirhash. So, instead of passing 'i_reclen' from ufs_lookup() to the ufs_dirremove() routine, just read the 'd_reclen' field directly out of the entry being removed when updating the length of the previous entry in the block. This avoids a cosmetic issue of writing to 'i_reclen' while holding a shared vnode lock. It also slightly reduces the amount of side-band data passed from ufs_lookup() to operations updating a directory via the directory's i-node. Reviewed by: jeff Revision Changes Path 1.54 +0 -1 src/sys/ufs/ufs/inode.h 1.92 +9 -6 src/sys/ufs/ufs/ufs_lookup.c jeff 2008-04-11 09:48:12 UTC FreeBSD src repository Modified files: sys/ufs/ufs dirhash.h ufs_dirhash.c Log: - Use a lockmgr lock rather than a mtx to protect dirhash. This lock may be held for the duration of the various dirhash operations which avoids many complex unlock/lock/revalidate sequences. - Permit shared locks on lookup. To protect the ip->i_dirhash pointer we use the vnode interlock in the shared case. Callers holding the exclusive vnode lock can run without fear of concurrent modification to i_dirhash. - Hold an exclusive dirhash lock when creating the dirhash structure for the first time or when re-creating a dirhash structure which has been recycled. Tested by: kris, pho Revision Changes Path 1.6 +2 -1 src/sys/ufs/ufs/dirhash.h 1.24 +289 -227 src/sys/ufs/ufs/ufs_dirhash.c jhb 2008-09-16 16:23:56 UTC FreeBSD src repository Modified files: sys/ufs/ufs dirhash.h ufs_dirhash.c Log: SVN rev 183080 on 2008-09-16 16:23:56Z by jhb Fix a race with shared lookups on UFS. If the the dirhash code reached the cap on memory usage, then shared LOOKUP operations could start free'ing dirhash structures. Without these fixes, concurrent free's on the same directory could result in one of the threads blocked on a lock in a dirhash structure free'd by the other thread. - Replace the lockmgr lock in the dirhash structure with an sx lock. - Use a reference count managed with ufsdirhash_hold()/drop() to determine when to free the dirhash structures. The directory i-node holds a reference while the dirhash is attached to an i-node. Code that wishes to lock the dirhash while holding a shared vnode lock must first acquire a private reference to the dirhash while holding the vnode interlock before acquiring the dirhash sx lock. After acquiring the sx lock, it drops the private reference after checking to see if the dirhash is still used by the directory i-node. Revision Changes Path 1.7 +5 -1 src/sys/ufs/ufs/dirhash.h 1.25 +82 -33 src/sys/ufs/ufs/ufs_dirhash.c jhb 2008-09-22 20:53:22 UTC FreeBSD src repository Modified files: sys/ufs/ufs ufs_dirhash.c Log: SVN rev 183280 on 2008-09-22 20:53:22Z by jhb Close a race between concurrent calls to ufsdirhash_recycle() and ufsdirhash_free() introduced in my last commit by removing the dirhash about to be free'd in ufsdirhash_free() from the global dirhash list before dropping the sx lock. Tested by: kris Revision Changes Path 1.26 +10 -5 src/sys/ufs/ufs/ufs_dirhash.c There are additional fixes needed to fix races with umount -f, so if you backport all this stuff, don't use umount -f or you risk panics. :) Also, you will need to set the flag in the mount flags to enable shared lookups in the mount VOP in ffs_vfsops.c: --- //depot/projects/smpng/sys/ufs/ffs/ffs_vfsops.c 2008/08/25 16:33:41 +++ //depot/user/jhb/lock/ufs/ffs/ffs_vfsops.c 2008/08/29 15:04:03 @@ -852,7 +852,7 @@ * Initialize filesystem stat information in mount struct. */ MNT_ILOCK(mp); - mp->mnt_kern_flag |= MNTK_MPSAFE; + mp->mnt_kern_flag |= MNTK_MPSAFE | MNTK_LOOKUP_SHARED; MNT_IUNLOCK(mp); #ifdef UFS_EXTATTR #ifdef UFS_EXTATTR_AUTOSTART For 6.x you could in theory backport all of this as well, but there may be other fixes needed as well for 6.x. I'm only planning on merging this stuff back to 7.x myself. -- John Baldwin _______________________________________________ freebsd-hackers@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." |
|
|
Re: Major SMP problems with lstat/nameiOn Sep 24, 2008, at 6:12 AM, Ivan Voras wrote: > There is nothing that can be done within the 6.x branch. 7.x contains > many improvements but I think only 8.x will directly change the > lockmgr > and the namei cache. The best things you can try right now is to use > 7-STABLE (or soon to be released 7.1; you might need tuning with > 7.0-RELEASE) or try 8-CURRENT (it's quite stable). Really? Nothing? We get lockmgr-related panics on FreeBSD 7.0, as detailed elsewhere on this list. Stability issues aside, what else would we need to tune on 7.0, besides enabling the ULE scheduler, and how much benefit would we really get? These servers are in production, so 8-CURRENT is not an option. I've already had my knuckles rapped by a customer for trying 7.1-PRERELEASE on one of their machines. Thanks, Jeff _______________________________________________ freebsd-hackers@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." |
|
|
Re: Major SMP problems with lstat/nameiOn Sep 24, 2008, at 12:12 PM, John Baldwin wrote: > Shared lookups only work on the NFS client in 6.x. I'm about to > turn them on > for UFS in HEAD (8.x) and will backport the needed fixes to 7.x > after 7.1 > (too risky to merge to 7.x this close to a release). Testers available, when you get to that. :-) > So lookup_shared=1 > isn't going to really help on 6.x unless you are doing it all over > NFS. You > also want to backport my fix to cache_enter() before using > lookup_shared at > all: Since it sounds like 6.x is a dead end, we'll focus on 7.x, provided we can get it to be stable for us. Having never used svn, I do need to figure out how to pull the specific patches you referenced, but I'm sure that's not an unclimbable mountain. :-) I appreciate your insight on this, it's very helpful. Thanks, Jeff _______________________________________________ freebsd-hackers@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." |
|
|
Re: Major SMP problems with lstat/nameiOn Wednesday 24 September 2008 01:47:32 pm Jeff Wheelhouse wrote:
> > On Sep 24, 2008, at 12:12 PM, John Baldwin wrote: > > Shared lookups only work on the NFS client in 6.x. I'm about to > > turn them on > > for UFS in HEAD (8.x) and will backport the needed fixes to 7.x > > after 7.1 > > (too risky to merge to 7.x this close to a release). > > Testers available, when you get to that. :-) > > > So lookup_shared=1 > > isn't going to really help on 6.x unless you are doing it all over > > NFS. You > > also want to backport my fix to cache_enter() before using > > lookup_shared at > > all: > > Since it sounds like 6.x is a dead end, we'll focus on 7.x, provided > we can get it to be stable for us. Yes. > Having never used svn, I do need to figure out how to pull the > specific patches you referenced, but I'm sure that's not an > unclimbable mountain. :-) You can still use cvs to pull the revisions. All those e-mail msg's have the CVS revisions in them, too. -- John Baldwin _______________________________________________ freebsd-hackers@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." |
|
|
Re: Major SMP problems with lstat/nameiOn Sep 24, 2008, at 2:10 PM, John Baldwin wrote: > You can still use cvs to pull the revisions. All those e-mail msg's > have the > CVS revisions in them, too. If I'm ever to do anything that will benefit someone besides myself, it's worth my making the effort to learn SVN. We have coasted on the back of FreeBSD without giving back for long enough. Thanks, Jeff _______________________________________________ freebsd-hackers@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." |
|
|
Re: Major SMP problems with lstat/nameiJeff Wheelhouse wrote:
> > On Sep 24, 2008, at 6:12 AM, Ivan Voras wrote: >> There is nothing that can be done within the 6.x branch. 7.x contains >> many improvements but I think only 8.x will directly change the lockmgr >> and the namei cache. The best things you can try right now is to use >> 7-STABLE (or soon to be released 7.1; you might need tuning with >> 7.0-RELEASE) or try 8-CURRENT (it's quite stable). > > Really? Nothing? > > We get lockmgr-related panics on FreeBSD 7.0, as detailed elsewhere on > this list. > > Stability issues aside, what else would we need to tune on 7.0, besides > enabling the ULE scheduler, and how much benefit would we really get? > > These servers are in production, so 8-CURRENT is not an option. I've > already had my knuckles rapped by a customer for trying 7.1-PRERELEASE > on one of their machines. You are supposed to edit the uname info back to 7.0 before installing experimental 7.1 systems! Didn't you get the memo? > > Thanks, > Jeff > > > > _______________________________________________ > freebsd-hackers@... mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." _______________________________________________ freebsd-hackers@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." |
|
|
Re: Major SMP problems with lstat/nameiJeff Wheelhouse <freebsd-hackers@...> writes:
> I've written a quick benchmark with a pair of tests to > simplify/measure the problem. [...] Care to share? DES -- Dag-Erling Smørgrav - des@... _______________________________________________ freebsd-hackers@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." |
|
|
Re: Major SMP problems with lstat/nameiOn Sep 25, 2008, at 10:51 AM, Dag-Erling Smørgrav wrote: > Jeff Wheelhouse <freebsd-hackers@...> writes: >> I've written a quick benchmark with a pair of tests to >> simplify/measure the problem. [...] > > Care to share? No problem: http://software.wheelhouse.org/rptest.tar.bz2 Thanks, Jeff _______________________________________________ freebsd-hackers@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." |
|
|
Re: Major SMP problems with lstat/nameiOn Sep 24, 2008, at 12:12 PM, John Baldwin wrote: > Shared lookups only work on the NFS client in 6.x. I'm about to > turn them on > for UFS in HEAD (8.x) and will backport the needed fixes to 7.x > after 7.1 > (too risky to merge to 7.x this close to a release). OK, given all the patches you referenced, I did make a decent effort at backporting to 7.0. Here are the results: > Revision Changes Path > 1.87 +48 -29 src/sys/ufs/ufs/ufs_lookup.c Applied, changing a couple of VOP_ISLOCKED() and vn_lock() calls to add "td" as the last parameter. > Revision Changes Path > 1.53 +0 -1 src/sys/ufs/ufs/inode.h > 1.88 +10 -13 src/sys/ufs/ufs/ufs_lookup.c Applied successfully. > SVN rev 181018 on 2008-07-30 21:07:56Z by jhb NOT applied, because it was a whitespace tweak on ufs_lookup 1.89 which was not on your list. > SVN rev 183079 on 2008-09-16 16:18:36Z by jhb Applied cleanly. > Modified files: > sys/ufs/ufs inode.h ufs_lookup.c > Log: > SVN rev 183093 on 2008-09-16 19:06:44Z by jhb Applied cleanly. > 1.6 +2 -1 src/sys/ufs/ufs/dirhash.h > 1.24 +289 -227 src/sys/ufs/ufs/ufs_dirhash.c This patch applies but generates an awful lot of errors (enclosed at end). I think it may be dependent on the 8.0 lockmgr. Since most of the remaining patches are against the same files, I bailed out here. > SVN rev 183080 on 2008-09-16 16:23:56Z by jhb Skipped. > SVN rev 183280 on 2008-09-22 20:53:22Z by jhb Skipped. > There are additional fixes needed to fix races with umount -f, > so if you backport all this stuff, don't use umount -f or you > risk panics. :) Noted. > - mp->mnt_kern_flag |= MNTK_MPSAFE; > + mp->mnt_kern_flag |= MNTK_MPSAFE | MNTK_LOOKUP_SHARED; Applied. If I can make the backport work (a big if, given the dirhash changes) on 7.0, I am happy to maintain and test the diffs locally until after the 7.1 release and send them over to you at that time, if it will save you some effort. Thanks, Jeff Dirhash compile errors: /usr/src/sys/ufs/ufs/ufs_dirhash.c:132:37: error: macro "lockmgr" requires 4 arguments, but only 3 given /usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_release': /usr/src/sys/ufs/ufs/ufs_dirhash.c:132: error: 'lockmgr' undeclared (first use in this function) /usr/src/sys/ufs/ufs/ufs_dirhash.c:132: error: (Each undeclared identifier is reported only once /usr/src/sys/ufs/ufs/ufs_dirhash.c:132: error: for each function it appears in.) /usr/src/sys/ufs/ufs/ufs_dirhash.c:161:45: error: macro "lockmgr" requires 4 arguments, but only 3 given /usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_create': /usr/src/sys/ufs/ufs/ufs_dirhash.c:161: error: 'lockmgr' undeclared (first use in this function) /usr/src/sys/ufs/ufs/ufs_dirhash.c:178:17: error: macro "lockmgr" requires 4 arguments, but only 3 given /usr/src/sys/ufs/ufs/ufs_dirhash.c:193:60: error: macro "lockmgr" requires 4 arguments, but only 3 given /usr/src/sys/ufs/ufs/ufs_dirhash.c:198:42: error: macro "lockmgr" requires 4 arguments, but only 3 given /usr/src/sys/ufs/ufs/ufs_dirhash.c:222:39: error: macro "lockmgr" requires 4 arguments, but only 3 given /usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_acquire': /usr/src/sys/ufs/ufs/ufs_dirhash.c:222: error: 'lockmgr' undeclared (first use in this function) /usr/src/sys/ufs/ufs/ufs_dirhash.c:248:17: error: macro "lockmgr" requires 4 arguments, but only 3 given /usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_free': /usr/src/sys/ufs/ufs/ufs_dirhash.c:247: error: 'lockmgr' undeclared (first use in this function) /usr/src/sys/ufs/ufs/ufs_dirhash.c:385:39: error: macro "lockmgr" requires 4 arguments, but only 3 given /usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_build': /usr/src/sys/ufs/ufs/ufs_dirhash.c:385: error: 'lockmgr' undeclared (first use in this function) cc1: warnings being treated as errors /usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_free_locked': /usr/src/sys/ufs/ufs/ufs_dirhash.c:403: warning: implicit declaration of function 'lockmgr_assert' /usr/src/sys/ufs/ufs/ufs_dirhash.c:403: warning: nested extern declaration of 'lockmgr_assert' /usr/src/sys/ufs/ufs/ufs_dirhash.c:403: error: 'KA_LOCKED' undeclared (first use in this function) /usr/src/sys/ufs/ufs/ufs_dirhash.c:417:37: error: macro "lockmgr" requires 4 arguments, but only 3 given /usr/src/sys/ufs/ufs/ufs_dirhash.c:417: error: 'lockmgr' undeclared (first use in this function) /usr/src/sys/ufs/ufs/ufs_dirhash.c:418:35: error: macro "lockmgr" requires 4 arguments, but only 3 given /usr/src/sys/ufs/ufs/ufs_dirhash.c:438:37: error: macro "lockmgr" requires 4 arguments, but only 3 given /usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_lookup': /usr/src/sys/ufs/ufs/ufs_dirhash.c:473: error: 'KA_LOCKED' undeclared (first use in this function) /usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_findfree': /usr/src/sys/ufs/ufs/ufs_dirhash.c:621: error: 'KA_LOCKED' undeclared (first use in this function) /usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_enduseful': /usr/src/sys/ufs/ufs/ufs_dirhash.c:692: error: 'KA_LOCKED' undeclared (first use in this function) /usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_findslot': /usr/src/sys/ufs/ufs/ufs_dirhash.c:1001: error: 'KA_LOCKED' undeclared (first use in this function) /usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_delslot': /usr/src/sys/ufs/ufs/ufs_dirhash.c:1025: error: 'KA_LOCKED' undeclared (first use in this function) /usr/src/sys/ufs/ufs/ufs_dirhash.c:1101:59: error: macro "lockmgr" requires 4 arguments, but only 3 given /usr/src/sys/ufs/ufs/ufs_dirhash.c: In function 'ufsdirhash_recycle': /usr/src/sys/ufs/ufs/ufs_dirhash.c:1101: error: 'lockmgr' undeclared (first use in this function) _______________________________________________ freebsd-hackers@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@..." |