|
View:
New views
9 Messages
—
Rating Filter:
Alert me
|
|
|
Testing needed - HAMMER PFS exports via NFS It should now be possible in HEAD only to export nullfs mounts of
HAMMER PFS's. This needs testing. -Matt Matthew Dillon <dillon@...> |
|
|
|
|
|
Re: Testing needed - HAMMER PFS exports via NFS:Is null mount meant to be a permanent solution for PFS nfs export?
Yes. :Testing: :Both nfs server and client running dfly HEAD; :3 PFSs nfs exported: /usr/src, /usr/obj and my home dir; :did some world/kernel building with above nfs mounts. : :Errors: : - on nfs client (GENERIC) - buildkernel did abort :Sep 22 13:15:01 boy kernel: nfs_getpages: error 13 :Sep 22 13:15:01 boy kernel: vm_fault: pager read error, pid 75644 (ld) :Sep 22 13:15:01 boy kernel: pid 75644 (ld), uid 0: exited on signal 11 Error 13? That's 'permission denied'. It sounds like cockpit trouble. Is the server exporting the filesystem with -maproot=root: ? If not then there could be issues because the getpages/putpages code might run with root creds and if you do not export with -maproot=root: the root creds won't match. There might be nfs mount options to force the creds, I'm not sure. The best solution is typically to export on the server with '-maproot=root:'. : - on SMP nfs server (from 21:19 also server) - didn=B4t notice any ill eff= :ect :Sep 22 18:24:04 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! :Sep 22 19:08:10 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! :Sep 22 21:37:18 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! :Sep 22 22:03:27 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! : : -thomas Hmm. Well, since they didn't deadlock the kernel I'm guessing it was just a very long-held spin lock that eventually cleared. We should try to track it down if possible, though. Don't worry about it for now. -Matt Matthew Dillon <dillon@...> |
|
|
Re: Testing needed - HAMMER PFS exports via NFSOn Tue, Sep 23, 2008 at 12:31:52AM +0200, Thomas Nikolajsen wrote:
> - on SMP nfs server (from 21:19 also server) - didn´t notice any ill effect > Sep 22 18:24:04 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! > Sep 22 19:08:10 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! > Sep 22 21:37:18 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! > Sep 22 22:03:27 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! Does it cease you set kern.intr_mpsafe to 0? I started seeing it, especially frequently on DragonFly as a VMWare guest, after it's been changed to 1 by default. |
|
|
Re: Testing needed - HAMMER PFS exports via NFSOn Wed, Sep 24, 2008 at 9:05 PM, YONETANI Tomokazu <qhwt+dfly@...> wrote:
> On Tue, Sep 23, 2008 at 12:31:52AM +0200, Thomas Nikolajsen wrote: >> - on SMP nfs server (from 21:19 also server) - didn´t notice any ill effect >> Sep 22 18:24:04 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! >> Sep 22 19:08:10 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! >> Sep 22 21:37:18 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! >> Sep 22 22:03:27 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! > > Does it cease you set kern.intr_mpsafe to 0? I started seeing it, > especially frequently on DragonFly as a VMWare guest, after it's been > changed to 1 by default. > I didn't remember when I began to see it (I set kern.intr_mpsafe to 1 ~May.2007, I didn't see it at that time period). Spin lock indefinite wait msg could be triggered on UFS by: make installkernel && shutdown -r now Best Regards, sephe -- Live Free or Die |
|
|
Re: Testing needed - HAMMER PFS exports via NFSOn Wed, Sep 24, 2008 at 9:05 PM, YONETANI Tomokazu <qhwt+dfly@...> wrote:
> On Tue, Sep 23, 2008 at 12:31:52AM +0200, Thomas Nikolajsen wrote: >> - on SMP nfs server (from 21:19 also server) - didn´t notice any ill effect >> Sep 22 18:24:04 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! >> Sep 22 19:08:10 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! >> Sep 22 21:37:18 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! >> Sep 22 22:03:27 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! > > Does it cease you set kern.intr_mpsafe to 0? I started seeing it, > especially frequently on DragonFly as a VMWare guest, after it's been > changed to 1 by default. Could you test following patch on HEAD? http://leaf.dragonflybsd.org/~sephe/nata_nompsafe.diff It works for me at least :) Best Regards, sephe -- Live Free or Die |
|
|
Re: Testing needed - HAMMER PFS exports via NFSOn Fri, Sep 26, 2008 at 11:02:01PM +0800, Sepherosa Ziehau wrote:
> On Wed, Sep 24, 2008 at 9:05 PM, YONETANI Tomokazu <qhwt+dfly@...> wrote: > > On Tue, Sep 23, 2008 at 12:31:52AM +0200, Thomas Nikolajsen wrote: > >> - on SMP nfs server (from 21:19 also server) - didn´t notice any ill effect > >> Sep 22 18:24:04 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! > >> Sep 22 19:08:10 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! > >> Sep 22 21:37:18 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! > >> Sep 22 22:03:27 boy kernel: spin_lock: 0xc6087fdc, indefinite wait! > > > > Does it cease you set kern.intr_mpsafe to 0? I started seeing it, > > especially frequently on DragonFly as a VMWare guest, after it's been > > changed to 1 by default. > > Could you test following patch on HEAD? > http://leaf.dragonflybsd.org/~sephe/nata_nompsafe.diff > > It works for me at least :) Hey, you've already committed it :) Yes, it works for me, too. I also tweaked nata driver(using sys_cputimer->count()) and found that - `indefinite wait!' message is displayed when spin_lock_wr() is called from ata_start(), to lock state_mtx - ata_interrupt takes too long while it's holding state_mtx when it calls ch->hw.end_transaction. digging further, it's callout_stop() right below the `end_finished' label in ata_end_transaction() that is taking long time (> sys_cputimer->freq) I'll try diving into callout_stop() later. Cheers. |
|
|
Re: Testing needed - HAMMER PFS exports via NFSOn Sun, Sep 28, 2008 at 01:59:38AM +0900, YONETANI Tomokazu wrote:
> I also tweaked nata driver(using sys_cputimer->count()) and found that > > - `indefinite wait!' message is displayed when spin_lock_wr() is called > from ata_start(), to lock state_mtx > - ata_interrupt takes too long while it's holding state_mtx when it calls > ch->hw.end_transaction. digging further, it's callout_stop() right > below the `end_finished' label in ata_end_transaction() that is taking > long time (> sys_cputimer->freq) > > I'll try diving into callout_stop() later. It turned out that the while loop in lwkt_wait_ipiq() performs so many iterations (18000000~) in such a case. It's called from callout_stop() when the caller's CPU isn't the same as that of the caller of callout_reset(), in which case callout_stop() must issue a synchronous IPI. lwkt_wait_ipiq(): : cpu_enable_intr(); while ((int)(ip->ip_xindex - seq) < 0) { crit_enter(); lwkt_process_ipiq(); crit_exit(); : } Since the only caller of lwkt_wait_ipiq() is callout_stop(), I don't know if the number is extreme or not, but usually the number of iterations in this loop stays very low, and it doesn't seem to reach 5000 during `make -j2 buildworld' on my Athlon64x2 box. The number depends on the hardware, of course, but even on VMWare, the upper bound of the iterations in the normal case is hundred times lower than the abnormal case. Assuming that the values returned by sys_cputimer->count() are not skewed, it takes about one second (that is, slightly above sys_cputimer->freq) to return from lwkt_wait_ipiq(), when `indefinite wait!' message is shown. |
|
|
Re: Testing needed - HAMMER PFS exports via NFS:... :> - `indefinite wait!' message is displayed when spin_lock_wr() is called :> from ata_start(), to lock state_mtx :> - ata_interrupt takes too long while it's holding state_mtx when it calls :> ch->hw.end_transaction. digging further, it's callout_stop() right :> below the `end_finished' label in ata_end_transaction() that is taking :> long time (> sys_cputimer->freq) :> :> I'll try diving into callout_stop() later. : :It turned out that the while loop in lwkt_wait_ipiq() performs so many :iterations (18000000~) in such a case. It's called from callout_stop() :when the caller's CPU isn't the same as that of the caller of callout_reset(), :in which case callout_stop() must issue a synchronous IPI. : :lwkt_wait_ipiq(): : : : cpu_enable_intr(); : while ((int)(ip->ip_xindex - seq) < 0) { : crit_enter(); : lwkt_process_ipiq(); : crit_exit(); : : : } : :Since the only caller of lwkt_wait_ipiq() is callout_stop(), I don't know :if the number is extreme or not, but usually the number of iterations in :this loop stays very low, and it doesn't seem to reach 5000 during :`make -j2 buildworld' on my Athlon64x2 box. The number depends on the :hardware, of course, but even on VMWare, the upper bound of the iterations :in the normal case is hundred times lower than the abnormal case. :Assuming that the values returned by sys_cputimer->count() are not skewed, :it takes about one second (that is, slightly above sys_cputimer->freq) :to return from lwkt_wait_ipiq(), when `indefinite wait!' message is shown. FreeBSD has been struggling with callout races for the last year. I did callout_stop() that way because I wanted to ensure that upon the return from callout_stop(), the callout is guaranteed to have been removed and its dispatch function is guaranteed to have completed if it was running. We can probably use spinlocks there to some degree. Most callouts occur on the same cpu (because the protocol threads are already localized), so we should not have to worry too much about the performance of the occassional cross-cpu callout. If you would like to change it, I suggest using a spinlock to protect the structure and then something else... perhaps even allow a blocking condition, if the target callout is in the middle of running its callout function and we have to wait for it to return before we can return ourselves. (Currently callout_stop() is considered to be non-blocking so some code auditing may be needed. The synchronous IPI is just as dangerous, though). -Matt Matthew Dillon <dillon@...> |
| Free Forum Powered by Nabble | Forum Help |