|
View:
New views
15 Messages
—
Rating Filter:
Alert me
|
|
|
I/O Scheduling results in poor responsiveness Why is the command below all that is needed to bring the system to
it's knees? Why doesn't the io scheduler, CFQ, which is supposed to be all about fairness starve other processes? Example, if I open a new file in vim, and hold down "i" while this is running it will pause the display of new "i"s for seconds, sometimes until the dd write is completely finished. Another example is applications like firefox, thunderbird, xchat, and pidgin will stop refreshing for 10+ seconds. dd if=/dev/zero of=test-file bs=2M count=2048 I understand the main difference between using oflag=direct or not relates to if the io scheduler is used, and if the file is cached or not. I can see this clearly by watching cached rise without oflag=direct, stay the same with it, and go way down when I delete the file after running dd without oflag=direct. The system in question is running Fedora 8. It is an E6600, 4gb memory, and 2x300gb Seagate sata drives. The drives are setup with md raid 1, and the filesystem is ext3. But I also see this with plenty of other systems with more cpu, less cpu, less memory, raid, and no raid. I have tried various tweaks to sys.vm settings, tried changing the scheduler to as or deadline. Nothing seem to get it to behave, other than oflag=direct. Using dd if=/dev/zero is just an easy test case. I see this when copying large files, creating large files, and using virtualization software that does heavy i/o on large files. The command below seems to result in cpu idle 0 and io wait 100%. As shown by "vmstat 1" dd if=/dev/zero of=test-file bs=2M count=2048 2048+0 records in 2048+0 records out 4294967296 bytes (4.3 GB) copied, 94.7903 s, 45.3 MB/s The command below seems to work much better for responsiveness. The cpu idle will be around 50, and the io wait will be around 50. dd if=/dev/zero of=test-file2 bs=2M count=2048 oflag=direct 2048+0 records in 2048+0 records out 4294967296 bytes (4.3 GB) copied, 115.733 s, 37.1 MB/s -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessNathan Grennan wrote:
> Why is the command below all that is needed to bring the system to > it's knees? Why doesn't the io scheduler, CFQ, which is supposed to be > all about fairness starve other processes? Example, if I open a new file > in vim, and hold down "i" while this is running it will pause the > display of new "i"s for seconds, sometimes until the dd write is > completely finished. Another example is applications like firefox, > thunderbird, xchat, and pidgin will stop refreshing for 10+ seconds. > > dd if=/dev/zero of=test-file bs=2M count=2048 > > I understand the main difference between using oflag=direct or not > relates to if the io scheduler is used, and if the file is cached or > not. I can see this clearly by watching cached rise without > oflag=direct, stay the same with it, and go way down when I delete the > file after running dd without oflag=direct. > > The system in question is running Fedora 8. It is an E6600, 4gb memory, > and 2x300gb Seagate sata drives. The drives are setup with md raid 1, > and the filesystem is ext3. But I also see this with plenty of other > systems with more cpu, less cpu, less memory, raid, and no raid. Can you compare to systems with SCSI drives? I think this is telling you that your disk controller is eating all the CPU when the controller and DMA should be doing all the work. -- Les Mikesell lesmikesell@... -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessLes Mikesell wrote:
> Can you compare to systems with SCSI drives? I think this is telling > you that your disk controller is eating all the CPU when the > controller and DMA should be doing all the work. > Are you saying you think that the controller isn't using DMA? You think the controller or driver is just poorly written? I will try a SCSI system, but the closest I have is a CentOS 4.6 machine, which using a kernel based on 2.6.9. It is also a server, so I can't run firefox, thunderbird, xchat, or pidgin on it. -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessNathan Grennan wrote:
> >> Can you compare to systems with SCSI drives? I think this is telling >> you that your disk controller is eating all the CPU when the >> controller and DMA should be doing all the work. >> > Are you saying you think that the controller isn't using DMA? You think > the controller or driver is just poorly written? I'm not sure how to tell - but disk activity shouldn't take a lot of CPU other than tying it up in iowait if it doesn't have anything else to do. > I will try a SCSI system, but the closest I have is a CentOS 4.6 > machine, which using a kernel based on 2.6.9. It is also a server, so I > can't run firefox, thunderbird, xchat, or pidgin on it. There are 2 things that could be going wrong - one is that the driver is keeping the cpu too busy to do anything else, and the other is that the system might need to page in some needed code or flush a work buffer before doing anything else (neither of which seems likely in your vi insert character example) and the disk heads are far away and busy with the writing. The latter case could be helped by putting the OS on a different drive from your data with large writes. -- Les Mikesell lesmikesell@... -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessOn Tue, Mar 04, 2008 at 11:37:31PM -0800, Nathan Grennan wrote:
> Why is the command below all that is needed to bring the system to > it's knees? Why doesn't the io scheduler, CFQ, which is supposed to be > all about fairness starve other processes? Example, if I open a new file > in vim, and hold down "i" while this is running it will pause the > display of new "i"s for seconds, sometimes until the dd write is > completely finished. Another example is applications like firefox, > thunderbird, xchat, and pidgin will stop refreshing for 10+ seconds. > > dd if=/dev/zero of=test-file bs=2M count=2048 > > I understand the main difference between using oflag=direct or not > relates to if the io scheduler is used, and if the file is cached or > not. I can see this clearly by watching cached rise without > oflag=direct, stay the same with it, and go way down when I delete the > file after running dd without oflag=direct. > > The system in question is running Fedora 8. It is an E6600, 4gb > memory, and 2x300gb Seagate sata drives. The drives are setup with md > raid 1, and the filesystem is ext3. But I also see this with plenty of > other systems with more cpu, less cpu, less memory, raid, and no raid. > What motherboard/chipset do you have? which sata chipset? Are you using ncq? Did you try limiting the memory to 2G or even 1G ? Are you running 32bit or 64bit OS? > I have tried various tweaks to sys.vm settings, tried changing the > scheduler to as or deadline. Nothing seem to get it to behave, other > than oflag=direct. > Did you also try noop? > Using dd if=/dev/zero is just an easy test case. I see this when > copying large files, creating large files, and using virtualization > software that does heavy i/o on large files. > > > > The command below seems to result in cpu idle 0 and io wait 100%. As > shown by "vmstat 1" > Maybe also try iostat.. maybe it shows you something more/important in this case. There are also some caching/flushing related vm parameters which might affect these things.. -- Pasi -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessNathan Grennan wrote:
> Why is the command below all that is needed to bring the system to > it's knees? Why doesn't the io scheduler, CFQ, which is supposed to be > all about fairness starve other processes? Example, if I open a new file > in vim, and hold down "i" while this is running it will pause the > display of new "i"s for seconds, sometimes until the dd write is > completely finished. Another example is applications like firefox, > thunderbird, xchat, and pidgin will stop refreshing for 10+ seconds. > > dd if=/dev/zero of=test-file bs=2M count=2048 > > I understand the main difference between using oflag=direct or not > relates to if the io scheduler is used, and if the file is cached or > not. I can see this clearly by watching cached rise without > oflag=direct, stay the same with it, and go way down when I delete the > file after running dd without oflag=direct. > > The system in question is running Fedora 8. It is an E6600, 4gb memory, > and 2x300gb Seagate sata drives. The drives are setup with md raid 1, > and the filesystem is ext3. But I also see this with plenty of other > systems with more cpu, less cpu, less memory, raid, and no raid. > > I have tried various tweaks to sys.vm settings, tried changing the > scheduler to as or deadline. Nothing seem to get it to behave, other > than oflag=direct. > > Using dd if=/dev/zero is just an easy test case. I see this when > copying large files, creating large files, and using virtualization > software that does heavy i/o on large files. > > > > The command below seems to result in cpu idle 0 and io wait 100%. As > shown by "vmstat 1" > > dd if=/dev/zero of=test-file bs=2M count=2048 > > 2048+0 records in > 2048+0 records out > 4294967296 bytes (4.3 GB) copied, 94.7903 s, 45.3 MB/s > > > The command below seems to work much better for responsiveness. The cpu > idle will be around 50, and the io wait will be around 50. > > dd if=/dev/zero of=test-file2 bs=2M count=2048 oflag=direct > > 2048+0 records in > 2048+0 records out > 4294967296 bytes (4.3 GB) copied, 115.733 s, 37.1 MB/s > CFQ is optimized for throughput, not latency. When you're doing dd without oflag=direct, you're dirtying memory faster than it can be written to disk, so pdflush will spawn up to 8 threads (giving it 8 threads' worth of CFQ time), which starves out vim's extremely frequent syncing of its session file. The 8 threaded behavior of pdflush is a bit of a hack, and upstream is working on pageout improvements that should obviate it, but that work is still experimental. vim's behavior is a performance/robustness tradeoff, and is expected to be slow when the system is doing a lot of I/O. As for your virtualization, this is why most virtualization software (including Xen and KVM) allows you to use a block device, such as a logical volume, to which it can do direct I/O, which takes pdflush out of the picture. Ultimately, if latency is a high priority for you, you should switch to the deadline scheduler. -- Chris -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessPasi Kärkkäinen wrote:
> > What motherboard/chipset do you have? which sata chipset? > > ASUS P5B Premium ICH8R in ACHI mode > Are you using ncq? > Yes, I tried turning it off. > Did you try limiting the memory to 2G or even 1G ? > > No, I haven't tried that one, though had thought about it. > Are you running 32bit or 64bit OS? > > x86_64 across the board. I have thought of trying it on i686 systems, though the only ones I have running Fedora 8 are laptops. > > Did you also try noop? > > No, I thought about trying that. > > Maybe also try iostat.. maybe it shows you something more/important in this > case. > I have looked it it, but not in detail. > There are also some caching/flushing related vm parameters which might > affect these things.. > That is what I meant by sys.vm settings, but after some reading this morning I might have been configuring them in the wrong direction. In this case less may be more. -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessOn Tue, 04 Mar 2008 23:37:31 -0800
Nathan Grennan <fedora-list@...> wrote: > Why is the command below all that is needed to bring the system to > it's knees? I see a lot of variation in this kind of thing. Copying a big backup file to my external usb hard driver made my system stutter a bit (though nothing like what you describe), but I have recently dd'ed up some disk images for xen, and the system worked perfectly well during that operation. This was on fedora 8 x86_64 with sata disks. -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessOn Wed, Mar 05, 2008 at 14:35:41 -0500,
Chris Snook <csnook@...> wrote: > > Ultimately, if latency is a high priority for you, you should switch to the > deadline scheduler. Is there an easy way to do this? My desktop seems pretty sluggish since switching to rawhide and I suspect it is disk IO related. I'd like to try out another scheduler and see if it helps. -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessBruno Wolff III wrote:
> On Wed, Mar 05, 2008 at 14:35:41 -0500, > Chris Snook <csnook@...> wrote: >> Ultimately, if latency is a high priority for you, you should switch to the >> deadline scheduler. > > Is there an easy way to do this? My desktop seems pretty sluggish since > switching to rawhide and I suspect it is disk IO related. I'd like to > try out another scheduler and see if it helps. At the grub screen, hit 'a' to append kernel arguments, and add elevator=deadline to the list of parameters. If you like the results, you can add it permanently by editing /boot/grub/grub.conf. It's also possible that your sluggish rawhide performance is due to all the extra debug options that are turned on in the rawhide kernel. I've seen overhead as high as 30% on some workloads. There's been some discussion of adding a 'nodebug' kernel variant to rawhide that's compiled with roughly the same options as the stable Fedora kernel, but I don't know when or if that's going to happen. -- Chris -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessNathan Grennan wrote:
> Why is the command below all that is needed to bring the system to > it's knees? Why doesn't the io scheduler, CFQ, which is supposed to be > all about fairness starve other processes? Example, if I open a new file > in vim, and hold down "i" while this is running it will pause the > display of new "i"s for seconds, sometimes until the dd write is > completely finished. Another example is applications like firefox, > thunderbird, xchat, and pidgin will stop refreshing for 10+ seconds. > > dd if=/dev/zero of=test-file bs=2M count=2048 > > I understand the main difference between using oflag=direct or not > relates to if the io scheduler is used, and if the file is cached or > not. I can see this clearly by watching cached rise without > oflag=direct, stay the same with it, and go way down when I delete the > file after running dd without oflag=direct. > > The system in question is running Fedora 8. It is an E6600, 4gb memory, > and 2x300gb Seagate sata drives. The drives are setup with md raid 1, > and the filesystem is ext3. But I also see this with plenty of other > systems with more cpu, less cpu, less memory, raid, and no raid. > > I have tried various tweaks to sys.vm settings, tried changing the > scheduler to as or deadline. Nothing seem to get it to behave, other > than oflag=direct. > the RAID list. The current io schedulers don't split drive access fairly between read and write, so when you get a huge batch of write queued reads suffer. In your case, the vi problem may be an issue of doing a write to the file and that write being at the end of the io queue. Note: the optimization is for throughput, not responsiveness, you may see more pleasing results with the deadline scheduler. You also may want to look at using NCQ and setting the queue_depth in /sys. I can't explain it without looking up the details, so there's something for you to check. -- Bill Davidsen <davidsen@...> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessChris Snook wrote:
> At the grub screen, hit 'a' to append kernel arguments, and add > elevator=deadline to the list of parameters. If you like the results, > you can add it permanently by editing /boot/grub/grub.conf. > > It's also possible that your sluggish rawhide performance is due to > all the extra debug options that are turned on in the rawhide kernel. > I've seen overhead as high as 30% on some workloads. There's been > some discussion of adding a 'nodebug' kernel variant to rawhide that's > compiled with roughly the same options as the stable Fedora kernel, > but I don't know when or if that's going to happen. > nightly. From Firefox 3.0b3 on Firefox is very fsync happy. As in ever time you load a page it fsync about eight times. Do things in the middle of a big write and performance goes all to hell. I have filed a bug upstream, https://bugzilla.mozilla.org/show_bug.cgi?id=421482 . I ran across this idea by reading, http://kerneltrap.org/node/14148 . The first e-mail from Ingo as a reply to another one of his earlier e-mails mentions a case a lot like mine. Quad-Core, 4gb ram, and 30 second pauses in vim. He mentions vim uses fsync. He mentions an option, but it isn't good enough. You have to also change set swapsync, like below. I then straced vim and found any hiccups in the output were directly related to when vim ran fsync. I set the options, and the problem seemed to go away. Finally I turned to Firefox 3.0 nightly, and straced it. I found the problem. I then went back to Firefox 2.0.0.12, and straced it. I found it didn't have the same problem. So as nice as Firefox 3.0b3 or later is, it is a recipe for unhappiness. set swapsync=sync set nofsync But this is just a symptom of a bigger problem. As the Kernel Trap url above mentions, ext3 + fsync = crappiness. So my next step will be to talk to the right developers, learn as much as possible, and see if a solution can be found. Otherwise I may completely give up on ext3 and move to another filesystem. -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessOn Thu, Mar 06, 2008 at 11:18:54 -0500,
Chris Snook <csnook@...> wrote: > > At the grub screen, hit 'a' to append kernel arguments, and add > elevator=deadline to the list of parameters. If you like the results, > you can add it permanently by editing /boot/grub/grub.conf. I gave this a try and my perception is that it helped a bit when switching windows under high disk IO. It didn't seem to help much with programs taking a long time to exit. I suspect this might be do to lots of dirty pages stacking up and then a sync occuring. I'll try playing with the value for how long they sit around and see if that helps some more. Thanks for the help with setting the scheduler. -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessNathan Grennan wrote:
> > I figured it all out. The main issue was my use of a Firefox 3.0 > nightly. From Firefox 3.0b3 on Firefox is very fsync happy. As in ever > time you load a page it fsync about eight times. Do things in the > middle of a big write and performance goes all to hell. I have filed a > bug upstream, https://bugzilla.mozilla.org/show_bug.cgi?id=421482 . > > I ran across this idea by reading, http://kerneltrap.org/node/14148 . > The first e-mail from Ingo as a reply to another one of his earlier > e-mails mentions a case a lot like mine. Quad-Core, 4gb ram, and 30 > second pauses in vim. He mentions vim uses fsync. He mentions an > option, but it isn't good enough. You have to also change set > swapsync, like below. > > I then straced vim and found any hiccups in the output were directly > related to when vim ran fsync. I set the options, and the problem > seemed to go away. > > Finally I turned to Firefox 3.0 nightly, and straced it. I found the > problem. I then went back to Firefox 2.0.0.12, and straced it. I found > it didn't have the same problem. So as nice as Firefox 3.0b3 or later > is, it is a recipe for unhappiness. > > set swapsync=sync > set nofsync > > But this is just a symptom of a bigger problem. As the Kernel Trap > url above mentions, ext3 + fsync = crappiness. So my next step will be > to talk to the right developers, learn as much as possible, and see if > a solution can be found. Otherwise I may completely give up on ext3 > and move to another filesystem. > use of sqlite. There was already another bug abotu poor zoom performance which relates to sqlite. https://bugzilla.mozilla.org/show_bug.cgi?id=417732 -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
|
|
Re: I/O Scheduling results in poor responsivenessOn Thu, Mar 06, 2008 at 11:36:49AM -0500, Bill Davidsen wrote:
> Nathan Grennan wrote: > > Why is the command below all that is needed to bring the system to > >it's knees? Why doesn't the io scheduler, CFQ, which is supposed to be > >all about fairness starve other processes? Example, if I open a new file > >in vim, and hold down "i" while this is running it will pause the > >display of new "i"s for seconds, sometimes until the dd write is > >completely finished. Another example is applications like firefox, > >thunderbird, xchat, and pidgin will stop refreshing for 10+ seconds. > > > > dd if=/dev/zero of=test-file bs=2M count=2048 > > > > I understand the main difference between using oflag=direct or not > >relates to if the io scheduler is used, and if the file is cached or > >not. I can see this clearly by watching cached rise without > >oflag=direct, stay the same with it, and go way down when I delete the > >file after running dd without oflag=direct. > > > > The system in question is running Fedora 8. It is an E6600, 4gb memory, > >and 2x300gb Seagate sata drives. The drives are setup with md raid 1, > >and the filesystem is ext3. But I also see this with plenty of other > >systems with more cpu, less cpu, less memory, raid, and no raid. > > > > I have tried various tweaks to sys.vm settings, tried changing the > >scheduler to as or deadline. Nothing seem to get it to behave, other > >than oflag=direct. > > > Known problem with the io schedulers, and discussed from time to time on > the RAID list. The current io schedulers don't split drive access fairly > between read and write, so when you get a huge batch of write queued > reads suffer. In your case, the vi problem may be an issue of doing a > write to the file and that write being at the end of the io queue. > > Note: the optimization is for throughput, not responsiveness, you may > see more pleasing results with the deadline scheduler. You also may want > to look at using NCQ and setting the queue_depth in /sys. I can't > explain it without looking up the details, so there's something for you > to check. > Hi! Do you happen to know if it's possible to check current queue depth "in use"? Meaning how many commands are currently queued.. -- Pasi -- fedora-list mailing list fedora-list@... To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list |
| Free Forum Powered by Nabble | Forum Help |