Getattr called on every file following a readdir

View: New views
5 Messages — Rating Filter:   Alert me  

Getattr called on every file following a readdir

by Pete Wyckoff-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi,

For some reason, FUSE (2.7.3 on Linux Kernel 2.6.12) is calling getattr for
every file after a readdir which gave it the same information using the
filler function and stat structures.

It does not do this for sub-directories, just files in the directory.

It's killing me for directories with 10s of thousands of files.

Is there some option to set or do I need a newer Kernel or maybe I'm not
supplying the right info in readdir?

Thanks, pete


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
fuse-devel mailing list
fuse-devel@...
https://lists.sourceforge.net/lists/listinfo/fuse-devel

Re: Getattr called on every file following a readdir

by Nikolaus Rath :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Pete Wyckoff <pwyckoff@...> writes:
> Hi,
>
> For some reason, FUSE (2.7.3 on Linux Kernel 2.6.12) is calling getattr for
> every file after a readdir which gave it the same information using the
> filler function and stat structures.

I cannot reproduce this here with fuse 2.7.2 and kernel 2.6.24.

Could you provide a simple sample filesystem that exhibits this
behaviour or try to reproduce it the with the attached fs?

Best,

   -Nikolaus

--
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
                                                         -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

#!/usr/bin/env python

#    Copyright (C) 2001  Jeff Epler  <jepler@...>
#    Copyright (C) 2006  Csaba Henk  <csaba.henk@...>
#
#    This program can be distributed under the terms of the GNU LGPL.
#    See the file COPYING.
#

import os, sys
from errno import *
from stat import *
import fcntl
# pull in some spaghetti to make this stuff work without fuse-py being installed
try:
    import _find_fuse_parts
except ImportError:
    pass
import fuse
from fuse import Fuse
import time

if not hasattr(fuse, '__version__'):
    raise RuntimeError, \
        "your fuse-py doesn't know of fuse.__version__, probably it's too old."

fuse.fuse_python_api = (0, 2)

fuse.feature_assert('stateful_files', 'has_init')


def flag2mode(flags):
    md = {os.O_RDONLY: 'r', os.O_WRONLY: 'w', os.O_RDWR: 'w+'}
    m = md[flags & (os.O_RDONLY | os.O_WRONLY | os.O_RDWR)]

    if flags | os.O_APPEND:
        m = m.replace('w', 'a', 1)

    return m


class Xmp(Fuse):

    def __init__(self, *args, **kw):

        Fuse.__init__(self, *args, **kw)

        # do stuff to set up your filesystem here, if you want
        #import thread
        #thread.start_new_thread(self.mythread, ())
        self.root = '/'

#    def mythread(self):
#
#        """
#        The beauty of the FUSE python implementation is that with the python interp
#        running in foreground, you can have threads
#        """
#        print "mythread: started"
#        while 1:
#            time.sleep(120)
#            print "mythread: ticking"

    def getattr(self, path):
        print "getattr for", path, "at", time.time()
        return os.lstat("." + path)

    def readlink(self, path):
        return os.readlink("." + path)

    def readdir(self, path, offset):
        for e in os.listdir("." + path):
            yield fuse.Direntry(e)

    def unlink(self, path):
        os.unlink("." + path)

    def rmdir(self, path):
        os.rmdir("." + path)

    def symlink(self, path, path1):
        os.symlink(path, "." + path1)

    def rename(self, path, path1):
        os.rename("." + path, "." + path1)

    def link(self, path, path1):
        print "link"
        os.link("." + path, "." + path1)

    def chmod(self, path, mode):
        os.chmod("." + path, mode)

    def chown(self, path, user, group):
        os.chown("." + path, user, group)

    def truncate(self, path, len):
        f = open("." + path, "a")
        f.truncate(len)
        f.close()

    def mknod(self, path, mode, dev):
        os.mknod("." + path, mode, dev)

    def mkdir(self, path, mode):
        os.mkdir("." + path, mode)

    def utime(self, path, times):
        os.utime("." + path, times)

#    The following utimens method would do the same as the above utime method.
#    We can't make it better though as the Python stdlib doesn't know of
#    subsecond preciseness in acces/modify times.
#
#    def utimens(self, path, ts_acc, ts_mod):
#      os.utime("." + path, (ts_acc.tv_sec, ts_mod.tv_sec))

    def access(self, path, mode):
        if not os.access("." + path, mode):
            return -EACCES

#    This is how we could add stub extended attribute handlers...
#    (We can't have ones which aptly delegate requests to the underlying fs
#    because Python lacks a standard xattr interface.)
#
#    def getxattr(self, path, name, size):
#        val = name.swapcase() + '@' + path
#        if size == 0:
#            # We are asked for size of the value.
#            return len(val)
#        return val
#
#    def listxattr(self, path, size):
#        # We use the "user" namespace to please XFS utils
#        aa = ["user." + a for a in ("foo", "bar")]
#        if size == 0:
#            # We are asked for size of the attr list, ie. joint size of attrs
#            # plus null separators.
#            return len("".join(aa)) + len(aa)
#        return aa

    def statfs(self):
        """
        Should return an object with statvfs attributes (f_bsize, f_frsize...).
        Eg., the return value of os.statvfs() is such a thing (since py 2.2).
        If you are not reusing an existing statvfs object, start with
        fuse.StatVFS(), and define the attributes.

        To provide usable information (ie., you want sensible df(1)
        output, you are suggested to specify the following attributes:

            - f_bsize - preferred size of file blocks, in bytes
            - f_frsize - fundamental size of file blcoks, in bytes
                [if you have no idea, use the same as blocksize]
            - f_blocks - total number of blocks in the filesystem
            - f_bfree - number of free blocks
            - f_files - total number of file inodes
            - f_ffree - nunber of free file inodes
        """

        return os.statvfs(".")

    def fsinit(self):
        os.chdir(self.root)

    class XmpFile(object):

        def __init__(self, path, flags, *mode):
            self.file = os.fdopen(os.open("." + path, flags, *mode),
                                  flag2mode(flags))
            self.fd = self.file.fileno()

        def read(self, length, offset):
            self.file.seek(offset)
            return self.file.read(length)

        def write(self, buf, offset):
            self.file.seek(offset)
            self.file.write(buf)
            return len(buf)

        def release(self, flags):
            self.file.close()

        def _fflush(self):
            if 'w' in self.file.mode or 'a' in self.file.mode:
                self.file.flush()

        def fsync(self, isfsyncfile):
            self._fflush()
            if isfsyncfile and hasattr(os, 'fdatasync'):
                os.fdatasync(self.fd)
            else:
                os.fsync(self.fd)

        def flush(self):
            self._fflush()
            # cf. xmp_flush() in fusexmp_fh.c
            os.close(os.dup(self.fd))

        def fgetattr(self):
            return os.fstat(self.fd)

        def ftruncate(self, len):
            self.file.truncate(len)

        def lock(self, cmd, owner, **kw):
            # The code here is much rather just a demonstration of the locking
            # API than something which actually was seen to be useful.

            # Advisory file locking is pretty messy in Unix, and the Python
            # interface to this doesn't make it better.
            # We can't do fcntl(2)/F_GETLK from Python in a platfrom independent
            # way. The following implementation *might* work under Linux.
            #
            # if cmd == fcntl.F_GETLK:
            #     import struct
            #
            #     lockdata = struct.pack('hhQQi', kw['l_type'], os.SEEK_SET,
            #                            kw['l_start'], kw['l_len'], kw['l_pid'])
            #     ld2 = fcntl.fcntl(self.fd, fcntl.F_GETLK, lockdata)
            #     flockfields = ('l_type', 'l_whence', 'l_start', 'l_len', 'l_pid')
            #     uld2 = struct.unpack('hhQQi', ld2)
            #     res = {}
            #     for i in xrange(len(uld2)):
            #          res[flockfields[i]] = uld2[i]
            #
            #     return fuse.Flock(**res)

            # Convert fcntl-ish lock parameters to Python's weird
            # lockf(3)/flock(2) medley locking API...
            op = { fcntl.F_UNLCK : fcntl.LOCK_UN,
                   fcntl.F_RDLCK : fcntl.LOCK_SH,
                   fcntl.F_WRLCK : fcntl.LOCK_EX }[kw['l_type']]
            if cmd == fcntl.F_GETLK:
                return -EOPNOTSUPP
            elif cmd == fcntl.F_SETLK:
                if op != fcntl.LOCK_UN:
                    op |= fcntl.LOCK_NB
            elif cmd == fcntl.F_SETLKW:
                pass
            else:
                return -EINVAL

            fcntl.lockf(self.fd, op, kw['l_start'], kw['l_len'])


    def main(self, *a, **kw):

        self.file_class = self.XmpFile

        return Fuse.main(self, *a, **kw)


def main():

    usage = """
Userspace nullfs-alike: mirror the filesystem tree from some point on.

""" + Fuse.fusage

    server = Xmp(version="%prog " + fuse.__version__,
                 usage=usage,
                 dash_s_do='setsingle')

    server.parser.add_option(mountopt="root", metavar="PATH", default='/',
                             help="mirror filesystem from under PATH [default: %default]")
    server.parse(values=server, errex=1)

    try:
        if server.fuse_args.mount_expected():
            os.chdir(server.root)
    except OSError:
        print >> sys.stderr, "can't enter root of underlying filesystem"
        sys.exit(1)

    server.main()


if __name__ == '__main__':
    main()

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
fuse-devel mailing list
fuse-devel@...
https://lists.sourceforge.net/lists/listinfo/fuse-devel

Re: Getattr called on every file following a readdir

by sqweek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Jul 19, 2008 at 8:34 AM, Pete Wyckoff <pwyckoff@...> wrote:
> For some reason, FUSE (2.7.3 on Linux Kernel 2.6.12) is calling getattr for
> every file after a readdir which gave it the same information using the
> filler function and stat structures.
>
> It does not do this for sub-directories, just files in the directory.
>
> It's killing me for directories with 10s of thousands of files.

 I presume it's killing you during ls. In this case, it's nothing to
do with fuse - ls is causing the getattr. readdir(3) doesn't portably
give ls any information except the name of the file. Most likely
you're using some sort of ls alias that uses a flag which forces ls to
stat the file (eg --color). You'll probably find if you bypass the
alias by typing /bin/ls everything is a lot swifter.
 Unfortunately I don't think there's a workaround here, short of
moving to plan 9 (directory reads in 9p include all the stat
information).
-sqweek

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
fuse-devel mailing list
fuse-devel@...
https://lists.sourceforge.net/lists/listinfo/fuse-devel

Re: Getattr called on every file following a readdir

by Pete Wyckoff-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Thanks, for the info and that does work.

But, is there a way to push that stat info into fuse's attribute cache
during the readdir call?  I.e., some function call like
insert_into_cache(path,  stat struct)?

Thanks, pete


On 7/19/08 2:26 AM, "sqweek" <sqweek@...> wrote:

> On Sat, Jul 19, 2008 at 8:34 AM, Pete Wyckoff <pwyckoff@...> wrote:
>> For some reason, FUSE (2.7.3 on Linux Kernel 2.6.12) is calling getattr for
>> every file after a readdir which gave it the same information using the
>> filler function and stat structures.
>>
>> It does not do this for sub-directories, just files in the directory.
>>
>> It's killing me for directories with 10s of thousands of files.
>
>  I presume it's killing you during ls. In this case, it's nothing to
> do with fuse - ls is causing the getattr. readdir(3) doesn't portably
> give ls any information except the name of the file. Most likely
> you're using some sort of ls alias that uses a flag which forces ls to
> stat the file (eg --color). You'll probably find if you bypass the
> alias by typing /bin/ls everything is a lot swifter.
>  Unfortunately I don't think there's a workaround here, short of
> moving to plan 9 (directory reads in 9p include all the stat
> information).
> -sqweek


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
fuse-devel mailing list
fuse-devel@...
https://lists.sourceforge.net/lists/listinfo/fuse-devel

Re: Getattr called on every file following a readdir

by Miklos Szeredi :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 19 Jul 2008, Pete Wyckoff wrote:
> But, is there a way to push that stat info into fuse's attribute cache
> during the readdir call?  I.e., some function call like
> insert_into_cache(path,  stat struct)?

No such call exists today, sshfs for example implements its own
directory, attribute and symlink cache.

Improvements to the caching interface are planned for the future
though.

Thanks,
Miklos

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
fuse-devel mailing list
fuse-devel@...
https://lists.sourceforge.net/lists/listinfo/fuse-devel
LightInTheBox - Buy quality products at wholesale price