Simple FUSE Filesystem HOWTO

View: New views
1 Messages — Rating Filter:   Alert me  

Simple FUSE Filesystem HOWTO

by Luis Furquim :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello

This is the latest text I have. It has my text and the corrections
suggested by Miklos. The simpler corrections were already
introduced in the text. The ones that demand a rewrite of the
text aren 't integrated in the text and appears below of the
point where it has to be inserted. After the point where I tell
Miklos that  "I stopped here" you will find some chunks of text
that I just "cut and paste" from emails from this fuse-devel list
to future insertion in the tutorial.

Note that I started the tutorial explaining that "there are 4 four
types of filesystem". After this I noticed that the filesystems
page got reorganized and divided the filesystems in:
ArchiveFileSystems, CompressedFileSystems, DatabaseFileSystems,
EncryptedFileSystems, MediaFileSystems, HardwareFileSystems,
MonitoringFileSystems, NetworkFileSystems, NonNativeFileSystems,
UnionFileSystems, VersioningFileSystems. This classification
is about the purpose of the filesystem, not how they are internally
structured. I not sure if we include it or not in this tutorial.

So, currently this is what I have.

Cheers,

Luis Otavio de Colla Furquim



---------- Forwarded message ----------
From: Miklos Szeredi <miklos@...>
Date: Mon, Mar 17, 2008 at 8:08 AM
Subject: Re: FUSE Documentation
To: luisfurquim@...


> Sorry for the long silence, I was loaded with other tasks and had to
> stop writing the tutorial for a while. I made the corrections you suggested
> in the last e-mail and added the beginning of the explanation about the
> callbacks, only the statfs and getattr for now. But I made some reasearch
> in the fuse-devel e-mails and added some of the info found there to the
> explanations. As I am not quite sure if I understood every bit of what was
> said, I thought in passing to to revise it and, also, to ask you if adding
> those info from the discussions is relevant/desired.
>
> So, here's the current draft:

Thanks.  See my comments inline.

>
>
>
> Introduction
>
> To develop a new filesystem using FUSE, you need to make a program
> that calls the function fuse_main, passing to it some command line
> parameters and an object containing a set of function callbacks. These
> function callback will be the heart of your filesystem. Each time a
> user tries to access your filesystem, one or more of these function
> callbacks will be called in order to provide the operations and
> information required. So, you would write a function to provide the
> "open" operation, another one to provide the "read" operation, and so
> on. Some of the callbacks must be provided and others may be left
> unimplemented.
>
> There are four main types of filesystems you can implement:
>   1) a passthrough filesystem: think of it like a proxy or gateway
> filesystem. The data and maybe the metadata are physically stored in
> another filesystem. So, a FUSE passthrough filesystem will ADD
> features that do not exist in the underlying filesystem.
>       This way, a passthrough filesystem will perform any specific
> operation but will always call similar libc filesystem functions to
> access the data: the open callback will call the libc open, the
> release callback will call libc close, the getattr callback will call
> libc lstat, and so on.
>   2) a block filesystem: the data in it is accessed by the filesystem
> without calling the libc counterparts. It can access a block device or
> a image reading raw sectors and interpreting the data
>   3) a virtual filesystem: it can access files, devices, network
> services or whatever, which are not filesystems themselves and provide
> the access with the filesystem interface to it.
>   4) a networked filesystem: this is somewhat similar to passthrough
> filesystems, but instead of calling the equivalent libc filesystem
> functions, it uses some specific network protocol to perform the
> operations, so the open callback will send data to the network server,
> according to its protocol, telling it to open the file and returning
> the response (file handle and/or error code) to the caller according
> to the FUSE specification; the read callback will send data to the
> network server, according to its protocol, asking the data it wants to
> read, and returning the response (file content and/or error code) to
> the caller according to the FUSE specification and so on.
>
> FUSE provides an example filesystem, called fusexmp. It is a
> do-nothing passthrough filesystem. *Prior to version 2.7.0*, it just
> provide access to the root of the system in its mountpoint. It does
> not makes copies, working similarly to links. It differs from other
> passthrough filesystem because it does not add any new feature to the
> underlying filesystem. This happens just because its purpose is to be
> an example of how you can use FUSE. It mirrors the root of the
> filesystem just as a side effect: when FUSE calls fusexmp callback
> functions it passes the path *relative* to mountpoint and when this
> relative path is used in the libc functions, the libc functions
> process it as an *absolute* path. So, if the mountpoint is "/foo" and
> the user calls libc "open" with "/foo/bar.txt" as the path, the libc
> "open" will call VFS, which will call FUSE, which will call your
> "xmp_open" callback with "/bar.txt" as the path (remember that "/foo"
> is the mountpoint, "/bar.txt" is the *relative* path to "/foo"). At
> this point "xmp_open" just calls the libc open without processing the
> path parameter (i.e. it calls open("/bar.txt",...) ) and is this use
> of the parameter in *relative* form with a function which handles just
> *absolute* forms that makes the mirroring of the root.
>
> *Starting in version 2.7.0*, is possible to mirror other directories
> without modifying the fusexmp filesystem. If you want to mirror the
> */foo* subdirectory instead of the root, call fusexmp giving the
> following options: _-omodules=subdir,subdir=/foo_ . In prior versions,
> to make fusexmp mirror only a subdirectory of your system, it would be
> needed to modify its source code to accept the path to be mirrored as
> a command line parameter and, also, do string concatenation in each
> callback that would use the path parameter (open or getattr for
> example). But, this way, it would add useless complexity to the
> fusexmp, which is completely useless for anything than its only
> purpose: be an *example* of how to construct a FUSE filesystem. You
> don't *use* fusexmp, you just *learn* from it. Most programmers just
> copied fusexmp_fh.c to myfs.c and started changing it to meet their
> goals. So, fusexmp works as a skeleton where programmers start coding
> their own filesystems.
>
>
> 1. First steps
>
> Let's start copying fusexmp to myfs. At this point, let's note that
> there are two versions of fusexmp: fusexmp.c and fusexmp_fh.c. They do
> same operations. They differ only in the way they work. In
> fusexmp_fh.c file handles are provided to the caller program by the
> fusexmp_fh.c program.

This is not strictly true.  File descriptors returned to the program
are always generated by the kernel in a well defined way (the smallest
unused file descriptor is allocated).  The file handle set in the
open() callback is a way for the filesystem to identify the open file,
and it may contain arbitrary data.  Fuse doesn't use this file handle
in any way except to pass it to the read/write/release/... callbacks.

> In fusexmp.c they are not provided anyway (so FUSE itself handles
> the problem generating filehandles and giving it to the caller
> process).

So this is not true either.

> In this case, as fusexmp just don't know anything about
> the file handles, when xmp_read and xmp_write callbacks are called,
> it needs to use the name of the file (and not the file handle)
> provided in the "path" parameter to open it, then it uses the offset
> parameter to seek to the position the caller process "thinks", it is
> and make the access itself (read or write). After this, it closes
> the file. It works, but is slower than using file handles and
> keeping the file opened until the caller process explicitly closes
> it.
>
> Why fusexmp.c? Why not using the fh version to develop all the
> filesystems? Some filesystems are not intended to be performance
> champions. They may have to access data that doesn't have nothing
> similar with a current position. They may have to access data that is
> so slow that you could not achieve any performance improvement using
> file handles. So, in this case, code simplicity may be the primary
> focus in the development.
>
>
> 1.1 System initialization and termination
>
> The program starts as any normal program. It isn't a library or a
> kernel module. No special compile flags or linking. Just start coding
> as usual. You begin initializing your variables, checking anything you
> need to check and parsing the command line arguments. Now you have
> your first problem to solve: FUSE needs to know where is the
> mountpoint of your filesystem and which options the user wants to set.
> As said before, you will call fuse_main providing this information.
> The way you give this is information is the same way your program
> receives from the command line: passing argc and argv to it. The first
> two parameters to fuse_main are argc and argv. If your filesystem does
> not need more information than the info asked by fuse_main, you may
> safely pass your argc and argv to fuse_main. But if you need more
> information than this and demand from the user to put it on the
> command line, then you may a) strip the extra info from the argv
> before passing it to fuse_main; OR b) create another variable to hold
> the info to pass to fuse_main. For example, suppose your program uses
> the following command line syntax:
>     myfs <mountpoint> <myparameter> [-o <fuse_option1>[,...[,<fuse_optionN>]]]
> you could write a code like this:
>
> char **fuse_argv, *myparameter;
>
> ... some processing you might need ...
>
> fuse_argv = malloc((argc-1) * sizeof(char *) );
> ... do some error handling ...
> fuse_argv[0] = argv[0]; // not really needed fuse_main ignores it
> fuse_argv[1] = argv[1]; // the mountpoint
> myparameter = argv[2];
>
> // as you can see below, argv[2] will not be inserted in fuse_argv
> // this is because FUSE does not need it and if you pass it to
> // fuse_main, it will generate an error
> for(int i=3;i<argc;++i) {
>    fuse_argv[i-1] = argv[i];
> }
>
> ... some other processing you might need ...
>
> // don't worry about my_operations, you will learn about it soon
> fuse_main(argc-1,fuse_argv,my_operations);
>
> Note: in the example above, there were just one more parameter in the
> command line. It was just an example, you may do whatever you want.
> The command line is yours! You define whatever syntax you need to
> start your filesystem. You are bound only to the syntax you pass to
> fuse_main! You may define optional parameters like
> "-o<fuse_option1>,<myfs_option>,<fuse_option2>", but you have to strip
> it off and pass "-o<fuse_option1>,<fuse_option2>" to fuse_main. FUSE
> provides an option parsing API in <fuse_opt.h>, which can help with
> manipulating options before calling fuse_main(). Unfortunately there
> are no examples for this in the fuse package, neither in this document
> for now. But for example, sshfs uses this interface quite extensively.
> Future versions of this documentation *may* be updated to cover this
> API too.

OK, here are a couple of examples.  This does the same as your example
above:

======================================================================
int main(int argc, char *argv[])
{
       struct fuse_args args = FUSE_ARGS_INIT(0, NULL);
       int i;

       for(i = 0; i < argc; i++) {
               if (i == 2)
                       myparameter = argv[i];
               else
                       fuse_opt_add_arg(&args, argv[i]);
       }

       return fuse_main(args.argc, args.argv, &my_operations, NULL);
}
======================================================================

Again similar, but using a processing function:

======================================================================
static int myfs_opt_proc(void *data, const char *arg, int key,
                          struct fuse_args *outargs)
{
       if (key == FUSE_OPT_KEY_NONOPT && myparameter == NULL) {
               myparameter = strdup(arg);
               return 0;
       }
       return 1;
}

int main(int argc, char *argv[])
{
       struct fuse_args args = FUSE_ARGS_INIT(argc, argv);

       fuse_opt_parse(&args, NULL, NULL, myfs_opt_proc);

       return fuse_main(args.argc, args.argv, &my_operations, NULL);
}
======================================================================

And this one adds a new argument before calling fuse_main():

======================================================================
int main(int argc, char *argv[])
{
       struct fuse_args args = FUSE_ARGS_INIT(argc, argv);

       fuse_opt_parse(&args, NULL, NULL, NULL);
       fuse_opt_add_arg(&args, "-omodules=subdir,subdir=/foo");

       return fuse_main(args.argc, args.argv, &my_operations, NULL);
}
======================================================================


And a more complex example:

======================================================================
struct myfs_config {
       int mynum;
       char *mystring;
       int mybool;
};

enum {
       KEY_HELP,
       KEY_VERSION,
};

#define MYFS_OPT(t, p, v) { t, offsetof(struct myfs_config, p), v }

static struct fuse_opt myfs_opts[] = {
       MYFS_OPT("mynum=%i",          mynum, 0),
       MYFS_OPT("-n %i",             mynum, 0),
       MYFS_OPT("mystring=%s",       mystring, 0),
       MYFS_OPT("mybool",            mybool, 1),
       MYFS_OPT("nomybool",          mybool, 0),
       MYFS_OPT("--mybool=true",     mybool, 1),
       MYFS_OPT("--mybool=false",    mybool, 0),

       FUSE_OPT_KEY("-V",             KEY_VERSION),
       FUSE_OPT_KEY("--version",      KEY_VERSION),
       FUSE_OPT_KEY("-h",             KEY_HELP),
       FUSE_OPT_KEY("--help",         KEY_HELP),
       FUSE_OPT_END
};

static int myfs_opt_proc(void *data, const char *arg, int key,
                         struct fuse_args *outargs)
{
       switch (key) {
       case KEY_HELP:
               fprintf(stderr,
                       "usage: %s mountpoint [options]\n"
                       "\n"
                       "general options:\n"
                       "    -o opt,[opt...]  mount options\n"
                       "    -h   --help      print help\n"
                       "    -V   --version   print version\n"
                       "\n"
                       "Myfs options:\n"
                       "    -o mynum=NUM\n"
                       "    -o mystring=STRING\n"
                       "    -o mybool\n"
                       "    -o nomybool\n"
                       "    -n NUM           same as '-omynum=NUM'\n"
                       "    --mybool=BOOL    same as 'mybool' or 'nomybool'\n"
                       , outargs->argv[0]);
               fuse_opt_add_arg(outargs, "-ho");
               fuse_main(outargs->argc, outargs->argv, &my_operations, NULL);
               exit(1);

       case KEY_VERSION:
               fprintf(stderr, "Myfs version %s\n", PACKAGE_VERSION);
               fuse_opt_add_arg(outargs, "--version");
               fuse_main(outargs->argc, outargs->argv, &my_operations, NULL);
               exit(0);
       }
       return 1;
}

int main(int argc, char *argv[])
{
       struct fuse_args args = FUSE_ARGS_INIT(argc, argv);
       struct myfs_config conf;

       memset(&conf, 0, sizeof(conf));

       fuse_opt_parse(&args, &conf, myfs_opts, myfs_opt_proc);

       return fuse_main(args.argc, args.argv, &my_operations, NULL);
}
======================================================================


>
> The third parameter to fuse_main is fuse_operations, it is a struct
> with pointers to your function callbacks, they will handle all
> filesystem calls. So, for example, any time a program tries to open a
> file inside your filesystem, fuse will call the function you provided
> as the open callback. Don't worry about the callbacks now, we will see
> it soon.
>
> When calling fuse_main, your program will block and enter in *daemon
> mode*. As you entered the daemon mode you will *lose access* to
> stdin, stdout and stderr. Any debug you need to do must be made via
> syslog or writing to a file, choose what is best for you. Now FUSE
> will be in the control and all filesystem calls received by FUSE will
> generate a call to the callbacks passed in fuse_operations to
> fuse_main. Remember, as a filesystem may be accessed by any program
> running on the OS, you may receive concurrent calls. Your callbacks
> may be even preempted during execution. So, be cautious and *don't use
> global variables without using locks*. Or, alternatively use the '-s' (single
> threaded) option to disable concurrency.
>
> When the user umounts your filesystem fuse_main ends its processing
> and returns the control to your main program. This is a time to do
> finalization code.
>
> There are other points to do initialization/finalization code. In the
> fuse_operations struct, you will find pointers to supply an init
> callback and a destroy callback. They are optional callbacks and may
> be left unimplemented, but if you supply them, they will be executed
> once: init immediately after fuse_main is called and destroy just
> before fuse_main returns to the caller. The init function  callback
> you provided returns a pointer (or NULL) which will be passed to all
> your function callbacks any time they get called, including the
> destroy callback. If you started threads before calling fuse_main,
> they will terminated at this point. So, if you need to maintain
> threads running after you call fuse_main, you need to *start them in
> the init callback*.
>
>
> 2. The function callbacks.
>
> First, let's state here the FUSE error handling convention: your
> callbacks *must* report the errors returning -errno. The convention in
> plain C is return -1 and set errno. But the callbacks don't follow
> this convention, your function, when calling some libc function, will
> test if it returns -1 and, if so, will handle errno, but when your
> function decides to return an error to the caller, it must return
> -errno. If your function performs OK and no error ocurred, it must
> return a non-negative value. Most of the times you will have to return
> 0. Sometimes, in read and write callbacks, for example, the return
> value may have some meaning (like the number of bytes read or
> written), and so is not always zero. But a successful return is always
> non-negative.
>
> Another convention is the path parameter. It contains a string with
> the pathname of the file to be accessed. It will start with a slash,
> but it is *relative* to the mountpoint you passed to fuse_main as the
> second element of the argv parameter. So, if, for example, the caller
> calls open("/foo/bar.txt","r") and your mountpoint is "/foo", then
> your open callback function will receive "/bar.txt" in the path
> parameter. If, for some reason you need to know the mountpoint to
> reconstruct the *absolute* path, *is up to you* to store the
> mountpoint in some variable and the place to do it is in your
> initialization code (somewhere before calling fuse_main or in the init
> callback supplied in fuse_operations).
>
> You tell FUSE which functions are your callbacks using the
> fuse_operations parameter you provide as the third parameter to
> fuse_main. Before calling fuse_main, initialize it this way:
>
> static struct fuse_operations myfs_oper = {
>       .getattr                = myfs_getattr,
>       .fgetattr       = myfs_fgetattr,
>       .access         = myfs_access,
>       .readlink       = myfs_readlink,
>       .opendir        = myfs_opendir,
>       .readdir        = myfs_readdir,
>       .releasedir     = myfs_releasedir,
>       .mknod  = myfs_mknod,
>       .mkdir          = myfs_mkdir,
>       .symlink        = myfs_symlink,
>       .unlink         = myfs_unlink,
>       .rmdir          = myfs_rmdir,
>       .rename = myfs_rename,
>       .link           = myfs_link,
>       .chmod  = myfs_chmod,
>       .chown          = myfs_chown,
>       .truncate       = myfs_truncate,
>       .ftruncate      = myfs_ftruncate,
>       .utimens        = myfs_utimens,
>       .create         = myfs_create,
>       .open           = myfs_open,
>       .read           = myfs_read,
>       .write          = myfs_write,
>       .statfs         = myfs_statfs,
>       .flush          = myfs_flush,
>       .release        = myfs_release,
>       .fsync          = myfs_fsync,
>       .setxattr       = myfs_setxattr,
>       .getxattr       = myfs_getxattr,
>       .listxattr      = myfs_listxattr,
>       .removexattr    = myfs_removexattr,
>       .lock           = myfs_lock
> };
>
>
> STATFS - myfs_statfs(const char *path, struct statvfs *stbuf)
>
> Now you will start to code your callback functions. Start coding the
> ones that are called first or frequently in order to be able to test
> your filesystem without having to code all the callbacks at once.
>
> So, begin coding the statfs function callback. It is called to obtain
> some info about your filesystem. Each time the user calls statvfs in
> your filesystem (It is called from "df" or "stat -f path") your statfs
> callback function will be called. The first parameter is the _path_,
> which is, in this case, not so useful, because the caller wants info
> about the filesystem where _path_ resides, and the fact that your
> function was called tells you that is in *your* filesystem that the
> _path_ resides. So, you don't need to do nothing at all with this
> parameter. The second parameter, _stbuf_ is the buffer where
> you have to store the data about the filesystem. It has the same
> structure of the buffer used by the libc statvfs, so refer to your libc
> manual for info on how to fill this buffer. Passthrough filesystems
> will typically pass the pointer to the buffer to the libc statvfs function
> and let it do all the job.
>
> More info: http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/statvfs.h.html
>
>
> GETATTR - myfs_getattr(const char *path, struct stat *stbuf)
>
> After coding the statfs callback, you may go to code the getattr
> callback. You have to code it before start coding the others because
> fuse calls it before most of them. So, to test opening a file, you
> must already have getattr coded. It receives _path_ and _stbuf_ as
> parameters. The _path_ is the *relative* filename to be checked. The
> _stbuf_ parameter is the buffer where you have to store the attributes
> of the file referenced by path. The _stbuf_ format is the same used by
> the libc functions _stat_ and _lstat_, because the getattr callback is
> called when the user program calls one of these functions or some
> other similar function.
>
> Note: the data you return from this function is cached. It means that
> the getattr callback will *not* be called again until the cache
> expires. The default timeout for the cache is 1 second. You may change
> it passing the option *-oattr_timeout=N* to fuse_main. If N=0 the
> attribute caching is disabled, but beware *performance will degrade*.
>
> Another note: if, for some reason, your getattr callback informs the
> caller that a file is zero sized and the caller program attempts to
> read something from this file, then your read callback will *not be
> called*. If your getattr says to the caller that the size of the file
> is less than it really is, fuse will truncate the return of a
> subsequent read on this file in accordance to the size given by the
> getattr callback. Because of performance reasons, libfuse doesn't
> attempt to read beyond of what it believes to be the end of the file.
> So, if you want to inform your caller that a file is zero sized and
> still want to handle any read operation in your read callback, or
> report a value lesser than its real value, then you have to define the
> direct_io flag in the open system call. But be warned: to run an
> executable file, *it must not be opened* with this flag, or it will
> not run at all.
>
> More info: http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html
>
>
> //
> // Miklos, I stopped here. the texts below are extracted from e-mails
> on fuse-devel and used
> // as background to write the getattr callback explanation. Please
> revise it because I am not
> // *really* sure if I understood every bit of it.

I think your description is correct.

Thanks again for doing this tutorial!

Miklos

>
>
> I think if the file size is 0 (as returned by the getattr before an
> open), then fuse will not issue the read. It will just return 0. If you
> want it to still issue the read you have to define the direct_io flag in
> the open system call.
> I'm not sure but this might be causing the behavior you're seeing.
>
>
>
> In the current implementation, fetching the contents of a file is a
> quite expensive operation. But to know the size of a file, I have to
> know its content. Because I don't want the getattr() calls to be
> expensive, I omit the size check and want to set the size to "0"
> (similar to /proc).
> But now - if I try to cat the file - I don't get any output. If I set
> filesize to 1 I get the first char. When setting the size to the correct
> value (or higher) it works. Thus it can't be my read() implementation.
>
>
>
>
> this bug comes from the way libfuse handles hard links.  Or rather how
> it does _not_ handle them, and treats the two links as two separate
> files.  This means, that modifying the file through one link and
> checking the attributes through another will not behave as expected,
> due to various caches.
>
> One workaround (that may only work some of the time, or may not work
> at all, due to another bug in 2.5.3) is to disable caching with
> '-oentry_cache=0'.
>
>
> Argh, sorry.  It's 'entry_timeout'.
>
> Safest bet is to do '-oattr_timeout=0,entry_timeout=0'.  Downside is,
> it might slow things down somewhat.
>
>
>
>
>
> > What kind of benefit could I see from going to a 2.6 kernel and thus
> > later version of fuse?
> Just small things, like the entry_cache=0 bug being fixed, and that
> the fix described above could be applied more easily.
>
> > Does 2.7 manage links better than 2.5?
> No, but this issue has been reported more than once recently, so I may
> look at fixing the handling of hard links.
>
>
>
>
> > my fuse filesystem reads file attributes from a file from disk. Does
> > FUSE need to re-read it again and again every time when
> > myfs_getattr() is called?
>
> Yes.
>
> > Or does it cache it anyhow, so if myfs_getattr() is called twice, it
> > remembers the previous result for the given file and returns without
> > re-reading?
>
> Fuse does cache attributes, and then myfs_getattr() simply won't be
> called the second time.  The default timeout for this cache is 1sec,
> but this can be changed with the '-oattr_timeout=N' option.
>
>
>
>
>
> --
> Luis Otavio de Colla Furquim
> Não alimente os pingos
> Don't feed the tribbles
> http://www.furquim.org/chironfs/
>



--
Luis Otavio de Colla Furquim
Não alimente os pingos
Don't feed the tribbles
http://www.furquim.org/chironfs/

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
fuse-devel mailing list
fuse-devel@...
https://lists.sourceforge.net/lists/listinfo/fuse-devel
LightInTheBox - Buy quality products at wholesale price