Skip to content

Instantly share code, notes, and snippets.

@akiross
Last active January 18, 2018 04:31
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save akiross/3d7952b186d523c61cbc to your computer and use it in GitHub Desktop.
Save akiross/3d7952b186d523c61cbc to your computer and use it in GitHub Desktop.
A simple step-by-step tutorial for llfuse in Python3

llfuse tutorial in python3

The only package I have for FUSE + Python3 is python3-llfuse (llfuse for short). I wanted to start easily, but who cares: I'll go the hard way. Sadly, Fedora uses an outdated version (as 30 Sept 2015): 0.40, which is from 2013, but llfuse had some nice updated lately, so I want the pip version. dnf install fuse-devel libattr-devel are enough (here) to make pip3 install llfuse smoothly.

I am now using the version 0.41.1, but other changes have been made.

Before continuing: let make me clear that I am not a fuse expert, I am not a kernel expert, I am not a filesystem expert or database expert. This is my first experience with fuse.

What's the deal: the llfuse python package uses the low-level FUSE API, which means that it's a lil bit harder, because instead of providing very practical functions to work on your filesystem, you will have to work with various inodes and structures. Well, I don't care right now, I am learning about fuse, so...

So, I am writing this "tutorial" about llfuse. My personal objective is to create a filesystem that allows me to practically group a large amount of files, using a certain criteria. Maybe I will write a tutorial about that, too (unlikely), but in this one I will aim for a simple, yet interesting example.

First, a little introduction to fuse and llfuse.

The idea of fuse is to create user-space filesystems. It means that you can write the code to build a filesystem without the need to work directly with the kernel - where usually you put filesystems' code.

So, basically, you provide the code that handles certain events, and fuse will call them when required by specific OS operations. So, for instance, you provide a function to create a new file and return to fuse a handler to it, fuse will keep track of that handler and whenever it needs to work with that file (e.g. reading it), it will give you the handler in a specific read_the_content_of_the_file() function.

Fuse is pretty convenient: once you create the program, you can use it to mount what you need to mount without being root. Even better: it's so practical, that people use it to create filesystem-based-interfaces for anything: robots, encrypted data, virtual data, you name it.

Doing nothing

Let's start with the absolute basic: a useless filesystem that does nothing, beside existing. Just to get started with the process. I will use the symbols $ and % to denote two different consoles.

Source file nothingfs.py

#!/usr/bin/env python3

import sys
import llfuse

class NothingOps(llfuse.Operations):
	def __init__(self):
		super().__init__()

if __name__ == '__main__':
	print("Mounting to directory", sys.argv[1])
	ops = NothingOps(root)
	llfuse.init(ops, sys.argv[1], ['fsname=nothingfs'])
	try:
		llfuse.main()
	except:
		llfuse.close()
		raise
	llfuse.close()

In the console where nothingfs.py exists:

$ chmod +x nothingfs.py
$ mkdir Mountpoint
$ ./nothingfs.py Mountpoint
Mounting to directory Mountpoint

In another console, same directory:

% ll -a
ls: cannot access Mountpoint: Function not implemented
total 24
drwxrwxr-x. 3 akiross akiross 4096 Sep 30 14:46 .
drwxrwxr-x. 5 akiross akiross 4096 Sep 30 14:44 ..
d?????????? ? ?       ?          ?            ? Mountpoint
-rwxr-xr-x. 1 akiross akiross  348 Sep 30 14:42 nothingfs.py
%

And to unmount the filesystem, use

% fusermount -u Mountpoint

From this we can see that something is necessary to

  1. access the directory and list its content
  2. get information about the directory (permissions, size, creation time...)

Well, this gives us a clue. Doing nothing is not so useless, afterall!

Stats and listing the content

Being the filesystem mounted in a directory, would be nice (to say the list) be able to list its content. Let's start with that.

To do that, we have to fill our "operation" structure, which is NothingOps: this class contains the method necessary to handle the requests coming from the user, and will contain a lot of methods which are used to create/open/read/close files and directories, get their stats, etc.

Normally, llfuse throws a "not implemented" error if a handler is not implemented, like in our case above.

But now we are going to define our first handlers: [opendir][1](inode), [readdir][2](fh, off) and [releasedir][3](fh).

The basic idea is that a directory is opened, we return an integer handler which will be used to call the readdir; when we are done with the directory, the handler will be released.

The opendir takes an inode: what is it? Inodes are, in this context, just integers. In the fuse lowlevel C api, you can see typedef uint64_t fuse_ino_t, which is in turn used by the llfuse API.

Inodes are used in the communication between fuse and the kernel, and they are provided to us as identifiers, but in some occasions we will be required to return handlers, stored internally in our implementation.

The root inode is 1, but to avoid the magic constant, use llfuse.ROOT_INODE to refer to it. Check out this explorative code.

class NothingOps(llfuse.Operations):
	def __init__(self):
		super().__init__()
	def opendir(self, inode):
		print("Opendir got inode", inode)
		if inode == llfuse.ROOT_INODE:
			print("  Root inode!")
		return 0
	def readdir(self, fh, off):
		print("Readdir got args", fh, off)
		return iter([])
	def releasedir(self, fh):
		print("Releasedir got handler", fh)

When mounting, I get several times these:

Opendir got inode 1
  Root inode!
Readdir got args 0 0
Releasedir got handler 0

Note the return value for readdir: it is an iterator, in this case to an empty list.

This is still pretty useless though, so let's try to get the bare minimum: directory stats.

Stats are provided by the getattr(inode) function. Let's create a mock one, that works only for the root directory:

	def getattr(self, inode):
		if inode != llfuse.ROOT_INODE:
			raise llfuse.FUSEError(errno.ENOENT)

		entry = llfuse.EntryAttributes()
		entry.st_ino = inode # inode 
		entry.st_mode = stat.S_IFDIR # it's a dir
		entry.st_nlink = 1
		entry.st_uid = os.getuid() # Process UID
		entry.st_gid = os.getgid() # Process GID
		entry.st_rdev = 0
		entry.st_size = 0
		entry.st_blksize = 1
		entry.st_blocks = 1
		entry.generation = 0
		entry.attr_timeout = 1
		entry.entry_timeout = 1
		entry.st_atime_ns = 0 # Access time (ns), 1 Jan 1970
		entry.st_ctime_ns = 0 # Change time (ns)
		entry.st_mtime_ns = 0 # Modification time (ns)
		return entry

These are not the only attributes, there are more, but for now those are the required ones. You can get info about the st_* attributes looking at man 2 stat, and the other attributes on llfuse doc.

Generation is not strictly necessary, but I wanted to put it here because it may of interest. An nice explanation of what a generation is, could be found on stackoverflow.

You will see that with this method, even without touching the (open|read|release)dir methods, the output will change:

% ll -a
total 25
drwxrwxr-x. 3 akiross akiross 4096 Sep 30 17:26 .
drwxrwxr-x. 5 akiross akiross 4096 Sep 30 17:03 ..
d---------. 1 akiross akiross    0 Jan  1  1970 Mountpoint
-rwxr-xr-x. 1 akiross akiross 1320 Sep 30 17:00 nothingfs.py
% ll -a Mountpoint
total 0
%

The directory is utterly empty: not even '.' and '..' show up in the list! Let's add them. We will create two entries for those directories. Let's modify the readdir so that, when the correct handler is passed (0, from the opendir above) and when offset is 0, we return the list of directories:

def readdir(self, fh, off):
	print("Readdir got args", fh, off)
	if fh == 0: # A valid fh returned by opendir
		if off == 0: # First entry
			yield (b'.', self.getattr(llfuse.ROOT_INODE), 1)
		if off == 1: # Second entry
			yield (b'..', self.getattr(llfuse.ROOT_INODE), 2)

With this code, we can get at least our dots:

% ll -a
total 25
drwxrwxr-x. 3 akiross akiross 4096 Sep 30 18:31 .
drwxrwxr-x. 5 akiross akiross 4096 Sep 30 18:22 ..
d---------. 1 akiross akiross    0 Jan  1  1970 Mountpoint
-rwxr-xr-x. 1 akiross akiross 1659 Sep 30 18:31 nothingfs.py
% ll -a Mountpoint
total 9
d---------. 1 akiross akiross    0 Jan  1  1970 .
drwxrwxr-x. 3 akiross akiross 4096 Sep 30 18:31 ..
%

Note that, until now, the releasedir does nothing. We do not need to release anything for now, so it's fine (that method could be removed entirely in our case).

Also, note that even if we provide the same inode when getting the attr of '.' and '..', their ls entries are different. I do not know why, and probably I also did a mistake in my code, but I think that it is because a special treatment is reserved to those directories.

Virtual files and directory

Let's take another little step further and create a very simple control mechanism: we will create a file input and directory output/. Whenever we write on the input file, an entry gets added to the output directory, and the content of that entry will be the data written onto input. Pretty cool, uh? :) This will be our WriterFS.

To do that, we will need functions to open, read, write and close files, but let's start creating a fake output directory. We do it by modifying our readdir, adding a third entry:

if off == 2:
	yield (b'output', self.getattr(llfuse.ROOT_INODE+1), 3)

Note that, when we get the attributes, we are in charge of building the entry attributes, therefore we will define the inode to be used. In this case, we are using the inode root_node+1 (=2), but we also have to change the current getattr implementation.

Output will always be there, so we can hardcode it:

def getattr(self, inode):
	r = llfuse.ROOT_INODE
	if inode != r and inode != r+1:
		print("Sorry, wrong!")
		raise llfuse.FUSEError(errno.ENOENT)
	# build the entry like before

And if we see the result of ls...

% ll -a Mountpoint
ls: cannot access Mountpoint/output: Function not implemented
total 9
d---------. 1 akiross akiross    0 Jan  1  1970 .
drwxrwxr-x. 3 akiross akiross 4096 Sep 30 19:01 ..
d?????????? ? ?       ?          ?            ? output

WAT? What's that? Why it cannot get the attributes??

Enters the lookup method, that is used to get the attributes of a node by name.

In our case it can be as simple as:

def lookup(self, p_inode, name):
	print("Looking up", name, p_inode)
	if p_inode == llfuse.ROOT_INODE and name == b'output':
		return self.getattr(llfuse.ROOT_INODE+1)
	raise llfuse.FUSEError(errno.ENOENT)

And this make the ls work.

It's still not clear to me how the lookup works and why we didn't need it earlier. My speculation is that, earlier, no named file or directory was present, beside the dotted directories, which are kinda special.

Let's add the input file. To do it, we have to change a little our getattr, because until now we served only directories. Let's fix it to support the input file:

def getattr(self, inode):
	r = llfuse.ROOT_INODE
	entry = llfuse.EntryAttributes()
	if inode == r or inode == r+1:
		# it's a directory, mode 000
		entry.st_mode = stat.S_IFDIR
	elif inode == r+2:
		# it's a regular file, writable
		entry.st_mode = stat.S_IFREG
	else:
		raise llfuse.FUSEError(errno.ENOENT)
	# Set the others like before

Here I am using a regular file, and I think it is ok because it will be handled normally by our filesystem - even if we will do "magic".

And naturally the lookup has to be fixed as well:

def lookup(self, p_inode, name):
	print("Looking up", name, p_inode)
	if p_inode == llfuse.ROOT_INODE and name == b'output':
		return self.getattr(llfuse.ROOT_INODE+1)
	elif p_inode == llfuse.ROOT_INODE and name == b'input':
		return self.getattr(llfuse.ROOT_INODE+2)
	raise llfuse.FUSEError(errno.ENOENT)

Let's now do a very simple thing: whenever input is written, we print the data written to it. To do so, few functions are our friend: open(inode, flags), read(fh, off, size), write(fh, off, buff), flush(fh), fsync(fh, datasync) and release(fh).

To understand the basics, let's implement a trivial version of them:

def open(self, inode, flags):
	print("Requested opening of", inode, "with flags", flags)
	return 1 # A different handler

def read(self, fh, off, size):
	print("Requested to read", fh, off, size)
	return b''

def write(self, fh, off, buff):
	print('Writing', fh, off, buff)
	return 0

def flush(self, fh):
	print("Flushing", fh)

def fsync(self, datasync):
	print("Fsync", datasync)

def release(self, fh):
	print("Releasing handler", fh)

Let's mount and cat input

% cat Mountpoint/input 
%

Will result in these messages printed:

Looking up b'input' 1
Requested opening of 3 with flags 32768
Flushing 1
Releasing handler 1

While echoing something to it, will yield:

% echo "Hello, World" >>Mountpoint/input
echo: write error: Input/output error

and the messages:

Looking up b'input' 1
Requested opening of 3 with flags 33793
Flushing 1
Writing 1 0 b'Hello, World\n'
Flushing 1
Releasing handler 1

Note that I have to use ">>" and not ">", otherwise the file should be overwritten - and to do that, we need to implement create!

Also, note that fsync is not called, while flush is: flush is called (even multiple times) when the file gets closed, to ensure that all the data has been written, while fsync is related to the syncronization of that that may get dirty. So, basically, we can comment the fsync.

The next step is to store the data got by the write and make it available in the output directory. Let's keep a list of data for each file. We will assume that content files will have an inode of ROOT_INODE+3 and above.

First, let's do the easy part: save written data.

def __init__(self):
	super().__init__()
	self._data = [] # Storage

def write(self, fh, off, buff):
	print('Writing', fh, off, buff)
	self._data.append(buff) # Save
	return len(buff)

Next, we need to list the saved data, and this requires a change in opendir and readdir:

def opendir(self, inode):
	print("Opendir got inode", inode)
	if inode == llfuse.ROOT_INODE:
		print("  Root inode!")
		return 0
	if inode == llfuse.ROOT_INODE+1:
		print("  Output inode!")
		return 2
	raise llfuse.FUSEError(errno.ENOENT)

in this way, when we ask for inode root+1, we get an handler equal to 2 (because 1 was used before, for the input file), and we can now list the files in readdir:

def readdir(self, fh, off):
	print("Readdir got args", fh, off)
	# ...as above...
	if fh == 2:
		print("Listing files", self._data[off:])
		for i, _ in enumerate(self._data[off:]):
			name = 'dat{}'.format(i)
			attr = self.getattr(llfuse.ROOT_INODE+3+i)
			yield (name.encode(), attr, i+1)

But to have it working properly, we also need to satisfy the lookup and the getattr requirement:

def getattr(self, inode):
	r = llfuse.ROOT_INODE
	entry = llfuse.EntryAttributes()
	if inode == r or inode == r+1:
		# it's a directory, mode 000
		entry.st_mode = stat.S_IFDIR
		entry.st_size = 0 # no size
	elif inode == r+2:
		# it's a regular file, writable
		entry.st_mode = stat.S_IFREG
		entry.st_size = 0
	elif inode >= r+3:
		print("  Getting data for", inode-r-3)
		entry.st_mode = stat.S_IFREG
		entry.st_size = len(self._data[inode-r-3])
	else:
		raise llfuse.FUSEError(errno.ENOENT)
	entry.st_ino = inode # inode 
	entry.st_nlink = 1
	entry.st_uid = os.getuid() # Process UID
	entry.st_gid = os.getgid() # Process GID
	entry.st_rdev = 0
	entry.st_blksize = 1
	entry.st_blocks = 1
	entry.generation = 0
	entry.attr_timeout = 1
	entry.entry_timeout = 1
	entry.st_atime_ns = 0 # Access time (ns), 1 Jan 1970
	entry.st_ctime_ns = 0 # Change time (ns)
	entry.st_mtime_ns = 0 # Modification time (ns)
	return entry

def lookup(self, p_inode, name):
	print("Looking up", name, p_inode)
	if p_inode == llfuse.ROOT_INODE and name == b'output':
		return self.getattr(llfuse.ROOT_INODE+1)
	elif p_inode == llfuse.ROOT_INODE and name == b'input':
		return self.getattr(llfuse.ROOT_INODE+2)
	elif p_inode == llfuse.ROOT_INODE+1:
		n = int(name.decode()[3:])
		if 0 <= n < len(self._data):
			return self.getattr(llfuse.ROOT_INODE+3+n)
	raise llfuse.FUSEError(errno.ENOENT)

This allows us to write without problems:

% ll Mountpoint/output
total 0
% echo "Hello 1" >>Mountpoint/input
% echo "Hello 2" >>Mountpoint/input
% echo "Hello 3" >>Mountpoint/input
% ll Mountpoint/output
total 2
----------. 1 akiross akiross 8 Jan  1  1970 dat0
----------. 1 akiross akiross 8 Jan  1  1970 dat1
----------. 1 akiross akiross 8 Jan  1  1970 dat2
%

But still we cannot read those files: if we try, the result is empty. Let's fix that:

def open(self, inode, flags):
	print("Requested opening of", inode, "with flags", flags)
	if inode == llfuse.ROOT_INODE+2:
		return 1 # A different handler
	elif inode >= 4:
		return 100 + inode

def read(self, fh, off, size):
	print("Requested to read", fh, off, size)
	if fh >= 104:
		return self._data[fh - 104]
	return b''

And that's it: a simple, yet effective, virtual interface with files and directories. Of course, you can control your robot with this.

Few notes

This code is silly and horrible. It's really a mess, but I don't care: because it's a code made for a tutorial, trying to keep it lean and mean. Of course, this code can be rewritten better, fixing dates, fixing permissions, fixing everything.

Not being an expert, I may have made few (or lots of) mistakes, especially conceptual. In that case, please, feel free to report, explain, comment. I am gladly learning about this topic (which I wanted to study since forever), even if the documentation (for the low-level API) is not that comprehensive. This is why I made this tutorial.

#!/usr/bin/env python3
import os
import sys
import stat
import errno
import llfuse
class NothingOps(llfuse.Operations):
def __init__(self):
super().__init__()
def getattr(self, inode):
if inode != llfuse.ROOT_INODE:
raise llfuse.FUSEError(errno.ENOENT)
entry = llfuse.EntryAttributes()
entry.st_mode = stat.S_IFDIR
entry.st_ino = inode # inode
entry.st_nlink = 1
entry.st_uid = os.getuid() # Process UID
entry.st_gid = os.getgid() # Process GID
entry.st_rdev = 0
entry.st_size = 0 # no size
entry.st_blksize = 1
entry.st_blocks = 1
entry.generation = 0
entry.attr_timeout = 1
entry.entry_timeout = 1
entry.st_atime_ns = 0 # Access time (ns), 1 Jan 1970
entry.st_ctime_ns = 0 # Change time (ns)
entry.st_mtime_ns = 0 # Modification time (ns)
return entry
def opendir(self, inode):
# print("Opendir got inode", inode)
if inode == llfuse.ROOT_INODE:
# print(" Root inode! Returning handler 1")
return 0
raise llfuse.FUSEError(errno.ENOENT)
def readdir(self, fh, off):
# print("Readdir got args", fh, off)
if fh == 0:
if off == 0:
yield (b'.', self.getattr(llfuse.ROOT_INODE), 1)
if off == 1:
yield (b'..', self.getattr(llfuse.ROOT_INODE), 2)
# def releasedir(self, fh):
# print("Releasedir got handler", fh)
if __name__ == '__main__':
print("Mounting to directory", sys.argv[1])
ops = NothingOps()
llfuse.init(ops, sys.argv[1], ['fsname=nothingfs'])
try:
llfuse.main()
except:
llfuse.close(unmount=False)
raise
llfuse.close()
#!/usr/bin/env python3
import os
import sys
import stat
import errno
import llfuse
class WriterOps(llfuse.Operations):
def __init__(self):
super().__init__()
self._data = []
def getattr(self, inode):
r = llfuse.ROOT_INODE
entry = llfuse.EntryAttributes()
if inode == r or inode == r+1:
# it's a directory, mode 000
entry.st_mode = stat.S_IFDIR
entry.st_size = 0 # no size
elif inode == r+2:
# it's a regular file, writable
entry.st_mode = stat.S_IFREG
entry.st_size = 0
elif inode >= r+3:
print(" Getting data for", inode-r-3)
entry.st_mode = stat.S_IFREG
entry.st_size = len(self._data[inode-r-3])
else:
raise llfuse.FUSEError(errno.ENOENT)
entry.st_ino = inode # inode
entry.st_nlink = 1
entry.st_uid = os.getuid() # Process UID
entry.st_gid = os.getgid() # Process GID
entry.st_rdev = 0
entry.st_blksize = 1
entry.st_blocks = 1
entry.generation = 0
entry.attr_timeout = 1
entry.entry_timeout = 1
entry.st_atime_ns = 0 # Access time (ns), 1 Jan 1970
entry.st_ctime_ns = 0 # Change time (ns)
entry.st_mtime_ns = 0 # Modification time (ns)
return entry
def lookup(self, p_inode, name):
print("Looking up", name, p_inode)
if p_inode == llfuse.ROOT_INODE and name == b'output':
return self.getattr(llfuse.ROOT_INODE+1)
elif p_inode == llfuse.ROOT_INODE and name == b'input':
return self.getattr(llfuse.ROOT_INODE+2)
elif p_inode == llfuse.ROOT_INODE+1:
n = int(name.decode()[3:])
if 0 <= n < len(self._data):
return self.getattr(llfuse.ROOT_INODE+3+n)
raise llfuse.FUSEError(errno.ENOENT)
def opendir(self, inode):
print("Opendir got inode", inode)
if inode == llfuse.ROOT_INODE:
print(" Root inode!")
return 0
if inode == llfuse.ROOT_INODE+1:
print(" Output inode!")
return 2
raise llfuse.FUSEError(errno.ENOENT)
def readdir(self, fh, off):
print("Readdir got args", fh, off)
if fh == 0:
if off == 0:
yield (b'.', self.getattr(llfuse.ROOT_INODE), 1)
if off == 1:
yield (b'..', self.getattr(llfuse.ROOT_INODE), 2)
if off == 2:
yield (b'output', self.getattr(llfuse.ROOT_INODE+1), 3)
if off == 3:
yield (b'input', self.getattr(llfuse.ROOT_INODE+2), 4)
if fh == 2:
print("Listing files", self._data[off:])
for i, _ in enumerate(self._data[off:]):
name = 'dat{}'.format(i)
attr = self.getattr(llfuse.ROOT_INODE+3+i)
yield (name.encode(), attr, i+1)
def open(self, inode, flags):
print("Requested opening of", inode, "with flags", flags)
if inode == llfuse.ROOT_INODE+2:
return 1 # A different handler
elif inode >= 4:
return 100 + inode
def read(self, fh, off, size):
print("Requested to read", fh, off, size)
if fh >= 104:
return self._data[fh - 104]
return b''
def write(self, fh, off, buff):
print('Writing', fh, off, buff)
self._data.append(buff)
return len(buff)
def flush(self, fh):
print("Flushing", fh)
# def fsync(self, datasync):
# print("Fsync", datasync)
def release(self, fh):
print("Releasing handler", fh)
# def releasedir(self, fh):
# print("Releasedir got handler", fh)
if __name__ == '__main__':
print("Mounting to directory", sys.argv[1])
ops = WriterOps()
llfuse.init(ops, sys.argv[1], ['fsname=writerfs'])
try:
llfuse.main()
except:
llfuse.close(unmount=False)
raise
llfuse.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment