Discussion:
[fuse-devel] When does fuse FORGET?
David Sklar
2007-07-18 15:53:24 UTC
Permalink
I am seeing filesystem process memory size grow as my filesystem deals
with an increasing number of distinct files. I am able to reproduce
this memory growth by mounting fusexmp_fh on /example-mount and then
doing, e.g. ls -lR /example-mount/usr. The first time I issue such a
command, process memory grows by about 28M (ls -l /usr | wc -l reports
161861), but if I then issue the same command again, process memory
stays constant. I am running fusexmp_th as root with the command
"fusexmp_fh -o allow_other -o attr_timeout=0 -o entry_timeout=0
/example-mount". I see essentially the same results with fuse 2.6.5
and fuse 2.7.0.

Browsing through the code, my guess on the likely cause of this is the
userspace name_table and/or id_table filling up with info about all of
the entries I am asking for. It seems those tables only shrink when
FORGET requests are issued.

What causes FORGET requests to be issued? Can I force it? My
filesystem is potentially handling tens of millions of files (but not
all at once :) but I'd like to be able to put an upper bound on the
amount of memory it would consume while running.

Alternatively, if I'm totally off on what the memory consumption is
due to, any pointers in the correct direction would be appreciated.

Thanks,
David
Miklos Szeredi
2007-07-18 16:13:16 UTC
Permalink
Post by David Sklar
I am seeing filesystem process memory size grow as my filesystem deals
with an increasing number of distinct files. I am able to reproduce
this memory growth by mounting fusexmp_fh on /example-mount and then
doing, e.g. ls -lR /example-mount/usr. The first time I issue such a
command, process memory grows by about 28M (ls -l /usr | wc -l reports
161861), but if I then issue the same command again, process memory
stays constant. I am running fusexmp_th as root with the command
"fusexmp_fh -o allow_other -o attr_timeout=0 -o entry_timeout=0
/example-mount". I see essentially the same results with fuse 2.6.5
and fuse 2.7.0.
Browsing through the code, my guess on the likely cause of this is the
userspace name_table and/or id_table filling up with info about all of
the entries I am asking for. It seems those tables only shrink when
FORGET requests are issued.
Yes.
Post by David Sklar
What causes FORGET requests to be issued?
When the kernel shrinks the dcache (directory entry cache).
Post by David Sklar
Can I force it?
mount -i -oremount /tmp/fuse
Post by David Sklar
My filesystem is potentially handling tens of millions of files (but
not all at once :) but I'd like to be able to put an upper bound on
the amount of memory it would consume while running.
When there's memory pressure the kernel will start cleaning the
caches. So (depending on the amount of free memory) there will be a
point, when the memory use of the filesystem stops growing.

OTOH, if the entry timeout is set to a small value (like the 1sec
default) the fuse kernel module could collect the stale entries, and
so free up some memory in the kernel and the userspace filesystem as
well.

Miklos
Anand Avati
2007-07-25 06:54:21 UTC
Permalink
Post by Miklos Szeredi
Post by David Sklar
What causes FORGET requests to be issued?
When the kernel shrinks the dcache (directory entry cache).
Post by David Sklar
Can I force it?
mount -i -oremount /tmp/fuse
Did you mean that this forces the kernel to shrink the dcache of the
mountpoint? This doesnt seem to be the case with me, infact not even a
single FORGET comes when i do a 'mount -i -oremount /tmp/fuse'. Or did you
mean something else with 'mount -i -oremount' ?

thanks,
avati
--
Anand V. Avati
Miklos Szeredi
2007-07-25 07:26:45 UTC
Permalink
Post by Anand Avati
Post by Miklos Szeredi
Post by David Sklar
What causes FORGET requests to be issued?
When the kernel shrinks the dcache (directory entry cache).
Post by David Sklar
Can I force it?
mount -i -oremount /tmp/fuse
Did you mean that this forces the kernel to shrink the dcache of the
mountpoint?
Yes.
Post by Anand Avati
This doesnt seem to be the case with me, infact not even a single
FORGET comes when i do a 'mount -i -oremount /tmp/fuse'.
Were there any LOOKUPs before that?

Miklos
Miklos Szeredi
2007-07-25 07:29:06 UTC
Permalink
Post by Miklos Szeredi
Post by Anand Avati
This doesnt seem to be the case with me, infact not even a single
FORGET comes when i do a 'mount -i -oremount /tmp/fuse'.
Were there any LOOKUPs before that?
Also, objects which are still referenced (open file, CWD, mmap) won't
be FORGOTten.

Miklos
Anand Avati
2007-07-25 07:52:05 UTC
Permalink
Post by Anand Avati
Post by Anand Avati
Post by Miklos Szeredi
Post by David Sklar
Can I force it?
mount -i -oremount /tmp/fuse
Did you mean that this forces the kernel to shrink the dcache of the
mountpoint?
Yes.
Great, this was exactly what i was looking for, but this doesnt seem to
work.
Post by Anand Avati
This doesnt seem to be the case with me, infact not even a single
Post by Anand Avati
FORGET comes when i do a 'mount -i -oremount /tmp/fuse'.
Were there any LOOKUPs before that?
Yes, there is a big dir structure of a few thousand files/dirs. On this I
did an ls -lR. I did a cd / out of the mountpoint, and then a 'mount -i
-oremount /mnt' (/mnt is the mountpoint) and I do not get even a single
FORGET. I tried even cp -a'ing a fresh dir structure into the mountpoing, cd
/, and remount, still no FORGETs. Does sending FORGET on remount depend on
some INIT option in libfuse? (I am using lowlevel interface)

thanks,
avati
--
Anand V. Avati
Anand Avati
2007-07-25 07:54:40 UTC
Permalink
FWIW,
the mount line from 'strace mount -ni -o remount /mnt'

mount("glusterfs", "/mnt", "fuse", MS_NOSUID|MS_NODEV|MS_REMOUNT|0xc0ed0000,
0x805c990) = 0

does MS_NODEV cause VFS not to shrink the dcache?

avati
Post by Anand Avati
Post by Anand Avati
Post by Anand Avati
Post by David Sklar
Can I force it?
mount -i -oremount /tmp/fuse
Did you mean that this forces the kernel to shrink the dcache of the
mountpoint?
Yes.
Great, this was exactly what i was looking for, but this doesnt seem to
work.
Post by Anand Avati
This doesnt seem to be the case with me, infact not even a single
Post by Anand Avati
FORGET comes when i do a 'mount -i -oremount /tmp/fuse'.
Were there any LOOKUPs before that?
Yes, there is a big dir structure of a few thousand files/dirs. On this I
did an ls -lR. I did a cd / out of the mountpoint, and then a 'mount -i
-oremount /mnt' (/mnt is the mountpoint) and I do not get even a single
FORGET. I tried even cp -a'ing a fresh dir structure into the mountpoing, cd
/, and remount, still no FORGETs. Does sending FORGET on remount depend on
some INIT option in libfuse? (I am using lowlevel interface)
thanks,
avati
--
Anand V. Avati
--
Anand V. Avati
Miklos Szeredi
2007-07-25 08:01:12 UTC
Permalink
Post by Anand Avati
the mount line from 'strace mount -ni -o remount /mnt'
mount("glusterfs", "/mnt", "fuse", MS_NOSUID|MS_NODEV|MS_REMOUNT|0xc0ed0000,
0x805c990) = 0
does MS_NODEV cause VFS not to shrink the dcache?
No, it shrinks the dcache unconditionally. What I haven't realized,
that it doesn't shrink the icache as well, which is the couse of the
problem I think.

Miklos
Anand Avati
2007-07-25 08:06:52 UTC
Permalink
Post by Anand Avati
Post by Anand Avati
the mount line from 'strace mount -ni -o remount /mnt'
mount("glusterfs", "/mnt", "fuse",
MS_NOSUID|MS_NODEV|MS_REMOUNT|0xc0ed0000,
Post by Anand Avati
0x805c990) = 0
does MS_NODEV cause VFS not to shrink the dcache?
No, it shrinks the dcache unconditionally. What I haven't realized,
that it doesn't shrink the icache as well, which is the couse of the
problem I think.
Do you mean upgrading to 2.6.22 should make it work? Or should it work with
2.6.18 itself?


avati
--
Anand V. Avati
Miklos Szeredi
2007-07-25 08:19:28 UTC
Permalink
Post by Anand Avati
Post by Anand Avati
Post by Anand Avati
the mount line from 'strace mount -ni -o remount /mnt'
mount("glusterfs", "/mnt", "fuse",
MS_NOSUID|MS_NODEV|MS_REMOUNT|0xc0ed0000,
Post by Anand Avati
0x805c990) = 0
does MS_NODEV cause VFS not to shrink the dcache?
No, it shrinks the dcache unconditionally. What I haven't realized,
that it doesn't shrink the icache as well, which is the couse of the
problem I think.
Do you mean upgrading to 2.6.22 should make it work?
I think so.
Post by Anand Avati
Or should it work with 2.6.18 itself?
I only tested with 2.6.22, and it works. And if my theory is correct
it should not work with 2.6.18. But I may be wrong.

Also the change is very simple. If you don't want to upgrade, then
you can just apply this patch, which should fix the issue.

Miklos

Index: linux/fs/fuse/inode.c
===================================================================
--- linux.orig/fs/fuse/inode.c 2007-05-18 11:32:21.000000000 +0200
+++ linux/fs/fuse/inode.c 2007-05-18 11:47:36.000000000 +0200
@@ -470,6 +470,7 @@ static const struct super_operations fus
.destroy_inode = fuse_destroy_inode,
.read_inode = fuse_read_inode,
.clear_inode = fuse_clear_inode,
+ .drop_inode = generic_delete_inode,
.remount_fs = fuse_remount_fs,
.put_super = fuse_put_super,
.umount_begin = fuse_umount_begin,
Miklos Szeredi
2007-07-25 07:59:42 UTC
Permalink
Post by Anand Avati
Yes, there is a big dir structure of a few thousand files/dirs. On this I
did an ls -lR. I did a cd / out of the mountpoint, and then a 'mount -i
-oremount /mnt' (/mnt is the mountpoint) and I do not get even a single
FORGET. I tried even cp -a'ing a fresh dir structure into the mountpoing, cd
/, and remount, still no FORGETs. Does sending FORGET on remount depend on
some INIT option in libfuse? (I am using lowlevel interface)
I think I know. There was a recent change in the fuse module, that
made the kernel drop the inode, when the dentry was dropped. This was
added in linux-2.6.22. If you have an earlier kernel, this may not
work.

Miklos
Anand Avati
2007-07-25 08:05:27 UTC
Permalink
Post by Miklos Szeredi
I think I know. There was a recent change in the fuse module, that
made the kernel drop the inode, when the dentry was dropped. This was
added in linux-2.6.22. If you have an earlier kernel, this may not
work.
I'm running on 2.6.18. I am going to upgrade to 2.6.22 soon, I will check
how it works with that.

thanks,
avati
--
Anand V. Avati
Jan Engelhardt
2007-07-25 08:35:20 UTC
Permalink
Post by Anand Avati
Post by Miklos Szeredi
Post by David Sklar
What causes FORGET requests to be issued?
When the kernel shrinks the dcache (directory entry cache).
Post by David Sklar
Can I force it?
mount -i -oremount /tmp/fuse
Did you mean that this forces the kernel to shrink the dcache of the
mountpoint?
Yes.
But only in FUSE, is not it?


Jan
--
Miklos Szeredi
2007-07-25 08:46:05 UTC
Permalink
Post by Jan Engelhardt
Post by Anand Avati
Post by Miklos Szeredi
Post by David Sklar
What causes FORGET requests to be issued?
When the kernel shrinks the dcache (directory entry cache).
Post by David Sklar
Can I force it?
mount -i -oremount /tmp/fuse
Did you mean that this forces the kernel to shrink the dcache of the
mountpoint?
Yes.
But only in FUSE, is not it?
No, it does it in all cases:

| int do_remount_sb(struct super_block *sb, int flags, void *data, int force)
| {
| ...
| shrink_dcache_sb(sb);

Miklos

Gerard on the Road
2007-07-18 19:46:29 UTC
Permalink
snip most of David's thread<
OTOH, if the entry timeout is set to a small value (like the 1sec
default) the fuse kernel module could collect the stale entries, and
so free up some memory in the kernel and the userspace filesystem as
well.
Miklos
Hi Miklos,

Are those FORGET's available to the client fuse system?
If they are, then does FUSE guarantee there are no threads in the tree
below the Forget?

Thanks,

Gerard
Loading...