Discussion:
[fuse-devel] fuse: when are release requests queued?
David Butterfield
2017-05-26 15:17:51 UTC
Permalink
Looking at fs/fuse/file.c, it looks as if fuse_release() directly calls
fuse_request_send_background() to send the request. But at that point I
can no longer follow the code. Is it possible for another request to
sneak in at this point?
Furthermore, does the VFS call fuse_release() directly while handling
the close() syscall, or does this happen asynchronously later on?
Does the comment in fuse_release_common (called by fuse_release)
(Linux 4.4.0) answer this?

267 /*
268 * Normally this will send the RELEASE request, however if
269 * some asynchronous READ or WRITE requests are outstanding,
270 * the sending will be delayed.
271 *
272 * Make the release synchronous if this is a fuseblk mount,
273 * synchronous RELEASE is allowed (and desirable) in this case
274 * because the server can be trusted not to screw up.
275 */
Nikolaus Rath
2017-05-26 23:11:05 UTC
Permalink
Post by David Butterfield
Looking at fs/fuse/file.c, it looks as if fuse_release() directly calls
fuse_request_send_background() to send the request. But at that point I
can no longer follow the code. Is it possible for another request to
sneak in at this point?
Furthermore, does the VFS call fuse_release() directly while handling
the close() syscall, or does this happen asynchronously later on?
Does the comment in fuse_release_common (called by fuse_release)
(Linux 4.4.0) answer this?
267 /*
268 * Normally this will send the RELEASE request, however if
269 * some asynchronous READ or WRITE requests are outstanding,
270 * the sending will be delayed.
271 *
272 * Make the release synchronous if this is a fuseblk mount,
273 * synchronous RELEASE is allowed (and desirable) in this case
274 * because the server can be trusted not to screw up.
275 */
It does give some indication, I'd rather have someone familiar with the
actual code confirm this.

Specifically, this says that if async read()/write() requests are
pending, the RELEASE will be delayed. But does this guarantee that
that if there are no pending requests it will not be delayed? And how
can there be a pending request if the file isn't open anymore?

Best,
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

»Time flies like an arrow, fruit flies like a Banana.«
Maxim Patlasov
2017-05-27 01:49:14 UTC
Permalink
Post by Nikolaus Rath
Post by David Butterfield
Looking at fs/fuse/file.c, it looks as if fuse_release() directly calls
fuse_request_send_background() to send the request. But at that point I
can no longer follow the code. Is it possible for another request to
sneak in at this point?
Furthermore, does the VFS call fuse_release() directly while handling
the close() syscall, or does this happen asynchronously later on?
Does the comment in fuse_release_common (called by fuse_release)
(Linux 4.4.0) answer this?
267 /*
268 * Normally this will send the RELEASE request, however if
269 * some asynchronous READ or WRITE requests are outstanding,
270 * the sending will be delayed.
271 *
272 * Make the release synchronous if this is a fuseblk mount,
273 * synchronous RELEASE is allowed (and desirable) in this case
274 * because the server can be trusted not to screw up.
275 */
It does give some indication, I'd rather have someone familiar with the
actual code confirm this.
Specifically, this says that if async read()/write() requests are
pending, the RELEASE will be delayed. But does this guarantee that
that if there are no pending requests it will not be delayed?
If nothing is pending, it will go to pending queue immediately. But this
won't guarantee that the userspace fetches it before fuse_release() returns.
Post by Nikolaus Rath
And how
can there be a pending request if the file isn't open anymore?
I think the comment tells us about pending request in general, not
specifically for that given file.
Post by Nikolaus Rath
Best,
-Nikolaus
Maxim Patlasov
2017-05-27 01:39:56 UTC
Permalink
Hello,
I am trying to debug a sporadic test failure in libfuse
(https://github.com/libfuse/libfuse/issues/157).
Can someone tell me at which point the fuse kernel module will send a
RELEASE request to userspace?
Anytime after fuse_release(). It only puts request to background queue.
Later, the request will be transferred to pending queue. And later, the
userspace will fetch it by fuse_dev_do_read().
Is it possible that this is delayed until
after the close() syscall for the last fd has returned and userspace has
submitted a different fuse request for the same fs?
I think it's possible. See how flush_bg_queue() do nothing if
fc->active_background > fc->max_background.
Looking at fs/fuse/file.c, it looks as if fuse_release() directly calls
fuse_request_send_background() to send the request. But at that point I
can no longer follow the code. Is it possible for another request to
sneak in at this point?
Yes, but not for that given fuse_file that we're closing now.
Furthermore, does the VFS call fuse_release() directly while handling
the close() syscall, or does this happen asynchronously later on?
It's called directly for well-behaved applications in well-controlled
environment, but there are some exceptions. You may be interested to
read https://sourceforge.net/p/fuse/mailman/message/32872225/
Thanks!
-Nikolaus
Nikolaus Rath
2017-05-29 16:49:36 UTC
Permalink
Hi Maxim,
Post by Maxim Patlasov
Hello,
I am trying to debug a sporadic test failure in libfuse
(https://github.com/libfuse/libfuse/issues/157).
Can someone tell me at which point the fuse kernel module will send a
RELEASE request to userspace?
Anytime after fuse_release(). It only puts request to background
queue. Later, the request will be transferred to pending queue. And
later, the userspace will fetch it by fuse_dev_do_read().
Is it possible that this is delayed until
after the close() syscall for the last fd has returned and userspace has
submitted a different fuse request for the same fs?
I think it's possible. See how flush_bg_queue() do nothing if
fc->active_background > fc->max_background.
Thanks Maxim! Not sure what I'd do with these issues without you :-).


Is there a way to deliberate trigger this behavior for debugging? For
example, is there a kernel equivalent of sleep(1) that I could put into
fuse_release()?
Post by Maxim Patlasov
Looking at fs/fuse/file.c, it looks as if fuse_release() directly calls
fuse_request_send_background() to send the request. But at that point I
can no longer follow the code. Is it possible for another request to
sneak in at this point?
Yes, but not for that given fuse_file that we're closing now.
I assume that a fuse_file refers to the (formerly) opened file, right?
So e.g. a unlink() request for the same directly entry could still go
through before RELEASE has been transferred to the pending queue?


Best,
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

»Time flies like an arrow, fruit flies like a Banana.«
Maxim Patlasov
2017-05-31 17:50:57 UTC
Permalink
Post by Nikolaus Rath
Hi Maxim,
Post by Maxim Patlasov
Hello,
I am trying to debug a sporadic test failure in libfuse
(https://github.com/libfuse/libfuse/issues/157).
Can someone tell me at which point the fuse kernel module will send a
RELEASE request to userspace?
Anytime after fuse_release(). It only puts request to background
queue. Later, the request will be transferred to pending queue. And
later, the userspace will fetch it by fuse_dev_do_read().
Is it possible that this is delayed until
after the close() syscall for the last fd has returned and userspace has
submitted a different fuse request for the same fs?
I think it's possible. See how flush_bg_queue() do nothing if
fc->active_background > fc->max_background.
Thanks Maxim! Not sure what I'd do with these issues without you :-).
Is there a way to deliberate trigger this behavior for debugging? For
example, is there a kernel equivalent of sleep(1) that I could put into
fuse_release()?
schedule_timeout_interruptible(HZ). But it's better to instrument fuse
userspace to postpone processing some i/o requests. Then you'll keep
fc->active_background > fc->max_background for a while. During that
period fuse_release may succeed with FUSE_RELEASE queued, but not passed
to the userspace. Then you cat try to sneak another request -- something
not involving fuse background queue.
Post by Nikolaus Rath
Post by Maxim Patlasov
Looking at fs/fuse/file.c, it looks as if fuse_release() directly calls
fuse_request_send_background() to send the request. But at that point I
can no longer follow the code. Is it possible for another request to
sneak in at this point?
Yes, but not for that given fuse_file that we're closing now.
I assume that a fuse_file refers to the (formerly) opened file, right?
So e.g. a unlink() request for the same directly entry could still go
through before RELEASE has been transferred to the pending queue?
Yes. But FUSE_UNLINK works on file path, it doesn't depend on fuse_file
at all. Hence unlink request can go anytime.
Post by Nikolaus Rath
Best,
-Nikolaus
Nikolaus Rath
2017-05-31 19:19:19 UTC
Permalink
Post by Maxim Patlasov
Post by Nikolaus Rath
Post by Maxim Patlasov
Can someone tell me at which point the fuse kernel module will send a
RELEASE request to userspace?
Anytime after fuse_release(). It only puts request to background
queue. Later, the request will be transferred to pending queue. And
later, the userspace will fetch it by fuse_dev_do_read().
Is it possible that this is delayed until
after the close() syscall for the last fd has returned and userspace has
submitted a different fuse request for the same fs?
I think it's possible. See how flush_bg_queue() do nothing if
fc->active_background > fc->max_background.
Thanks Maxim! Not sure what I'd do with these issues without you :-).
Is there a way to deliberate trigger this behavior for debugging? For
example, is there a kernel equivalent of sleep(1) that I could put into
fuse_release()?
schedule_timeout_interruptible(HZ).
Hmm. I made the following change in linux 4.10:

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 2401c5..3568a8 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -252,6 +252,9 @@ void fuse_release_common(struct file *file, int opcode)
if (unlikely(!ff))
return;

+ // Wait a little to force race condition in userspace
+ schedule_timeout_interruptible(1);
+
req = ff->reserved_req;
fuse_prepare_release(ff, file->f_flags, opcode);


But when doing e.g. "echo test > newfile", the RELEASE request still
comes right away (judging from the libfuse debugging output).

Do I need to do something else?
Post by Maxim Patlasov
But it's better to instrument fuse
userspace to postpone processing some i/o requests. Then you'll keep
fc->active_background > fc->max_background for a while. During that
period fuse_release may succeed with FUSE_RELEASE queued, but not
passed to the userspace. Then you cat try to sneak another request --
something not involving fuse background queue.
I don't know.. why is this better? It seems a lot more complicated. I
need to generate the extra request, add some switch to tell libfuse when
to start processing again, synchronize this with sneaking in the other
request...



Best,
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

»Time flies like an arrow, fruit flies like a Banana.«
Nikolaus Rath
2017-05-31 19:41:15 UTC
Permalink
Post by Nikolaus Rath
Post by Maxim Patlasov
Post by Nikolaus Rath
Post by Maxim Patlasov
Can someone tell me at which point the fuse kernel module will send a
RELEASE request to userspace?
Anytime after fuse_release(). It only puts request to background
queue. Later, the request will be transferred to pending queue. And
later, the userspace will fetch it by fuse_dev_do_read().
Is it possible that this is delayed until
after the close() syscall for the last fd has returned and userspace has
submitted a different fuse request for the same fs?
I think it's possible. See how flush_bg_queue() do nothing if
fc->active_background > fc->max_background.
Thanks Maxim! Not sure what I'd do with these issues without you :-).
Is there a way to deliberate trigger this behavior for debugging? For
example, is there a kernel equivalent of sleep(1) that I could put into
fuse_release()?
schedule_timeout_interruptible(HZ).
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 2401c5..3568a8 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -252,6 +252,9 @@ void fuse_release_common(struct file *file, int opcode)
if (unlikely(!ff))
return;
+ // Wait a little to force race condition in userspace
+ schedule_timeout_interruptible(1);
+
req = ff->reserved_req;
fuse_prepare_release(ff, file->f_flags, opcode);
But when doing e.g. "echo test > newfile", the RELEASE request still
comes right away (judging from the libfuse debugging output).
Do I need to do something else?
Try HZ*10 instead of 1 as an argument of
schedule_timeout_interruptible.
Ok, now the RELEASE comes a lot later. But now userspace is also
blocking until RELEASE comes in.
Post by Nikolaus Rath
Post by Maxim Patlasov
But it's better to instrument fuse
userspace to postpone processing some i/o requests. Then you'll keep
fc->active_background > fc->max_background for a while. During that
period fuse_release may succeed with FUSE_RELEASE queued, but not
passed to the userspace. Then you cat try to sneak another request --
something not involving fuse background queue.
I don't know.. why is this better? It seems a lot more complicated. I
need to generate the extra request, add some switch to tell libfuse when
to start processing again, synchronize this with sneaking in the other
request...
I thought it's better because it would trigger delayed processing of
FUSE_RELEASE: last __fput() succeeded, but fuse userspace will see
FUSE_RELEASE only later. Adding sleep to fuse_release_common would
only extend processing time of last __fput(), is that what you need?
I do not fully understand the difference you describe. What I would like
to construct is the following scenario:

1. Userspace calls close()
2. Userspace close() returns
3. Userspace calls unlink()
4. Userspace unlink() returns
5. libfuse reads UNLINK request from kernel pipe
6. libfuse reads RELEASE request from kernel pipe

What would be the simplest way to do that?

Thanks!
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

»Time flies like an arrow, fruit flies like a Banana.«
Michael Theall
2017-05-31 19:51:47 UTC
Permalink
Won't unlink(2) block until the fuse server has responded? I'm pretty sure
the close(2) should come back after the fuse server responds to FLUSH. It
sounds like with your RELEASE delay in the kernel, you should get your
steps as described buy step 4 and 5 must be swapped.

Regards,
Michael Theall
Post by Nikolaus Rath
Post by Maxim Patlasov
Post by Nikolaus Rath
Post by Maxim Patlasov
Can someone tell me at which point the fuse kernel module will send
a
Post by Nikolaus Rath
Post by Maxim Patlasov
Post by Nikolaus Rath
Post by Maxim Patlasov
RELEASE request to userspace?
Anytime after fuse_release(). It only puts request to background
queue. Later, the request will be transferred to pending queue. And
later, the userspace will fetch it by fuse_dev_do_read().
Is it possible that this is delayed until
after the close() syscall for the last fd has returned and
userspace has
Post by Nikolaus Rath
Post by Maxim Patlasov
Post by Nikolaus Rath
Post by Maxim Patlasov
submitted a different fuse request for the same fs?
I think it's possible. See how flush_bg_queue() do nothing if
fc->active_background > fc->max_background.
Thanks Maxim! Not sure what I'd do with these issues without you :-).
Is there a way to deliberate trigger this behavior for debugging? For
example, is there a kernel equivalent of sleep(1) that I could put
into
Post by Nikolaus Rath
Post by Maxim Patlasov
Post by Nikolaus Rath
fuse_release()?
schedule_timeout_interruptible(HZ).
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 2401c5..3568a8 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -252,6 +252,9 @@ void fuse_release_common(struct file *file, int
opcode)
Post by Nikolaus Rath
if (unlikely(!ff))
return;
+ // Wait a little to force race condition in userspace
+ schedule_timeout_interruptible(1);
+
req = ff->reserved_req;
fuse_prepare_release(ff, file->f_flags, opcode);
But when doing e.g. "echo test > newfile", the RELEASE request still
comes right away (judging from the libfuse debugging output).
Do I need to do something else?
Try HZ*10 instead of 1 as an argument of
schedule_timeout_interruptible.
Ok, now the RELEASE comes a lot later. But now userspace is also
blocking until RELEASE comes in.
Post by Nikolaus Rath
Post by Maxim Patlasov
But it's better to instrument fuse
userspace to postpone processing some i/o requests. Then you'll keep
fc->active_background > fc->max_background for a while. During that
period fuse_release may succeed with FUSE_RELEASE queued, but not
passed to the userspace. Then you cat try to sneak another request --
something not involving fuse background queue.
I don't know.. why is this better? It seems a lot more complicated. I
need to generate the extra request, add some switch to tell libfuse when
to start processing again, synchronize this with sneaking in the other
request...
I thought it's better because it would trigger delayed processing of
FUSE_RELEASE: last __fput() succeeded, but fuse userspace will see
FUSE_RELEASE only later. Adding sleep to fuse_release_common would
only extend processing time of last __fput(), is that what you need?
I do not fully understand the difference you describe. What I would like
1. Userspace calls close()
2. Userspace close() returns
3. Userspace calls unlink()
4. Userspace unlink() returns
5. libfuse reads UNLINK request from kernel pipe
6. libfuse reads RELEASE request from kernel pipe
What would be the simplest way to do that?
Thanks!
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
»Time flies like an arrow, fruit flies like a Banana.«
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
--
fuse-devel mailing list
To unsubscribe or subscribe, visit
https://lists.sourceforge.net/lists/listinfo/fuse-devel
Nikolaus Rath
2017-05-31 20:33:30 UTC
Permalink
Post by Michael Theall
Post by Nikolaus Rath
I do not fully understand the difference you describe. What I would like
1. Userspace calls close()
2. Userspace close() returns
3. Userspace calls unlink()
4. Userspace unlink() returns
5. libfuse reads UNLINK request from kernel pipe
6. libfuse reads RELEASE request from kernel pipe
What would be the simplest way to do that?
Won't unlink(2) block until the fuse server has responded?
Yes, you are right. It should be:

1. Userspace calls close()
2. Userspace close() returns
3. Userspace calls unlink()
4. libfuse reads UNLINK request from kernel pipe
5. Userspace unlink() returns
6. libfuse reads RELEASE request from kernel pipe
Post by Michael Theall
I'm pretty sure
the close(2) should come back after the fuse server responds to FLUSH. It
sounds like with your RELEASE delay in the kernel, you should get your
steps as described buy step 4 and 5 must be swapped.
No, the delay comes in between (1) and (2).

Best,
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

»Time flies like an arrow, fruit flies like a Banana.«
Nikolaus Rath
2017-05-31 20:31:08 UTC
Permalink
Post by Nikolaus Rath
I do not fully understand the difference you describe. What I would like
1. Userspace calls close()
2. Userspace close() returns
3. Userspace calls unlink()
4. Userspace unlink() returns
5. libfuse reads UNLINK request from kernel pipe
6. libfuse reads RELEASE request from kernel pipe
What would be the simplest way to do that?
I would try to keep fc->active_background elevated somehow. For
example you add sleep(1) for every incoming write request to libfuse
and serialize processing them. Then you generate enough writes to
achieve fc->max_background. If you call close() now, and if it really
ends up in last __fput(), corresponding FUSE_RELEASE will sit in
background queue for long while (as many seconds as # elements in the
queue). But close() from your 2. will return much earlier because it
doesn't wait for completion of FUSE_RELEASE. Hence unlink() might
succeed.
Ah, got it now, thanks!

Wouldn't be a simpler solution be to just patch the kernel module to
*always* put FUSE_RELEASE requests into the background queue, so that I
don't have to manually keep fc->active_background elevated?

I just can't seem to find the code that does this check... I would
expect it in fuse_file_put(), but the condition in there does not seem to
look at the number of background requests at all.


Best,
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

»Time flies like an arrow, fruit flies like a Banana.«
Maxim Patlasov
2017-05-31 20:47:58 UTC
Permalink
Post by Nikolaus Rath
Post by Nikolaus Rath
I do not fully understand the difference you describe. What I would like
1. Userspace calls close()
2. Userspace close() returns
3. Userspace calls unlink()
4. Userspace unlink() returns
5. libfuse reads UNLINK request from kernel pipe
6. libfuse reads RELEASE request from kernel pipe
What would be the simplest way to do that?
I would try to keep fc->active_background elevated somehow. For
example you add sleep(1) for every incoming write request to libfuse
and serialize processing them. Then you generate enough writes to
achieve fc->max_background. If you call close() now, and if it really
ends up in last __fput(), corresponding FUSE_RELEASE will sit in
background queue for long while (as many seconds as # elements in the
queue). But close() from your 2. will return much earlier because it
doesn't wait for completion of FUSE_RELEASE. Hence unlink() might
succeed.
Ah, got it now, thanks!
Wouldn't be a simpler solution be to just patch the kernel module to
*always* put FUSE_RELEASE requests into the background queue, so that I
don't have to manually keep fc->active_background elevated?
I just can't seem to find the code that does this check... I would
expect it in fuse_file_put(), but the condition in there does not seem to
look at the number of background requests at all.
The decision is made at mount stage: it's either fuseblk mount or not.
If it's not fuseblk mount, the kernel always put FUSE_RELEASE to
background queue. And vice versa.

Keeping active_background elevated may help us to win the race: you want
unlink is queued and processed before the userspace reads FUSE_RELEASE
from kernel.
Post by Nikolaus Rath
Best,
-Nikolaus
Loading...