patches and low-level development discussion
 help / color / mirror / code / Atom feed
From: Demi Marie Obenour <demiobenour@gmail.com>
To: Stefano Garzarella <sgarzare@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>
Cc: "Eric Dumazet" <edumazet@google.com>,
	"Arseniy Krasnov" <avkrasnov@salutedevices.com>,
	"Bobby Eshleman" <bobbyeshleman@meta.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"David S . Miller" <davem@davemloft.net>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Simon Horman" <horms@kernel.org>,
	netdev@vger.kernel.org, eric.dumazet@gmail.com,
	"Arseniy Krasnov" <AVKrasnov@sberdevices.ru>,
	"Jason Wang" <jasowang@redhat.com>,
	"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
	"Eugenio Pérez" <eperezma@redhat.com>,
	kvm@vger.kernel.org, virtualization@lists.linux.dev,
	"virtio-comment@lists.linux.dev" <virtio-comment@lists.linux.dev>,
	"Manuel Stoeckl" <code@mstoeckl.com>,
	"Alyssa Ross" <hi@alyssa.is>,
	"Spectrum Development" <devel@spectrum-os.org>,
	"systemd development" <systemd-devel@lists.freedesktop.org>,
	"Val Packett" <val@invisiblethingslab.com>
Subject: Re: [PATCH net] vsock/virtio: fix potential unbounded skb queue
Date: Sat, 23 May 2026 17:00:25 -0400	[thread overview]
Message-ID: <baf184f6-fefd-45ca-a437-1505533e3c81@gmail.com> (raw)
In-Reply-To: <af21ZGlANY6UUCKx@sgarzare-redhat>


[-- Attachment #1.1.1: Type: text/plain, Size: 4361 bytes --]

On 5/8/26 06:11, Stefano Garzarella wrote:
> On Fri, May 08, 2026 at 05:58:06AM -0400, Michael S. Tsirkin wrote:
>> On Fri, May 08, 2026 at 11:41:21AM +0200, Stefano Garzarella wrote:
>>> On Thu, May 07, 2026 at 06:48:47PM -0400, Michael S. Tsirkin wrote:
>>>> On Thu, May 07, 2026 at 02:59:13PM +0200, Stefano Garzarella wrote:
>>>>> On Thu, May 07, 2026 at 07:45:10AM -0400, Michael S. Tsirkin wrote:
>>>>>> On Thu, May 07, 2026 at 11:09:47AM +0200, Stefano Garzarella wrote:
>>>
>>> [...]
>>>
>>>>>>> For now, we're already doing something:
>>>>>>> merging the skuffs if they don't have EOM set.
>>>>>>
>>>>>>
>>>>>> Right that's good. You could go further and merge with EOM too
>>>>>> if you stick the info about message boundaries somewhere else.
>>>>>
>>>>> This adds a lot of complexity IMO, but we can try.
>>>>>
>>>>> Do you have something in mind?
>>>>
>>>> BER is clearly overkill but here's a POC that claude made for me,
>>>> just to give u an idea. It's clearly has a ton of issues,
>>>> for example I dislike how GFP_ATOMIC is handled.
>>>
>>> Okay, I somewhat understand, but clearly this isn't net material
>>
>> I doubt we have many other options given reverting the regression was
>> ruled out.
> 
> As Eric pointed out, we can't revert it.

Could there be an option to disable the mitigation in guests, for the
situation where the host is trusted?  There are VMMs that implement
AF_VSOCK in userspace with a backing AF_UNIX socket.

>>> so for now
>>> I think the best thing to do is to merge the fixup I sent (or something
>>> similar):
>>> https://lore.kernel.org/netdev/20260508092330.69690-1-sgarzare@redhat.com/
>>
>> I reviewed that one, problem is it's a spec violation/change that we'll
>> have to support forever.
> 
> I have a few points to make on this, but let's discuss them there.
> 
>>
>>> This is a major change that should be merged with more caution.
>>> Could this have too much of an impact on performance?
>>>
>>> Thanks,
>>> Stefano
>>
>> It's really a POC, real patch is left as an excersise for the reader 
>> :).
> 
> eheh, I see, but honestly, this overcomplication scares me. I'll try to 
> think it over.
> 
>> The correct approach IMHO is to only start using this
>> when we wasted a lot of memory on small packets.
>>
>> For example, if sum(truesize) >= buf size.
>>
>> then we'll not see a perf impact unless it's already pathological.
> 
> Agree on this, which is similar to what I'm doing in that patch.  
> Reducing the advertised buf_alloc only in pathological cases (e.g.  
> overhead > buf_alloc).

This isn't enough to prevent data loss due to race conditions.

I'm CCing the virtio-comment list and a few others.

Right now, any application that needs to send massive amount of
data over a vsock is simply broken.  This isn't just theoretical.
It's causing real-world problems for users of Waypipe.
Waypipe forwards Wayland protocol messages over AF_VSOCK,
so it can send a large amount of traffic over the socket.
See https://gitlab.freedesktop.org/mstoceckl/waypipe/work_items/165.

If one is willing to mutate the ring buffer in-place, or to maintain
an auxiliary counter, it's possible to store all messages with bounded
(in practice) overhead.  Specifically:

- If the first byte of a block of data is nonzero, it's a
  variable-length length.  1 byte for messages less than 128 bytes.

- If the first byte of a block of data is zero, the subsequent bytes
  are a variable-length counter that stores the number of consecutive
  zero-byte messages.

That adds a lot of complexity, which is very unfortunate for something
that needs to be backported to stable kernels.  I also suspect it
requires all access to the ring buffer to take a lock rather than
being lock-free.  But it's the only approach that I can think of that
can work with the current spec.

Could there at least be a normative note stating that drivers and
devices should treat each message as consuming 1024 bytes + the size
of the message itself, and warning that anything that doesn't is
going to be broken in practice?

I'm CCing Val Packet (of Invisible Things Lab) and Alyssa Ross
(of Spectrum) because both of them are working on systems that rely
critically on vsock.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

       reply	other threads:[~2026-05-23 21:00 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <afn0ZdvZWswBuDMm@sgarzare-redhat>
     [not found] ` <CANn89iLs8DOWJwDpf_ARoMrV+6b2tbhEJ=VVzeC8gCm5dRGaig@mail.gmail.com>
     [not found]   ` <afoF_cHfl6ygcupM@sgarzare-redhat>
     [not found]     ` <20260506113554-mutt-send-email-mst@kernel.org>
     [not found]       ` <afxVH2So9BbZ3Gta@sgarzare-redhat>
     [not found]         ` <20260507074113-mutt-send-email-mst@kernel.org>
     [not found]           ` <afyMCyBvZpzWrLtO@sgarzare-redhat>
     [not found]             ` <20260507163710-mutt-send-email-mst@kernel.org>
     [not found]               ` <af2sXvVCBT05XF0D@sgarzare-redhat>
     [not found]                 ` <20260508055343-mutt-send-email-mst@kernel.org>
     [not found]                   ` <af21ZGlANY6UUCKx@sgarzare-redhat>
2026-05-23 21:00                     ` Demi Marie Obenour [this message]
2026-05-25  2:37                       ` [PATCH net] vsock/virtio: fix potential unbounded skb queue Demi Marie Obenour
2026-05-25  9:14                         ` Stefano Garzarella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=baf184f6-fefd-45ca-a437-1505533e3c81@gmail.com \
    --to=demiobenour@gmail.com \
    --cc=AVKrasnov@sberdevices.ru \
    --cc=avkrasnov@salutedevices.com \
    --cc=bobbyeshleman@meta.com \
    --cc=code@mstoeckl.com \
    --cc=davem@davemloft.net \
    --cc=devel@spectrum-os.org \
    --cc=edumazet@google.com \
    --cc=eperezma@redhat.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hi@alyssa.is \
    --cc=horms@kernel.org \
    --cc=jasowang@redhat.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=systemd-devel@lists.freedesktop.org \
    --cc=val@invisiblethingslab.com \
    --cc=virtio-comment@lists.linux.dev \
    --cc=virtualization@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://spectrum-os.org/git/crosvm
	https://spectrum-os.org/git/doc
	https://spectrum-os.org/git/mktuntap
	https://spectrum-os.org/git/nixpkgs
	https://spectrum-os.org/git/spectrum
	https://spectrum-os.org/git/ucspi-vsock
	https://spectrum-os.org/git/www

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).