From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from atuin.qyliss.net (localhost [IPv6:::1]) by atuin.qyliss.net (Postfix) with ESMTP id DE000996E; Mon, 25 May 2026 09:14:38 +0000 (UTC) Received: by atuin.qyliss.net (Postfix, from userid 993) id 0364D98CD; Mon, 25 May 2026 09:14:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-26) on atuin.qyliss.net X-Spam-Level: X-Spam-Status: No, score=-0.5 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DMARC_PASS,RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL,SPF_HELO_PASS autolearn=unavailable autolearn_force=no version=4.0.1 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by atuin.qyliss.net (Postfix) with ESMTPS id 4677998CC for ; Mon, 25 May 2026 09:14:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1779700473; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=XCa6LzOtgQIffaGGFzBaXF3Q8bJnjAULK2FkHdc8+UE=; b=ID/rHZ4YXC9qwtn2HME1FgQSNjSjft1XaNKa8H/PJoInOonSh0KCEXujyywrs8Dc0FG6Mt k/tGd8vTT7R+jjFX/XED+dBtpsORQodO7IoU/cOIC140XTsa8jU0ZHHAvOeL0Y+tFDHg27 OvkVfl5RLI8s8TO4FA7MM0kFpmlWmj0= Received: from mail-pj1-f69.google.com (mail-pj1-f69.google.com [209.85.216.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-632-K7IwYrhRPwGN9bCwqTSGNQ-1; Mon, 25 May 2026 05:14:31 -0400 X-MC-Unique: K7IwYrhRPwGN9bCwqTSGNQ-1 X-Mimecast-MFC-AGG-ID: K7IwYrhRPwGN9bCwqTSGNQ_1779700471 Received: by mail-pj1-f69.google.com with SMTP id 98e67ed59e1d1-368b0dc7731so5499030a91.3 for ; Mon, 25 May 2026 02:14:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779700470; x=1780305270; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=XCa6LzOtgQIffaGGFzBaXF3Q8bJnjAULK2FkHdc8+UE=; b=Mdsx4ZvZ5XRYNZrewqKk6dSzdBrbOs03hj8O9wkjhUdqMozUO66btLEaEKhZiuU99k r+t3GA6JZSx0CQji46HPJ9Yxb3tYB/Wg2sBFgwnvn1P0f1JhwMFjLDhvZvAifzMCABYp AnJbnwDUKpoCS9ziwC1llCqGEPYApEoT25zyj851fKLJI8m5fegtmQbLTfz6HegVFjlX QAxnuCu1cdWmt2Sp+OqnmAvQQOUS1j+eTYlvQojXdM4ygsaBO+Mu5XYopq42VMI7nqyA UxTHpNTufSssKXCMuAlsHn1wmfiop07D7OVCVBmo2n3sv97kkBg05oHFYgtF9sQgDD38 eFlA== X-Forwarded-Encrypted: i=1; AFNElJ/9NBy6vOhnq3rkGy4hEf689KAP7rhnZ8cxZpt7YlaT65CZAGTGUC/8JGKNWYFVVi6YoGSJEw==@spectrum-os.org X-Gm-Message-State: AOJu0Yw2YVOC9gyQiBKoNydXAq78UudmNzhf6zbzfcaHF3OhJHNuwr+j A/JR/0+guKCUn1CuINIeySkapGz3g3qOn0PZihaMBd9tvZBOeY0c/IFKkUegMLPg9xjDyxZ6XCI rYVh5EgtKtfgMumR2fDs8LSYWMG/LPBSADr4L4zpIrDcxbGDBUmuc+CjBvXBxClIvu+BUCrS9ng YdY1yDlI0d69A0FXIEgzhwEegeNidxj/Y= X-Gm-Gg: Acq92OHza0cyxXgDAfKkrpvu8YWxZOCtT6NrYSYT7A64h2sAPFzEMh4UduPaRm82Jcu O6TFHINyc+hxZkBpuGSMDwh+9Kv8hECVNIS75adBxtibFbT9Syu+GhaeEwWI5A5OGUrlFWAAXfD hxXry933cWuAUL9r8uKVxP9qlqRre6xd4ggwADCyeKQhRCO/HZQ31D6TqERRmLYdSnr/STyIeBS 3d6 X-Received: by 2002:a17:90b:2585:b0:366:3ac:f730 with SMTP id 98e67ed59e1d1-36a6788e0abmr12403271a91.25.1779700470405; Mon, 25 May 2026 02:14:30 -0700 (PDT) X-Received: by 2002:a17:90b:2585:b0:366:3ac:f730 with SMTP id 98e67ed59e1d1-36a6788e0abmr12403244a91.25.1779700469810; Mon, 25 May 2026 02:14:29 -0700 (PDT) MIME-Version: 1.0 References: <20260506113554-mutt-send-email-mst@kernel.org> <20260507074113-mutt-send-email-mst@kernel.org> <20260507163710-mutt-send-email-mst@kernel.org> <20260508055343-mutt-send-email-mst@kernel.org> <96977898-81a0-4c79-a95f-0b288e0572a6@gmail.com> In-Reply-To: <96977898-81a0-4c79-a95f-0b288e0572a6@gmail.com> From: Stefano Garzarella Date: Mon, 25 May 2026 11:14:17 +0200 X-Gm-Features: AVHnY4Lhts8wIDTXHhuzia8dvW2K6_DuNl7PB_LxKTgmjoNxj80dGuhGdtyu2k0 Message-ID: Subject: Re: [PATCH net] vsock/virtio: fix potential unbounded skb queue To: Demi Marie Obenour X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: DOEsgqu_d5JuwACV_LWjriko1WdhxtBmUbE1ejcWSNc_1779700471 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Message-ID-Hash: MQAZU663OFJFAZTFGFWK4BMOLGRL75IY X-Message-ID-Hash: MQAZU663OFJFAZTFGFWK4BMOLGRL75IY X-MailFrom: sgarzare@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-devel.spectrum-os.org-0; header-match-devel.spectrum-os.org-1; header-match-devel.spectrum-os.org-2; header-match-devel.spectrum-os.org-3; header-match-devel.spectrum-os.org-4; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: "Michael S. Tsirkin" , Eric Dumazet , Arseniy Krasnov , Bobby Eshleman , Stefan Hajnoczi , "David S . Miller" , Jakub Kicinski , Paolo Abeni , Simon Horman , netdev@vger.kernel.org, eric.dumazet@gmail.com, Arseniy Krasnov , Jason Wang , Xuan Zhuo , =?UTF-8?Q?Eugenio_P=C3=A9rez?= , kvm@vger.kernel.org, virtualization@lists.linux.dev, "virtio-comment@lists.linux.dev" , Manuel Stoeckl , Alyssa Ross , Spectrum Development , systemd development , Val Packett X-Mailman-Version: 3.3.9 Precedence: list List-Id: Patches and low-level development discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Mon, 25 May 2026 at 04:37, Demi Marie Obenour wrote: > > On 5/23/26 17:00, Demi Marie Obenour wrote: > > On 5/8/26 06:11, Stefano Garzarella wrote: > >> On Fri, May 08, 2026 at 05:58:06AM -0400, Michael S. Tsirkin wrote: > >>> On Fri, May 08, 2026 at 11:41:21AM +0200, Stefano Garzarella wrote: > >>>> On Thu, May 07, 2026 at 06:48:47PM -0400, Michael S. Tsirkin wrote: > >>>>> On Thu, May 07, 2026 at 02:59:13PM +0200, Stefano Garzarella wrote: > >>>>>> On Thu, May 07, 2026 at 07:45:10AM -0400, Michael S. Tsirkin wrote: > >>>>>>> On Thu, May 07, 2026 at 11:09:47AM +0200, Stefano Garzarella wrote: > >>>> > >>>> [...] > >>>> > >>>>>>>> For now, we're already doing something: > >>>>>>>> merging the skuffs if they don't have EOM set. > >>>>>>> > >>>>>>> > >>>>>>> Right that's good. You could go further and merge with EOM too > >>>>>>> if you stick the info about message boundaries somewhere else. > >>>>>> > >>>>>> This adds a lot of complexity IMO, but we can try. > >>>>>> > >>>>>> Do you have something in mind? > >>>>> > >>>>> BER is clearly overkill but here's a POC that claude made for me, > >>>>> just to give u an idea. It's clearly has a ton of issues, > >>>>> for example I dislike how GFP_ATOMIC is handled. > >>>> > >>>> Okay, I somewhat understand, but clearly this isn't net material > >>> > >>> I doubt we have many other options given reverting the regression was > >>> ruled out. > >> > >> As Eric pointed out, we can't revert it. > > > > Could there be an option to disable the mitigation in guests, for the > > situation where the host is trusted? There are VMMs that implement > > AF_VSOCK in userspace with a backing AF_UNIX socket. Here the main problem was an attack to the host from the guest. Can the application set the max (IIRC 4G) of SO_VM_SOCKETS_BUFFER_SIZE ? > > > >>>> so for now > >>>> I think the best thing to do is to merge the fixup I sent (or something > >>>> similar): > >>>> https://lore.kernel.org/netdev/20260508092330.69690-1-sgarzare@redhat.com/ > >>> > >>> I reviewed that one, problem is it's a spec violation/change that we'll > >>> have to support forever. > >> > >> I have a few points to make on this, but let's discuss them there. > >> > >>> > >>>> This is a major change that should be merged with more caution. > >>>> Could this have too much of an impact on performance? > >>>> > >>>> Thanks, > >>>> Stefano > >>> > >>> It's really a POC, real patch is left as an excersise for the reader > >>> :). > >> > >> eheh, I see, but honestly, this overcomplication scares me. I'll try to > >> think it over. > >> > >>> The correct approach IMHO is to only start using this > >>> when we wasted a lot of memory on small packets. > >>> > >>> For example, if sum(truesize) >= buf size. > >>> > >>> then we'll not see a perf impact unless it's already pathological. > >> > >> Agree on this, which is similar to what I'm doing in that patch. > >> Reducing the advertised buf_alloc only in pathological cases (e.g. > >> overhead > buf_alloc). > > > > This isn't enough to prevent data loss due to race conditions. Yep, we didn't go in that direction. > > > > I'm CCing the virtio-comment list and a few others. > > > > Right now, any application that needs to send massive amount of > > data over a vsock is simply broken. This isn't just theoretical. > > It's causing real-world problems for users of Waypipe. > > Waypipe forwards Wayland protocol messages over AF_VSOCK, > > so it can send a large amount of traffic over the socket. > > See https://gitlab.freedesktop.org/mstoceckl/waypipe/work_items/165. > > > > If one is willing to mutate the ring buffer in-place, or to maintain > > an auxiliary counter, it's possible to store all messages with bounded > > (in practice) overhead. Specifically: > > > > - If the first byte of a block of data is nonzero, it's a > > variable-length length. 1 byte for messages less than 128 bytes. > > > > - If the first byte of a block of data is zero, the subsequent bytes > > are a variable-length counter that stores the number of consecutive > > zero-byte messages. > > > > That adds a lot of complexity, which is very unfortunate for something > > that needs to be backported to stable kernels. I also suspect it > > requires all access to the ring buffer to take a lock rather than > > being lock-free. But it's the only approach that I can think of that > > can work with the current spec. Yep, we were discussing something like that in this thread, but we are still working on that. That said, any help on preparing patches for this would be more than welcome. > > > > Could there at least be a normative note stating that drivers and > > devices should treat each message as consuming 1024 bytes + the size > > of the message itself, and warning that anything that doesn't is > > going to be broken in practice? We would like to fix the issue. This is a transitive situation. > > > > I'm CCing Val Packet (of Invisible Things Lab) and Alyssa Ross > > (of Spectrum) because both of them are working on systems that rely > > critically on vsock. > > Update: I see that patches have been upstreamed (with CC: stable) that > reset the connection instead of data loss. Yep, hope that helps the situation. But again, can setting the max SO_VM_SOCKETS_BUFFER_SIZE helps those applications? Thanks, Stefano