From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on atuin.qyliss.net X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 Received: from atuin.qyliss.net (localhost [IPv6:::1]) by atuin.qyliss.net (Postfix) with ESMTP id 97B9E23BE3; Wed, 6 Apr 2022 11:19:33 +0000 (UTC) Received: by atuin.qyliss.net (Postfix, from userid 496) id 470FC23B6D; Wed, 6 Apr 2022 11:19:31 +0000 (UTC) Received: from mail-lj1-x234.google.com (mail-lj1-x234.google.com [IPv6:2a00:1450:4864:20::234]) by atuin.qyliss.net (Postfix) with ESMTPS id CDAE923BDB for ; Wed, 6 Apr 2022 11:19:28 +0000 (UTC) Received: by mail-lj1-x234.google.com with SMTP id c15so2723298ljr.9 for ; Wed, 06 Apr 2022 04:19:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=yyrnD4eP/dDegKb+C948s3mdEXkRFRRvv6fwxsQGKds=; b=LvbWD9u5tmskQV9CiXzkYW7sL6kGNSIuAmNNe/tEKLrRPQMkFhqdmcrYUBQIBAFvvN 6Huj6AX8UIlR1TN86gIZZu2dFLbY53MduoWvSoly4Aus5b+PdjV+ckzj+/4/jY0gyiEb mqG52lIDNF7c3tN2RfcYkZ3reb/RkGWwp2oWANzUFid9kJ+b4/7kyZuW/78/MGLViC4k BpuB17Mk+lDoqOjXDILzsqPSsLwV3NV0Ag2J9daLOUrE1jXMxfGlcM7EnyIsDqRaix5g tGdjQ+uHzo64pGvK41+QnZB56aNnuUZc/J+BhYNO/KpQAEHxZVv+Lp3ZRvdV7FYZ8VDN 1z2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=yyrnD4eP/dDegKb+C948s3mdEXkRFRRvv6fwxsQGKds=; b=Nivp+vy9HkcnaQqE158ioxxeStn5UPlwbQfcampJMRu+NsAgNZ994jCkE9U4TAhs2y SFDJUfEMgq/dx96c6Sz6q8X0Jko1/AMqq9N4VGpQ/l7UNu7aR20m77j3uyYaz2UediUC BfzupX+W1vS3wkY/Q+xj248qBZg603+/M9hWvH7ENMcqalBjB6kNNqzNWnxCaiBm9/Vj ihzJzcFmlb3UgTPdzK5rPy0DfaJ+stDzBHAETvo9g0Kt7WP+Oop/pcr1IJ896UOw6NUD vPdU9EdmmnruQp7TAtH7W6G6PJn/3YyGphk1L3Oah5CINnR/+WFt/34IV5+hBSyE9R1M XUmg== X-Gm-Message-State: AOAM533oDoCPnXTEeyC0ue9t9uTv3tGe4E/dBLK4XSL6tSRtYSaXKm5T Hr2FBnUAYc2vEaRnhv3nrEXmLzcZjgVxkxXZDI8= X-Google-Smtp-Source: ABdhPJxVpPJYCXX0XTNAU4Pn8Om49r6i5v2KQJTVYhDQE/g7uMfC0Q3pHogrhmcdIO9fyPkVRwDS06rWH22ZeLXVACc= X-Received: by 2002:a05:651c:1508:b0:24b:1fd1:37e9 with SMTP id e8-20020a05651c150800b0024b1fd137e9mr4911522ljf.233.1649243966296; Wed, 06 Apr 2022 04:19:26 -0700 (PDT) MIME-Version: 1.0 References: <87o871tqqu.fsf@alyssa.is> <87bl2sp3p3.fsf@alyssa.is> <20220315140604.ta433j557jsoknkv@eve> <20220315202339.6bqrzsf4yjkggbup@eve> <20220321160534.w4oifq3cdygrobrm@eve> <20220322111639.pvr5gxz6zmeu5oua@eve> In-Reply-To: From: Thomas Leonard Date: Wed, 6 Apr 2022 12:19:14 +0000 Message-ID: Subject: Re: Using virtio-gpu instead of virtwl To: Alyssa Ross Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Message-ID-Hash: 6CG5GB5U2WDWC3NMEDEGYRDEDDVUEY2N X-Message-ID-Hash: 6CG5GB5U2WDWC3NMEDEGYRDEDDVUEY2N X-MailFrom: talex5@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-config-1; header-match-discuss.spectrum-os.org-0; header-match-discuss.spectrum-os.org-1; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: discuss@spectrum-os.org X-Mailman-Version: 3.3.5 Precedence: list List-Id: General high-level discussion about Spectrum Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Tue, 22 Mar 2022 at 20:05, Thomas Leonard wrote: > > On Tue, 22 Mar 2022 at 11:16, Alyssa Ross wrote: > > > > On Tue, Mar 22, 2022 at 11:08:15AM +0000, Thomas Leonard wrote: > > > On Mon, 21 Mar 2022 at 16:05, Alyssa Ross wrote: > > > > > > > > On Mon, Mar 21, 2022 at 12:10:43PM +0000, Thomas Leonard wrote: > > > > > Looking at the Linux virtio_gpu driver, it seems that using conte= xts > > > > > requires virgl: > > > > > > > > > > static int virtio_gpu_context_init_ioctl(struct drm_device *dev, = void > > > > > *data, struct drm_file *file) > > > > > { > > > > > ... > > > > > if (!vgdev->has_context_init || !vgdev->has_virgl_3d) > > > > > return -EINVAL; > > > > > > > > > > https://github.com/torvalds/linux/blob/f443e374ae131c168a065ea174= 8feac6b2e76613/drivers/gpu/drm/virtio/virtgpu_ioctl.c#L732 > > > > > > > > > > I think perhaps that crosvm is compiled without the "virgl_render= er" > > > > > feature (it's not in the default set), and this is causing it to = crash > > > > > because that's also "self.default_component". I don't know how to > > > > > compile crosvm with virgl enabled, though. > > > > > > > > It wasn't easy, but I got it to build[1]. I hope that helps. It a= dds > > > > both virgl_renderer and virgl_renderer_next. I think virgl_rendere= r > > > > is on by default with --gpu, and virgl_renderer_next is used with t= he > > > > --gpu-render-server argument. Hopefully at least one of those does= the > > > > right thing =E2=80=94 let me know! > > > > > > > > [1]: https://github.com/NixOS/nixpkgs/pull/165128 > > > > > > Thanks, that is very helpful! > > > > > > I gave it a try, and it got a little further. But now, doing `modprob= e > > > virtio_gpu` in the VM crashes crosvm with: > > > > > > Stack trace of thread 2: > > > #0 0x00007fa5fd0915f6 abort (libc.so.6 + 0x265f6) > > > #1 0x00007fa5fcfc6bfd get_dlopen_handle.part.0 (libepoxy.so.0 + 0xc7= bfd) > > > #2 0x00007fa5fcfc7366 epoxy_egl_dlsym (libepoxy.so.0 + 0xc8366) > [...] > > > > > > It looks like it should be printing a message to stderr before callin= g > > > abort, but I don't see it > > > (https://github.com/anholt/libepoxy/blob/1.5.9/src/dispatch_common.c#= L315). > > > > Did you try --disable-sandbox, like I suggested in my other mail? > > The sandbox blocks writing error messages, and is something I frequentl= y > > trip over when trying to use crosvm. > > It's not very easy because --disable-sandbox seems to conflict with > --shared-dir, which I use for lots of things. I got around this by changing `create_gpu_device` to use `let jail =3D None;`, so only the GPU device isn't jailed. I suspect the minijail config needs updating for NixOS (e.g. https://github.com/google/crosvm/blob/main/src/linux/gpu.rs#L82). I tried, but failed, to figure out the protocol. I did manage to get a test application showing a little animation, but it crashes after a few seconds. The basic idea seems to be: 1. You allocate a page of memory shared with the crosvm on the host. 2. You tell crosvm to read messages from the host compositor and write them to this page. 3. After doing this, crosvm signals the guest, which reads the data. The shared page is referred to as a "ring", but it's not used as a ring buffer. The host always writes to the start of it. Separately, to allocate an image buffer: 1. You tell crosvm the width and height, etc. 2. It assigns a blob_id and writes it to the shared page. 3. You wait for the operation to complete, then use the blob_id to create the buffer. The problem is that both these operations write to the same page, and they race! So sometimes the image information overwrites the Wayland data, or the Wayland data overwrites the image information, and then it crashes. This image_query function shows the problem: https://chromium.googlesource.com/chromiumos/platform2/+/refs/heads/main/vm= _tools/sommelier/virtualization/virtgpu_channel.cc#499 It asks for the image information to be written to "ring_addr_" and then reads it from there. But at the moment when the function is called, ring_addr_ may contain Wayland protocol data that hasn't been read yet. I didn't test it with Sommelier, but that's the problem I had in my code and I don't see how Sommelier's code can work in general. Sommelier caches the results, so it might not hit this case too often. I didn't use a cache, and also added a small sleep to my code to make the problem easier to reproduce. Anyone have any ideas how this is supposed to work? --=20 talex5 (GitHub/Twitter) http://roscidus.com/blog/