patches and low-level development discussion
 help / color / mirror / code / Atom feed
From: Demi Marie Obenour <demiobenour@gmail.com>
To: Yureka <yuka@yuka.dev>
Cc: devel@spectrum-os.org
Subject: Re: [PATCH] tools: add xdp-forwarder
Date: Sun, 7 Sep 2025 17:30:19 -0400	[thread overview]
Message-ID: <aa7ab0bf-e5d7-48a0-b7e4-756d4eef1b2d@gmail.com> (raw)
In-Reply-To: <9e9e071f-d999-476d-895a-9128dd36a0a5@yuka.dev>


[-- Attachment #1.1.1: Type: text/plain, Size: 16176 bytes --]

On 9/7/25 16:28, Yureka wrote:
> 
> On 9/7/25 21:25, Demi Marie Obenour wrote:
>> On 9/6/25 10:12, Yureka Lilian wrote:
>>> The xdp-forwarder's purpose is implementing the functionality needed
>>> within the net-vm (a VM running the Linux drivers for any physical
>>> interfaces on the spectrum system).
>>>
>>> In the future, the net-vm will load the included XDP programs on the
>>> passed-through physical interfaces as well as the downstream virtio
>>> interface going into the router (recognized by its special MAC address).
>>>
>>> The net-vm needs to multiplex between the physical interfaces, as there
>>> might be several interfaces in the same IOMMU-group.
>>>
>>> For this, the XDP program loaded on the physical interfaces
>>> (`prog_physical.o`) applies a VLAN tag corresponding to the interface id
>>> and redirects the packets to the router interface (identified by the
>>> `router_iface` bpf map). In the other direction the XDP program loaded on
>>> the router interface (`prog_router.o`) removes one layer of VLAN tagging
>>> and redirects the packets to the interface read from the VLAN tag.
>>>
>>> The helper program `set_router_iface` is used to update the 
>>> `router_iface`
>>> bpf map to point to the interface passed as argument to the program.
>>>
>>> Signed-off-by: Yureka Lilian <yureka@cyberchaos.dev>
>>> ---
>>> pkgs/default.nix | 5 +
>>> tools/default.nix | 15 +-
>>> tools/meson.build | 5 +
>>> tools/meson_options.txt | 4 +
>>> tools/xdp-forwarder/include/parsing_helpers.h | 273 ++++++++++++++++++
>>> tools/xdp-forwarder/include/rewrite_helpers.h | 145 ++++++++++
>>> tools/xdp-forwarder/meson.build | 38 +++
>>> tools/xdp-forwarder/prog_physical.c | 37 +++
>>> tools/xdp-forwarder/prog_router.c | 43 +++
>>> tools/xdp-forwarder/set_router_iface.c | 29 ++
>>> 10 files changed, 591 insertions(+), 3 deletions(-)
>>> create mode 100644 tools/xdp-forwarder/include/parsing_helpers.h
>>> create mode 100644 tools/xdp-forwarder/include/rewrite_helpers.h
>>> create mode 100644 tools/xdp-forwarder/meson.build
>>> create mode 100644 tools/xdp-forwarder/prog_physical.c
>>> create mode 100644 tools/xdp-forwarder/prog_router.c
>>> create mode 100644 tools/xdp-forwarder/set_router_iface.c

[...]

>>> diff --git a/tools/xdp-forwarder/include/parsing_helpers.h 
>>> b/tools/xdp-forwarder/include/parsing_helpers.h
>>> new file mode 100644
>>> index 0000000..3d240cd
>>> --- /dev/null
>>> +++ b/tools/xdp-forwarder/include/parsing_helpers.h
>>> @@ -0,0 +1,273 @@
>>> +/* SPDX-License-Identifier: (GPL-2.0-or-later OR BSD-2-clause) */
>>> +/* Vendored from https://github.com/xdp-project/xdp-tutorial/blob/d3d3eed6ea9a63d1302bfa8b5a8e93862bfe11f0/common/parsing_helpers.h */

It looks like your mail client wrapped the lines.  I demangled them in the C code.

>>> [...]
>>> +/*
>>> + * struct vlan_hdr - vlan header
>>> + * @h_vlan_TCI: priority and VLAN ID
>>> + * @h_vlan_encapsulated_proto: packet type ID or len
>>> + */
>>> +struct vlan_hdr {
>>> +	__be16 h_vlan_TCI;
>>> +	__be16 h_vlan_encapsulated_proto;
>>> +};

I think this (and all the subsequent structs) should be
__attribute__((packed)) but am not sure.
[...]

>>> diff --git a/tools/xdp-forwarder/include/rewrite_helpers.h 
>>> b/tools/xdp-forwarder/include/rewrite_helpers.h
>>> new file mode 100644
>>> index 0000000..2deae9a
>>> --- /dev/null
>>> +++ b/tools/xdp-forwarder/include/rewrite_helpers.h
>>> @@ -0,0 +1,145 @@
>>> +/* SPDX-License-Identifier: (GPL-2.0-or-later OR BSD-2-clause) */
>>> +/* Vendored from 
>>> https://github.com/xdp-project/xdp-tutorial/blob/d3d3eed6ea9a63d1302bfa8b5a8e93862bfe11f0/common/rewrite_helpers.h 
>>> */
>>> +/*
>>> + * This file contains functions that are used in the packetXX XDP programs to
>>> + * manipulate on packets data. The functions are marked as __always_inline, and
>>> + * fully defined in this header file to be included in the BPF program.
>>> + */
>>> +
>>> +#ifndef __REWRITE_HELPERS_H
>>> +#define __REWRITE_HELPERS_H
>>> +
>>> +#include <linux/bpf.h>
>>> +#include <linux/ip.h>
>>> +#include <linux/ipv6.h>
>>> +#include <linux/if_ether.h>
>>> +
>>> +#include <bpf/bpf_helpers.h>
>>> +#include <bpf/bpf_endian.h>
>>> +
>>> +/* Pops the outermost VLAN tag off the packet. Returns the popped VLAN ID on
>>> + * success or negative errno on failure.
>>> + */
>>> +static __always_inline int vlan_tag_pop(struct xdp_md *ctx, struct ethhdr *eth)
>>> +{
>>> +	void *data_end = (void *)(long)ctx->data_end;
>>> +	struct ethhdr eth_cpy;
>>> +	struct vlan_hdr *vlh;
>>> +	__be16 h_proto;
>>> +	int vlid;
>>> +
>>> +	if (!proto_is_vlan(eth->h_proto))
>>> +		return -1;
>>> +
>>> +	/* Careful with the parenthesis here */
>>> +	vlh = (void *)(eth + 1);
>>> +
>>> +	/* Still need to do bounds checking */
>>> +	if (vlh + 1 > data_end)
>>> +		return -1;
>>> +
>>> +	/* Save vlan ID for returning, h_proto for updating Ethernet header */
>>> +	vlid = bpf_ntohs(vlh->h_vlan_TCI);
>>> +	h_proto = vlh->h_vlan_encapsulated_proto;
>>> +
>>> +	/* Make a copy of the outer Ethernet header before we cut it off */
>>> +	__builtin_memcpy(&eth_cpy, eth, sizeof(eth_cpy));
>>> +
>>> +	/* Actually adjust the head pointer */
>>> + 	if (bpf_xdp_adjust_head(ctx, (int)sizeof(*vlh)))
>>> +		return -1;
>>> +
>>> + 	/* Need to re-evaluate data *and* data_end and do new bounds checking
>>> + 	 * after adjusting head
>>> + 	 */
>>> + 	eth = (void *)(long)ctx->data;
>>> + 	data_end = (void *)(long)ctx->data_end;
>>> + 	if (eth + 1 > data_end)
>>> + 	 	return -1;
>>> +
>>> + 	/* Copy back the old Ethernet header and update the proto type */
>>> + 	__builtin_memcpy(eth, &eth_cpy, sizeof(*eth));
>>> + 	eth->h_proto = h_proto;
>>> +
>>> + 	return vlid;
>>> +}
>> In Spectrum, eth is always the packet data, so this should be able to be
>> simplified to: [...]
> 
> Not upstreamable

I agree.

> [...]
>> In the Spectrum use-case (where the Ethernet header is always the one in
>> the packet) this should be able to be simplified to (not tested): [...]
> 
> Not upstreamable

That makes sense.

>> This also avoids needing the 'eth' parameter, as it is now assigned in the
>> function body before being read from. I'm not sure this would be
> you're not sure this would be what?

Should have added "upstreamable".

> [...]
>>> +#endif /* __REWRITE_HELPERS_H */
>>> diff --git a/tools/xdp-forwarder/meson.build 
>>> b/tools/xdp-forwarder/meson.build
>>> new file mode 100644
>>> index 0000000..7e60c11
>>> --- /dev/null
>>> +++ b/tools/xdp-forwarder/meson.build
>>> @@ -0,0 +1,38 @@
>>> +# SPDX-License-Identifier: EUPL-1.2+
>>> +# SPDX-FileCopyrightText: 2025 Yureka Lilian <yureka@cyberchaos.dev>
>>> +
>>> +libbpf = dependency('libbpf', version : '1.6.2')
>>> +
>>> +executable('set_router_iface', 'set_router_iface.c',
>>> + dependencies : libbpf,
>>> + install : true)
>>> +
>>> +clang = find_program('clang')
>>> +
>>> +bpf_o_cmd = [
>>> + clang.full_path(),
>>> + '-fno-stack-protector',
>>> + '-O2',
>>> + '-target', 'bpf',
>>> + '-I', meson.current_source_dir() + '/include',
>>> + '-g',
>>> + '-c',
>>> + '@INPUT@',
>>> + '-o',
>>> + '@OUTPUT@',
>>> +]
>> I recommend adding -fno-strict-overflow and -fno-strict-aliasing
>> here. I don't know if those are implied by -target bpf, but better
>> safe than sorry.
> Will do.
> 
>>> +prog_router_o = custom_target(
>>> + input : 'prog_router.c',
>>> + output : 'prog_router.o',
>>> + command : bpf_o_cmd,
>>> + install: true,
>>> + install_dir: 'lib/xdp')
>>> +
>>> +prog_physical_o = custom_target(
>>> + input : 'prog_physical.c',
>>> + output : 'prog_physical.o',
>>> + command : bpf_o_cmd,
>>> + install: true,
>>> + install_dir: 'lib/xdp')
>> Meson allows providing a dependency file so that the
>> target is remade if any of the headers are changed.
>> Clang is able to produce such a file (-MD -MP -MF file).
> 
> I mostly got inspiration from the systemd meson files. I'm not very 
> familiar with meson, but I will try to add this.

I think

bpf_o_cmd = [
	clang.full_path(),
	'-fno-stack-protector',
	'-fno-strict-aliasing',
	'-fno-strict-overflow',
	'-Wall',
	'-Wextra',
	'-O2',
	'-target', 'bpf',
	'-I', meson.current_source_dir() + '/include',
	'-g',
	'-c',
	'-o', '@OUTPUT@',
	'-MD',
	'-MP',
	'-MF', '@DEPFILE@',
	'--',
	'@INPUT@',
]

prog_physical_o = custom_target(
	input: 'prog_physical.c',
	output: 'prog_physical.o',
	depfile: 'prog_physical.o.dep',
	command: bpf_o_cmd,
	install: true,
	install_dir: 'lib/xdp')

should work.  '--' is optional, of course; I am in the habit
of adding it because it does matter in other contexts.  This
is also untested, but it is consistent with the Meson documentation
(https://mesonbuild.com/Reference-manual_functions.html#custom_target).
The above command includes the extra command-line options I mentioned
(-Wall -Wextra -fno-strict-overflow -fno-strict-aliasing).

> [...]
>>> +SEC("xdp")
>>> +int physical(struct xdp_md *ctx)
>>> +{
>>> +	void *data_end = (void *)(long)ctx->data_end;
>>> +	void *data = (void *)(long)ctx->data;
>> Would it make sense to use char * here? I know that Linux uses
>> arithmetic on void *, but it makes the
> it makes the [... what]?

Whoops, forgot to finish the sentence.  It makes the code easier
to understand for those familiar with userspace programming.

>>> +	struct hdr_cursor nh;
>>> +	nh.pos = data;
>>> +
>>> +	struct ethhdr *eth;
>>> +	if (parse_ethhdr(&nh, data_end, &eth) < 0)
>>> +		return XDP_DROP;
>>> +
>>> +
>>> +	if (ctx->ingress_ifindex < 1 || ctx->ingress_ifindex > 4096)
>>> +		return XDP_DROP;

I think this is off-by-1 and should be:

	if (ctx->ingress_ifindex < 1 || ctx->ingress_ifindex > VLAN_VID_MASK)
		return XDP_DROP;

>>> + vlan_tag_push(ctx, eth, ctx->ingress_ifindex);
>> Looks like a missing check for the return value.
> Thanks, will fix.

You're welcome.

> [...]
>>> +	int vlid = vlan_tag_pop(ctx, eth);
>>> +	if (vlid < 0) {
>>> +		return XDP_DROP;
>>> +	}
>> I don’t think Spectrum uses bits other than the VLAN ID,
>> and VLAN 0 is reserved, so perhaps:
>>
>> 	if (vlid < 1 || vlid > 4096) {
>> 		return XDP_DROP;
>> 	}
> The "< 0" check is strictly an error check. In success cases, 
> vlan_tag_pop returns exactly the VLAN ID.

Makes sense.  A check for the VLAN ID being valid could be done
but would be separate (I think `(vlid & VLAN_VID_MASK) != 0`).

> [...]
>> I simplified the code into the following (only compile-tested):
>>
>> /* SPDX-License-Identifier: (GPL-2.0-or-later OR BSD-2-clause) */
>> /*
>>  * This file contains parsing functions that can be used in eXDP programs. The
>>  * functions are marked as __always_inline, and fully defined in this header
>>  * file to be included in the BPF program.
>>  *
>>  * Each helper parses a packet header, including doing bounds checking, and
>>  * returns the type of its contents if successful, and -1 otherwise.
>>  *
>>  * For Ethernet and IP headers, the content type is the type of the payload
>>  * (h_proto for Ethernet, nexthdr for IPv6), for ICMP it is the ICMP type field.
>>  * All return values are in host byte order.
>>  */
>>
>> #include <stddef.h>
>> #include <linux/if_ether.h>
>> #include <linux/if_packet.h>
>> #include <linux/bpf.h>
>> #include <bpf/bpf_endian.h>
>> #include <bpf/bpf_helpers.h>
>>
>> /*
>>  * struct vlan_hdr - vlan header
>>  * @h_vlan_TCI: priority and VLAN ID
>>  * @h_vlan_encapsulated_proto: packet type ID or len
>>  */
>> struct vlan_hdr {
>> 	__be16 h_vlan_TCI;
>> 	__be16 h_vlan_encapsulated_proto;
>> } __attribute__((__packed__));
>>
>> struct eth_vlan_hdr {
>> 	struct ethhdr eth;
>> 	struct vlan_hdr vlan;
>> } __attribute__((__packed__));
>>
>> static __always_inline int proto_is_vlan(__u16 h_proto)
>> {
>> 	return !!(h_proto == bpf_htons(ETH_P_8021Q) ||
>> 	          h_proto == bpf_htons(ETH_P_8021AD));
>> }
>>
>> /* Pushes a new VLAN tag after the Ethernet header. Returns 0 on success,
>>  * -1 on failure.
>>  */
>> static __always_inline int vlan_tag_push(struct xdp_md *ctx, int vlid)
>> {
>> 	 struct eth_vlan_hdr *hdr;
>> 	char *data_end;
>> 	char *data;
>>
>> 	/* Check for valid Ethernet frame */
>> 	if ((unsigned long)ctx->data + ETH_ZLEN > (unsigned long)ctx->data_end)
>> 		return -1;
>>
>> 	/* Add space in front of the packet */
>> 	if (bpf_xdp_adjust_head(ctx, -(int)sizeof(struct vlan_hdr)))
>> 		return -1;
>>
>> 	/* Must re-fetch header pointers */
>> 	data_end = (char *)(long)ctx->data_end;
>> 	data = (char *)(long)ctx->data;
>>
>> 	/* Verifier requires this (redundant) bounds check */
>> 	if (data + ETH_ZLEN + sizeof(struct vlan_hdr) > data_end)
>> 		return -1;
>>
>> 	/* Move source and destination MAC addresses, populate VLAN tag
>> 	 * with ID and proto, and set outer Ethernet header to VLAN type.
>> 	 */
>> 	__builtin_memmove(data, data + sizeof(struct vlan_hdr),
>> 	                  offsetof(struct ethhdr, h_proto));
>> 	hdr = (struct eth_vlan_hdr *)data;
>> 	/* TODO: should this be ETH_P_8021AD? */
>> 	hdr->eth.h_proto = bpf_htons(ETH_P_8021Q);
>> 	hdr->vlan.h_vlan_TCI = bpf_htons(vlid);
>> 	return 0;
>> }
>>
>> /* Pops the outermost VLAN tag off the packet. Returns the popped VLAN ID on
>>  * success or negative errno on failure.
>>  */
>> static __always_inline int vlan_tag_pop(struct xdp_md *ctx)
>> {
>>
>> 	char *data_end = (void *)(long)ctx->data_end;
>> 	struct ethhdr *eth = (void *)(long)ctx->data;
>> 	struct vlan_hdr *vlh;
>> 	int vlid;
>>
>> 	/* Check that the produced Ethernet frame will be long enough */
>> 	if ((unsigned long)eth + ETH_ZLEN + sizeof(*vlh) >
>> 		(unsigned long)data_end)
>> 	return -1;
>>
>> 	/* Careful with the parenthesis here */
>> 	vlh = (void *)(eth + 1);
>>
>> 	if (!proto_is_vlan(eth->h_proto))
>> 		return -1;
>>
>> 	/* Save vlan ID for returning */
>> 	vlid = bpf_ntohs(vlh->h_vlan_TCI);
>>
>> 	/* Move the source and destination MAC addresses.
>> 	 * This moves the Ethernet header by 4 bytes, so
>> 	 * be sure to cast to char * before doing pointer
>> 	 * arithmetic.
>> 	 */
>> 	__builtin_memmove((char *)eth + sizeof(*vlh), eth,
>> 	                  offsetof(struct ethhdr, h_proto));
>>
>> 	/* Actually adjust the head pointer */
>> 	if (bpf_xdp_adjust_head(ctx, (int)sizeof(*vlh)))
>> 		return -1;
>>
>> 	return vlid;
>> }
>>
>> // The maps are actually not used by this program, but just included
>> // to keep the reference-counted pin alive before any physical interfaces
>> // are added.
>> struct {
>> 	__uint(type, BPF_MAP_TYPE_DEVMAP);
>> 	__type(key, int);
>> 	__type(value, int);
>> 	__uint(max_entries, 1);
>> 	__uint(pinning, LIBBPF_PIN_BY_NAME);
>> } physical_iface SEC(".maps");
>>
>> struct {
>> 	__uint(type, BPF_MAP_TYPE_DEVMAP);
>> 	__type(key, int);
>> 	__type(value, int);
>> 	__uint(max_entries, 1);
>> 	__uint(pinning, LIBBPF_PIN_BY_NAME);
>> } router_iface SEC(".maps");
>>
>> SEC("xdp")
>> int physical(struct xdp_md *ctx)
>> {
>> 	if (ctx->ingress_ifindex < 1 || ctx->ingress_ifindex > 4096)
>> 		return XDP_DROP;
>>
>> 	if (vlan_tag_push(ctx, ctx->ingress_ifindex))
>> 		return XDP_DROP;
>>
>> 	return bpf_redirect_map(&physical_iface, 0, 0);
>> }
>>
>> SEC("xdp")
>> int router(struct xdp_md *ctx)
>> {
>> 	int vlid = vlan_tag_pop(ctx);
>>
>> 	/* Negative VLAN IDs indicate errors, and 0 is reserved and invalid.
>> 	 * Values above 4096 indicate that the priority field is used, which
>> 	 * isn't suppported out of simplicity.
>> 	 */
>> 	if (vlid < 1 || vlid > 4096)
>> 		return XDP_DROP;
>>
>> 	return bpf_redirect(vlid, 0);
>> }
> 
> I'm not sure what you intend with this. If there is no significant flaws 
> in the upstream code, why should we write our own version?

Valid point.  I do think it is safe to set VLAN_MAX_DEPTH to 0, as
Spectrum doesn't parse VLAN tags.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2025-09-07 21:30 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-06 14:12 [PATCH] tools: add xdp-forwarder Yureka Lilian
2025-09-07 19:25 ` Demi Marie Obenour
2025-09-07 20:28   ` Yureka
2025-09-07 21:30     ` Demi Marie Obenour [this message]
2025-09-08  7:56 ` Alyssa Ross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa7ab0bf-e5d7-48a0-b7e4-756d4eef1b2d@gmail.com \
    --to=demiobenour@gmail.com \
    --cc=devel@spectrum-os.org \
    --cc=yuka@yuka.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://spectrum-os.org/git/crosvm
	https://spectrum-os.org/git/doc
	https://spectrum-os.org/git/mktuntap
	https://spectrum-os.org/git/nixpkgs
	https://spectrum-os.org/git/spectrum
	https://spectrum-os.org/git/ucspi-vsock
	https://spectrum-os.org/git/www

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).