# Enable OVS with DPDK

### OVS with DPDK <a href="#ovs-workernode-prerequisites" id="ovs-workernode-prerequisites"></a>

#### OVS WorkerNode Prerequisites <a href="#ovs-workernode-prerequisites" id="ovs-workernode-prerequisites"></a>

1. Bare-metal nodes as worker nodes.
2. Hugepages to be enabled on worker nodes.

**Hugepage Support**

Enable hugepages by performing following steps:

* Update the `/etc/default/grub` file on the worker nodes with following properties:

`GRUB_CMDLINE_LINUX='console=tty0 console=ttyS1,115200n8 intel_iommu=on iommu=pt default_hugepagesz=1G hugepagesz=1G hugepages=10'`

`GRUB`\_`SERIAL`\_`COMMAND='serial --unit=0 --speed=115200 --word=8 --parity=no --stop=1'`

* Update Grub

```
$ sudo update-grub
```

* Update vm.nr\_hugepages in /etc/sysctl.conf and refresh kernel parameters.&#x20;

```
$ sudo echo "vm.nr_hugepages = 10" >> /etc/sysctl.conf;                               
```

```
$ sudo sysctl -p
```

* Reboot the worker nodes

```
$ sudo reboot
```

* Confirm that hugepages are configured.

```
$ grep Huge /proc/meminfo
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:      10
HugePages_Free:       10
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB
Hugetlb:        10485760 kB
```

```
$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.4.0-148-generic root=/dev/mapper/vgroot-lvroot ro console=tty0 console=ttyS1,115200n8 intel_iommu=on iommu=pt default_hugepagesz=1G hugepagesz=1G hugepages=10 nvme_core.multipath=0
```

* Mount the hugepages, if not already mounted by default.

```
$ mount -t hugetlbfs -o pagesize=1G none /dev/hugepages
```

### Setting up OVS-DPDK with Pf9 DHCP Server on a PMK cluster <a href="#setting-up-ovs-dpdk-with-pf9-dhcp-server-on-a-pmk-cluster" id="setting-up-ovs-dpdk-with-pf9-dhcp-server-on-a-pmk-cluster"></a>

Create a PMK Cluster with the configured worker nodes in the previous section.

{% hint style="info" %}
**PMK cluster pre-requisites**

PMK cluster should have the following add-ons enabled:

* KubeVirt Add-on
* Advanced Networking Operator (Luigi) Add-on
  {% endhint %}

#### 1. Create Network Plugins Custom Resource <a href="#id-1-create-network-plugins-custom-resource" id="id-1-create-network-plugins-custom-resource"></a>

Network Plugin customer resource used to install advanced networking plugins such as ovs, sriov, dpdk, etc. and their configuration.

```yaml
$ cat <<EOF | kubectl apply -f -
apiVersion: plumber.k8s.pf9.io/v1
kind: NetworkPlugins
metadata:
  name: networkplugins-ovs-dpdk
  namespace: luigi-system
spec:
  plugins:
    hostPlumber: {}            #Enabled
    multus: {}                 #Enabled
    ovs:                       #Enabled with configuration. 
      dpdk:                    #Enabled with configuration
        lcoreMask: "0x1"
        socketMem: "1024"
        pmdCpuMask: "0xF0"
        hugepageMemory: "2Gi"
    dhcpController: {}         #Enabled
EOF
```

**Dpdk configuration parameters:**

* **lcoremask** : Specifies the CPU cores on which dpdk lcore threads should be spawned and expects hex string
* **socketMem** : Comma separated list of memory to pre-allocate from hugepages on specific sockets
* **pmdCpuMask** : The pmd-cpu-mask is a core bitmask that sets which cores are used by OVS-DPDK for datapath packet processing.
* **hugepageMemory :** (no. of hugepages\*hugepagesize) : The amount of memory for hugepages. For example, in the above yaml it indicates that 2Gi hugepage-backed RAM will be used for ovs-dpdk pod. huge page size is 1Gi it would allocate 2 hugepage.
  * Note: **hugepageMemory** needs to be greater than or equal to socket mem. if socketmem: “1024,1024“ then **hugepageMemory >= 2Gi**

**DHCP controller plugin**

**DHCP controller plugin** enables running **PF9 DHCP** **server** inside pod/virtual machine to cater to the DHCP requests from virtual machine instance(not pod in case of Kubevirt). Multus network-attachment-definitions will use DHCP server to assign IPs. Pf9 DHCP server serves as an alternate to the IPAM CNIs (whereabouts, host-local), which are used as delegate from backend CNI, which gets managed/triggered at pod creation and pod deletion.

Refer for more information: <https://platform9.com/docs/kubernetes/enable-p9-dhcp>

#### 2. Create Host Network Template <a href="#id-2-create-host-network-template" id="id-2-create-host-network-template"></a>

Host Network Template is used to define configuration such as ovs-config etc. on the PMK cluster.

```yaml
$ cat <<EOF | kubectl apply -f -
apiVersion: plumber.k8s.pf9.io/v1
kind: HostNetworkTemplate
metadata:
  name: host-network-template-ovs-dpdk
  namespace: luigi-system
spec:
  ovsConfig:
  - bridgeName: "dpdkbr01"
    nodeInterface: "bond0.2"
    dpdk: true
EOF
```

**ovsCofig parameters:**

* **bridgeName** : User Defined name of the OVS bridge
* **nodeInterface** : Physical Network interface to be used to create ovs-bridge with.
* **dpdk**: Boolean to enable dpdk on hosts.

#### 3. Create Network Attachment Definition <a href="#id-3-create-network-attachment-definition" id="id-3-create-network-attachment-definition"></a>

Network Attachment Definition is a Multus CRD used to configure additional NIC on pods and virtual machines.

```yaml
$ cat <<EOF | kubectl apply -f -
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: nad-ovs-dpdk-dhcp
  annotations:
    k8s.v1.cni.cncf.io/resourceName: ovs-cni.network.kubevirt.io/dpdkbr01
spec:
  config: '{
      "cniVersion": "0.3.1",
      "type": "userspace",
      "name": "nad-ovs-dpdk-dhcp",
      "kubeconfig": "/etc/cni/net.d/multus.d/multus.kubeconfig",
      "logFile": "/var/log/userspace-ovs-net-1-cni.log",
      "logLevel": "verbose",
      "host": {
              "engine": "ovs-dpdk",
              "iftype": "vhostuser",
              "netType": "bridge",
              "vhost": {
                      "mode": "client"
                },
              "bridge": {
                      "bridgeName": "dpdkbr01"      #Name of OVS bridge configured in HostNetworkTemplate
                }
        },  
    "container": {
            "engine": "ovs-dpdk",
            "iftype": "vhostuser",
            "netType": "interface",
            "vhost": {
                    "mode": "server"
              }
      }
    }'
EOF
```

4. Create Pf9 DHCP server

```yaml
$ cat <<EOF | kubectl apply -f -
apiVersion: dhcp.plumber.k8s.pf9.io/v1alpha1 
kind: DHCPServer
metadata:
  name: dhcpserver-pf9-ovs-dpdk
spec:
  networks:
    - networkName: nad-ovs-dpdk-dhcp
      interfaceIp: 192.168.15.14/24
      leaseDuration: 10m
      cidr:
        range: 192.168.15.0/24
        range_start: 192.168.15.30
        range_end: 192.168.15.100
        gateway: 192.168.15.1
EOF
```

About the fields:

* **Name**: Name of the DHCPServer. Configurations of dnsmasq will be generated in a Configmap with the same name
* **networks**: list of all networks that this pod will serve:
  * **networkName**: Name of NetworkAttachmentDefinition to provide IPs for. NAD should not have dhcp plugin enabled.
  * **interfaceIp**: IP address that the pod will be allocated. Must have prefix to ensure proper routes are added.
  * **leaseDuration**: Duration the leases offered should be valid for. Provide in valid formats for dnsmasq (eg: 10m, 5h, etc). Defaults to 1h
  * **vlanId**: Dnsmasq network identifier. Used as an identifier while restoring IPs. Optional.
  * **cidr**: range\_start, range\_end, gateway are optional. range is compulsory. If range start and end are provided, they will be used in place of the default start and end.

***At this point the PMK cluster is ready to be used for workloads such as Pods and Virtual Machines.***

#### Create a sample Virtual Machines to use the `nad-ovs-dpdk-dhcp` network. <a href="#create-a-sample-virtual-machines-to-use-the-nad-ovs-dpdk-dhcp-network" id="create-a-sample-virtual-machines-to-use-the-nad-ovs-dpdk-dhcp-network"></a>

Let’s validate your work by creating a Virtual Machine to consume the nad-ovs-dpdk-dhcp network.

```yaml
$ cat <<EOF | kubectl apply -f -
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: vm-test-ovs-dpdk
  namespace: default
spec:
  running: true
  template:
    metadata:
      labels:
        debugLogs: "true"
        kubevirt.io/size: small
      annotations:
        kubevirt.io/memfd: "false"
    spec:
      terminationGracePeriodSeconds: 30
      domain:
        resources:
          requests:
            memory: 2Gi
            cpu: 1
        memory:
          hugepages:
            pageSize: "1Gi"
        devices:
          disks:
            - name: containerdisk
              disk:
                bus: virtio
            - name: cloudinitdisk
              disk:
                bus: virtio
          interfaces:
          - name: default
            masquerade: {}
          - name: vhost-user-net-1
            vhostuser: {}
      networks:
      - name: default
        pod: {}
      - name: vhost-user-net-1
        multus:
          networkName: nad-ovs-dpdk-dhcp
      volumes:
        - name: containerdisk
          containerDisk:
            image: quay.io/kubevirt/fedora-cloud-container-disk-demo
        - name: cloudinitdisk
          cloudInitNoCloud:
            userData: |-
              #cloud-config
              password: fedora
              chpasswd: { expire: False }
EOF
```

### Validate the VM

```yaml
$ kubectl get vm vm-test-ovs-dpdk -o yaml
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"kubevirt.io/v1","kind":"VirtualMachine","metadata":{"annotations":{},"name":"vm-test-ovs-dpdk","namespace":"default"},"spec":{"running":true,"template":{"metadata":{"annotations":{"kubevirt.io/memfd":"false"},"labels":{"debugLogs":"true","kubevirt.io/size":"small"}},"spec":{"domain":{"devices":{"disks":[{"disk":{"bus":"virtio"},"name":"containerdisk"},{"disk":{"bus":"virtio"},"name":"cloudinitdisk"}],"interfaces":[{"masquerade":{},"name":"default"},{"name":"vhost-user-net-1","vhostuser":{}}]},"memory":{"hugepages":{"pageSize":"1Gi"}},"resources":{"requests":{"cpu":1,"memory":"2Gi"}}},"networks":[{"name":"default","pod":{}},{"multus":{"networkName":"nad-ovs-dpdk-dhcp"},"name":"vhost-user-net-1"}],"terminationGracePeriodSeconds":30,"volumes":[{"containerDisk":{"image":"quay.io/kubevirt/fedora-cloud-container-disk-demo"},"name":"containerdisk"},{"cloudInitNoCloud":{"userData":"#cloud-config\npassword: fedora\nchpasswd: { expire: False }"},"name":"cloudinitdisk"}]}}}}
    kubemacpool.io/transaction-timestamp: "2023-05-10T12:50:18.130438694Z"
    kubevirt.io/latest-observed-api-version: v1
    kubevirt.io/storage-observed-api-version: v1alpha3
  creationTimestamp: "2023-05-10T12:50:18Z"
  generation: 1
  name: vm-test-ovs-dpdk
  namespace: default
  resourceVersion: "116974"
  uid: 453dd252-2793-4de7-8dc5-9779c7e1828c
spec:
  running: true
  template:
    metadata:
      annotations:
        kubevirt.io/memfd: "false"
      creationTimestamp: null
      labels:
        debugLogs: "true"
        kubevirt.io/size: small
    spec:
      domain:
        devices:
          disks:
          - disk:
              bus: virtio
            name: containerdisk
          - disk:
              bus: virtio
            name: cloudinitdisk
          interfaces:
          - macAddress: "02:55:43:00:00:48"
            masquerade: {}
            name: default
          - macAddress: "02:55:43:00:00:49"
            name: vhost-user-net-1
            vhostuser: {}
        machine:
          type: q35
        memory:
          hugepages:
            pageSize: 1Gi
        resources:
          requests:
            cpu: "1"
            memory: 2Gi
      networks:
      - name: default
        pod: {}
      - multus:
          networkName: nad-ovs-dpdk-dhcp
        name: vhost-user-net-1
      terminationGracePeriodSeconds: 30
      volumes:
      - containerDisk:
          image: quay.io/kubevirt/fedora-cloud-container-disk-demo
        name: containerdisk
      - cloudInitNoCloud:
          userData: |-
            #cloud-config
            password: fedora
            chpasswd: { expire: False }
        name: cloudinitdisk
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-05-10T12:50:40Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: null
    status: "True"
    type: LiveMigratable
  created: true
  printableStatus: Running
  ready: true
  volumeSnapshotStatuses:
  - enabled: false
    name: containerdisk
    reason: Snapshot is not supported for this volumeSource type [containerdisk]
  - enabled: false
    name: cloudinitdisk
    reason: Snapshot is not supported for this volumeSource type [cloudinitdisk]
```

#### Validate the VMI <a href="#validate-the-vmi" id="validate-the-vmi"></a>

*Note that the VMI is annotated with the IP from PF9 DHCP server :*

`dhcp.plumber.k8s.pf9.io/dhcpserver: '{"02:55:43:00:00:49":"192.168.15.72"}'`

```yaml
$ kubectl get vmi vm-test-ovs-dpdk -o yaml                                                                                                                                                    
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
  annotations:
    dhcp.plumber.k8s.pf9.io/dhcpserver: '{"02:55:43:00:00:49":"192.168.15.72"}'
    kubevirt.io/latest-observed-api-version: v1
    kubevirt.io/memfd: "false"
    kubevirt.io/storage-observed-api-version: v1alpha3
  creationTimestamp: "2023-05-10T12:50:18Z"
  finalizers:
  - kubevirt.io/virtualMachineControllerFinalize
  - foregroundDeleteVirtualMachine
  generation: 9
  labels:
    debugLogs: "true"
    kubevirt.io/nodeName: 131.153.165.65
    kubevirt.io/size: small
  name: vm-test-ovs-dpdk
  namespace: default
  ownerReferences:
  - apiVersion: kubevirt.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: VirtualMachine
    name: vm-test-ovs-dpdk
    uid: 453dd252-2793-4de7-8dc5-9779c7e1828c
  resourceVersion: "117040"
  uid: 2e399241-953a-41e4-a2ce-78a5f1598fdd
spec:
  domain:
    cpu:
      cores: 1
      model: host-model
      sockets: 1
      threads: 1
    devices:
      disks:
      - disk:
          bus: virtio
        name: containerdisk
      - disk:
          bus: virtio
        name: cloudinitdisk
      interfaces:
      - macAddress: "02:55:43:00:00:48"
        masquerade: {}
        name: default
      - macAddress: "02:55:43:00:00:49"
        name: vhost-user-net-1
        vhostuser: {}
    features:
      acpi:
        enabled: true
    firmware:
      uuid: d7e0dfc2-e769-54e6-8b02-1abcd893cedd
    machine:
      type: q35
    memory:
      hugepages:
        pageSize: 1Gi
    resources:
      requests:
        cpu: "1"
        memory: 2Gi
  networks:
  - name: default
    pod: {}
  - multus:
      networkName: nad-ovs-dpdk-dhcp
    name: vhost-user-net-1
  terminationGracePeriodSeconds: 30
  volumes:
  - containerDisk:
      image: quay.io/kubevirt/fedora-cloud-container-disk-demo
      imagePullPolicy: Always
    name: containerdisk
  - cloudInitNoCloud:
      userData: |-
        #cloud-config
        password: fedora
        chpasswd: { expire: False }
    name: cloudinitdisk
status:
  activePods:
    46e3d75b-c9b9-4af3-9d59-e9e8a9bfa8a1: 131.153.165.65
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-05-10T12:50:40Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: null
    status: "True"
    type: LiveMigratable
  guestOSInfo: {}
  interfaces:
  - infoSource: domain
    ipAddress: 10.20.117.160
    ipAddresses:
    - 10.20.117.160
    mac: "02:55:43:00:00:48"
    name: default
    queueCount: 1
  - infoSource: domain
    mac: "02:55:43:00:00:49"
    name: vhost-user-net-1
    queueCount: 1
  launcherContainerImageVersion: platform9/virt-launcher:v0.58.1
  migrationMethod: BlockMigration
  migrationTransport: Unix
  nodeName: 131.153.165.65
  phase: Running
  phaseTransitionTimestamps:
  - phase: Pending
    phaseTransitionTimestamp: "2023-05-10T12:50:18Z"
  - phase: Scheduling
    phaseTransitionTimestamp: "2023-05-10T12:50:18Z"
  - phase: Scheduled
    phaseTransitionTimestamp: "2023-05-10T12:50:40Z"
  - phase: Running
    phaseTransitionTimestamp: "2023-05-10T12:50:41Z"
  qosClass: Burstable
  runtimeUser: 0
  virtualMachineRevisionName: revision-start-vm-453dd252-2793-4de7-8dc5-9779c7e1828c-1
  volumeStatus:
  - name: cloudinitdisk
    size: 1048576
    target: vdb
  - name: containerdisk
    target: vda
```

### Variations of OVS networks <a href="#variations-of-ovs-networks" id="variations-of-ovs-networks"></a>

OVS networks can be configured using the `HostNetworkTemplate` Custom Resource

#### OVS network without DPDK <a href="#ovs-network-without-dpdk" id="ovs-network-without-dpdk"></a>

```yaml
$ cat <<EOF | kubectl apply -f -
apiVersion: plumber.k8s.pf9.io/v1
kind: HostNetworkTemplate
metadata:
  name: host-network-template-ovs
  namespace: luigi-system
spec:
  ovsConfig:
  - bridgeName: "dpdkbr01"
    nodeInterface: "bond0.2"
    dpdk: false
EOF
```

OVS Bonded network without DPDK

```yaml
$ cat <<EOF | kubectl apply -f -
apiVersion: plumber.k8s.pf9.io/v1
kind: HostNetworkTemplate
metadata:
  name: host-network-template-ovs-bonded
  namespace: luigi-system
spec:
  ovsConfig:
  - bridgeName: "dpdkbr01"
    nodeInterface: "bond0.2,bond0.5"
    dpdk: false
    #optional paramters
    params:
      mtuRequest: 9192
      lacp: "active"  # create ovs bond with lacp enabled
EOF
```

OVS Bonded network with DPDK

```yaml
$ cat <<EOF | kubectl apply -f -
apiVersion: plumber.k8s.pf9.io/v1
kind: HostNetworkTemplate
metadata:
  name: host-network-template-ovs-dpdk-bonded
  namespace: luigi-system
spec:
  ovsConfig:
  - bridgeName: "dpdkbr01"
    nodeInterface: "bond0.2,bond0.5"
    dpdk: true
    #optional paramters
    params:
      mtuRequest: 9192
      bondMode: "balance-tcp"
      lacp: "active"
EOF
```
