Network Working Group K. Majumdar Internet Draft Microsoft Intended status: Standard Track L. Dunbar Expires: August 20, 2024 Futurewei V.Kasiviswanathan Arista A. Ramchandra Microsoft A. Choudhary Aviatrix February 20, 2024 Multi-segment SD-WAN via Cloud DCs draft-dmk-rtgwg-multisegment-sdwan-07 Abstract This document describes a method for SD-WAN CPEs using GENEVE Encapsulation (RFC8926) to encapsulate the IPsec encrypted packets and send them to their closest Cloud GWs, who can steer the IPsec encrypted payload through the Cloud Backbone without decryption to the egress Cloud GWs which then forward the original IPsec encrypted payload to the destination CPEs. This method is for Cloud Backbone to connect multiple segments of SD-WAN without the Cloud GWs decrypting and re- encrypting the payloads. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." xxx, et al. Expires August 20, 2024 [Page 1] Internet-Draft Multi-segment SD-WAN The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on Dec 20, 2024. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction..............................................3 2. Conventions used in this document.........................5 3. Use Cases.................................................6 3.1. Multi-segment SD-WAN via Single Cloud GW.............6 3.2. Multi-segment SD-WAN via Cloud Backbone..............7 3.3. Analysis of Policy-based Traffic Steering............8 3.4. End to End Encryption................................9 4. Data Plane encoding for SD-WAN Transit....................9 4.1. Multi-Segment SD-WAN Option Class....................9 4.2. SD-WAN Tunnel Endpoint Sub-TLV......................10 4.3. SD-WAN Tunnel Originator Sub-TLV....................11 4.4. Egress GW Sub-TLV...................................12 4.5. Include Transit Sub-TLV.............................12 4.6. Exclude-Transit Sub-TLV.............................13 5. IPsec Flow through Cloud GWs Illustration................14 5.1. Single Hop Cloud GW.................................14 5.2. Multi-hop Transit GWs...............................16 Dunbar, et al. Expires Dec 20, 2024 [Page 2] Internet-Draft Multi-segment SD-WAN 5.3. Data Authentication and Integrity Check by Cloud GW.18 6. Illustration of Traffic from Private VPN to IPsec Tunnel.19 7. Control Plane considerations.............................21 7.1. Control Plane for CPEs..............................21 7.2. Control Plane between CPEs and Cloud GWs............21 8. Observability Consideration..............................22 9. Security Considerations..................................23 10. Manageability Considerations............................26 11. IANA Considerations.....................................27 12. References..............................................28 12.1. Normative References...............................28 12.2. Informative References.............................29 13. Acknowledgments.........................................29 1. Introduction SD-WAN is widely deployed to connect enterprises' on-premises CPEs with services in Cloud DCs. As described in Section 4.1 of [Net2Cloud], there are multiple options for enterprises to connect to Cloud DCs: - Direct Interconnect model, - Direct Interconnect model with Enterprise's virtual appliances in the Cloud, - Indirect Interconnect model via SD-WAN paths and - Managed Hybrid WAN model using Enterprise's existing VPN connections. For the enterprise branches with private VPN circuits interconnecting with a Cloud GW via IXP (Internet eXchange Point), the Enterprise can extend into Cloud DC without setting up IPsec paths between their on-premises CPEs and the Cloud GWs. Enterprises connecting to Cloud DC may find significant benefits in leveraging the Cloud Backbone for transporting traffic between their CPEs, such as 1. Enterprises can benefit from the robust and high- performance infrastructure cloud service providers provide by leveraging diverse paths and harnessing cloud backbones' scalability and global reach to reduce the risk of downtime or disruptions. Dunbar, et al. Expires Dec 20, 2024 [Page 3] Internet-Draft Multi-segment SD-WAN 2. The scalability of the Cloud Backbone allows for efficient handling of increased data traffic, accommodating the growing demands of modern enterprises. 3. Cloud Backbone's centralized management and orchestration capabilities contribute to simplified network administration, enabling organizations to streamline their operations and respond more effectively to changing business requirements. To ensure security, enterprise traffic between their CPEs is encrypted and remains inaccessible to any third parties, including the Cloud DC. For the encrypted packets to be steered through the Cloud Backbone, the packet header must contain information indicating the packet's intended route. Given that the IPsec SA between CPEs is exclusively maintained between the CPEs and is not accessible to Cloud GWs, the encrypted packet needs to be carried by a tunnel between the source CPE and the ingress Cloud GW. This tunnel can be another layer of IPsec, which adds processing overhead to the Cloud GW to decrypt the outer IPsec tunnel solely for steering the encrypted payload. By steering the encrypted traffic through the Cloud Backbone without the need for decryption and re-encryption at Cloud GWs, processing demands at these GWs can be significantly reduced. This streamlined approach not only maintains the integrity of the encrypted traffic but also optimizes processing resources, enhancing overall efficiency within the cloud infrastructure. This document introduces a method for SD-WAN CPEs that utilizes GENEVE Encapsulation [RFC8926] to encapsulate IPsec encrypted packets, directing them to the nearest Cloud GWs. These gateways can determine whether the packet needs to traverse the backbone without decryption by inspecting Sub- TLVs within the GENEVE header, as specified in Section 4. Once determined that the packet is intended for backbone traversal, the IPsec encrypted payload is steered through the Cloud Backbone without decryption to optimal egress Cloud GWs. These gateways then forward the original IPsec encrypted Dunbar, et al. Expires Dec 20, 2024 [Page 4] Internet-Draft Multi-segment SD-WAN payload to the destination CPEs. This method facilitates the Cloud Backbone's connecting multiple SD-WAN segments without Cloud GWs decrypting and re-encrypting payloads. GENEVE is selected in this document as the encapsulation protocol due to its widespread usage in Cloud DC sites. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. The following acronyms and terms are used in this document: Cloud DC: Off-Premises Data Center, managed by the third party, that hosts applications, services, and workload for different organizations or tenants. CPE: Customer (Edge) Premises Equipment. OnPrem: On Premises data centers and branch offices. RR Route Reflector. SD-WAN An overlay connectivity service that optimizes transport of IP Packets over one or more Underlay Connectivity Services and determining forwarding behavior by applying Policies to them. [MEF-70.1] VPN Virtual Private Network. Dunbar, et al. Expires Dec 20, 2024 [Page 5] Internet-Draft Multi-segment SD-WAN 3. Use Cases 3.1. Multi-segment SD-WAN via Single Cloud GW For enterprise branches that have established SD-WAN paths to a Cloud GW for accessing Cloud services, the Cloud GW can be utilized to connect those branches, as shown in Figure 1. Here are some reasons for connecting those branches via a Cloud GW: - The public internet among those branches might have limited bandwidth, unpredictable connection performance, or be prone to cyber-attacks. In comparison, the network paths from CPEs to the Cloud GW have more reliable connections and are constantly monitored by sophisticated network functions. - It is easier to utilize Cloud based security functions, such as Firewall, DDoS, etc., to apply consistent policy enforcement for workloads/services to the Cloud and across the branches. - Proprietary cloud-based tools and SaaS (Software as a Service) may be available in specific deployments to collect and analyze the threat to the traffic. Dunbar, et al. Expires Dec 20, 2024 [Page 6] Internet-Draft Multi-segment SD-WAN (^^^^^^^^^^^^) ( Cloud ) ( +----+ +----+ ) + -----(-|Edge| + GW | ) Direct | ( +----+ +/--\+ ) Connect | (^^^^^^^/^^^^\^) {-+---} / \ SD-WAN Path CPE<->GW { VPN } / \ {-+---} / IPsec Tunnel +-------+----/------+ \ | / | \ ++--/+ | +-\--+ |CPE1| +----+CPE2| +----+ +----+ Client Route: 11.1.1.x 10.1.1.x 21.1.1.x 20.1.1.x 30.1.1.x Figure 1 Multi-Segment SD-WAN stitching via a Cloud GW 3.2. Multi-segment SD-WAN via Cloud Backbone For geographic faraway enterprise branches that have established SD-WAN paths to their corresponding Cloud GWs to access Cloud services in different geographic locations, the Cloud backbone can connect those branches, as shown in Figure 2. The reasons to utilize the Cloud Backbone to interconnect those branches are similar to interconnecting multiple branches via a single Cloud GW described in the previous section. Dunbar, et al. Expires Dec 20, 2024 [Page 7] Internet-Draft Multi-segment SD-WAN (^^^^^^^^^^^^^^^) ( Cloud ) ( +----+ +----+ ) +-----+ + ---(-|Edge|==| GW1|=================== GW2 | Direct | ( +----+ +/--\+ ) +--|--+ Connect | (^^^^^^^/^^^^\^) | {-+---} / \ | { VPN } / \ +-----+ {-+---} / IPsec Tunnel |CPE10| +-------+--/--------+ \ +-----+ | / | \ 10.2.1.x ++/--+ | +\---+ 20.2.1.x |CPE1| +----+CPE2| 30.2.1.x +----+ +----+ Client Route: 11.1.1.x 10.1.1.x 21.1.1.x 20.1.1.x 30.1.1.x Figure 2 Multi-Segment SD-WAN Stitching via Cloud Backbone 3.3. Analysis of Policy-based Traffic Steering There are many well-developed methods, such as SRv6 or MPLS-TE, to steer traffic through specific nodes. Those traffic steering methods are effective when the entire network domain is under one administrative control. However, the traffic from on-premises CPEs to Cloud GWs via the public internet can only be forwarded based on the packets' destination addresses. SD-WAN allows for the setup of multiple links (paths), some of which are the Public Internet, from the same SD-WAN branch CPE to a Cloud GW; each link (or path) represents a dual tunnel connection from a unique public IP of the SD- WAN CPE to two different instances of Cloud GW. Using Cloud GW to interconnect those on-premises CPEs eliminates the need to manage the multiple ISPs' links/paths between the CPEs. Dunbar, et al. Expires Dec 20, 2024 [Page 8] Internet-Draft Multi-segment SD-WAN 3.4. End to End Encryption To ensure the confidentiality, integrity, and availability of communication among CPEs, the traffic between the CPEs should be encrypted by the IPsec SAs if traversing the public Internet. When the traffic between the enterprise's CPEs doesn't terminate within the Cloud DCs, the processing burden on Cloud GWs can be significantly reduced if the Cloud GWs don't need to decrypt and re-encrypt transit IPsec encrypted traffic among CPEs. This document describes the mechanisms for the IPsec encrypted traffic between CPEs to traverse the Cloud GWs without being decrypted and re-encrypted by the Cloud GWs. 4. Data Plane encoding for SD-WAN Transit For Cloud GWs to differentiate the packets destined towards their internal hosts/services, which require decryption, and transit packets to be forwarded to the respective destination branch CPEs, proper marking is needed in the packets' header. As the GENEVE Encapsulation [RFC8926] is supported by most Cloud Service Providers, GENEVE is chosen as the encapsulation header for Cloud GWs to steer IPsec encrypted packets among CPEs without decryption. 4.1. Multi-Segment SD-WAN Option Class Geneve header is specified in Section 3 of [RFC8926]. A new GENEVE Option Class (Type value=TBD) is added to indicate that the Multi-segment SD-WAN relevant Sub-TLVs are encoded in the GENEVE header. Dunbar, et al. Expires Dec 20, 2024 [Page 9] Internet-Draft Multi-segment SD-WAN 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | multi-seg-SD-WAN Option Class |C| Type |R|R|R| Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ SD-WAN Tunnel Endpoint Sub-TLV ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ Optional SD-WAN Tunnel Originator Sub-TLV ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ Optional Egress GW Sub-TLV ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // // // Optional Type Length Value objects (variable) // // // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4 Multi Segment SD-WAN Option Class C-bit needs to be set, so that receiving node can drop the packet if it does not recognize the option [RFC8926]. Type indicates the various types of multi-segment SD-WAN. Type = Multi-seg-SDWAN_subType1 (To be assigned by IANA): Single Hop Transit SD-WAN Type = Multi-seg-SDWAN_subType3 (To be assigned by IANA): Multi-Hop Transit SD-WAN with explicitly specified egress Cloud GW. Type = Multi-seg-SDWAN_subType3 (To be assigned by IANA): Multi-Hop Transit SD-WAN without specified egress Cloud GW. Note: the payload after the multi-seg-SD-WAN Option Class can be IPv4 or IPv6. The IP header protocol type = 50 (ESP) [RFC4303] indicates the payload is IPsec ESP encrypted. 4.2. SD-WAN Tunnel Endpoint Sub-TLV The SD-WAN Endpoint sub-TLV indicates the destination CPE of the IPsec Tunnel. Dunbar, et al. Expires Dec 20, 2024 [Page 10] Internet-Draft Multi-segment SD-WAN For example, for the SD-WAN IPsec SA from CPE1 to CPE2 shown in Figure 1, the Tunnel Endpoint Sub-TLV of the Geneve Header has the CPE2's IP address. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |SD-WAN Endpoint| length | Reserved | TTL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SD-WAN Dst Addr Family | Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ (variable) + ~ ~ | SD-WAN end point Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5 SD-WAN Endpoint Sub-TLV TTL is set by the SD-WAN Tunnel Originator, e.g., CPE1. Each transit node or transit region/zone (visible to the CPEs) SHOULD decrement the TTL so that the destination CPE can know the number of logical transit nodes (cloud regions or zones) the packet has traversed. Enterprises can also use TTL to set the maximum transit nodes/regions the packets traverse. 4.3. SD-WAN Tunnel Originator Sub-TLV The SD-WAN Tunnel Originator Sub-TLV is an optional Sub-TLV inside the multi-seg-SD-WAN Option Class to indicate the originating CPE of the IPsec Tunnel. For example, for the SD-WAN IPsec SA from CPE1 to CPE2 shown in Figure 1, the Tunnel Originator Sub-TLV inside the Geneve Header of the packets indicates CPE1's address. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |SDWAN Origin | length | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SD-WAN Org Addr Family | Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ (variable) + ~ ~ | SD-WAN Tunnel Originator Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 6 SD-WAN Tunnel Originator Sub-TLV Dunbar, et al. Expires Dec 20, 2024 [Page 11] Internet-Draft Multi-segment SD-WAN The Tunnel Originator Sub-TLV in the GENEVE header can assist Cloud transit nodes in applying appropriate policies when forwarding the packet. 4.4. Egress GW Sub-TLV For the multi-segment SD-WAN via Cloud Backbone scenario, the originator CPE can use the Egress GW Sub-TLV to specify the Egress Cloud GW for reaching the destination CPE. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |SDWAN EgressGW | length | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Egress GW Addr Family | Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ (variable) + ~ ~ | Egress GW Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 7 SD-WAN Egress GW Sub-TLV The originator CPE can get the Egress GW address by configuration or by control plane protocol exchanged with destination CPEs. The detailed Control Plane protocol extension is out of the scope of this document. 4.5. Include Transit Sub-TLV Include-Transit Sub-TLV is an optional Sub-TLV for explicitly including a list of Cloud Availability Regions or Zones for reasons like: - Those regions have certain OAM and security functions for the improved visibility. - To comply with regulations, etc. Dunbar, et al. Expires Dec 20, 2024 [Page 12] Internet-Draft Multi-segment SD-WAN 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Include-Transit| length |Transit_Type |I|Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Transit node ID | ~ ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 8 Include-Transit Sub-TLV Multiple Include-Transit Sub-TLVs can be incorporated into a single GENEVE header to denote multiple nodes or regions intended for inclusion when steering the packet through the Cloud Backbone. It's important to note that the multiple Include-Transit Sub-TLVs constitute a set rather than an ordered list. Transit_type: TBD1: when Transit node ID is represented as a numeric number, such as a Cloud Availability Region or Zone numeric identifier that the Cloud Operator provides. TBD2: when Transit node ID is represented as a string, such as a Cloud Availability Region or Zone name that the Cloud Operator provides. TBD3: when Transit node ID is represented as an IP address. I-bit: When set to 0: it indicates it needs best effort to steer through the transit node ID. When set to 1, it indicates that the Transit Node ID must be included through the Cloud Backbone. If the Transit Node ID cannot be traversed, an alert or alarm must be generated to the enterprise via an out-of-band channel. It is out of the scope of this document to specify those alerts or alarms. 4.6. Exclude-Transit Sub-TLV Exclude-Transit Sub-TLV is an optional Sub-TLV for explicitly excluding a list of Cloud Availability Regions or Zones for reasons like - To comply with regulations, Dunbar, et al. Expires Dec 20, 2024 [Page 13] Internet-Draft Multi-segment SD-WAN - To avoid regions that impose certain risks. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Exclude-Transit| length |Transit_Type |E| Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Transit node ID | ~ ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 9 Exclude-Transit Sub-TLV Multiple Exclude-Transit Sub-TLVs can be incorporated into a single GENEVE header to denote multiple nodes or regions to be exclouded when steering the packet through the Cloud Backbone. It's important to note that the multiple Enclude- Transit Sub-TLVs constitute a set rather than an ordered list. Transit_type: same as Section 4.6 E-bit: When set to 0: it indicates it needs best effort to avoid the transit node ID. When set to 1, it indicates that the Transit Node ID must be avoided through the Cloud Backbone. If the Transit Node ID cannot be avoided, an alert or alarm must be generated to the enterprise via an out-of-band channel. It is out of the scope of this document to specify those alerts or alarms. 5. IPsec Flow through Cloud GWs Illustration This section illustrates Cloud GWs connecting traffic flow carried by the IPsec tunnels. 5.1. Single Hop Cloud GW Assuming that all CPEs are under one administrative control (e.g., iBGP). Using Figure 1 as an example: Dunbar, et al. Expires Dec 20, 2024 [Page 14] Internet-Draft Multi-segment SD-WAN - There is a bidirectional IPsec tunnel between CPE1 and Cloud GW; with IPsec SA1 for the traffic from the CPE1 to the Cloud-GW; and IPsec SA2 for the traffic from the Cloud-GW to the CPE1. - There is a bidirectional IPsec tunnel between CPE2 and Cloud GW; with IPsec SA3 for the traffic from the CPE2 to the Cloud-GW; and IPsec SA4 for the traffic from the Cloud-GW to the CPE2. - All the CPEs are under one iBGP administrative domain, with a Route Reflector (RR) as their controller. The CPEs notify their peers of their corresponding Cloud GW addresses (which is out of the scope of this document). When 11.1.1.x and 10.1.1.x need to communicate with each other, CPE1 and CPE2 establish a bidirectional IPsec Tunnel, with SA5 for the traffic from CPE1 to CPE2 and SA6 for the traffic from CPE2 to CPE1. Assume the IPsec ESP Tunnel Mode is used. A packet from 11.1.1.1 to 10.1.1.2 has the following outer header: Dunbar, et al. Expires Dec 20, 2024 [Page 15] Internet-Draft Multi-segment SD-WAN Outer IP header: +---------------------------+ | protocol = 17(UDP) | | src = CPE1 | | dst = Cloud GW | +---------------------------+ | Source Port =xxxx | | Dst Port = 6081 (GENEVE) | +===========================+ | GENEVE Header | | multi-seg-SD-WAN Option | |GENEVE Proto = 50 (ESP) | +- - -- -- - - -- - --+ |SD-WAN EndPt SubTLV (CPE2) | +---------------------------+ < ----------+ |SPI(Security Parameter Idx)| Authenticated +---------------------------+ | | sequence number | | +---------------------------+ <-+ | | payload IP header: | | | | src = 11.1.1.1 | | | | dst = 10.1.1.2 | | | +---------------------------+ Encrypted | | TCP header + | | | ~ payload (variable) ~ | | | | | | +===========================+ <-+ -------+ | Authentication Data | +---------------------------+ Figure 8 Packet header illustration of traffic to Cloud GWs 5.2. Multi-hop Transit GWs Traffic to/from geographic apart CPEs can cross multiple Cloud DCs via Cloud backbone. The on-premises CPEs are under one administrative control (e.g., iBGP). Using Figure 2 as an example: - There is a bidirectional IPsec tunnel between CPE1 and the Cloud GW1; with IPsec SA1 for the traffic from the CPE1 to the Cloud-GW1; and IPsec SA2 for the traffic from the Cloud-GW1 to the CPE1. Dunbar, et al. Expires Dec 20, 2024 [Page 16] Internet-Draft Multi-segment SD-WAN - There is a bidirectional IPsec tunnel between CPE10 and the Cloud GW2; with IPsec SA3 for the traffic from the CPE10 to the Cloud-GW2; and IPsec SA4 for the traffic from the Cloud-GW2 to the CPE10. - All the CPEs are under one iBGP administrative domain, with a Route Reflector (RR) as their controller. CPEs notify their peers of their corresponding Cloud GW addresses. When 11.1.1.x and 10.2.1.x need to communicate with each other, CPE1 and CPE10 establish a bidirectional IPsec Tunnel, with SA5 for the traffic from CPE1 to CPE10 and SA6 for the traffic from CPE10 to CPE1. Assume the IPsec ESP Tunnel Mode is used, a packet from 11.1.1.1 to 10.2.1.2 has the following outer header: Dunbar, et al. Expires Dec 20, 2024 [Page 17] Internet-Draft Multi-segment SD-WAN Outer IP header: +---------------------------+ | proto = 17 (UDP) | | src = CPE1 | | dst = Cloud GW1 | +===========================+ | GENEVE Header | | multi-seg-SD-WAN Option | |GENEVE Proto = 50 (ESP) | +- - -- -- - - -- - --+ |SD-WAN EndPt SubTLV (CPE10)| +---------------------------+ | EgressGW-SubTLV | +---------------------------+ < ----------+ |SPI(Security Parameter Idx)| Authenticated +---------------------------+ | | sequence number | | +---------------------------+ <-+ | | payload IP header: | | | | src = 11.1.1.1 | | | | dst = 10.2.1.2 | | | +---------------------------+ Encrypted | | TCP header + | | | ~ payload (variable) ~ | | | | | | +===========================+ <-+ -------+ | Authentication Data | +---------------------------+ Figure 9 GENEVE header encapsulated IPsec packet 5.3. Data Authentication and Integrity Check by Cloud GW The IPsec SA already encrypts the client payload between the CPEs, the Cloud GW doesn't need to decrypt and re- encrypt the payload when relaying it to the destination CPE. However, data authentication and integrity check are needed as the traffic traverse an untrusted network. [RFC2403] and [RFC2404] define the authentication algorithms used in AH and ESP. SHA2 224/256/384/512 are some of the cryptographic hashing algorithms. They are part of a Hashed Message Authentication Code. 5.4. Packet Header Processing Dunbar, et al. Expires Dec 20, 2024 [Page 18] Internet-Draft Multi-segment SD-WAN In Figure 1, upon receiving a GENEVE encapsulated packet with the GENEVE Protocol Type = 50 (ESP), the Cloud GW does the following: - Authenticate the packet using a preconfigured authentication method. - Extract the destination CPE address from the SD-WAN Endpoint Sub-TLV inside the GENEVE header. Replace the outer IP destination address with the destination CPE address. - Optionally replace the outer IP source address with the Cloud GW address. - GENEVE header is unchanged. - Forward the packet to the destination CPE. The cloud GW SHOULD drop all packets with the source addresses or the values in the Sub-TLVs of the GENEVE header that are not recognized or registered to prevent unauthorized users from using the Cloud services. 5.5. Error Handling As traffic through Cloud Backbone takes precious resources, the Cloud GW SHOULD drop the packets with unregistered source or destination addresses. Cloud GW SHOULD drop the packets originated from unpaid (or unregistered) address (CPE). Cloud GW SHOULD validate the value of the SD-WAN Endpoint Sub-TLV and drop the packet if the value of the SD-WAN Endpoint Sub-TLV is an unpaid (or unregistered) address. 6. Illustration of Traffic from Private VPN to IPsec Tunnel This section illustrates a Cloud GW connecting client traffic from a branch CPE via a Private VPN to another CPE via an IPsec tunnel. Using Figure 1 as an example: Dunbar, et al. Expires Dec 20, 2024 [Page 19] Internet-Draft Multi-segment SD-WAN - CPE1 send traffic via a Private VPN (Direct Connect to the Cloud Edge) to the Cloud GW. The traffic is not encrypted. - There is a bidirectional IPsec tunnel between CPE2 and the Cloud GW; with IPsec SA1 for the traffic from the CPE2 to the Cloud-GW; and IPsec SA2 for the traffic from the Cloud-GW to the CPE2. - All the CPEs are under one iBGP administrative domain, with a Route Reflector (RR) as their controller. CPEs notify their peers of their corresponding Cloud GW addresses. Assume the IPsec ESP Tunnel Mode is used for the IPsec SA between Cloud GW and CPE2. For a packet from 11.1.1.1 to 10.2.1.2, the following header is added by CPE1 sending over the Private VPN: Outer IP header: +---------------------------+ | proto = 17 (UDP) | | src = CPE1 | | dst = Cloud GW | +===========================+ | GENEVE Header | | multi-seg-SD-WAN Option | |GENEVE Proto =TCP/UDP/etc. | +- - -- -- - - -- - --+ |SD-WAN EndPt SubTLV (CPE2) | +---------------------------+ < -+ | payload IP header: | | | src = 11.1.1.1 | | | dst = 10.2.1.2 | | +---------------------------+ Not Encrypted | TCP header + | | ~ payload (variable) ~ | | | | +===========================+ <-+ Figure 10 Illustration of packet through VPN Upon receiving the GENEVE encapsulated packet with the "Multi-Segment-SD-WAN" option, the Cloud GW extracts the destination CPE from the GENEVE header and encrypts the packet with the IPsec SA2 to forward to the destination (i.e., CPE2). The GENEVE Header is carried to the CPE2. Dunbar, et al. Expires Dec 20, 2024 [Page 20] Internet-Draft Multi-segment SD-WAN Outer IP header: +---------------------------+ | proto = 17 (UDP) | | src = Cloud GW | | dst = CPE2 | +===========================+ | GENEVE Header | | multi-seg-SD-WAN Option | |GENEVE Proto =50 (ESP) | +- - -- -- - - -- - --+ |SD-WAN EndPt SubTLV (CPE2) | +---------------------------+ < ----------+ |SPI(Security Parameter Idx)| Authenticated +---------------------------+ | | sequence number | | +---------------------------+ <-+ | | payload IP header: | | | | src = 11.1.1.1 | | | | dst = 10.2.1.2 | | | +---------------------------+ Encrypted | | TCP header + | | | ~ payload (variable) ~ | | | | | | +===========================+ <-+ -------+ | Authentication Data | +---------------------------+ Figure 11 Illustration of packet from the Egress Cloud GW 7. Control Plane considerations 7.1. Control Plane for CPEs The control plane enables SD-WAN edges to discover their properties and attached routes. The on-premises CPEs and their vCPEs (or Virtual Appliances in Cloud DC) can be controlled by one iBGP instance. [SD-WAN-Edge-Discovery] describes the mechanism for SD-WAN edges to discover each other's properties. The IPsec Key Exchange between on- premises CPEs and the vCPE is via the iBGP Update through RR. [SD-WAN-Edge-Discovery]. 7.2. Control Plane between CPEs and Cloud GWs It is common to have eBGP sessions between enterprises CPEs and the Cloud GWs. An enterprise-owned vCPE can establish an eBGP session with the Cloud VPN GW for accessing the Dunbar, et al. Expires Dec 20, 2024 [Page 21] Internet-Draft Multi-segment SD-WAN workloads hosted in the Cloud DCs. If an IPsec tunnel is required between the Cloud DC GW and the vCPE, the full suite of IPSec IKEv2 must be exchanged between the vCPE and the Cloud GW. 8. Observability Consideration Observability considerations encompass monitoring, analysis, and reporting mechanisms to gain insights into the behavior and performance of the multi-segment SD-WAN infrastructure. Key observability aspects include: - Performance Metrics: Monitor and collect performance metrics related to link utilization, latency, and packet loss across the SD-WAN segments and Cloud DC backbone. This data provides insights into the overall health and efficiency of the network. IP Flow Information Export (IPFIX) [RFC7011] is one of the standardized methods to expose traffic flow over the network. - Global Network Topology Visualization: Utilize visualization tools to depict the global network topology, showcasing the interconnections and traffic flows between different SD-WAN segments and Cloud DCs. - Control Plane Monitoring: Monitor the control plane for both CPEs and the communication between CPEs and Cloud GWs. This includes tracking route discovery, path selection, and any changes in network state to ensure proper functioning of the SD-WAN control plane. - Security Event Logging: The security event logging is to capture and analyze security-related events, including threat detection, authentication failures, and any unauthorized access attempts. Syslog [RFC5424] is a valuable tool for security monitoring and auditing. These considerations contribute to the overall success of the multi-segment SD-WAN deployment connecting edge devices via a Cloud DC backbone. Dunbar, et al. Expires Dec 20, 2024 [Page 22] Internet-Draft Multi-segment SD-WAN 9. Security Considerations 9.1. Threat Analysis As shown in Figure 3, the information carried by the GENEVE Header is not encrypted, which is susceptible to Man-in-the- Middle (MitM) attacks. An attacker can intercept and potentially alter the information in the GENEVE header between the branch CPEs and the Cloud GWs without the enterprise and the Cloud provider's knowledge or consent. Here is the threat analysis of the MitM attacks between CPEs and Cloud GWs: a) Eavesdropping: Attackers can get knowledge of the enterprise's branch locations and their respective contracted Cloud GWs. As the payload between the CPEs is encrypted, attackers can't get any data exchanged between CPEs. This threat is no different from direct IPsec SAs between two CPEs. b) Data Manipulation: Attackers alter the content (Sub-TLVs) in the GENEVE header. As packets with unrecognized source addresses or invalid values in the Sub-TLVs of the GENEVE header are dropped by Cloud GWs, there might be a higher packet drop rate between the CPEs. Packet drop is not a new problem. The transport layer, such as TCP or QUIC, can handle packet drop well. c) Potential steeling of Cloud Backbone bandwidth: A threat actor might want to leverage Cloud Backbones to transport its own traffic between two locations without paying for the services. For example, a legitimate Cloud subscriber pays for the Cloud Backbone transport services for traffic between CPE-A and CPE-B. The attacker, who has two locations far apart (say Node-A and Node-B), can use CPE-A's address as the source address and CPE-B as the value in the SD-WAN Endpoint Sub-TLV for a packet from Node-A to Node-B before reaching the ingress Cloud GW. When the packet is sent from the egress Cloud GW via the Internet towards CPE-B, the actor can change the source address back to Node-A and the destination address to Node- B. By doing so, Node-A and Node-B can maintain the IPsec Dunbar, et al. Expires Dec 20, 2024 [Page 23] Internet-Draft Multi-segment SD-WAN tunnel via the Cloud Backbone without paying for the service. Therefore, it is necessary to have some level data integrity and authentication for traffic between CPEs and Cloud GWs even though it is not necessary for Cloud GWs to decrypt and re-encrypt the payload between CPEs. 9.2. HMAC-based Integrity and Authentication HMAC (Hash-based Message Authentication Code), a widely used cryptographic technique for ensuring both data integrity and authentication, can be used to ensure the integrity and authenticity of data between CPEs and Cloud GWs to verify that GENEVE header has not been tampered with. The basic idea behind HMAC is to combine a secret key and a hash function to produce a fixed-size authentication code for the GENEVE header between CPEs and the Cloud GW. This authentication code is then sent along with the data itself. When the Cloud GW and the destination CPEs receive the data and the authentication code, they can independently compute the HMAC using the same key and hash function. If the computed HMAC matches the received authentication code, it indicates that the data has not been altered, as long as the secret key remains confidential. The HMAC authentication code can be carried by an HMAC Sub- TLV in the GENEVE Header, as specified below: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |MultiSDWAN-HMAC| length | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ ~ | HMAC Authentication Code for entire GENEVE Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 12 Multi Segment SD-WAN HMAC Sub-TLV The HMAC Authentication Code, a.k.a. the HMAC hash value, is computed including all the bytes in the GENEVE header and with the MultiSDWAN-HMAC value field setting to 0. The advantages of using HMAC are: - Data Integrity: HMAC provides a strong mechanism for verifying the integrity of data. By hashing the message Dunbar, et al. Expires Dec 20, 2024 [Page 24] Internet-Draft Multi-segment SD-WAN and using a secret key, it generates a fixed-size hash value that ensures the data has not been tampered with during transmission. - Authentication: HMAC also verifies the authenticity of the sender. Since HMAC requires a shared secret key between the sender and receiver, it confirms that the sender is who they claim to be. - Efficiency: HMAC is computationally efficient, making it suitable for real-time and resource-constrained devices. It uses simple bitwise operations and hash functions. - Resistance to Tampering: HMAC is designed to resist various forms of tampering, including replay attacks, message insertion, and message deletion. Any change in the message will result in a different HMAC value. - Flexibility: HMAC can be used with various hash functions, such as SHA-256 or SHA-512, depending on the desired level of security requirements. - Widely Supported: HMAC is a well-established and widely supported authentication mechanism, making it easy to integrate into different systems. Here are some common problems associated with using the HMAC and why their risks are acceptable in the scenario described in this draft. - Key Management: The security of HMAC depends heavily on the confidentiality and management of the shared secret key. If the key is compromised, the data packets from CPEs to Cloud GW can be dropped but not compromised because the user payloads are protected by IPsec SA encryption. - Lack of Non-Repudiation: HMAC provides data integrity and sender authentication but does not provide non-repudiation. Non-repudiation is the ability to prove that a message was sent by a specific sender, which HMAC alone cannot guarantee. This risk is same as two IPsec protected traffic between CPEs. - Limited to Symmetric Cryptography: HMAC relies on symmetric key cryptography, which means that both parties must share the same secret key. As the Cloud backbone interconnecting CPEs are paid services, there are established channels to distribute the symmetric key. Dunbar, et al. Expires Dec 20, 2024 [Page 25] Internet-Draft Multi-segment SD-WAN - No Protection Against Eavesdropping: While HMAC ensures data integrity and sender authentication, it does not provide encryption. Eavesdropping does pose additional risks to payloads encrypted by IPsec SA. In summary, HMAC-based integrity and authentication offer strong security benefits in terms of data integrity and sender authentication. Even though it does not provide non- repudiation or protection against eavesdropping, the IPsec encrypted payload between CPEs won't be impacted. 9.3. AH based Integrity and Authentication For enterprises or Cloud providers worrying about secret HMAC keys being compromised, they can add another layer of AH encryption [RFC4301] or ESP-NULL [RFC2410] [RFC6071] on top of the IPsec encryption between the two CPEs. Both AH and ESP-NULL IPsec encryption require pairwise IPsec key management between Cloud GWs and the CPEs, therefore requiring more processing on Cloud GWs and CPEs. In addition, the AH encrypted packets can't traverse NAT because of outer IP address changes. 10. Manageability Considerations The following manageability considerations are crucial for the successful deployment and ongoing operation of the proposed strategies outlined in this document: - Centralized Orchestration: A centralized orchestration system is needed to manage and authenticate multiple SD-WAN segments through the Cloud GWs. - Policy-based Configuration: Utilize policy-driven configurations to streamline the deployment of SD-WAN segments and their connectivity options. This approach allows for efficient management of network policies, ensuring consistent and coherent behavior across diverse deployment scenarios. [RFC8192] can be used to automate the security policy configurations. - Real-time Monitoring and Analytics: Dunbar, et al. Expires Dec 20, 2024 [Page 26] Internet-Draft Multi-segment SD-WAN Integrate robust monitoring and analytics tools to provide real-time visibility into the performance and health of SD-WAN segments. This includes monitoring bandwidth utilization, latency, packet loss, and other key performance indicators to promptly identify and address any issues. - Automated Alerting and Reporting: Implement automated alerting mechanisms to promptly notify network administrators of potential issues or anomalies within the SD-WAN infrastructure. Additionally, generate regular reports to facilitate performance analysis, capacity planning, and compliance monitoring. 11. IANA Considerations IANA is requested to assign a new GENEVE Option Class from the IETF Review range as shown below: Option Class Description Assignee/Contact Reference ------ ------------------- ------------- ----------- TBD Multi Segment SD-WAN IETF [this document] Multi-seg- SDWAN_subType Description Assignee/Contact Reference ------ -------------- ------------- ----------- TBD1 Single Hop Transit IETF [this document] TBD2 MultiHopTransit IETF [this document] TBD3 MultiHop wo egress IETF [this document] IANA is requested to create a registry as below with the initial values shown in the Multi Segment SD-WAN Geneve Option Class registry group: Registry: Multi Segment SD-WAN Sub-TLVs Assignment Policy: IETF Review Reference: [this document] Sub-TLV Type Description Reference ------------ ---------------------- --------------- Dunbar, et al. Expires Dec 20, 2024 [Page 27] Internet-Draft Multi-segment SD-WAN 0 Reserved 1 SD-WAN Endpoint [this document] 2 SD-WAN Originator [this document] 3 SD-WAN Egress GW [this document] 4 Multi SD-WAN-HMAC [this document] 5-254 Unassigned 255 Reserved 12. References 12.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2403] C. Madson, R. Glenn, "The Use of HMAC-MD5-96 within ESP and AH", RFC2403, Nov. 1998. [RFC2404] C. Madson, R. Glenn, "The Use of HMAC-SHA-1-96 within ESP and AH", RFC2404, Nov. 1998. [RFC4301] S. Kent and K. Seo, "Security Architecture for the Internet Protocol", RFC4301, Dec. 2005. [RFC4303] S. Kent, "IP Encapsulating Security Payload (ESP)". RFC4303, Dec. 2005. [RFC5424] R. Gerhards, "The Syslog Protocol", RFC5424, March 2009. [RFC7011] B. Claise, B. Trammell, and P. Aitken, "Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information", RFC7011, Sept 2013. [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . Dunbar, et al. Expires Dec 20, 2024 [Page 28] Internet-Draft Multi-segment SD-WAN [RFC8926] J. Gross, et al, "Geneve: Generic Network Virtualization Encapsulation", RFC8926, Nov 2020. 12.2. Informative References [RFC2410] R. Glenn and S. Kent, "The NULL encryption Algorithm and Its Use with IPsec", RFC2310, Nov. 1998. [RFC6071] S. Frankel and S. Krishnan, "IP Security (IPsec) and Internet Key Exchange (IKE) Document Roadmap", Feb. 2011. [RFC8192] S. Hares, et al, "Interface to Network Security Functions (I2NSF) Problem Statement and Use Cases", July 2017 [MEF-70.1] MEF 70.1 SD-WAN Service Attributes and Service Framework. Nov. 2021. [Net2Cloud] L. Dunbar and A. Malis, "Dynamic Networks to Hybrid Cloud DCs Problem Statement", draft-ietf- rtgwg-net2cloud-problem-statement-34, Jan, 2024. [SD-WAN-Edge-Discovery] L. Dunbar, et al, "BGP UPDATE for SD- WAN Edge Discovery", draft-ietf-idr-sdwan-edge- discovery-12, Oct. 2023. 13. Acknowledgments Acknowledgements to Adrian Farrel, Donald Eastlake, Stephen Farrell for their extensive review and suggestions. This document was prepared using 2-Word-v2.0.template.dot. Dunbar, et al. Expires Dec 20, 2024 [Page 29] Internet-Draft Multi-segment SD-WAN Dunbar, et al. Expires Dec 20, 2024 [Page 30] Internet-Draft Multi-segment SD-WAN Authors' Addresses Linda Dunbar Futurewei Email: ldunbar@futurewei.com Kausik Majumdar Microsoft Email: kmajumdar@microsoft.com Venkit Kasiviswanathan Arista Email: venkit@arista.com Ashok Ramchandra Microsoft Email: aramchandra@microsoft.com Aseem Choudhary Aviatrix Email: achoudhary@aviatrix.com Contributors' Addresses Dunbar, et al. Expires Dec 20, 2024 [Page 31]