ZebOS PIM Features

20th Apr 2015

Any networking suite is incomplete without Multicasting Services.  Along with the vast array of Unicasting features, ZebOS also has been supporting Multicasting features from a very long time.

Here’s a brief introduction to some of the Protocol Independent Multicast (PIM) features supported by ZebOS:

  • PIM SPARSE
  • PIM SSM
  • PIM BIDIR
  • PIM DENSE
  • PIM SPARSE-DENSE
  • PIM ECMP REDIRECT

PIM SPARSE:                                 

The Protocol Independent Multicasting-Sparse Mode (PIM-SM) is a multicast routing protocol designed to operate efficiently across Wide Area Networks (WANs) with sparsely distributed groups. It helps geographically dispersed network nodes to conserve bandwidth and reduce traffic by simultaneously delivering a single stream of information to multiple locations. PIM-SM uses the IP multicast model of receiver-initiated membership, supporting both shared and shortest-path trees, and uses soft-state mechanisms to adapt to changing network conditions. It relies on a topology gathering protocol to populate a multicast routing table with routes. ZebOS implements PIM-SM v2 as specified in the RFC-4601.

Use the “ip pim sparse-mode” to configure an interface as sparse mode.

This figure shows a simple PIM-SM topology.

OCNOS_graphic_01

Configuring RP:

Since sources and receivers are not aware of each other and can become active at any time, a place where the information regarding sources and receivers can be stored is needed. The Rendezvous Point (RP) serves this purpose. Sources (S) sending data for a multicast group (G) get registered with the RP through the PIM REGISTER mechanism. Receivers express their interest for a group (G) through one of the host protocols (IGMP or MLD). This gets translated into a PIM join towards the RP by the multicast router. Once both the Sources and Receivers are known at the RP, the multicast data starts flowing as shown in the figures from sources towards RP and from RP towards the receivers. This is known as the RP path or Shared-Tree path since the same RP might be used by many sources and groups.

There can be many RPs configured in the domain. But each group on every router should map to the same RP.  Every router in the domain should be aware of the RP. This can be achieved either statically (using the “ip pim rp-address” cli) or through the Bootstrap Router (BSR) protocol.   One or more routers in the domain would be configured as the BSR candidates and a single router would be elected as the Bootstrap Router. Multiple routers can be configured as candidate RPs and the BSR selection algorithm would choose and distribute the RP information to all the multicast routers in the domain.

In the above case it can be assumed that one of the RTR2’s SM interface is configured as the BSR and RP candidate.

SPT path:   In case the receiver desires that traffic from the sources flow through the shortest path, the SPT switch should be enabled, which would indicate the last hop router to request traffic from the shortest path by sending a S,G join towards the source. The “ip pim spt-threshold” controls the traffic switchover from RP tree to Shortest Path Tree. As soon as traffic arrives from the SPT path, the RP path would be pruned.

PIM SSM:

A little observation of the PIM SM protocol would reveal that the whole process can be made simple if the receivers already know about the sources through some other means.  This would result in doing away with the RP concept altogether.  SSM stands for Source Specific Multicast which makes use of the receiver’s capability to specify the source information along with the group information (IGMPv3 or MLDv2).  The multicast router connected to the receiver translates this into a source specific join (as done while switching over to SPT) towards the source.

Below is a figure showing a simple SSM topology.

OCNOS_graphic_01

The range of multicast addresses from 232.0.0.0 to 232.255.255.255 is currently set aside for source-specific multicast in IPv4. By default ZebOS does not enable SSM.

Use the “ip pim ssm default” to enable SSM for the default range. Only source specific join will be accepted for the range. No (*,G) or (S,G,RPT) state would be created for the specified range.  Use the “ip pim ssm range” command to configure a range of your choice by using ACLs.

PIM BIDIR:

ZebOS PIM Fig 3

ZebOS supports BIDIR-PIM as described in RFC5015.

Though PIM-SM has taken over PIM-DM, due to its better state management, sometimes the complexity of the states maintained by PIM-SM might not be needed for all kinds of application deployments.

Application models can be of ONE-to-MANY, MANY-to-MANY and the rest. PIM-SM can fit into all these, but few cases do not need the whole of PIM-SM.

One-to-Many applications have a single sender, and multiple receivers. Example applications include audio/video lectures, presentations; push-media like weather updates, sports scores;  file distritribution and caching,

In Many-to-Many applications the receivers might also act as senders. Example MANY-to-MANY applications include audio/video conferencing, chat groups, distributed interactive simulations, multi-player games.

PIM-SSM which uses a specific part of PIM-SM, does away with the shared tree (or to say the use of RP Tree), and relies on the Source Tree mechanics to deliver Multicast Traffic. This is all that would be needed to deploy a ONE-to-MANY Application.

It can be noted that MANY-to-MANY Application deployment using PIM-SM introduces much overhead of state creation using Shared Trees (Hint: Register and Register-Stop) and switching to Source Trees (Hint: SPT). Adding to this maintaining both Shared Tree (*,G) and Source Tree (S,G) states for the same group  will add additional overhead. BIDIR-PIM unlike SSM uses only Shared Trees and does away with the Source for Trees. This allows BIDIR-PIM to scale to large number of groups and sources.

So neither one of SM, SSM, BIDIR is a replacement to the other. Depending on the applications’ needs the variant can be chosen. Corporate applications like sharing training videos (one way traffic) where the sources are already known or can be discovered easily can be SSM deployed which can save lots of bandwidth and storage space. BIDIR can scale to a large number of sources and receivers and as such financial applications like trading floor can choose BIDIR and general deployments can choose SM.

Taking this into consideration ZebOS PIM has been designed to allow for all the three states to exist. Mechanism to partition the groups into SSM, BIDIR and SM have been provided. The priority of SSM > BIDIR > SM is followed to decide which state a particular group should choose. Also Hungary) partial deployment is supported, so as to bring in BIDIR into existing SM deployments.

An important feature of BIDIR-PIM is the concept of Designated Forwarder (DF).  In case of PIM-SM, routers connected to the sources have to Register the (S,G) information with the Rendezvous Point (RP) and then the RP issues a source specific join towards the source if it has downstream receivers. So in case of SM, traffic always flows in a unidirectional manner from Sources -> RP -> Receivers (Shared Tree) or Source – > Receivers (Source Trees).

Though BIDIR-PIM uses the concept of RP, it does not use the Register mechanism. Instead it relies on the DF mechanism to send the traffic towards the RP. On every link, all the BIDIR enabled routers participate in the DF election. The router having the best metrics towards the RP becomes the Designated Forwarder (DF) for that RP. Routers downstream natively send the traffic towards the DF for the RP which in turn does the same on its upstream interface until it reaches the RP.

Similarly if the Source is also a Receiver, the same path is used for the traffic flow from RP.

Traffic is sent towards RP (upstream) only if the router is DF on the link where traffic is received.

Traffic from RP (upstream) is sent to the downstream only if the router is the DF on that crossings downstream link.

The “bidir” keyword in the CLI (static and candidate rp) should be used to mark the configuration as BIDIR.

PIM Dense:                                                                     

ZebOS also supports the dense mode working of PIM as defined by the RFC 3973. PIM-DM is a data-driven multicast routing protocol that builds source based multicast distribution trees that operate on the flood-and-prune principle. Initially the traffic is flooded throughout the domain and the last hop routers that are not interested in the traffic (no receivers for the group) prune the links. Traffic flows through the other links that are not pruned, meaning there are receivers interested in the traffic.

Also the RPF checks would ensure that the traffic is flowing from the Shortest Path towards the source.

This prune state is not lost as long as the receivers do not come up. The State Refresh mechanism as defined in the RFC refreshes the prune state. But if a receiver happens to come up interested in the traffic, the router sends a Graft message (which is similar to the PIM Join) to its upstream router to express interest in the traffic which in turn does the same on its upstream interface. This continues till the graft reaches the Router connected to the source or a router which is already forwarding traffic on a different interface.

ZebOS PIM Fig 3

The traffic flows from RTR1->RTR2->RTR3->RTR4. The RTR3-RTR5 link would be in pruned state as there are no receivers connected Scalability to RTR5.

Use the “ip pim dense-mode” to configure an interface as dense mode.

Note:  ZebOS does not support mixed mode configuration wherein few interfaces are in sparse and few interfaces are in dense mode.

PIM SPARSE-DENSE:

In case of PIM-SM, for a particular group address, a RP should be available (either by static configuration or dynamically), using which a (*.G) entry is formed. A RP address is mandatory for PIM-SM, without which the packets are lost.

In case of PIM-DM, for a particular group address, the packets are flooded and path is formed by pruning the links which are not interested in the particular group address. A RP configuration is not required in this Mode.

In case of PIM-SMDM, if an RP address for a particular group address is present, then PIM-SM state-machine would be followed for that particular group address. If RP is not available PIM-DM will be followed.  Similarly if an RP comes up for a particular group which had a DM state, the state is removed and the SM state is created.

This mode addresses the drawback of PIM-SM, wherein multicast services continue even if the RP is not configured, or is lost mid-way.

Use the “ip pim sparse-dense-mode” to configure an interface as sparse-dense mode.

PIM ECMP REDIRECT:

The PIM protocol uses the Reverse Path Forwarding (RPF) mechanism to find out the upstream interface and router for building the forwarding state. But when there are equal-cost multipaths (ECMPs), using mechanism like hashing or choosing the neighbor with higher ip address do not provide for the spread of traffic among the ECMPs. This results in ineffective use of network resources. The RFC6754 provides a mechanism to improve the RPF procedure over ECMPs.

It allows path selection to be based on administratively selected metrics, such as data transmission delays, path preferences and routing metrics.

RFC6754 depends on RFC6395 to uniquely identify the router interfaces.

Note:   The term “ECMP” here refers to parallel, single-hop, equal-cost links between adjacent nodes.

As part of the ZebOS-XP 1.2 release, ZebOS is being extended to support the above mentioned RFCs.

Currently, while choosing the RPF neighbor, PIM depends on the RIB module for unicast route lookup and nexthop updates.  When there are Make equal the cost multiple paths to a RP or Source, the router with the highest address is chosen as the nexthop to build forwarding state.

ZebOS PIM Fig 5

1, 2, 3 in order of increasing ip addresses.

Figure: RPF path selection

DS-RTR will always use the link 3 to send joins to the US-RTR. This will not consider the other two links into consideration. The ECMP-REDIRECT feature proposes a mechanism to overcome this and better utilize the available bandwidth.

The upstream router on receiving the join message, can do a check based on administratively selected metrics (say bandwidth usage or path preference) and send a REDIRECT message to the downstream router to resend the join on a preferred interface (indicated by the interface id).

Say if the desired traffic is already flowing on a different interface, a REDIRECT message will be sent by the upstream router on the interface on which the join has arrived. The downstream router then sends the Join on the other interface.  This is a simple example of bandwidth saving.

If the request is for a different group, the upstream router can identify the interface with more bandwidth and request for a Join on that interface. This is a case of load balancing.  The implementation is free to choose the decision framework based on its requirement.

As an example the number of flows on each interface will be used as a cue to accept the Join or not. The Join will be redirected to the interface with the least number of flows. This framework will be extended to support accepting or redirecting Joins based on bandwidth usage.

The downstream router does not do any ECMP analysis.  Only the upstream router currently makes the decision to accept the Join or redirect it.