Can ONF Stratum Meet its Full Potential?

Industry and standards bodies in telecom have, in my personal view and from my own experience, a bit of a tarnished history.  Even the fifty-odd operators I’ve interacted with in the last month think that the activities of these groups take way too long, and a very large majority say the results aren’t “transformative”.  Now, however, the ONF may have found its wings, or at least its legs.  I’ve blogged on its Stratum initiative before, but we need to look at it again, now that it’s been open-sourced by the ONF and in the context of the current market.

Stratum is one of two important developments in the open-switch market sector.  DANOS, the other, is a Linux Foundation project derived from an earlier AT&T initiative (dNOS) to create an “open, disaggregated” alternative to traditional proprietary network devices.  Stratum may take the concept further than DANOS, in part because it’s a more generalized (or at least generalizable) approach, and in part because it includes a model to virtualize custom switching chips, making it an easy fit to new open switch silicon designs.

Open switching products are exceptionally interesting to network operators, particularly ones like AT&T who (because of their low demand density) have issues with profitability on network infrastructure.  Using white-box switches with open software, operators could reduce their device costs by as much as 50%, according to operators themselves.  They also believe that these devices could reduce opex as well by standardizing network operating system features and providing APIs to link with open operations automation tools.

DANOS is important for its role in AT&T’s network.  It’s based on the Vyatta software-hosted router technology that AT&T acquired with the breakup of Brocade.  Many operators were very interested in Vyatta technology, but Brocade seemed unable to leverage that interest once it acquired Vyatta in 2012.  Part of the reason might have been the onrush of interest in NFV, which launched with the Call for Action white paper in the fall of that year.  Part, of course, was likely Brocade’s own failure to do insightful positioning of what they acquired.

As useful as DANOS is, it does have a tight bond with traditional routing.  We are already seeing a transition in data center networking, one moving toward more “programmable” forwarding, and it’s very possible that optical grooming applications, the “packet-layer” stuff I blogged about earlier regarding Cisco and Ciena, might play a role in the future.  If so, there’s a chance that DANOS might be behind the curve of device needs already, and in danger of falling further behind.

Stratum has a very different genesis.  It emerged as an abstraction for network switching, an “open-source thin switch implementation” that’s built to exploit the P4 flow-programming language.  The original goal was to abstract the hardware layer of switches used in SDN with OpenFlow, the original SDN model that was the basis for founding the ONF.  Stratum retains an SDN slant in its positioning, and the ONF seems careful not to be seen as rushing away from its own SDN roots.  That, in fact, may be the big issue they’ll have to address if Stratum is to truly remake networks, and the ONF as well.

If DANOS is a stiletto, then Stratum is a Swiss army knife.  The software in DANOS, the network operating system that makes up the “NOS” in DANOS, is capable of playing a lot of IP-network roles.  Stratum is more generalized in its approach.  The NOS, in the Stratum material, lies above Stratum and interacts with it through APIs, including P4.  The ONF has other projects that fill in the NOS layer, including ONOS, Trellis, and CORD.  These are an SDN cloud controller, a leaf/spine controller for NFV, and an architecture to transform the CO into a data center, respectively.

The strength of Stratum, in my view, has always rested in the P4 flow programming model.  P4, combined with merchant switching silicon, creates the data plane of a network device.  You can use P4 to program flows that are controlled by SDN, by NFV, or by traditional control-plane adaptive routing and switching.  In fact, you could use Stratum with DANOS, to provide a useful hardware abstraction layer.  What this does is separate the capital component of network equipment, the white-box switches, from the protocols and even missions at the service level

You can do traditional switching and routing with Stratum, or you can do OpenFlow SDN, or you could do something that’s not even formally defined.  If the 5G User Plane, for example, could benefit from a custom abstraction-based model as I’ve suggested it could, then Stratum could program the forwarding and expose the APIs needed to make the new model work.  And if something else comes along, the boxes are simply repurposed by changing the layers above Stratum.

This “universality strength” brings a collateral strategic strength, which is that Stratum is not bound to the current vision of “virtual networks”, which says that you virtualize a router network by virtualizing the routers.  A network of virtual boxes, as I’ve said before, isn’t a virtual network, it’s the same box network it always was.  You need virtual devices, in the form of collectivized hosted features, to create a virtual network, but what unfetters virtual networks from the past is the ability to collectivize those features rather than linking boxes.  Stratum could provide the mechanism to do that at the hardware-abstraction level, and the higher-level models illustrate how the way that network services are collectivized can be varied fairly easily.

The big question is whether the ONF understands this, and a second question nearly as important is whether they can unfetter Stratum from the old SDN story they told.  SDN was, in my terms, a way of changing box networks by pulling the control-plane out of the boxes and hosting it in the cloud.  It never proved itself out at scale.  If the ONF insists that Stratum is the box layer of OpenFlow SDN, then it inherits the things that OpenFlow SDN never proved about itself, and limits its own success before it ever really gets started.

The good news is that the ONF does have a sort-of-model in which Stratum fits, and that model is generally suitable for describing most networks and services in the present, as well as the way they seem to be evolving.  They have a central policy process that frames what I’ve called “virtual network abstractions” as well as lower levels to describe functional architecture.  Their only omission lies, I think, in relying on NFV to describe how features are hosted.  It’s taken NFV six years to get where they are, which is not yet where true cloud-think would demand that networks be taken.

What the ONF needs is a compelling vision of the way cloud-think would take networks.  That vision could then be decomposed top-down into layers that could map to the Stratum model already defined.  It could help unite what otherwise might look like a bunch of diverging trends and goals.  It could give the ONF new relevance.

It could also help compete against DANOS, even though the two are not strictly competing technologies.  DANOS is a box model, a way of doing routing without router vendors and switching without switching vendors.  For buyers, that’s worthwhile, but in the end, it commits the buyers to a model of networking defined by the people they’re trying to disintermediate.

I’d like to see AT&T embrace both DANOS as a NOS and Stratum as a P4 engine and foundation for the hardware abstraction process.  P4 joined with the ONF last year, and it’s a Linux Foundation project for open-source administration.  DANOS is also hosted by the Linux Foundation, so it may be that common administration will keep these projects tracking closer.  Formal commitment would be more useful, of course, and for AT&T in particular, the unification of all this stuff could be a step toward creating a model for a future service-and-experience-driven networking approach.  As I said earlier this week, they need that.