Orchestrating computer systems, a research into a new protocol

tags:

~2014

Introduction

As artists embrace new technologies as an artistic medium these technologies often provide artists with new methods for cooperation. Networking technologies are used frequently for these purposes. Internet has been a driving force behind the development of networking hardware and open standards for connecting any to device to any device. These technologies are now a commodity and thus available for anybody to use. However we see a rising need to be able to implement these technologies in a flexible adaptive manner without the explicit fixed setups requiring manual configuration.

Open Sound Control (OSC) is a protocol developed for exchanging music performance data. OSC is often used as an alternative to MIDI however it has found its way to many use cases besides musical performances. We have found OSC to be an ideal de facto standard for connecting applications to each other in order to orchestrate them. However the flexibility of OSC tends to decrease exponentially when used in large setups because of it’s hard coded nature. I.e. many applications need to be instructed to use specific manual settings and agreements in order for applications to communicate. In large setups containing multiple systems and multiple applications this manual work is inflexible and error prone.

This document describes the results of an initial research into new methods and protocols which are more suited to handle the mentioned situations.

Use cases and requirements

Most networking technologies are designed to cope with almost any situation. Protocols used on the internet are often designed with reliability in mind. They guarantee delivery of the message or they will provide an error. In local networks technologies with different capabilities are found, e.g. transferring the message as fast as possible. As a lot of knowledge has been gained from these existing technologies we are keen on leaning on those resources as much as we can. The first requirements for our foreseen protocol are the following:

KISS (Keep it simple and stupid): We want this protocol not be in our way and we want to be able to understand it easily.
Zero Configuration: The protocol should be able to handle most, if not all configuration by itself. There is no need for setting up specific parameters unless requested.
Runs on anything TCP/IP: Since TCP/IP is the de facto standard for devices to communicate the technology should be able to run on any device that is able to talk TCP/IP
Open Standards: All used technologies, software, protocols should be freely and openly available.

The use cases for a protocol we foresee are broad. However we initially focus on use cases for live performances and interactive setups. In these setups time is crucial. If a musician is interacting with a device we want the communication to be instant. Any possible delay should be as small as possible. We usually refer to these circumstances as ‘realtime’. When we require the latency to be as small as possible guaranteed message delivery is usually not required. E.g. a corrupted message is discarded since the mechanism to recover from the error takes more time then sending a new message. However there are other use cases when message reliability is important. E.g. An actor is switching a light on. We need to ensure the light switch message is received. Thus both requirements, message latency and message raliability need to be taken into account.

Low latency, when needed
Reliability, when needed

In any development situation one of the most important features is debugging. You want to know what is going on. Especially when you run into situations different from what is expected. In performative or improvising situations this feature is even more important than in regular development situations. When you want technology to be flexible you’ll need to ensure you can rely on the technology but more important rely on understanding what it is doing and being able to fix it when needed. Meanwhile all this debuging needs to be unintrusive as it cannot intervene with the live situation.

Unintrusive debugging and monitoring

Transports

As our main focus is TCP/IP because of its wide availability we have the following available de facto transports:

TCP: For the situations where reliability is required as TCP guarantees delivery
UDP: Where latency is more important

In order for multiple nodes to exchange data there are 3 network communication topologies to consider:

Unicast for 1:1 communication
Multicast for 1:N communication
Broadcast for 1:all communication

These topologies are essential to consider since these topologies are often being handled by hardware networking devices and can deliver very low latency transfers. They can also reduce traffic as a message can be sent once while being received by many. However it is important to understand that most of these communication topologies are not available in Wide Area Networks (WAN) like the internet. As we can imagine projects to include some exchanging through the internet we need to provide methods to do so while utilizing the efficiency of Local Area Network communication topologies. This essentially means every method topology should be available though a unicast topology since devices communicating across internet can only rely on unicast communication. Only in situations where other topologies are available these can be utilized to provide more efficiency.

When in a situation in which a program needs to communicate with an other program running on the same system it will be more efficient to communicate to communicate through other means than a network socket. Examples include an interprocess socket or named pipes. These need to be provided by the host operating system.

Utilize communication topologies provided by hardware devices or host operating system where available

If all these communication transports need to be accounted for we can conclude that the protocol running on these transports needs to be transport agnostic.

Transport agnostic protocol

Decentralisation

One final category to account for is traffic optimizing. In many environments messages are routed through a central device. It acts like the telephone operator connecting you to your requested line. This device becomes a single point of failure as well as a potential bottleneck. In mesh based networks this is essentially not necessary. If one node knows who to talk to it can do so directly without intervention by some central director. This prevents single point of failures as well potential bottlenecks. The protocol therefore needs to contain an ‘out of bound’ control protocol which acts in parallel to the data it can transfer. When nodes can communicate directly they also need to know how to connect to each other interfaces. This requires logic in cases where interfaces don’t match. E.g. output is numbers, input is a string.

Parallel out of bound control protocol
Logic for instructing interfaces to connect

Existing Technologies:

While researching this new protocol we have studied many existing open technologies in order to prevent reinventing the wheel, find any meeting our requirements or to do cherry picking. Since the list of technologies is too extensive and the technologies itself as well we only mention the technologies we have considered or studied thoroughly as a reference:

Mbus The Message Bus (Mbus) is a light-weight local coordination protocol for developing component-based distributed applications that has been developed by Bremen University and University College London. While this was a very interesting candidate it isn’t suited for internet.
OSC Open Sound Control is currently the most used protocol in creative applications
OSPF Open Shortest Path First is routing protocol which does neighbour discovery and exchanges information about networks in order to build routing tables for routers using the most optimal path. This protocol has been studies extensively in order to copy mechanisms it uses.
ZeroMQ A high-performance asynchronous messaging library aimed at use in scalable distributed or concurrent applications
NanoMsg Successor to ZeroMQ
D-Bus D-Bus is a inter-process communication (IPC) system, allowing multiple, concurrently-running computer programs to communicate with one another.
XMPP XMPP is the Extensible Messaging and Presence Protocol, a set of open technologies for instant messaging, presence, multi-party chat, voice and video calls, collaboration, lightweight middleware, content syndication, and generalized routing of XML data.

Phase 1: Discovery

The first thing our protocol needs to do is some sort of discovery mechanism. The OSPF protocol uses a multicast address to discover neighbours and initiate unicast handshake. DHCP uses broadcast to discover clients and uses unicast to finish a handshake. MDNS uses multicast to discover services. Bonjour/Avahi is a popular ZeroConf implementation which uses multicast as well. Most mechanism rely on either multicast or broadcast technology. As noted before since these technologies are very uncommon on WAN networks this limits the use case. In a lab tested prototype we have had very good results using a simple mechanism with multicast. Using this mechanism with a broadcast topology is easy to implement.

An other possibility would be to use an existing protocol for doing discovery. The most suitable candidate would be the ZeroConf service like Avahi or Bonjour. We have opted not to do since the scope of these protocols is much wider than our use case. It would make our protocol rely on other software which the operating system would need to provide. In cases where the operating system lacks these services we would need to provide these ourselves. As there is no platform independent solution for these ZeroConf services the dependencies would become overly complicated.

While settling on a multicast discovery mechanism our research into existing technologies found a very similar protocol coming from the ZeroMQ community. The Zyre project is a an open-source framework for proximity-based peer-to-peer applications. Zyre has been designed with unreliable wireless networks in mind. Zyre implements Zbeacon which uses a broadcast technology for discovery of nodes. We have have compared our initial discovery prototype with Zyre and concluded that the differences are too small to not embrace this existing technology. However since we would rather rely on multicast technology as opposed to broadcast technology we have joined the ZeroMQ community in order to add this to the Zyre project.

Designing our protocol on the foundations of the Zyre project is very beneficial since we can build upon the experience of the ZeroMQ community. This would mean we instantly have a lot of cross platform compatibility and an already tested framework.

However we have no discovery mechanism which would work across WAN networks. This needs to be researched. Our initial research tends to point towards implementing a system based on DNS using SRV records like done in the XMPP protocol. This is a subject of interest for the Zyre project as well. Since in any case manual discovery remains an option we will research this further in the future.

Phase 2: Communication and exchange

After nodes have discovered each other they need to communicate. In any real situation where one enters a room full of people we can imagine one would announce itself and introduce itself personally accompanied by a physical handshake. In the digital realm this is not different. The protocol is able to discover all nodes in its vicinity and so can introduce itself personally to another node and can receive introductions. In a human situation it is a custom to shake hands. In the digital realm we can do a handshake through the network however nodes need information on how to connect to another node. So the discovery beacon contains an address of where the node can be contacted. This address in TCP/IP terms would consist of an ipaddress and a port number.

Fortunately the Zyre framework provides exactly this as well. It provides in the broadcast beacon besides some other information the ipaddress and port number of the node’s unicast socket.

Phase 3: Interfacing

When we would use the protocol as we have described up until here it is usable as a replacement for OSC. We don’t need to enter ipaddresses and port numbers anymore. We could just select nodes from a list of discovered nodes and let them communicate like they do now using OSC. However to fully meet our zero configuration requirement we would like to implement a mechaniscm so nodes tell each other what they are able to send and receive. In order to do so they would need to exchange their capabilities.

Again when in a human social situation we would know of means to exchange our capabilities and requirements. One could ask another what they do and so find out how somebody could be useful or how somebody could be of use for another. In the digital realm we don’t have these etiquettes so we need to agree on a formalisation of one.

There are many frameworks to provide Remote Procedure Calls (RPC) mechanisms. However there are only few that provide mechanisms for exchanging what RPC calls can be done automatically. Apart from just exchanging RPC capabilities a formalisation of capabilities would mean nodes can already expect certain capabilities and can act upon them. It would be the basis for autonomous operation of nodes which is also necessary for decentralised operation. However this would be a subject for further research beyond the scope of this initial research. Therefore we settle on a practical initial design of how nodes interface with each other.

Before we can start exchanging capabilities we need to agree on what messages look like. We need some serialization format which contains the information of the messages. As there are many options like: BSON, MessagePack, Protocol Buffers, OSC, etc. We have settled for JSON for this research purely for practical reasons. In any further iteration of the design we will focus on the message serialization formats.

When we want nodes to communicate there a multitude of possible things they can exchange. We have reduced these options to the following:

parameters: simple values to get or set on a node
signals: event messages emitted from a node
sensors: events receivers on a node which can receive signals from a node
methods: custom functions which can be called on a node

These possibilities to exchange are all meta data. When this is done using the Zyre framework we are using a reliable unicast transport. However we required to also provide methods for low latency where raliability is not of concern. Therefore we provide a mechanism to completely bypass the protocol so direct communication between programs is made possible. This would be the best setup for any low latency situation as it would be done using custom sockets. This mechanism would also provide a method to wrap other protocols inside our protocol. E.g. one could wrap a video data stream through an RTP socket. For this mechanism we provide the following extra options:

sources: data emitted from the node through a custom socket using an alien protocol
sinks: receiver for data emitted from a custom socket using an alien protocol

The requirements for exchanging capabilities would then become:

provide information for all capabilty objects of the node
provide methods to get and modify parameters of these objects
provide methods to call methods on these objects
provide methods to subscribe or unsubscribe signals to/from sensors of these objects
provide methods to subscribe or unsubscribe streams to/from sinks of these objects

A node’s capability would be a tree-like data structure containing all relevant information. E.g:

node:          name
  root:        physical base properties where node is located
  objects      objects accessible through this node
    object     object properties and data
      signals  signals emitted from this object
      sensors  available sensors on this object
      methods  available methods to be called on this object
      sinks    available stream sinks at this object
      sources  available stream sources at this object
      ...      any data belonging to the object e.g.
      localMat local matrix
      type     type of object (camera, processor, projector, etc)
      visible  visibility state

Of course we have researched exiting technologies. We have found many RPC and IPC technologies and found D-Bus closest to meeting our requirements. We are currently investigating if we want to use D-Bus fully or if we want to map D-Bus onto our design.

Initial implementation

During this research we have implemented a prototype. The results have been published on Github:

ZOCP implementation in Python:

http://github.com/z25/pyZOCP

Pyre an implementation of Zyre in Python:

http://github.com/zeromq/pyre

Conclusions

Our research has resulted in an initial design for a protocol providing a way for systems to automatically discover each other, exchange capabilities and provide mechanism for setting up communication. The design provides methods for incorporating any other protocols like the legacy OSC protocol. It therefore provides a migration path towards fully implementing this protocol in current software. As this design has been developed inside a laboratory environment it still needs to be tested thoroughly in practical situations. New insights gained from these tests will result in modifications of the design. However we believe the design is a sane foundation to meet our set requirements.

Future work

Multiple areas for future research are mentioned:

utilizing multicast and broadcast topologies
discovery mechanisms on WAN networks
autonomous interfacing and exchange
serialization formats
implementing or mapping D-Bus
researching logic for connecting interfaces

The current prototype has embraced the Zyre (ZRE) framework as the basis. There are still some design considerations which need to be looked into. One important aspect is our wish to design the protocol as close to hardware implementations as possible. One practical area of research for ZRE would be to compare Zyre’s group messaging system to network multicast technologies like IGMP.

Lastly, discovery in it’s current state will only operate on networks enabled with multicast routing or it will be limited to the local segment of the network. This is a subject for the Zyre project as well which we hope to tackle together with the ZeroMQ community.

Any future iteration of the protocol will be published in repositories owned by the Z25 Foundation. Currently these can be found on Github.

References

ZOCP specification http://projects.z25.org/projects/plab/wiki/OrchestratorControlProtocolSpec Retrieved on 2014-03-06
OSPF discovery http://www.itcertnotes.com/2011/02/ospf-neighbor-establishment-process.html Retrieved on 2014-03-06
CERN ZeroMQ review http://lists.zeromq.org/pipermail/zeromq-dev/attachments/20130502/75df19e8/attachment.pdf Retrieved on 2014-03-06
Solving the discovery problem http://hintjens.com/blog:32 Retrieved on 2014-03-06
ZYRE design http://zguide.zeromq.org/php:chapter8 Retrieved on 2014-03-06