Handling Audio over IP in Broadcast Production Environments
Broadcasters moving to IP infrastructures in broadcast production have mainly focussed on video transport, due to the bandwidth it requires. In this article, Olivier Suard, Vice President of Marketing at Nevion, talks about the challenges audio that presents, not only involving a substantially greater number of flows compared to video, but also using a number of diverse standards in production.
Professional audio over Ethernet has been in use since before the turn of the century, when broadcast radio become an early adopter of standardised networking. Over the years, several competing proprietary approaches and standards for audio over IP have emerged, including DANTE, REVENA MADI (AES10) and AES67. However, compatibility between them, and even between implementations of specific formats, has been a long-standing issue in audio transport and processing.
“Now that more broadcasters are moving to IP in their facilities, the issue of ensuring audio compatibility has become critical and complex,” Olivier said. “Broadcasters should consider the following key issues - the streaming plane, timing, the control plane, and the issue of protection.”
Audio Transport Over the Network
AES67 has become a key factor in the basic transport of audio over the network, the streaming plane. First issued in 2013, the AES67 standard has been adopted and integrated by most manufacturers, including providers of products based on proprietary approaches. Olivier said, “Equally important, AES67 is the basis of the recent SMPTE ST 2110-30 standard, which means that compatibility on the streaming plane between most of the commonest systems is largely assured.
“That said, within the SMPTE ST 2110-30 standard, three levels of conformance are defined, only some of which are currently supported by vendors. The mandatory Level A supports 48 kHz streams with one to eight audio channels, at packet times of 1 ms. Level B adds support for packet times of 125 µs. Level C increases the maximum number of audio channels allowed per stream to 64, which means that MADI, still in wide use, may be carried as-is over the audio network.”
As they plan their move to IP, broadcasters should be aware that many audio-over-IP systems are currently only able to handle the basic level A. They may have limitations regarding the total number of audio network streams supported, and what combinations of channel count and stream count can be used. Olivier advised that such limitations need careful consideration when selecting audio equipment as they could restrict the flexibility of the overall workflow.
Network Timing
“On networks comprised of AES67-compatible equipment from different manufacturers, Precision Time Protocol (PTP) version 2, or IEEE 1588-2008, can now be used for network timing,” he said. “This development also fits with the SMPTE ST 2110-10 standard which mandates use of PTP v2. SMPTE has also published the ST-2059 standard, which extends the media clock concept of AES67 to any kind of periodic media clock, including video and timecode.”
Media clocks control the rate at which information is passed to an external media playing device. For example, a specific clock oversees the rate at which samples should be passed to an audio CODEC. Time synchronisation is critical in networks because all aspects of managing, securing, planning and debugging a network involves determining when events happen. Time may also be the only frame of reference between all network devices.
Control and Orchestration
Audio may not necessarily place high demands on bandwidth in a network compared to video, but according to Olivier it does create a challenge in terms of control and orchestration. He said, “The common production environment has many more audio-sources than video-sources, and an even greater number of audio-destinations. A major sports production could have thousands of audio channels travelling across the network, for example.
“Audio engineers traditionally expect to be able to plug and play equipment and connect sources and destinations without concern for protocols and standards. Conversely, in a broadcast facility, inter-studio routing must be centrally controlled both for the integrity of signals, and for security and access control.”
Proprietary Control Plane
One reason that proprietary approaches have been valued in the past is that they include a comprehensive control plane, whereas standards like AES67 or indeed SMPTE ST-2110 are generic do not define how the streams should be controlled.
“While the proprietary control planes are effective on their own, they are not compatible with each other. More important, they are designed for a local studio environment (LAN), and therefore aren’t suited to a connected, distributed production environment, such as those designed for big campus or inter-campus use, or for remote production over WAN,” said Oliver.
“These control planes also rely on making audio directly available to any equipment in the network by default, meaning no explicit routing of streams is required. This situation could raise security concerns, especially in a distributed, multi-department or multi-organisation environment.
Finally, the fundamental assumption behind proprietary approaches is that no controlled bandwidth management is needed because audio streams are comparatively small, which may no longer be valid when the size and complexity of the network increases.”
MADI Tielines
A familiar method he mentioned of overcoming the issues with control plane interoperability, and dealing with security and stability concerns, is to bridge different IP audio islands with MADI baseband tielines. In baseband systems, tielines are created to interface all of the network modules, and signal path connections require an equivalent tieline definition.
However, if a tieline has not been defined or the definition doesn't match the physical connection then the associated components will not work, adding complexity to the management of audio routing in the campus and reducing flexibility and agility. In short, tielines would largely defeat the purpose and promise of using a converged media network in the first place.
Standards-based and Software-defined Control
Olivier believes that the Networked Media Open Specifications (NMOS), a family of specifications to support the development of products and services within an open industry framework proposed by the Advanced Media Workflow Association (AMWA), will bring a way to handle endpoint control for audio that may deliver the true promise of distributed IP production. NMOS aims to support the AV media industry's transition to a completely networked architecture, and the development of a control and management layer complementing the SMPTE ST2110 transport layer.
He said, “The NMOS standard is now gaining traction in the industry – although so far audio equipment manufacturers have been slower to adopt it than video equipment vendors. In the meantime, the most promising approach to control is to use software defined networking (SDN) capabilities to control the audio flows across the IP network.
“This type of networking can be combined with the implementation of standard control interfaces in both endpoint equipment and correspondingly in the broadcast media network controller. It is not only a simple way to connect diverse sources and destinations, but it also adds a layer of predictability, performance guarantees and security by managing bandwidth and only allowing authorised destinations access to specific audio network flows.”
Audio Signal Protection
As production transitions from the LAN environment and into the WAN, and IP audio networking is converged with video networking, audio signal protection is becoming an issue. “The SMPTE ST 2022-7 dual path protection standard has now been extended beyond video to cover any RTP media stream, and and works to ensure audio signal reliability,” said Olivier. “Compatibility and network addressing issues may persist where different parties need to exchange audio signals, for example, between different organisations, or simply between an OB van and the live audio system. Broadcasters can address these concerns through IP Media Edge devices and/or SDN controlling which flows can cross the boundary and how.”
A better approach than using MADI tielines to bridge the gap, edge devices are used as an entry point into enterprise or service provider core networks, controlling data flow. Examples are routers, routing switches and edge devices that connect a LAN to a high speed switch or backbone.
Olivier commented, “Broadcasters often underestimate the challenges of handling audio in an IP-based facility as they focus more closely on how to handle high-bandwidth video. However, the introduction of new standards and products, and ensuring they have the right expertise and experience, will help overcome these issues and take better advantage of a fully functional IP facility.” nevion.com