In the previous two parts of this four-part series, we covered the basic principles of PTP and explained how time transfer can be made highly reliable using both the inherent methods IEE1588 provides as well as various complementing redundancy technologies. In this part, we look deeper into monitoring PTP systems.
Regardless of the level of fault tolerance a PTP infrastructure is provisioned with, it is still crucial to observe the behavior of the complete PTP network as closely as possible. This is absolutely mandatory during the deployment and commissioning phase of any new system, yet it should be continued during normal operation to a justifiable degree. But why is monitoring so essential for PTP and what data should be observed?
Basic PTP Monitoring Requirements
One of the undisputable advantages of PTP is its ability to select a common time reference on its own via the Best Master Clock Algorithm (BMCA). It is worthwhile noticing that every BCMA selection round engages all devices i.e. every PTP enabled port of every end device and Boundary Clock alike. Although the conditions for state changes to occur are defined very strictly, leaving no room for ambiguous interpretations by implementors, interoperability and compliance to the IEEE1588 standard under all operating conditions should not be taken for granted. Continuous monitoring of every BMCA state change is even more justified when its dependency on timeouts i.e. the absence of Announce messages over a configurable period of time is factored in.
Besides keeping a close eye on state changes, the accuracy can be affected by a number of other parameters. In contrast to an SDI-based environment, monitoring should not be limited to just checking quality of the time sources, rather it should include as many devices as possible employing both in-band and out-of-band measurement techniques.
It is good design practice for any mission critical installation to deploy more than one accurate time source. However, is should be kept in mind that only one Grandmaster will be active at a time with all others remain in hot stand-by merely listing to PTP traffic. To avoid unwelcomed surprises during a master failure, the quality of all PTP Grandmasters in the network needs to be continuously verified as a first mandatory step.
The synchronization status of all Slaves (or at least Slaves which play an important role within an All-IP Studio such as cameras, switchers and mixers) should be periodically monitored together with the status of all PTP aware network devices. In case PTP is deployed within a network without any PTP support, the loading of the network requires both careful planning and continuous monitoring as it will affect the accuracy. Thus, transient load peaks causing high Packet Delay Variations (PDVs) can be either avoided completely or their effect on the accuracy can be mitigated. As a basic requirement, such peaks need to be logged whilst alerting personnel in charge of the network and broadcast operations.
Modern networks provide a high level of redundancy and are able to cope with partial failures such as a broken network connection. Traffic will be re-routed automatically applying protocols such as Rapid Spanning Tree and/or an Interior Gateway Protocol. Such events do impact PTP because the transmission time for PTP messages will suddenly change. Therefore, such events need to be monitored in the same way as load peaks.
PTP Monitoring Techniques
After having established what information we need to gather from as many devices as possible, we need to plan how to do it. PTP specifies a set of management messages for querying the status of nodes as well as setting specific PTP parameters. They are well suited to query all nodes within a PTP network but should be complemented by additional monitoring measures. Firstly, PTP Management may increase the PTP network load significantly if all nodes within a large network are queried in short intervals. Secondly, PTP Management messages will yield only short-term information, rather than providing data about past events together with respective time information allowing to correlate information gathered from different devices in the network with each other.
Therefore, the PTP Management mechanism has to be complemented by other techniques. Aside from monitoring the presence and contents of the Announce messages, PTP event messages (Sync, Delay_Request, and Delay_Response) need to be accounted for as well. Within PTP, a Master failure is ONLY detected via the absence of Announce messages. If a Slave does not receive PTP Event messages, it will remain in its state without triggering the BMCA, yet its local clock starts drifting away. Such a situation may well occur due to a malfunctioning network device, be it PTP aware or unaware, but will remain undetected by the PTP Slave. Some PTP device manufacturers provide access to extended statistic data such as packet counters via custom PTP management messages or other standard network monitoring protocols such as SNMP (Simple Network Management Protocol). At a network level, PTP traffic can be monitored and analysed easily using open source tools like Wireshark or PTP Track Hound; the latter being specifically dedicated to PTP traffic.
The current offset of a PTP Slave can be accessed simply via a corresponding PTP Management message. However, this data may be insufficient to assess the synchronization quality of the device in question, because it reflects only the Slave’s point of view. Any well designed PTP servo loop will keep the mean value of the offset very close to zero. It will do so by assuming a symmetric transmission delay within the network and has no way of detecting asymmetries and thus cannot account for them.
A simple and straightforward way to verify symmetric delay or account for asymmetries is to measure the offset of the PTP Slave to its Master externally. This can be accomplished by comparing signals generated by both devices with appropriate measurement equipment. This approach is equivalent to comparing video sync signals with a vectorscope. Some devices can measure the offset of an external input against their internal PTP-synchronized clock.
It is well understood that proposing out-of-band measurements as an important monitoring tool will counteract the fundamental principle of PTP to be deployed on the single communication medium together with all other user specific traffic. However, it should be considered at least during initial deployment of PTP as a viable tool for evaluation.
For mission critical applications nodes providing out-of-band offset measurement capabilities should be placed at distinct points within a large network, thus further enhancing the observability.
Extended PTP Monitoring
Several PTP vendors led by Meinberg have proposed an enhancement to the PTP standard which greatly improves its monitoring capabilities: The Netsync Monitor. It can be added either to an existing PTP node (preferably a Grandmaster) or can be deployed onto a separate monitoring device.
Support for this monitoring extension requires only minimal software changes to the PTP stack. The monitoring system initiates and maintains a two-way time transfer similar to the original PTP mechanism and thus can utilize existing hardware for scanning and timestamping of PTP packets without any alteration whatsoever. Regardless of the communication mechanism selected for standard PTP traffic all monitoring messages are exchanged in unicast.
The transfer is started by sending a Delay_Request message. This message is extended with a special TLV (Type Length Value) field designating it as a monitoring message. The receiver will simply process this message by gathering the ingress timestamp and returning this information via a Delay_Response message extended by a corresponding TLV back to the monitoring system. The data contained in the TLV triggers the device to generate one additional Sync message again extended by a TLV. This message is sent to the monitoring system, which now has gathered the same four timestamps as every Slave is using to calculate the offset of its local clock. The monitoring system, however, will not use this data to adjust its clock, as it is already synchronized using an alternate time feed i.e. a GNSS source. It will rather be able to analyse the offset of the device independently. It should be noted that more than one monitoring device may be added to a PTP network. The data gathered by these monitoring systems can be used to evaluate the synchronization accuracy of all devices supporting this extension. Furthermore, it can reveal offsets caused by asymmetries.
Time transfer using PTP is a small, yet crucial part of the All-IP Studio. If designed with special attention to providing a sufficient level of fault tolerance, PTP will maintain accurate time throughout the network, requiring little to no user interaction at all. Continuous monitoring of all vital PTP parameters should, however, not be limited to the deployment and commissioning stages. Both in-band and out-of-band measurement and monitoring techniques should be employed whenever manageable.
You might also like...
In part-1 of this three-part series we discussed the benefits of Remote Production and some of the advantages it provides over traditional outside broadcasts. In this part, we look at the core infrastructure and uncover the technology behind this revolution.
Recent international events have overtaken normality causing us to take an even closer look at how we make television. Physical isolation is greatly accelerating our interest in Remote Production, REMI and At-Home working, and this is more important now than…
MIT researchers have developed RFocus “smart surface” antenna technology that can work as both a mirror and a lens to increase the strength of WiFi signals or 5G cellular networks by ten times.
SDI has been and continues to be a mature and stable standard for the distribution of video, audio and metadata in broadcast facilities. From its inception in the 1989 to the modern quad-link 12G-SDI available today, it has stood the test…
Here we look at one of the first practical error-correcting codes to find wide usage. Richard Hamming worked with early computers and became frustrated when errors made them crash. The rest is history.