Synchronization Across Machines

By Chris Pirazzi. Some material stolen from Wiltse Carpenter, Doug Cook, Bryan James, and Bruce Karsh.

The UST and UST/MSC support described in Introduction to UST and UST/MSC lets you relate the times at which signals enter and exit one machine. Sometimes you want to relate the timing of signals on multiple machines. For example:

If you are using one machine to input, output, or process audio, while you are using another machine to input, output, or process video or graphics, and you want to synchronize these operations (in visual simulation, for example).
If you have an array of genlocked video or graphics monitors, each connected to a different SGI machine, and you want to output data to each monitor in sync.
If you have an array of genlocked cameras capturing different views of a scene, each connected to a different SGI machine, and you want to capture data in such a way that you can determine which image on each machine coincides with each image on the other machines.
If you have an array of machines, each connected to the same video signal, and you want to perform a real-time "striped" compression or other processing task (each of the N machines processes every Nth piece of video data) that would not go in real-time on one machine.

UST and UST/MSC solved the problem on one machine by mapping all interesting signal events onto a common clock, the UST clock.

To solve the problem across machines, you need:

1. a clock that is common to all the machines, and
2. a way to map this common global clock onto machine-local events.

Sometimes, it's convenient to map the machine-global clock onto the machine-local UST clock, since then you can use the UST support on the machine to map to other machine-local events. Sometimes you map directly from the machine-global clock to the particular machine-local event you are using.

Global Clocks

For our purposes, a clock is any periodically changing numerical quantity which we can measure from software, such that two pieces of software measuring the clock at the same instant will retrieve the same numerical value (to some stated accuracy). In some cases, that numerical quantity may appear as a timecode (see Timecode).

These are the machine-local clocks we're familiar with:

UST
MSCs from any AL port or VL path
gettimeofday() (with no network time daemon)
The cycle counter (clock_gettime(CLOCK_SGI_CYCLE,...))

How many kinds of clocks can be distributed to more than one machine (requirement 1 above)? The number may surprise you:

gettimeofday() (along with timed or NTP synchronization)
LTC over an audio wire (see Timecode)
MIDI timecode over a MIDI wire (see Timecode)
Any variant on over-the-wire timecode you care to cook up. For example,
- use tserialio to transmit a number over a serial port 10 times a second. The number increments by 1 each time.
- use the AL to transmit a digital or analog audio signal containing a simple code plus an incrementing number.
Now you have a trivial-to-parse clock which you can distribute anywhere!
The combination of video sync (ie, a common view of where a series of video vertical sync pulse intervals fall in time, but without any particular label on each vertical sync pulse interval) and any of the following labeling events:
- one or more VITC codewords embedded in the video signal (see Timecode). Fields without VITC implicitly follow the previous field in the timecode sequence.
- a GPI trigger assertion, which you can think of as labeling the coincident video field as field number 0. Subsequent fields are implicitly numbered 1, 2, 3, etc.
- one or more messages transmitted over a serial port, which label the coincident fields with a number or a timecode. For example, the messages may come from a V-LAN box or a Sony 9-pin protocol RS-422 deck.
- any variant on the above you care to cook up, for example using tserialio or AL.

There are many more such clocks. We have chosen some clocks which also satisfy requirement 2.

SGI's libraries let you accurately map each of these global clocks (except for video+GPI) onto the machine-local UST clock. For example:

You can bring in LTC with the AL (parsing it with dmLTC()) and use the AL's UST/MSC support to find a UST for each LTC codeword.
You can do the same with the MIDI library for MIDI timecode.
For the examples above that use serial signals, you can use tserialio to pair incoming and outgoing serial bytes with UST.
For the examples that use video signals along with serial signals, you can use the UST support in tserialio and the VL to assemble the complete clock and map it to UST.

You can also bypass UST and map many of these global clocks onto other local events. For example:

a VITC codeword in each incoming video field lets you map each global timecode onto a particular MSC and a particular chunk of data from your VL path.
SGI's video devices which support GPI let you map the video+GPI clock (0, 1, 2, 3) onto the chunks of data you get from your VL path, using GPI triggered VL transfers.

With the tools described above, and a little imagination, you can build a cross-machine synchronization mechanism for your application.

The rest of this document contains more detailed explanations of a few particular cross-machine synchronization mechanisms.

Relating the UST of Two Machines Using Network Time

The most obvious way to relate the UST of two machines on a network is to sync up their gettimeofday() clocks using your favorite network time protocol (timed, NTP, etc.) and then use dmGetUSTCurrentTimePair() on each machine to pair UST with gettimeofday(). The network time protocols often give you accuracy in the millisecond range on LANs. When using this technique, be sure to update your UST/gettimeofday() relationship with dmGetUSTCurrentTimePair() periodically, for reasons described in Introduction to UST and UST/MSC.

Relating the UST of Two Machines Using a Digital Audio Signal

For machines which are not connected by a network, or for cases where you require more accuracy, consider connecting the machines with another clock. The following hack shows a way to relate the UST clock of two machines to at least ±80us (±3us on newer audio platforms) using a digital audio signal.

This hack works if you are willing to dedicate one digital audio channel to the task (or at least some of it), and plug the connector containing that channel from the output of one machine to the input of the other machine. Say machine A has the digital audio output, and machine B has the digital audio input.

On machine A, use the UST/MSC support in the AL to determine the UST of the next audio sample you are sending out. Say the next sample you're sending out has UST 0xaabbccddeeffgghh. Transmit the following data over the normal audio bits of your digital audio channel using alWriteFrames() (this example assumes 16 bits per sample):

  0xFEED 0xBABE 0xDEAD 0xBEEF 0xaabb 0xccdd 0xeeff 0xgghh

Since 0xFEED was the first frame we transmitted, the UST 0xaabbccddeeffgghh is the UST of 0xFEED. Machine A continues to transmit valid pairs of "0xFEED 0xBABE 0xDEAD 0xBEEF" and UST. The program generating the data on machine A need not call alGetFrameTime() and alGetFrameNumber() between transmission of each pair: once per second should be fine. The program can safely interpolate the remaining USTs, or it could even transmit a "0xFEED 0xBABE 0xDEAD 0xBEEF" and UST pair only once every second or so, and transmit zeroes the rest of the time.

Now, on machine, B, you have another program running which reads 16-bit audio in the normal way with alReadFrames(). The program searches the incoming samples for the pattern "0xFEED 0xBABE 0xDEAD 0xBEEF." When it sees this code, it knows that the next 4 samples (next 8 bytes) contain the machine A UST of 0xFEED. Like machine A, machine B uses the UST/MSC support in the AL on its side to determine a machine B UST for 0xFEED.

Now machine B has a machine-B-UST and a machine-A-UST which coincide. The accuracy of this pair depends on the accuracy of the UST/MSC support used to generate it on each machine. The AL's UST/MSC support will give you USTs that are ±1.5us accurate on Octane, Origin, Onyx2, and O2 digital audio. The USTs are 4-audio-sample-accurate (±40us at 48k) on other systems. So if both systems are ±1.5us accurate, this will give you a machine-A-UST/machine-B-UST pair which is accurate to ±3us (since the uncertainty in the transmitter's and receiver's time accuracy are cumulative).

If both machine A and machine B need to know the pairing of their UST's, then you can do one of these:

Machine B can send some of the pairs it computes back to machine A in any old leisurely, non-real-time manner (for example, over a socket connection on an ethernet). As long as machine A and B are both working with machine-A-UST/machine-B-UST pairs that are within a second or two, you should not lose significant accuracy.
If it's easier than #1 and you have another digital audio channel available on each machine, you could repeat this hack in the other direction.

If you were wondering, we purposely chose the code:

  0xFEEDBABEDEADBEEF

because, in addition to being silly, it is an essentially unachievable UST. This means that the receiver will never receive a UST with the value 0xFEEDBABEDEADBEEF and mistake it for the special code that indicates a UST is next.

Why is it unachievable? UST measures elapsed nanoseconds since system startup. In order for the UST to reach 0xFEEDBABEDEADBEEF, system A would have to be up for:

  0xFEEDBABEDEADBEEF == 18369543784056602351
        
  18369543784056602351 / (1000000000 * 60 * 60 * 24 * 365) = 582 years

You can even do this hack without dedicating an entire digital audio channel to it. If you transmit and receive with 24 bits of precision, you can transmit the timing signal 0xFEEDBABEDEADBEEFaabbccddeeffgghh one bit at a time using the least significant bit of the audio samples. Thus the timing information overlays the audio information like a water mark overlays an image. This low bit is very likely to be inaudible. Whether or not you actually need it intact depends on your application.

Relating the UST of Two Machines Using Other Signals

The hack above can easily be adapted to other signals:

You can use tserialio to send and receive the same timing signal (0xFEEDBABEDEADBEEFaabbccddeeffgghh) over a serial port. Since tserialio is ±1ms accurate, this relates the UST of two machines to ±2ms accuracy.
You can use the VL to embed the timing signal inside video data that you send to another machine (perhaps in the vertical interval). Typical VL devices are at least ±100us accurate, so this relates the UST of two machines to at least ±200us. In this case, you might instead want to embed a more standard VITC signal in the video, and then lazily send pairings of VITC timecode and machine-A-UST in non-real-time from machine A to machine B.
You can also send the timing signal over an analog audio line, but you'll have to figure out a way to adapt the timing signal so that it is resilient to analog audio filtering and artifacts. Conceptually, you need to make the signal "smoother" so that analog equipment will properly transmit and receive it.