Flow control

This section addresses some of the mechanisms behind flow control and start of burst.

General Overview

Flow control refers to the mechanisms used to sample availability when sending data from a host to the SDR. In particular, it introduces a buffer between the asynchronous packets sent over the Ethernet port, and the deterministic consumption of samples by the DSP chain and converter devices.

For proper operation, and to ensure deterministic data transmission, accurate and precise time synchronization between the Host PC and SDR is required to ensure:

  1. All channels have data in their buffers prior to the time transmission should start.

  2. That transmission data streams start at an appropriate time to ensure that the Crimson TNG transmission packet buffer is neither exhausted (causing underflows because we started streaming data too late), or saturated (resulting in overflows because we started sending data too early).

  3. That sufficient sample data is available throughout operation, including sufficient buffer margin to absorb variations in latency introduced by Network cards or switches. If incoming data is delayed for too long, then the transmit buffer will be exhausted before that data arrives (causing an underflow).

This is because once streaming commences, the DAC consumes samples at a constant and deterministic rate. In the absence of any compensation mechanism, if those samples are not present, then samples need to be inserted, which could lead to undesirable phase differences between channels.

Flow control is particular relevant when at least one of the following conditions are true:

  1. Sending data across different clock domains.

  2. Sending data asynchronously or nondeterministically.

In most situations, both of the above conditions are true when sending data from any sort of non-realtime processer to the SDR. This is because the time between when you send data required to send data can vary substantially every time you call the send() command. If the time variance between when the application attempts to send data, and the time that data is “on the wire” exceeds the temporal size of the buffer (calculated as buffer = sample_buffer_size/desired_sample_rate_in_samples_per_second) then the application will underflow.

Discussion of Flow Control

In the ideal case, we would not need flow control; the entire network stack would be entirely deterministic, and packets would be sent at uniform intervals. As a practical matter, this is not the case. Consider the following case of an ideal network card, a switch with substantial jitter, and Cyan.

  1. The ideal network card happens to send packets at perfectly uniform intervals, with zero jitter.

  2. The 40G switch introduces a fixed latency component and a variable latency component (jitter).

  3. You start streaming data, and achieve steady state. At this point, your host PC is sending radio data in UDP packets at uniform intervals, which are being re-transmitted by the switch.

  4. The Cyan transmission sample buffer holds steady at 85% - the default value which helps provide sufficient margin for start of burst, and damp out any routine fluctuations.

    • We use the sample buffer level to help ensure that the sample rate of the host PC is aligned with the actual sample rate at which Cyan sends data. This is because, when transmitting data, the clock that matters most is the transmit device clock, which is found within Cyan. Taking this clock as an absolute reference, we need to compensate for any drift or variation between the host PC clock and the Cyan reference clock - we can use the buffer level as a very reasonable proxy for this difference, and adjust the transmission rate from the Host PC accordingly.
  5. You’ve achieved steady state, and then (randomly) your switch delays the transmission of a packet. During this time, your network card continues to send packets to the switch, which are buffered.

    • This is the difference between a store-and-forward switch, and a cut-through switch. A store-and-forward switch waits for the entire Ethernet frame to be cached prior to forwarding the data. A cut-through switch starts forwarding frames immediately after getting the first 12 bytes of the Ethernet header. (This doesn’t necessarily do what you think to latency, though).
  6. At this point, the Cyan transmit buffer is still reasonably full, so there is no interruption. However, depending on your sample rate we might be consuming samples very quickly - in which case we will underflow (run out of samples). At this point, Cyan will send zero-valued samples to the DAC (the chain does require data), and keep track of how many zero valued samples it has sent.

  7. At this point, the switch releases all the buffered data - All at once, and at the 40G line rate.

  8. The packet handler on Cyan then drops the first N packets or samples (depending on how many zeros we have inserted), to ensure phase coherency, and then start buffering data once we’re back in sync.

  9. At this point, the Cyan sample buffer may be very low - so it’s very susceptible to additional variations. Our flow control routine detects this and ensures the buffer returns to the nominal level.

If you are observing issues with glitches in the output, confirm that the network route to the device has as low a latency as possible - this helps improve the reliability with which you can send packets to Cyan.

On Crimson TNG

The current code requires the temporal variance between the Host PC and Crimson TNG to be less than 20ppm/s for convergence to be achieved. On non-realtime operating systems operating at the highest sample rates, this number may vary depending on the load placed on the system. As a result, we may occasionally lose convergence.

Though there are a number of compensation mechanisms in place to ensure such loses are handled gracefully, for best performance, we strongly recommend a direct 10G connection between Crimson TNG and the Host PC.

In the event of an underflow, the current implementation keeps track of exactly how many samples we have dropped. When the updated samples come in, we drop as many samples as is required to ensure that relative coherency (with respect to our first packet sent) is preserved.

On Cyan

The current code requires the temporal variance between the Host PC and Cyan TNG to be less than 20ppm/s for convergence to be achieved. On non-realtime operating systems operating at the highest sample rates, this number may vary depending on the load placed on the system. As a result, we may occasionally lose convergence.

Though there are a number of compensation mechanisms in place to ensure such loses are handled gracefully, for best performance, we strongly recommend a direct 10G connection between Cyan and the Host PC.

In the event of an underflow, the current implementation keeps track of exactly how many samples we have dropped. When the updated samples come in, we drop as many samples as is required to ensure that relative coherency (with respect to our first packet sent) is preserved.