Real-time SDR system with TySOM Bartosz, Piotr and Tomasz, Electronics Engineers at Aldec Like(7) Comments (0) Introduction This blog post continues on from “Development of real-time SDR systems with Aldec HES” which was written by Mariusz Grabowski and recounted how the early development phases of a Software Defined Radio (SDR) were achieved using our HES platform. The FPGA boards that are part of HES are intended to work with workstations, to facilitate the quick and easy prototyping of hardware designs. The HES environment was therefore very useful at the beginning of the SDR design and verification process because of its high level of integration with other tools, like our Riviera-PRO HDL simulator and the GNU Radio framework. For instance, a user can easily verify a design that is partially implemented in FPGA while the rest of the design can be simulated in Riviera-PRO or run in the GNU Radio framework. Once the SDR design was proven in the HES platform it was time to move the design into the embedded environment. And that’s what this article recounts Into TySOM Systems like radios should be small and compact, and embedded platforms are ideal because of their small form factor. However, performance is important too - and must be verified.. Aldec’s TySOM-3-ZU7EV board was selected as the embedded platform for the next stage of the SDR design and verification. The board (see figure 1) contains a Xilinx Zynq UltraScale+ SoC chip with Quad-core ARM Cortex-A53 processor. Figure 1. TySOM-3-ZU7EV board with ADC/DAC FMC expansion card. Also, an Aldec FMC expansion card with analog and digital (and vice versa) converters were used to produce an analog signal and a Linux Ubuntu 20.04.3 LTS (Focal) operating system was installed on the TySOM board to support the further development of the software needed for the SDR system. The Ubuntu operating system allows for the installation of the GNU Radio framework as a system package. The GNU Radio framework was also installed on the TySOM board to enable testing and debugging of modules implemented in the programmable logic (PL) part of the ZYNQ FPGA device. To communicate with the PL, a custom component was created in the GNU Radio framework. That component was the Xilinx DMA Proxy, which transfers data to the PL using Xilinx’s AXI Direct Memory Access interface (shown in Figure 2). Figure 2. The GNU Radio framework running on the TySOM board. It was easy to pass signals from the GNU Radio framework running on the Zynq device’s quad-core ARM Cortex-A53 processor to the PL part of the same device. The signal samples were transferred over the Xilinx DMA Proxy custom component in the GNU radio framework to the AXI Direct Memory Access component, which was implemented in PL. An AXI Direct Memory Access component is available in the standard library of the Xilinx Vivado Design Suite environment. However, adding a new component in the PL connected to the ARM Cortex-A53 processing system (PS) part of the ZYNQ device required reconfiguration of the Linux operating system. The Linux OS needed to be aware of the newly connected device to enable access to its registers and other resources to the Linux users. Reconfiguration of the operating system was relatively easy with Xilinx’s PetaLinux tool; which needed the board definition files to correctly produce updated Linux configuration files. Note, the definition files for Aldec boards are available for free on GitHub: https://github.com/aldec/TySOM-3-ZU7EV/tree/master/Petalinux_BSP The PetaLinux tool took the board definition files and Xilinx Vivado design where additional components like AXI Direct Memory Access were implemented and connected to the ZYNQ PS part, and produced updated Linux operating system files. The new Linux files needed to be updated on the TySOM board and the operating system was restarted. Figure 3. Xilinx Petalinux configuration window The figure above shows Xilinx PetaLinux configuration window where a user can enable access to the devices implemented in both the PLPS parts of the ZYNQ SoC chip. SDR Design Description SDR is a radio system where signal processing is realized using processors or FPGA devices. Such a system is easy to reconfigure to work with various radio protocols. Figure 4 shows the block diagram of the implemented SDR radio system. Figure 4. SDR Design Block Diagram The SDR system is designed to transfer files over an analog channel. In our case, an audio channel (i.e. not RF) was selected so as to avoid causing interference with local radio communication systems. Originally (and as recounted in our other blog post), the SDR design was fully modeled in the GNU Radio framework to check its functionality on system level It was time now to map the design modules, one-by-one, to an FPGA for better performance. The GNU Radio and our Riviera-PRO simulator were integrated to allow the design parts implemented in HDL to be co-simulated alongside the software modules (remain in the GNU Radio framework). Figure 5. Aldec Riviera-PRO and GNU Radio simulation results Figure 5 shows co-simulation results in Riviera-PRO’s waveform viewer and a GNU Radio constellation diagram. The SDR design worked correctly on the embedded platform, but the CPU load was significantly higher compared to when the design was distributed over a HES platform and powerful workstation. The decision was taken to implement more the design in hardware (PL) to reduce the CPU’s workload. To achieve best performance with a relatively small CPU workload, all modules implemented in the GNU Radio were mapped to the ZYNQ FPGA. The framer, deframer, constellation modulator and polyphase clock synchronizer modules were implemented in FPGA. Note, originally, only carrier modulator and demodulator with Costas loop were implemented in FPGA. Additionally, a finite impulse response (FIR) filter and a numerically controlled oscillator (NCO) generator modules were implemented in RTL HDL to replace Xilinx hard macros and enable easy portability of the design to other FPGAs. Let’s discuss the SDR modules implemented in the FPGA. Framer An HDLC framer is used to pack input data into frames and perform data serialization for the analog channel. Each frame starts and ends with a specific sequence of bits, namely “01111110”. Address and control fields are added at the beginning of the frame. One frame contains 128 bits of user’s data and bit stuffing was used to eliminate detection of the start/end sequence inside the frame content. After every five consecutive ones, a single “0” bit is stuffed into the frame. The address and control fields are not used in this design and are set to constant values. The HDLC framer calculates CRC-16 checksum using “8408(hex)” polynomial (reversed “1021(hex)”). The CRC is added to the frame just before the frame end sequence. Figure 6 shows an example of the HDLC frame in Riviera-PRO waveform. Figure 6. HDLC frame transmission. FIR Filter The FIR filter with a RRC (root-raised-cosine) characteristic is used to filter the demodulated signal. This filter is also used in the constellation modulator to shape the output signal. RRC filters are frequently used in communication systems because they reduce intersymbol interference. This filter is realized using a transposed structure to achieve best performance using as few FPGA resources as possible. The impulse response of the filter is shown in figure 7. Figure 7. FIR filter impulse response The same filter is also used to remove the carrier component from the output signal (see figure 8). Figure 8. Removal of the ecarrier component. QPSK Constellation modulator The purpose of the constellation modulation is segmentation of the input data stream into binary words containing a selected number of bits. In the case of QPSK (Quadrature Phase Shift Keying) modulation the input data stream is divided into two bit words. For each a unique par of two analog signals is created, namely In-phase (I) and Quadrature (Q) signal. The two I and the two Q signals represent a symbol. Each symbol represents the binary word encoded using a discrete set of values of the two I and Q signals. In the case of QPSK modulation two bits are transferred per symbol. This means the symbol rate is half of input bit rate. The symbols are frequently shown in the form of constellation diagram (see figure 9) Figure 9. QPSK Constellation Diagram The constellation diagram uses a rectangular coordinate system, where the X-axis represents values of the I signal and the Y-axis represents values of the Q signal. Usually, the I and Q signals take several discrete values. Each pair of discrete values denotes a point on the diagram representing a symbol. Figure 10. Constellation modulator I and Q output signals. The constellation modulator must also shape the output signals. The bandwidth of the I/Q signals must be narrowed by smoothing edges to eliminate sudden changes. The RRC filter is typically used to shape the signals. Additionally, a selected number of samples per symbol parameter must be achieved to correctly generate the signals. An example of I/Q signals is shown in figure 10. NCO The NCO generates two sine and cosine carrier waves. The output frequency of the NCO can be changed by loading a frequency control word (FCW). The following formula shows relation between the output frequency, the FCW and the input frequency of the NCO: Figure 11. NCO Output Wave. Figure 11 shows an example NCO output wave. Carrier Modulator This takes the sine and cosine waves generated by the NCO and the I and Q signals produced by the constellation modulator. The sine wave is amplitude modulated by the I signal and the cosine wave is amplitude modulated by the Q signal. The final QPSK signal is obtained by adding the two modulated waves. The signal can then be passed to a DAC in readiness for sending out over an analog channel. Figure 12. QPSK Signal ADC/DAC converters In our project, both ADC and DAC converters were available on the FMC-compliant expansion card connected to the TySOM board. Our SDR design contains state machines to configure the converters and communicate with them. The samples of the modulated signal were passed to DAC converter. The output of DAC converter was connected to the input of the ADC to create a closed loop for testing the system. Samples obtained from the ADC were passed to the receiving section of our SDR design. Carrier Demodulator with Costas loop The modulated signal sampled by ADC converter was passed to the Carrier Demodulator module. Note, the demodulation process is similar to the modulation process described above. As mentioned, the modulated signal was multiplied by the carrier sine and cosine waves. T demodulate the carrier waves, the sine and cosine must be phase aligned with the carriers that were used during modulation. The process of recovering carriers from the modulated signal is crucial for the demodulation process. We used a Costas loop in our design to recover the carriers and demodulate the signal. A PD (phase detector), loop filter, PD regulator and NCO are all important parts of the Costas loop. Correctly configured, a Costas loop can produce coherent carriers and correctly demodulate QPSK signals. Figure 13 shows the I Q signals recovered through demodulation.. Figure 13. “I” and “Q” signals separated. Figure shows signals before and after filtration. Constellation Demodulator with Symbol Synchronizer module. A symbol synchronizer module performs timing synchronization for modulated signals to select samples that best represent any given symbol. Each symbol is represented using multiple samples of two analog signals I and Q. The symbol synchronizer analyzes all input samples and selects the point in time when the analog signals are stable and captures a pair of two samples each of I and Q that represent the current symbol. The idea is to find the center of a given symbol and capture the samples in the middle of its duration. Note, the symbol synchronizer should avoid capturing samples close to the edges of the input signals (where the input signals change their values) as it is possible to accidentally capture some glitches. Figure 14. Selecting I/Q samples and mapping to binary values. Once the two I and Q samples are correctly selected the constellation demodulator remaps the sample values into binary data. This operation is simply the reverse of the constellation modulation operation. The output of the constellation demodulator is a binary stream passed to the deframer. Deframer It recognizes the incoming HDLC frames and decomposes them to extract transferred data. The deframer must recognize start and stop fields, revert bit stuffing and perform a CRC which, if it passes, means data can be passed to the user’s application. Packages with incorrect CRCs are rejected. As mentioned in the framer description, the address and control fields are ignored in our design because the frames are sent between just two devices. Summary The SDR design was originally modeled in GNU Radio Framework to check the correctness of its functionality before implementation in FPGA. The SDR design block diagram in GNU Radio Framework is shown on Figure 15. Figure 15. SDR Design Diagram in GNU Radio Framework The design modules were moved one-by-one to the FPGA. The availability of the GNU Radio Framework on embedded Linux operating systems, such as Ubuntu, was very important in our project, as it gave us great flexibility – e.g. deciding which modules to run in hardware and which to run in software. Our TySOM-3board (the TySOM-3-ZU7EV) with its Quad-core ARM Cortex-A53 performed well with our SDR design, and we were able to move some modules to the FPGA to reduces the CPU’s workload (freeing it for other compute intensive applications needed in SDR systems). It is worth mentioning that GNU Radio Framework will soon be integrated with Open Component Portability Infrastructure (OpenCPI). It allows for the execution of applications on heterogeneous platforms.This means that execution of GNU Radio components will be possible on CPUs, GPUs and FPGAs without any manual mapping. Similarly, like GNU Radio, the OpenCPI infrastructure is component-based. Each component may have various implementations for different platforms (as mentioned, CPUs, GPUs or FPGAs). A user may select a suitable platform for each component and the OpenCPI automatically maps selected components to a given platform and establishes appropriate communication between currently used platforms. OurSDR design required extensive work to map GNU Radio components to the FPGA platform. The GNU Radio Framework integrating with OpenCPI will make life much easier. Each GNU Radio component should have FPGA implementation. Mapping a component to FPGA will be as easy as clicking on it and selecting the FPGA platform in component options. The process of mapping selected components to FPGA will be automated with OpenCPI. OpenCPI will also configure the communication system between components executed on various platforms. Also worthy of note is that OpenCPI is not limited to CPUs, GPUs or FPGAs platforms. Aldec’s Riviera-PRO HDL simulator is already integrated with OpenCPI (something that was announced to industry trade press in June 2022 – and here’s EE Journal’s coverage) and it can be selected as a platform for execution modules implemented in VHDL, Verilog or SystemVerilog languages. This means Riviera-PRO simulator will appear in the OpenCPI environment as a debugging platform. A user can therefore select Riviera-PRO HDL and map a component to the tool’s HDL simulator to debug the component’s implementation in the real OpenCPI environment. This significantly improves the design and verification process of components implemented in HDL languages. Also, the OpenCPI environment allows running multiple instances of Riviera-PRO HDL simulator. A user may debug many components simultaneously using multiple sessions of Aldec Riviera-PRO HDL simulator. Aldec Riviera-PRO HDL is a leading simulator on the market and it is prepared ahead of time to work with upcoming state-of-the-art verification environments developed for safety critical applications. Tags:ARM,Auto,Design,FPGA,Embedded,TySOM,SoC,Xilinx