Quantcast
Channel: Xilinx Wiki : Xilinx Wiki - all changes
Viewing all 11776 articles
Browse latest View live

HDMI FrameBuffer Example Design

$
0
0
...
% cd $HDMI_HOME/pl
% vivado –s ./design/setup.tcl
...
hardware design.
Next click

Click
"Generate Bitstream"
...
be generated.
Click "File->Export->Export Hardware". Check the box that says "Include bitstream" and click OK. This will generate an archive file (.hdf) as shown belowhdmi_project/hdmi_example_zcu102.sdk/hdmi_example_zcu102_wrapper.hdf

Copy the generated bitstreamhdf file to the PetaLinux directoryhw-description folder and rename to system.hdf
% cp hdmi_example_zcu102.runs/impl_1/hdmi_example_zcu102_wrapper.bit $HDMI_HOME/apu/petalinux_bsp/images/linux/hdmi_project/hdmi_example_zcu102.sdk/hdmi_example_zcu102_wrapper.hdf $HDMI_HOME/apu/petalinux_bsp/project-spec/hw-description/system.hdf
5.2.2 PetaLinux BSP
This tutorial shows how to build the Linux image and boot image using the PetaLinux build tool
The petalinux-config step can be skipped as the pre-configured BSP is provided
Design hardware file should be available in the hw-description folder
Build the project image file along with the rootfs
% petalinux-build

HDMI FrameBuffer Example Design

$
0
0
...
This will open the Vivado GUI and generate the hardware design.
Click "Generate Bitstream" under "PROGRAM AND DEBUG" tab and wait for the bit-stream to be generated.
...
file (.hdf) as shown belowhdmi_project/hdmi_example_zcu102.sdk/hdmi_example_zcu102_wrapper.hdfin the following location hdmi_project/hdmi_example_zcu102.sdk/hdmi_example_zcu102_wrapper.hdf
Copy the generated hdf file to the PetaLinux hw-description folder and rename to system.hdf
% cp hdmi_project/hdmi_example_zcu102.sdk/hdmi_example_zcu102_wrapper.hdf $HDMI_HOME/apu/petalinux_bsp/project-spec/hw-description/system.hdf

HDMI FrameBuffer Example Design

$
0
0
...
Design hardware file should be available in the hw-description folder
Build the project image file along with the rootfs
% cd $HDMI_HOME/apu/petalinx_bsp
% petalinux-build
Create a boot image

Zynq UltraScale+ MPSoC Ubuntu part 1 - Running the Pre-Built Ubuntu Image and Power Advantage Tool

$
0
0
...
Micro USB to Standard USB cable.
4K or 1080p Display Port Monitor and DisplayPort Cable.
...
HDMI Cable Note: XaoS Mandelbrot will not display on HDMI. Alternatively, use mbrot.sh script.
USB 3.0 connector or USB 2.0 micro cable to standard USB female adapter, USB Hub to connect USB mouse and USB keyboard or connect USB keyboard with mouse integrated etc.
Setup:
...
The installers are located in the C:\ZynqUS_Demos\tools directory
Install Tera Term terminal application by double clicking teraterm-4.87.exe
...
double clicking AutoHotkey112203_Install.exe.AutoHotkey112203_Install.exe, which can also be found at - Autohotkey webpage.
Install CP210x USB to UART bridge driver as follows: Navigate to C:\ZynqUS_ Demos\tools\CP210x_VCP_Windows directory and install CP210xVCPInstaller_x64.exe
Note: This might ask you to restart the machine. Perform the next step before restarting the machine

Zynq UltraScale MPSoC Software Acceleration TRD 2017.3

$
0
0
Revision History
This wiki page complements the 2017.2 version of the Software Acceleration TRD. For other versions, refer to the Zynq UltraScale+ MPSoC Software Acceleration TRD overview page.
Change Log:
Update all projects, IPs, and tools versions to 2017.2
Remove hard-TPG from design
Use SDSoC based function instead of linux based function to measure performance.
Move NE10 headers and library to platform.
Performance improvement in all computation engines, especifically in RPU as Coproc (~1100us -> ~800us) and APU-PL (~120us -> ~70us)
Simplified build-steps for better user-experience
Various fixes and clean-up
Introduction
This wiki page contains information on how to build various components of the Zynq UltraScale+ MPSoC Software Acceleration Targeted Reference Design (TRD), version 2017.2. The page also has information on how to set-up the hardware and software platforms and run the design using the ZCU102 evaluation kit (Rev 1.0 with ES2 or Production silicon).
About the TRD
The Software Acceleration TRD is an embedded signal processing application designed to showcase various features and capabilities of the Zynq UltraScale+ MPSoC ZU9EG device for the embedded domain. The TRD consists of two elements: The Zynq UltraScale+ MPSoC Processing System (PS) and a signal processing application implemented in Programmable Logic (PL). The MPSoC allows you to implement a signal processing algorithm that performs Fast Fourier Transform (FFT) on samples (coming from Test Pattern Generator (TPG) in Application Processing Unit (APU) or System Monitoring (SYSMON) through an external channel) either as a software program running on the Zynq UltraScale+ MPSoC based PS or as a hardware accelerator inside the PL. The design has three accelerator cores generated using SDx for computing 4096, 16384, and 65536 point FFTs. The data transfers of the SDx accelerators is controlled by the APU. There is one accelerator (FFT IP from the Vivado IP catalog) for 4096 point FFT controlled by the Real-Time Processing Unit (RPU). The TRD demonstrates how to seamlessly switch between a software or a hardware implementation and to evaluate the cost and benefit of each implementation. The TRD also demonstrates the value of offloading computation-intensive tasks onto PL, thereby freeing the CPU resources to be available for user-specific applications.
For detailed information on the complete feature set, or hardware and software architecture of the design, please refer to the TRD user guide here.
Download the TRD
This TRD has been tested on Rev 1.0 of ZCU102 board with ES2 silicon and Production silicon. The Current design doesn't support ES1 silicon.
The following design files can be downloaded from here.
ES2 silicon : rdf0435-zcu102-es2-swaccel-trd-2017-2.zip
Production silicon : rdf0376-zcu102-swaccel-trd-2017-2.zip
TRD Directory Structure and Package Contents
The Software Acceleration TRD package is released with the source code, Vivado project, SDK projects, and an SD card image that enables you to run the demonstration and software application. It also includes the binaries necessary to configure and boot the ZCU102 board. Prior to running the steps mentioned in this wiki page, download the TRD package and extract its contents to a directory referred to as ‘TRD_HOME' which is the home directory.
{Soft_Acc_17p2_es2dir3.png}
The table below describes the content of each directory in detail.
Folder/file
Description
apu
Contains the software source files
petalinux
Contains the PetaLinux project's configuration
Qt_gui
Contains GUI sources
zcu102_fft
SDx folder containg the hardware platform, pfm files and FFT accelerator C sources.
rpu
swaccel_r5_firmware
Contains SDK project for building RPU firmware
sdcard
Contains ready to test binaries
BOOT.BIN
BIN file containing FSBL, PL bitstream, U-boot and ARM trusted firmware
image.ub
Kernel image
libfft.so
FFT accelerator shared-object
r5FFT.elf
R5 FFT computation firmware
README.txt
Contains design version history, steps to implement the design, and Vivado and PetaLinux versions to be used to build the design.
swaccel_qt
Qt GUI application.
THIRD_PARTY_NOTICES.zip
Contains the Copyright text for third-party libraries
IMPORTANT_NOTICE_CONCERNING_THIRD_PARTY-CONTENT.txt
Contains information about the third party licences
Pre-requisites:
ZCU102 Evaluation Kit (Rev 1.0 with ES2 or Production silicon)
A Linux development PC with following tools installed:
Xilinx Vivado Design Suite 2017.2
Xilinx SDx 2017.2
Petalinux 2017.2
Distributed version control system Git installed. For information, refer to the Xilinx Git wiki.
GNU make utility version 3.81 or higher.
Running the Demo
This section provides step by step instructions on bringing up the ZCU102 board for demonstration of the TRD and running different options from the graphical user interface (GUI).
The binaries required to run the design are in the $TRD_HOME/sdcard folder. It also includes the binaries necessary to configure and boot the ZCU102 board.
Before running the demo:
Format the SD-MMC card as FAT32 using a SD-MMC card reader. Copy the contents of the $TRD_HOME/sdcard onto the primary partition of the SD-MMC card.
PetaLinux console login details are;
user: root
password: root
Hardware Setup Requirements
Requirements for theTRD demo setup:
The ZCU102 Evaluation Kit (Rev 1.0 with ES2 or Production silicon)
AC power adapter (12 VDC)
Optional: A USB Type-A to USB Micro-B cable (for UART communication) and a Tera Term Pro (or similar) UART terminal program.
USB-UART drivers from Silicon Labs
USB Micro-B to female Adaptor with USB hub is needed for connecting a mouse.
USB mouse
4K monitor with Display Port support
Certified Display Port cable (version 1.2); TRD tested with 6 feet long E342987, Cable matters
Optional, required only for testing with external audio input:
XA3 SYSMON Headphone Adapter card from Faster Technology
An audio source like MP3 player
An aux cable with 3.5mm male jack on both ends.
A SD-MMC flash card containing TRD binaries formatted with FAT32. The SD-MMC should have the required binaries in its primary partition. Copy the binaries from sdcard folder of the TRD zip file. The required binaries include :
BOOT.BIN
image.ub
libfft.so
r5FFT.elf
swaccel_qt
Note: TRD supports Ultra HD (4K) and Full HD (1080p) resolutions. The binaries provided in the sdcard folder have been tested with ViewSonic (4K), ASUS (4K), Acer (4K) and Dell-P2414Hb (1080p) display monitors. However, the binaries should work well with any Display Port certified monitors supporting 4K/1080p resolution in its EDID database. Please make sure to use a DP certified 1.2 version of the cable for connecting the ZCU102 board to the monitor.
Board Setup
Connect various cables to the ZCU102 board as shown in the following steps.
{setup-1.jpg}
{2016-05-13 15.33.50.jpg}
1. Connect a 4K monitor to the DP port on ZCU102 using DP 1.2 cable.
2. Connect an USB mouse to the Micro-B USB connector (Jumper J96 on ZCU102 board).
3. Optional: Connect an USB Micro-B cable into the micro USB port (J83) labeled USB UART on the ZCU102 board and the USB Type-A cable end into an open USB port on the host PC for UART communication.
4. Connect the power supply to the ZCU102 board. Do not switch the power ON.
5. Optional: Plug the XA3 Adapter card into the Sysmon Header on ZCU102 board (J3). Connect Jumpers J5 and J4 on XA3 card as shown in below figure.
{IMG_20160518_143720.jpg}
6. Optional: Connect the 3.5mm auxiliary cable to XA3 card and audio source. One end connects to audio source and the other end connects to 3.5mm female connector on XA3 card.
7. Insert a SD-MMC memory card, which contains the TRD binaries, into the SD receptacle on the ZCU102 board
8. Make sure the DIP switches (SW6) are set as shown in figure below, which allows the ZCU102 board to boot from the SD-MMC card.
{sa_2017p1_bm.png}
9. Optional: Open a serial communication terminal software like TeraTerm, and set up a new serial communicaiton as shown in below figure.
{teraterm_2.png} Click on "New Connection" and select Interface 0 and click OK (as shown in below figure).
{teraterm_1.png} Click on Setup -> Serial Port and make sure to setup as shown in below figure
{teraterm_4.png} User can see the following on the serial terminal
{teraterm_3.png} After linux boot is complete, you see the Petalinux login prompt, as shown in below figure
{teraterm_5.png}
Run QT GUI application
A Linux application with QT-based GUI is provided with the package included on the SD-MMC memory card. This application provides options to user to exercise different modes of the demonstration. User can select Test Pattern Generator (TPG) samples or External audio source (requires the XA3 adapter card, aux cable and audio source for testing).
User can select to perform FFT computation on APU (run as software code on the PS) or in PL (run in the FPGA fabric as a hardware IP core).
User can also apply various windowing techniques on input samples before performing FFT.
Powering on the QT-based GUI application demo
Make sure the monitor is set for DP Ultra HD (4K) resolution.
Turn on power switch (J52)
Note: The Linux image and Qt based GUI application will be loaded from the SD-MMC memory card.
The Linux image will load and the frame buffer console is displayed on the monitor.
The Qt based GUI will load
When the GUI starts up, the demonstration starts with FFT being computed by software running in APU on samples coming from TPG in PL.
Running the Qt-based GUI application demo
Exercise different options by pressing the buttons available in the GUI to evaluate the different use cases mentioned below.
{sa_20171_1.jpg}
Test Start/Pause
Demonstration can be paused at any instant by clicking on Pause button, as shown in figure below.
{IMG_20160511_115138.jpg}
Input Source
There are two sources of data samples.
Use case
Input source
1
Test Pattern Generator (TPG in software). This is the default option.
2
External audio input(through XA3 SYSMON Headphone Adapter card)
Note : To test the external audio (assuming that setup is made as per procedure mentioned above), play an audio from the MP3 player/Phone. The peak voltage of the audio source depends on the manufacturer. The voltage levels of the samples depend on the volume. If the output voltage of the audio signal goes beyond 1V, the waveform will be clipped. Adjust the volume on the audio source so that the voltage of the samples lies within 1V peak-to-peak.
{sa_20171_2.jpg}
FFT Computation Engine
For the two input sources mentioned in above table, user can select one of the following compute engines for FFT computation.
FFT Compute Engine
Description
APU (default)
FFT computation is done by software running on APU
NEON
FFT computation is done by software running on APU. Neon intrinsic APIs are used for FFT computation to make
sure that the instructions are executed on NEON.
APU controlled PL Accelerator
FFT computation is done by the FFT core in Programmable Logic(PL)
RPU as Co-processor
FFT computation is done by software running on RPU. APU is involved in moving samples from TPG in PL to PS DDR.
Samples from PS DDR are copied to OCM by APU software and that information is passed to RPU through OpenAMP channel.
RPU controlled PL Accelerator
FFT computation is done by PL FFT IP. RPU controls the AXI DMA transfers to/from PL FFT core from/to PS DDR.
APU is involved in moving samples from TPG in PL to PS DDR. Samples from PS DDR are copied to OCM by APU
software and that information is passed to RPU through OpenAMP channel. PL FFT core fetches samples from OCM
and computes FFT on the samples and writes samples back to OCM.
All
Runs FFT on all engines one at a time. This mode is useful for comparing computation times for various engines.
{IMG_20160511_114619.jpg}
FFT Length
FFT length determines the number of samples on which FFT computation is performed. User can run the following FFT sizes.
FFT Size
4096 (default)
16384
65536
{IMG_20160804_143712.jpg}
FFT Window
User can apply one of the window function on the input samples before FFT computation.
Window function
None (Default, No windowing)
Hann
Hamming
Blackman
Blackman Harris
{IMG_20160511_114705.jpg}
Frequency Zoom
User can select the following Frequency Zoom options
FFT Zoom option
Description
ZOOM
Selecting this option fixes the units on frequency axis in the Frequency domain plot to 512.
This enables users to closely observe the values on frequency axis. This is 5X zoom.
NONE (default)
This is the default option. None is No Zoom. Selecting this option will plot all points on frequency axis (Number of points equal to half of the FFT size)
{IMG_20160511_114726.jpg}
FFT Scale
User can select the different scales on Voltage/Amplitude axis. This option is important when using external audio source as input. The voltage of the samples is dependent on the volume of the audio signal. Depending on the amplitude of the audio samples, the scale can be selected. Available options are:
FFT Scale
1V (Default)
0.5V
0.25V
0.1V
{IMG_20160511_114742.jpg}
Sample Rate
The sampling rate of the SYSMON in PL can be changed on run time. Supported sampling rates are:
Sampling Rate
200 kSPS (default)
100 kSPS
50 kSPS
{IMG_20160804_143912.jpg}
Time and Frequency domain plots
The time domain plot plots the samples corresponding to data generated by either TPG or by external audio source. The number of points in the plot depends on the FFT size.
The frequency domain plot plots the power spectral density (not in logarithm scale). It is a function of voltage vs frequency bins. The value “Fp” on the extreme right corner of frequency domain plot depicts the frequency bin with highest energy. The number of frequency bins plotted is half of FFT size (half because of symmetry for real valued samples) when “NONE” is selected in Frequency Zoom control and 512 by default (ZOOM enabled).
FFT Computation time plot
The time taken for FFT computation by each engine is plotted on the “FFT computation plot”. The average computation times for 4096 point FFT are captured for reference in below table:
Computation Engine
~Average computation time (us)
APU
400
APU with Neon as Co-processor
320
APU controlled PL
70
RPU
830*
RPU controlled PL
140*
RPU is running at 500 MHz and APU is running at 1.1G. Also, the OpenAMP communication latency is included which is approximately 100 μs.
CPU Utilization plot
The APU cluster (A53 cores) utilization is plotted in “CPU Utilization Plot”.
PS-PL Interface Performance plot
The bandwidth utilization of Full Power domain and Low power domain high performance ports is plotted by “PS-PL performance plot”. The write and read throughputs are plotted.
PL Die temperature
The PL Die temperature is read from the SYSMON and displayed on the GUI.
Block Diagram view
The top-level block diagram and the blocks involved in data path for each of the modes in Input source and FFT computation engines is displayed in the bottom right corner of the GUI.
Building the Software components
The following tutorials assume that the $TRD_HOME environment variable has been set as below.
For rev 1.0 with production silicon:
$ export TRD_HOME=</path/to/downloaded/zip-file>/rdf0376-zcu102-swaccel-trd-2017-2
For rev 1.0 with ES2 silicon:
$ export TRD_HOME=</path/to/downloaded/zip-file>/rdf0376-zcu102-es2-swaccel-trd-2017-2
For some modules, the $PETALINUX environment variables needs to be set as well. This is done automatically when you source the PetaLinux settings.sh script (see PetaLinux installation guide).
Building RPU firmware using XSDK
Source the SDK tool-chain and execute the following commands:
$ cd $TRD_HOME/rpu/swaccel_r5_firmware $ xsdk -workspace . &
A welcome screen is displayed as shown in the below figure.
{Soft_Acc_17p2_rpu1.png}
Click 'Import Project' from the welcome screen, browse to the current working directory and make sure the r5FFT, r5FFT_bsp and zcu102_fft_wrapper_hw_platform_0 projects are selected. Click Finish.
{Soft_Acc_17p2_rpu2.png}
It builds automatically and fails (failure can be ignored as it will build successfully in the next step).
From menu-bar, go to Xilinx Tools - > Repositories.
{Soft_Acc_17p2_rpu3.png}
Click on New and specify the path to the repository directory in present working directory. Click Apply and then OK.
{Soft_Acc_17p2_rpu4.png}
Right click on r5FFT_bsp, then click on Board Support Package Settings. Board Support Package Settings window is displayed.
{Soft_Acc_17p2_rpu5.png}
Navigate to Overview > drivers > psu_cortexr5_0. Then append 'value' field for “extra_compiler_flags" with -mfloat-abi=hard.
{Soft_Acc_17p2_rpu6_New.png}
Click OK. It will regenerate BSP sources and build the firmware.
Create “images” directory and copy the generated image.
$ mkdir –p $TRD_HOME/images $ cp r5FFT/Debug/r5FFT.elf $TRD_HOME/images
Petalinux BSP
This tutorial shows how to build the Linux image using the Petalinux build tool.
$ cd $TRD_HOME/apu/petalinux_bsp $ petalinux-config --oldconfig $ cd project-spec/meta-user/recipes-bsp/device-tree/files/ $ cp zcu102-swaccel-dm2.dtsi system-user.dtsi $ petalinux-build $ cd -
Copy generated image.ub to $TRD/images.
$ cp images/linux/image.ub $TRD_HOME/images
Set the SYSROOT environment variable, required for the application build step.
Note: The below command assumes you are using the default yocto tmp directory. If you are using a custom yocto tmp directory, you need to modify the path accordingly.
$ export SYSROOT=$TRD_HOME/apu/petalinux_bsp/tmp/sysroots/plnx_aarch64
Build BitStream and FFT Shared Object using SDSoC
Source the SDx tool-chain and execute the following commands:
$ cd $TRD_HOME/apu/swaccel_app $ sdx -workspace . &
A welcome screen is displayed as shown in the below figure.
{Soft_Acc_17p2_apu1.png}
Create a new SDx Project (File > New > Xilinx SDx Project…).
{Soft_Acc_17p2_apu2.png}
Enter ' fft ' as project name and click Next.
{Soft_Acc_17p2_apu3.png}
Click 'Add Custom Platform', browse to the $TRD_HOME/apu/zcu102_fft directory and click OK. Select the newly added zcu102_fft (custom) platform for production silicon or ES2 from the list and click 'Next'.
{Soft_Acc_17p2_apu4.png}
Check the 'Shared Library' box and click 'Next'.
{Soft_Acc_17p2_apu5.png}
Select the 'FFT Library' template and click 'Finish'.
{Soft_Acc_17p2_apu6.png}
Change the 'Active build configuration' to Release in the SDx Project Settings window.
{Soft_Acc_17p2_apu7.png}
Right-click the fft project, select 'C/C++ Build Settings'. Navigate to the 'Build Artifacts' tab and add the output prefix 'lib'. Click OK.
{Soft_Acc_17p2_apu8.png}
Right-click the fft project and select 'Build Project'.
Copy the content of the generated sd_card folder to the images
$ cp -r fft/Release/sd_card/* $TRD_HOME/images/
QT-application:
This tutorial shows how to build Qt application.
Set up the Qt environment and generate a Makefile for the Qt project. Make sure the TRD_HOME, PETALINUX, and SYSROOT environment variables are set before running this step
$ cd $TRD_HOME/apu/swaccel_app/swaccel_qt $ source qmake_set_env.sh $ qmake swaccel_qt.pro -r -spec linux-oe-g++
Create a new SDx workspace.
$ cd .. $ sdx -workspace . &
Click on File > Import > General > Existing Projects into Workspace. Browse to the current working directory and make sure the "swaccel_qt" project is selected. Click finish.
{Soft_Acc_17p2_qt1.png}
Right-click the swaccel_qt project and click 'Build Project'.
{Soft_Acc_17p2_qt2.png}
Copy the generated swaccel_qt executable to the images directory.
$ cp swaccel_qt/swaccel_qt $TRD_HOME/images
User can now follow the above Board Setup steps to start the demo.
Support
To obtain technical support for this reference design, go to the:
Xilinx Answers Database to locate answers to known issues
Xilinx Community Forums to ask questions or discuss technical details and issues. Please make sure to browse the existing topics first before filing a new topic. If you do file a new topic, make sure it is filed in the sub-forum that best describes your issue or question e.g. Embedded Linux for any Linux related questions. Please include "ZCU102 Software Acceleration TRD" and the release version in the topic name along with a brief summary of the issue.

Zynq UltraScale MPSoC Software Acceleration TRD 2017.3

$
0
0

{under.jpg}

Revision History
This wiki page complements the 2017.2 version of the Software Acceleration TRD. For other versions, refer to the Zynq UltraScale+ MPSoC Software Acceleration TRD overview page.

Zynq UltraScale MPSoC Software Acceleration TRD

$
0
0
...
Version
Wiki
2017.3 (EA)
Zynq UltraScale MPSoC Software Acceleration TRD 2017.3

2017.2 (EA)
Zynq UltraScale MPSoC Software Acceleration TRD 2017.2

design_use_case1.JPG


Zynq UltraScale+ MPSoC VCU 4k60 Design Example with HDMI Tx and Rx

$
0
0
...
Description of Revisions
1.0
...
Mogilipaka &
Rajesh Gugulothu
Initial Release
...
This tutorial contains information about:
How to build all the required components based on the provided source files via detailed step-by-step tutorials.
...
uses cases and we have used the Gstreamer application to create the pipeline and execute it accordingly..
Use Case 1: HDMI capture pipeline with VCU Encode and streaming
{Us_Case1.JPG}
...
The second ZCU106 board which is identified by its IP address runs a GStreamer pipeline which captures the encoded video stream from the network and decodes the encoded video packets and displays it on the HDMI monitor connected to the HDMI transmit interface.We will be using the gstreamer application to create this pipeline and execute in the Steps to run instructions provided in further section of this document.
Use Case 2: HDMI capture pipeline with VCU Encode and streaming in bidirectional mode
{UseCase2.JPG}{design_use_case1.JPG}
Overview:
...
HDMI transmit interface.This scenario happens vice versa
We
interface.
In the same way raw video captured by the HDMI Rx subsystem on the ZCU106 board2 is encoded and streamed out to zcu106 Board2 , displayed on the HDMI monitor connected to it.We
will be
Additional material that is not hosted on the tutorial:
Zynq UltraScale+ MPSoC VCU TRD user guide, UG1250: The UG provides the list of features, software architecture and hardware architecture.
...
Change directory to $Design_HOME/pl
To create the Vivado IPI project and invoke the GUI, run the following command.
...
vivado -source scripts/zcu106_4k_demo.tclzcu106_4k_demo.tcl
On Windows 7:
Click Start > All Programs > Xilinx Design Tools > Vivado 2017.3 > Vivado 2017.3.
...
In the Tcl console type:
cd </path/to/downloaded/zip-file>/Design_files/pl
source scripts/zcu106_4k_demo.tclzcu106_4k_demo.tcl
{hardwareflow1.png}
After executing the script, the vivado IPI block design comes up as shown in the below Figure.
...
% petalinux-package --boot --bif=vcu.bif
Copy the generated boot image and Linux image to the SD card directory.
...
BOOT.BIN image.ub $Design_HOME/images/rev-x$Design_HOME/images/
Preparing the SD Cards:
Preparing SD card for Board1

Zynq UltraScale MPSoC Software Acceleration TRD 2017.3

$
0
0

{under.jpg}
Revision History
This wiki page complements the 2017.2 version of the Software Acceleration TRD. For other versions, refer to the Zynq UltraScale+ MPSoC Software Acceleration TRD overview
Please Ignore this page.
Change Log:
Update all projects, IPs, and tools versions to 2017.2
Remove hard-TPG from design
Use SDSoC based function instead of linux based function to measure performance.
Move NE10 headers and library to platform.
Performance improvement in all computation engines, especifically in RPU as Coproc (~1100us -> ~800us) and APU-PL (~120us -> ~70us)
Simplified build-steps for better user-experience
Various fixes and clean-up
Introduction
This wiki page contains information on how to build various components of the Zynq UltraScale+ MPSoC Software Acceleration Targeted Reference Design (TRD), version 2017.2. The page also has information on how to set-up the hardware and software platforms and run the design using the ZCU102 evaluation kit (Rev 1.0 with ES2 or Production silicon).
About the TRD
The Software Acceleration TRD is an embedded signal processing application designed to showcase various features and capabilities of the Zynq UltraScale+ MPSoC ZU9EG device for the embedded domain. The TRD consists of two elements: The Zynq UltraScale+ MPSoC Processing System (PS) and a signal processing application implemented in Programmable Logic (PL). The MPSoC allows you to implement a signal processing algorithm that performs Fast Fourier Transform (FFT) on samples (coming from Test Pattern Generator (TPG) in Application Processing Unit (APU) or System Monitoring (SYSMON) through an external channel) either as a software program running on the Zynq UltraScale+ MPSoC based PS or as a hardware accelerator inside the PL. The design has three accelerator cores generated using SDx for computing 4096, 16384, and 65536 point FFTs. The data transfers of the SDx accelerators is controlled by the APU. There is one accelerator (FFT IP from the Vivado IP catalog) for 4096 point FFT controlled by the Real-Time Processing Unit (RPU). The TRD demonstrates how to seamlessly switch between a software or a hardware implementation and to evaluate the cost and benefit of each implementation. The TRD also demonstrates the value of offloading computation-intensive tasks onto PL, thereby freeing the CPU resources to be available for user-specific applications.
For detailed information on the complete feature set, or hardware and software architecture of the design, please refer to the TRD user guide here.
Download the TRD
This TRD has been tested on Rev 1.0 of ZCU102 board with ES2 silicon and Production silicon. The Current design doesn't support ES1 silicon.
The following design files can be downloaded from here.
ES2 silicon : rdf0435-zcu102-es2-swaccel-trd-2017-2.zip
Production silicon : rdf0376-zcu102-swaccel-trd-2017-2.zip
TRD Directory Structure and Package Contents
The Software Acceleration TRD package is released with the source code, Vivado project, SDK projects, and an SD card image that enables you to run the demonstration and software application. It also includes the binaries necessary to configure and boot the ZCU102 board. Prior to running the steps mentioned in this wiki page, download the TRD package and extract its contents to a directory referred to as ‘TRD_HOME' which is the home directory.
{Soft_Acc_17p2_es2dir3.png}
The table below describes the content of each directory in detail.
Folder/file
Description
apu
Contains the software source files
petalinux
Contains the PetaLinux project's configuration
Qt_gui
Contains GUI sources
zcu102_fft
SDx folder containg the hardware platform, pfm files and FFT accelerator C sources.
rpu
swaccel_r5_firmware
Contains SDK project for building RPU firmware
sdcard
Contains ready to test binaries
BOOT.BIN
BIN file containing FSBL, PL bitstream, U-boot and ARM trusted firmware
image.ub
Kernel image
libfft.so
FFT accelerator shared-object
r5FFT.elf
R5 FFT computation firmware
README.txt
Contains design version history, steps to implement the design, and Vivado and PetaLinux versions to be used to build the design.
swaccel_qt
Qt GUI application.
THIRD_PARTY_NOTICES.zip
Contains the Copyright text for third-party libraries
IMPORTANT_NOTICE_CONCERNING_THIRD_PARTY-CONTENT.txt
Contains information about the third party licences
Pre-requisites:
ZCU102 Evaluation Kit (Rev 1.0 with ES2 or Production silicon)
A Linux development PC with following tools installed:
Xilinx Vivado Design Suite 2017.2
Xilinx SDx 2017.2
Petalinux 2017.2
Distributed version control system Git installed. For information, refer to the Xilinx Git wiki.
GNU make utility version 3.81 or higher.
Running the Demo
This section provides step by step instructions on bringing up the ZCU102 board for demonstration of the TRD and running different options from the graphical user interface (GUI).
The binaries required to run the design are in the $TRD_HOME/sdcard folder. It also includes the binaries necessary to configure and boot the ZCU102 board.
Before running the demo:
Format the SD-MMC card as FAT32 using a SD-MMC card reader. Copy the contents of the $TRD_HOME/sdcard onto the primary partition of the SD-MMC card.
PetaLinux console login details are;
user: root
password: root
Hardware Setup Requirements
Requirements for theTRD demo setup:
The ZCU102 Evaluation Kit (Rev 1.0 with ES2 or Production silicon)
AC power adapter (12 VDC)
Optional: A USB Type-A to USB Micro-B cable (for UART communication) and a Tera Term Pro (or similar) UART terminal program.
USB-UART drivers from Silicon Labs
USB Micro-B to female Adaptor with USB hub is needed for connecting a mouse.
USB mouse
4K monitor with Display Port support
Certified Display Port cable (version 1.2); TRD tested with 6 feet long E342987, Cable matters
Optional, required only for testing with external audio input:
XA3 SYSMON Headphone Adapter card from Faster Technology
An audio source like MP3 player
An aux cable with 3.5mm male jack on both ends.
A SD-MMC flash card containing TRD binaries formatted with FAT32. The SD-MMC should have the required binaries in its primary partition. Copy the binaries from sdcard folder of the TRD zip file. The required binaries include :
BOOT.BIN
image.ub
libfft.so
r5FFT.elf
swaccel_qt
Note: TRD supports Ultra HD (4K) and Full HD (1080p) resolutions. The binaries provided in the sdcard folder have been tested with ViewSonic (4K), ASUS (4K), Acer (4K) and Dell-P2414Hb (1080p) display monitors. However, the binaries should work well with any Display Port certified monitors supporting 4K/1080p resolution in its EDID database. Please make sure to use a DP certified 1.2 version of the cable for connecting the ZCU102 board to the monitor.
Board Setup
Connect various cables to the ZCU102 board as shown in the following steps.
{setup-1.jpg}
{2016-05-13 15.33.50.jpg}
1. Connect a 4K monitor to the DP port on ZCU102 using DP 1.2 cable.
2. Connect an USB mouse to the Micro-B USB connector (Jumper J96 on ZCU102 board).
3. Optional: Connect an USB Micro-B cable into the micro USB port (J83) labeled USB UART on the ZCU102 board and the USB Type-A cable end into an open USB port on the host PC for UART communication.
4. Connect the power supply to the ZCU102 board. Do not switch the power ON.
5. Optional: Plug the XA3 Adapter card into the Sysmon Header on ZCU102 board (J3). Connect Jumpers J5 and J4 on XA3 card as shown in below figure.
{IMG_20160518_143720.jpg}
6. Optional: Connect the 3.5mm auxiliary cable to XA3 card and audio source. One end connects to audio source and the other end connects to 3.5mm female connector on XA3 card.
7. Insert a SD-MMC memory card, which contains the TRD binaries, into the SD receptacle on the ZCU102 board
8. Make sure the DIP switches (SW6) are set as shown in figure below, which allows the ZCU102 board to boot from the SD-MMC card.
{sa_2017p1_bm.png}
9. Optional: Open a serial communication terminal software like TeraTerm, and set up a new serial communicaiton as shown in below figure.
{teraterm_2.png} Click on "New Connection" and select Interface 0 and click OK (as shown in below figure).
{teraterm_1.png} Click on Setup -> Serial Port and make sure to setup as shown in below figure
{teraterm_4.png} User can see the following on the serial terminal
{teraterm_3.png} After linux boot is complete, you see the Petalinux login prompt, as shown in below figure
{teraterm_5.png}
Run QT GUI application
A Linux application with QT-based GUI is provided with the package included on the SD-MMC memory card. This application provides options to user to exercise different modes of the demonstration. User can select Test Pattern Generator (TPG) samples or External audio source (requires the XA3 adapter card, aux cable and audio source for testing).
User can select to perform FFT computation on APU (run as software code on the PS) or in PL (run in the FPGA fabric as a hardware IP core).
User can also apply various windowing techniques on input samples before performing FFT.
Powering on the QT-based GUI application demo
Make sure the monitor is set for DP Ultra HD (4K) resolution.
Turn on power switch (J52)
Note: The Linux image and Qt based GUI application will be loaded from the SD-MMC memory card.
The Linux image will load and the frame buffer console is displayed on the monitor.
The Qt based GUI will load
When the GUI starts up, the demonstration starts with FFT being computed by software running in APU on samples coming from TPG in PL.
Running the Qt-based GUI application demo
Exercise different options by pressing the buttons available in the GUI to evaluate the different use cases mentioned below.
{sa_20171_1.jpg}
Test Start/Pause
Demonstration can be paused at any instant by clicking on Pause button, as shown in figure below.
{IMG_20160511_115138.jpg}
Input Source
There are two sources of data samples.
Use case
Input source
1
Test Pattern Generator (TPG in software). This is the default option.
2
External audio input(through XA3 SYSMON Headphone Adapter card)
Note : To test the external audio (assuming that setup is made as per procedure mentioned above), play an audio from the MP3 player/Phone. The peak voltage of the audio source depends on the manufacturer. The voltage levels of the samples depend on the volume. If the output voltage of the audio signal goes beyond 1V, the waveform will be clipped. Adjust the volume on the audio source so that the voltage of the samples lies within 1V peak-to-peak.
{sa_20171_2.jpg}
FFT Computation Engine
For the two input sources mentioned in above table, user can select one of the following compute engines for FFT computation.
FFT Compute Engine
Description
APU (default)
FFT computation is done by software running on APU
NEON
FFT computation is done by software running on APU. Neon intrinsic APIs are used for FFT computation to make
sure that the instructions are executed on NEON.
APU controlled PL Accelerator
FFT computation is done by the FFT core in Programmable Logic(PL)
RPU as Co-processor
FFT computation is done by software running on RPU. APU is involved in moving samples from TPG in PL to PS DDR.
Samples from PS DDR are copied to OCM by APU software and that information is passed to RPU through OpenAMP channel.
RPU controlled PL Accelerator
FFT computation is done by PL FFT IP. RPU controls the AXI DMA transfers to/from PL FFT core from/to PS DDR.
APU is involved in moving samples from TPG in PL to PS DDR. Samples from PS DDR are copied to OCM by APU
software and that information is passed to RPU through OpenAMP channel. PL FFT core fetches samples from OCM
and computes FFT on the samples and writes samples back to OCM.
All
Runs FFT on all engines one at a time. This mode is useful for comparing computation times for various engines.
{IMG_20160511_114619.jpg}
FFT Length
FFT length determines the number of samples on which FFT computation is performed. User can run the following FFT sizes.
FFT Size
4096 (default)
16384
65536
{IMG_20160804_143712.jpg}
FFT Window
User can apply one of the window function on the input samples before FFT computation.
Window function
None (Default, No windowing)
Hann
Hamming
Blackman
Blackman Harris
{IMG_20160511_114705.jpg}
Frequency Zoom
User can select the following Frequency Zoom options
FFT Zoom option
Description
ZOOM
Selecting this option fixes the units on frequency axis in the Frequency domain plot to 512.
This enables users to closely observe the values on frequency axis. This is 5X zoom.
NONE (default)
This is the default option. None is No Zoom. Selecting this option will plot all points on frequency axis (Number of points equal to half of the FFT size)
{IMG_20160511_114726.jpg}
FFT Scale
User can select the different scales on Voltage/Amplitude axis. This option is important when using external audio source as input. The voltage of the samples is dependent on the volume of the audio signal. Depending on the amplitude of the audio samples, the scale can be selected. Available options are:
FFT Scale
1V (Default)
0.5V
0.25V
0.1V
{IMG_20160511_114742.jpg}
Sample Rate
The sampling rate of the SYSMON in PL can be changed on run time. Supported sampling rates are:
Sampling Rate
200 kSPS (default)
100 kSPS
50 kSPS
{IMG_20160804_143912.jpg}
Time and Frequency domain plots
The time domain plot plots the samples corresponding to data generated by either TPG or by external audio source. The number of points in the plot depends on the FFT size.
The frequency domain plot plots the power spectral density (not in logarithm scale). It is a function of voltage vs frequency bins. The value “Fp” on the extreme right corner of frequency domain plot depicts the frequency bin with highest energy. The number of frequency bins plotted is half of FFT size (half because of symmetry for real valued samples) when “NONE” is selected in Frequency Zoom control and 512 by default (ZOOM enabled).
FFT Computation time plot
The time taken for FFT computation by each engine is plotted on the “FFT computation plot”. The average computation times for 4096 point FFT are captured for reference in below table:
Computation Engine
~Average computation time (us)
APU
400
APU with Neon as Co-processor
320
APU controlled PL
70
RPU
830*
RPU controlled PL
140*
RPU is running at 500 MHz and APU is running at 1.1G. Also, the OpenAMP communication latency is included which is approximately 100 μs.
CPU Utilization plot
The APU cluster (A53 cores) utilization is plotted in “CPU Utilization Plot”.
PS-PL Interface Performance plot
The bandwidth utilization of Full Power domain and Low power domain high performance ports is plotted by “PS-PL performance plot”. The write and read throughputs are plotted.
PL Die temperature
The PL Die temperature is read from the SYSMON and displayed on the GUI.
Block Diagram view
The top-level block diagram and the blocks involved in data path for each of the modes in Input source and FFT computation engines is displayed in the bottom right corner of the GUI.
Building the Software components
The following tutorials assume that the $TRD_HOME environment variable has been set as below.
For rev 1.0 with production silicon:
$ export TRD_HOME=</path/to/downloaded/zip-file>/rdf0376-zcu102-swaccel-trd-2017-2
For rev 1.0 with ES2 silicon:
$ export TRD_HOME=</path/to/downloaded/zip-file>/rdf0376-zcu102-es2-swaccel-trd-2017-2
For some modules, the $PETALINUX environment variables needs to be set as well. This is done automatically when you source the PetaLinux settings.sh script (see PetaLinux installation guide).
Building RPU firmware using XSDK
Source the SDK tool-chain and execute the following commands:
$ cd $TRD_HOME/rpu/swaccel_r5_firmware $ xsdk -workspace . &
A welcome screen is displayed as shown in the below figure.
{Soft_Acc_17p2_rpu1.png}
Click 'Import Project' from the welcome screen, browse to the current working directory and make sure the r5FFT, r5FFT_bsp and zcu102_fft_wrapper_hw_platform_0 projects are selected. Click Finish.
{Soft_Acc_17p2_rpu2.png}
It builds automatically and fails (failure can be ignored as it will build successfully in the next step).
From menu-bar, go to Xilinx Tools - > Repositories.
{Soft_Acc_17p2_rpu3.png}
Click on New and specify the path to the repository directory in present working directory. Click Apply and then OK.
{Soft_Acc_17p2_rpu4.png}
Right click on r5FFT_bsp, then click on Board Support Package Settings. Board Support Package Settings window is displayed.
{Soft_Acc_17p2_rpu5.png}
Navigate to Overview > drivers > psu_cortexr5_0. Then append 'value' field for “extra_compiler_flags" with -mfloat-abi=hard.
{Soft_Acc_17p2_rpu6_New.png}
Click OK. It will regenerate BSP sources and build the firmware.
Create “images” directory and copy the generated image.
$ mkdir –p $TRD_HOME/images $ cp r5FFT/Debug/r5FFT.elf $TRD_HOME/images
Petalinux BSP
This tutorial shows how to build the Linux image using the Petalinux build tool.
$ cd $TRD_HOME/apu/petalinux_bsp $ petalinux-config --oldconfig $ cd project-spec/meta-user/recipes-bsp/device-tree/files/ $ cp zcu102-swaccel-dm2.dtsi system-user.dtsi $ petalinux-build $ cd -
Copy generated image.ub to $TRD/images.
$ cp images/linux/image.ub $TRD_HOME/images
Set the SYSROOT environment variable, required for the application build step.
Note: The below command assumes you are using the default yocto tmp directory. If you are using a custom yocto tmp directory, you need to modify the path accordingly.
$ export SYSROOT=$TRD_HOME/apu/petalinux_bsp/tmp/sysroots/plnx_aarch64
Build BitStream and FFT Shared Object using SDSoC
Source the SDx tool-chain and execute the following commands:
$ cd $TRD_HOME/apu/swaccel_app $ sdx -workspace . &
A welcome screen is displayed as shown in the below figure.
{Soft_Acc_17p2_apu1.png}
Create a new SDx Project (File > New > Xilinx SDx Project…).
{Soft_Acc_17p2_apu2.png}
Enter ' fft ' as project name and click Next.
{Soft_Acc_17p2_apu3.png}
Click 'Add Custom Platform', browse to the $TRD_HOME/apu/zcu102_fft directory and click OK. Select the newly added zcu102_fft (custom) platform for production silicon or ES2 from the list and click 'Next'.
{Soft_Acc_17p2_apu4.png}
Check the 'Shared Library' box and click 'Next'.
{Soft_Acc_17p2_apu5.png}
Select the 'FFT Library' template and click 'Finish'.
{Soft_Acc_17p2_apu6.png}
Change the 'Active build configuration' to Release in the SDx Project Settings window.
{Soft_Acc_17p2_apu7.png}
Right-click the fft project, select 'C/C++ Build Settings'. Navigate to the 'Build Artifacts' tab and add the output prefix 'lib'. Click OK.
{Soft_Acc_17p2_apu8.png}
Right-click the fft project and select 'Build Project'.
Copy the content of the generated sd_card folder to the images
$ cp -r fft/Release/sd_card/* $TRD_HOME/images/
QT-application:
This tutorial shows how to build Qt application.
Set up the Qt environment and generate a Makefile for the Qt project. Make sure the TRD_HOME, PETALINUX, and SYSROOT environment variables are set before running this step
$ cd $TRD_HOME/apu/swaccel_app/swaccel_qt $ source qmake_set_env.sh $ qmake swaccel_qt.pro -r -spec linux-oe-g++
Create a new SDx workspace.
$ cd .. $ sdx -workspace . &
Click on File > Import > General > Existing Projects into Workspace. Browse to the current working directory and make sure the "swaccel_qt" project is selected. Click finish.
{Soft_Acc_17p2_qt1.png}
Right-click the swaccel_qt project and click 'Build Project'.
{Soft_Acc_17p2_qt2.png}
Copy the generated swaccel_qt executable to the images directory.
$ cp swaccel_qt/swaccel_qt $TRD_HOME/images
User can now follow the above Board Setup steps to start the demo.
Support
To obtain technical support for this reference design, go to the:
Xilinx Answers Database to locate answers to known issues
Xilinx Community Forums to ask questions or discuss technical details and issues. Please make sure to browse the existing topics first before filing a new topic. If you do file a new topic, make sure it is filed in the sub-forum that best describes your issue or question e.g. Embedded Linux for any Linux related questions. Please include "ZCU102 Software Acceleration TRD" and the release version in the topic name along with a brief summary of the issue.

Zynq UltraScale MPSoC Software Acceleration TRD

$
0
0
...
Version
Wiki
2017.32017.4 (EA)
Zynq
...
Acceleration TRD 2017.32017.4
2017.2 (EA)
Zynq UltraScale MPSoC Software Acceleration TRD 2017.2

Zynq UltraScale MPSoC Software Acceleration TRD 2017.4

$
0
0

{under.jpg}
Revision History
This wiki page complements the 2017.2 version of the Software Acceleration TRD. For other versions, refer to the Zynq UltraScale+ MPSoC Software Acceleration TRD overview page.
Change Log:
Update all projects, IPs, and tools versions to 2017.2
Remove hard-TPG from design
Use SDSoC based function instead of linux based function to measure performance.
Move NE10 headers and library to platform.
Performance improvement in all computation engines, especifically in RPU as Coproc (~1100us -> ~800us) and APU-PL (~120us -> ~70us)
Simplified build-steps for better user-experience
Various fixes and clean-up
Introduction
This wiki page contains information on how to build various components of the Zynq UltraScale+ MPSoC Software Acceleration Targeted Reference Design (TRD), version 2017.2. The page also has information on how to set-up the hardware and software platforms and run the design using the ZCU102 evaluation kit (Rev 1.0 with ES2 or Production silicon).
About the TRD
The Software Acceleration TRD is an embedded signal processing application designed to showcase various features and capabilities of the Zynq UltraScale+ MPSoC ZU9EG device for the embedded domain. The TRD consists of two elements: The Zynq UltraScale+ MPSoC Processing System (PS) and a signal processing application implemented in Programmable Logic (PL). The MPSoC allows you to implement a signal processing algorithm that performs Fast Fourier Transform (FFT) on samples (coming from Test Pattern Generator (TPG) in Application Processing Unit (APU) or System Monitoring (SYSMON) through an external channel) either as a software program running on the Zynq UltraScale+ MPSoC based PS or as a hardware accelerator inside the PL. The design has three accelerator cores generated using SDx for computing 4096, 16384, and 65536 point FFTs. The data transfers of the SDx accelerators is controlled by the APU. There is one accelerator (FFT IP from the Vivado IP catalog) for 4096 point FFT controlled by the Real-Time Processing Unit (RPU). The TRD demonstrates how to seamlessly switch between a software or a hardware implementation and to evaluate the cost and benefit of each implementation. The TRD also demonstrates the value of offloading computation-intensive tasks onto PL, thereby freeing the CPU resources to be available for user-specific applications.
For detailed information on the complete feature set, or hardware and software architecture of the design, please refer to the TRD user guide here.
Download the TRD
This TRD has been tested on Rev 1.0 of ZCU102 board with ES2 silicon and Production silicon. The Current design doesn't support ES1 silicon.
The following design files can be downloaded from here.
ES2 silicon : rdf0435-zcu102-es2-swaccel-trd-2017-2.zip
Production silicon : rdf0376-zcu102-swaccel-trd-2017-2.zip
TRD Directory Structure and Package Contents
The Software Acceleration TRD package is released with the source code, Vivado project, SDK projects, and an SD card image that enables you to run the demonstration and software application. It also includes the binaries necessary to configure and boot the ZCU102 board. Prior to running the steps mentioned in this wiki page, download the TRD package and extract its contents to a directory referred to as ‘TRD_HOME' which is the home directory.
{Soft_Acc_17p2_es2dir3.png}
The table below describes the content of each directory in detail.
Folder/file
Description
apu
Contains the software source files
petalinux
Contains the PetaLinux project's configuration
Qt_gui
Contains GUI sources
zcu102_fft
SDx folder containg the hardware platform, pfm files and FFT accelerator C sources.
rpu
swaccel_r5_firmware
Contains SDK project for building RPU firmware
sdcard
Contains ready to test binaries
BOOT.BIN
BIN file containing FSBL, PL bitstream, U-boot and ARM trusted firmware
image.ub
Kernel image
libfft.so
FFT accelerator shared-object
r5FFT.elf
R5 FFT computation firmware
README.txt
Contains design version history, steps to implement the design, and Vivado and PetaLinux versions to be used to build the design.
swaccel_qt
Qt GUI application.
THIRD_PARTY_NOTICES.zip
Contains the Copyright text for third-party libraries
IMPORTANT_NOTICE_CONCERNING_THIRD_PARTY-CONTENT.txt
Contains information about the third party licences
Pre-requisites:
ZCU102 Evaluation Kit (Rev 1.0 with ES2 or Production silicon)
A Linux development PC with following tools installed:
Xilinx Vivado Design Suite 2017.2
Xilinx SDx 2017.2
Petalinux 2017.2
Distributed version control system Git installed. For information, refer to the Xilinx Git wiki.
GNU make utility version 3.81 or higher.
Running the Demo
This section provides step by step instructions on bringing up the ZCU102 board for demonstration of the TRD and running different options from the graphical user interface (GUI).
The binaries required to run the design are in the $TRD_HOME/sdcard folder. It also includes the binaries necessary to configure and boot the ZCU102 board.
Before running the demo:
Format the SD-MMC card as FAT32 using a SD-MMC card reader. Copy the contents of the $TRD_HOME/sdcard onto the primary partition of the SD-MMC card.
PetaLinux console login details are;
user: root
password: root
Hardware Setup Requirements
Requirements for theTRD demo setup:
The ZCU102 Evaluation Kit (Rev 1.0 with ES2 or Production silicon)
AC power adapter (12 VDC)
Optional: A USB Type-A to USB Micro-B cable (for UART communication) and a Tera Term Pro (or similar) UART terminal program.
USB-UART drivers from Silicon Labs
USB Micro-B to female Adaptor with USB hub is needed for connecting a mouse.
USB mouse
4K monitor with Display Port support
Certified Display Port cable (version 1.2); TRD tested with 6 feet long E342987, Cable matters
Optional, required only for testing with external audio input:
XA3 SYSMON Headphone Adapter card from Faster Technology
An audio source like MP3 player
An aux cable with 3.5mm male jack on both ends.
A SD-MMC flash card containing TRD binaries formatted with FAT32. The SD-MMC should have the required binaries in its primary partition. Copy the binaries from sdcard folder of the TRD zip file. The required binaries include :
BOOT.BIN
image.ub
libfft.so
r5FFT.elf
swaccel_qt
Note: TRD supports Ultra HD (4K) and Full HD (1080p) resolutions. The binaries provided in the sdcard folder have been tested with ViewSonic (4K), ASUS (4K), Acer (4K) and Dell-P2414Hb (1080p) display monitors. However, the binaries should work well with any Display Port certified monitors supporting 4K/1080p resolution in its EDID database. Please make sure to use a DP certified 1.2 version of the cable for connecting the ZCU102 board to the monitor.
Board Setup
Connect various cables to the ZCU102 board as shown in the following steps.
{setup-1.jpg}
{2016-05-13 15.33.50.jpg}
1. Connect a 4K monitor to the DP port on ZCU102 using DP 1.2 cable.
2. Connect an USB mouse to the Micro-B USB connector (Jumper J96 on ZCU102 board).
3. Optional: Connect an USB Micro-B cable into the micro USB port (J83) labeled USB UART on the ZCU102 board and the USB Type-A cable end into an open USB port on the host PC for UART communication.
4. Connect the power supply to the ZCU102 board. Do not switch the power ON.
5. Optional: Plug the XA3 Adapter card into the Sysmon Header on ZCU102 board (J3). Connect Jumpers J5 and J4 on XA3 card as shown in below figure.
{IMG_20160518_143720.jpg}
6. Optional: Connect the 3.5mm auxiliary cable to XA3 card and audio source. One end connects to audio source and the other end connects to 3.5mm female connector on XA3 card.
7. Insert a SD-MMC memory card, which contains the TRD binaries, into the SD receptacle on the ZCU102 board
8. Make sure the DIP switches (SW6) are set as shown in figure below, which allows the ZCU102 board to boot from the SD-MMC card.
{sa_2017p1_bm.png}
9. Optional: Open a serial communication terminal software like TeraTerm, and set up a new serial communicaiton as shown in below figure.
{teraterm_2.png} Click on "New Connection" and select Interface 0 and click OK (as shown in below figure).
{teraterm_1.png} Click on Setup -> Serial Port and make sure to setup as shown in below figure
{teraterm_4.png} User can see the following on the serial terminal
{teraterm_3.png} After linux boot is complete, you see the Petalinux login prompt, as shown in below figure
{teraterm_5.png}
Run QT GUI application
A Linux application with QT-based GUI is provided with the package included on the SD-MMC memory card. This application provides options to user to exercise different modes of the demonstration. User can select Test Pattern Generator (TPG) samples or External audio source (requires the XA3 adapter card, aux cable and audio source for testing).
User can select to perform FFT computation on APU (run as software code on the PS) or in PL (run in the FPGA fabric as a hardware IP core).
User can also apply various windowing techniques on input samples before performing FFT.
Powering on the QT-based GUI application demo
Make sure the monitor is set for DP Ultra HD (4K) resolution.
Turn on power switch (J52)
Note: The Linux image and Qt based GUI application will be loaded from the SD-MMC memory card.
The Linux image will load and the frame buffer console is displayed on the monitor.
The Qt based GUI will load
When the GUI starts up, the demonstration starts with FFT being computed by software running in APU on samples coming from TPG in PL.
Running the Qt-based GUI application demo
Exercise different options by pressing the buttons available in the GUI to evaluate the different use cases mentioned below.
{sa_20171_1.jpg}
Test Start/Pause
Demonstration can be paused at any instant by clicking on Pause button, as shown in figure below.
{IMG_20160511_115138.jpg}
Input Source
There are two sources of data samples.
Use case
Input source
1
Test Pattern Generator (TPG in software). This is the default option.
2
External audio input(through XA3 SYSMON Headphone Adapter card)
Note : To test the external audio (assuming that setup is made as per procedure mentioned above), play an audio from the MP3 player/Phone. The peak voltage of the audio source depends on the manufacturer. The voltage levels of the samples depend on the volume. If the output voltage of the audio signal goes beyond 1V, the waveform will be clipped. Adjust the volume on the audio source so that the voltage of the samples lies within 1V peak-to-peak.
{sa_20171_2.jpg}
FFT Computation Engine
For the two input sources mentioned in above table, user can select one of the following compute engines for FFT computation.
FFT Compute Engine
Description
APU (default)
FFT computation is done by software running on APU
NEON
FFT computation is done by software running on APU. Neon intrinsic APIs are used for FFT computation to make
sure that the instructions are executed on NEON.
APU controlled PL Accelerator
FFT computation is done by the FFT core in Programmable Logic(PL)
RPU as Co-processor
FFT computation is done by software running on RPU. APU is involved in moving samples from TPG in PL to PS DDR.
Samples from PS DDR are copied to OCM by APU software and that information is passed to RPU through OpenAMP channel.
RPU controlled PL Accelerator
FFT computation is done by PL FFT IP. RPU controls the AXI DMA transfers to/from PL FFT core from/to PS DDR.
APU is involved in moving samples from TPG in PL to PS DDR. Samples from PS DDR are copied to OCM by APU
software and that information is passed to RPU through OpenAMP channel. PL FFT core fetches samples from OCM
and computes FFT on the samples and writes samples back to OCM.
All
Runs FFT on all engines one at a time. This mode is useful for comparing computation times for various engines.
{IMG_20160511_114619.jpg}
FFT Length
FFT length determines the number of samples on which FFT computation is performed. User can run the following FFT sizes.
FFT Size
4096 (default)
16384
65536
{IMG_20160804_143712.jpg}
FFT Window
User can apply one of the window function on the input samples before FFT computation.
Window function
None (Default, No windowing)
Hann
Hamming
Blackman
Blackman Harris
{IMG_20160511_114705.jpg}
Frequency Zoom
User can select the following Frequency Zoom options
FFT Zoom option
Description
ZOOM
Selecting this option fixes the units on frequency axis in the Frequency domain plot to 512.
This enables users to closely observe the values on frequency axis. This is 5X zoom.
NONE (default)
This is the default option. None is No Zoom. Selecting this option will plot all points on frequency axis (Number of points equal to half of the FFT size)
{IMG_20160511_114726.jpg}
FFT Scale
User can select the different scales on Voltage/Amplitude axis. This option is important when using external audio source as input. The voltage of the samples is dependent on the volume of the audio signal. Depending on the amplitude of the audio samples, the scale can be selected. Available options are:
FFT Scale
1V (Default)
0.5V
0.25V
0.1V
{IMG_20160511_114742.jpg}
Sample Rate
The sampling rate of the SYSMON in PL can be changed on run time. Supported sampling rates are:
Sampling Rate
200 kSPS (default)
100 kSPS
50 kSPS
{IMG_20160804_143912.jpg}
Time and Frequency domain plots
The time domain plot plots the samples corresponding to data generated by either TPG or by external audio source. The number of points in the plot depends on the FFT size.
The frequency domain plot plots the power spectral density (not in logarithm scale). It is a function of voltage vs frequency bins. The value “Fp” on the extreme right corner of frequency domain plot depicts the frequency bin with highest energy. The number of frequency bins plotted is half of FFT size (half because of symmetry for real valued samples) when “NONE” is selected in Frequency Zoom control and 512 by default (ZOOM enabled).
FFT Computation time plot
The time taken for FFT computation by each engine is plotted on the “FFT computation plot”. The average computation times for 4096 point FFT are captured for reference in below table:
Computation Engine
~Average computation time (us)
APU
400
APU with Neon as Co-processor
320
APU controlled PL
70
RPU
830*
RPU controlled PL
140*
RPU is running at 500 MHz and APU is running at 1.1G. Also, the OpenAMP communication latency is included which is approximately 100 μs.
CPU Utilization plot
The APU cluster (A53 cores) utilization is plotted in “CPU Utilization Plot”.
PS-PL Interface Performance plot
The bandwidth utilization of Full Power domain and Low power domain high performance ports is plotted by “PS-PL performance plot”. The write and read throughputs are plotted.
PL Die temperature
The PL Die temperature is read from the SYSMON and displayed on the GUI.
Block Diagram view
The top-level block diagram and the blocks involved in data path for each of the modes in Input source and FFT computation engines is displayed in the bottom right corner of the GUI.
Building the Software components
The following tutorials assume that the $TRD_HOME environment variable has been set as below.
For rev 1.0 with production silicon:
$ export TRD_HOME=</path/to/downloaded/zip-file>/rdf0376-zcu102-swaccel-trd-2017-2
For rev 1.0 with ES2 silicon:
$ export TRD_HOME=</path/to/downloaded/zip-file>/rdf0376-zcu102-es2-swaccel-trd-2017-2
For some modules, the $PETALINUX environment variables needs to be set as well. This is done automatically when you source the PetaLinux settings.sh script (see PetaLinux installation guide).
Building RPU firmware using XSDK
Source the SDK tool-chain and execute the following commands:
$ cd $TRD_HOME/rpu/swaccel_r5_firmware $ xsdk -workspace . &
A welcome screen is displayed as shown in the below figure.
{Soft_Acc_17p2_rpu1.png}
Click 'Import Project' from the welcome screen, browse to the current working directory and make sure the r5FFT, r5FFT_bsp and zcu102_fft_wrapper_hw_platform_0 projects are selected. Click Finish.
{Soft_Acc_17p2_rpu2.png}
It builds automatically and fails (failure can be ignored as it will build successfully in the next step).
From menu-bar, go to Xilinx Tools - > Repositories.
{Soft_Acc_17p2_rpu3.png}
Click on New and specify the path to the repository directory in present working directory. Click Apply and then OK.
{Soft_Acc_17p2_rpu4.png}
Right click on r5FFT_bsp, then click on Board Support Package Settings. Board Support Package Settings window is displayed.
{Soft_Acc_17p2_rpu5.png}
Navigate to Overview > drivers > psu_cortexr5_0. Then append 'value' field for “extra_compiler_flags" with -mfloat-abi=hard.
{Soft_Acc_17p2_rpu6_New.png}
Click OK. It will regenerate BSP sources and build the firmware.
Create “images” directory and copy the generated image.
$ mkdir –p $TRD_HOME/images $ cp r5FFT/Debug/r5FFT.elf $TRD_HOME/images
Petalinux BSP
This tutorial shows how to build the Linux image using the Petalinux build tool.
$ cd $TRD_HOME/apu/petalinux_bsp $ petalinux-config --oldconfig $ cd project-spec/meta-user/recipes-bsp/device-tree/files/ $ cp zcu102-swaccel-dm2.dtsi system-user.dtsi $ petalinux-build $ cd -
Copy generated image.ub to $TRD/images.
$ cp images/linux/image.ub $TRD_HOME/images
Set the SYSROOT environment variable, required for the application build step.
Note: The below command assumes you are using the default yocto tmp directory. If you are using a custom yocto tmp directory, you need to modify the path accordingly.
$ export SYSROOT=$TRD_HOME/apu/petalinux_bsp/tmp/sysroots/plnx_aarch64
Build BitStream and FFT Shared Object using SDSoC
Source the SDx tool-chain and execute the following commands:
$ cd $TRD_HOME/apu/swaccel_app $ sdx -workspace . &
A welcome screen is displayed as shown in the below figure.
{Soft_Acc_17p2_apu1.png}
Create a new SDx Project (File > New > Xilinx SDx Project…).
{Soft_Acc_17p2_apu2.png}
Enter ' fft ' as project name and click Next.
{Soft_Acc_17p2_apu3.png}
Click 'Add Custom Platform', browse to the $TRD_HOME/apu/zcu102_fft directory and click OK. Select the newly added zcu102_fft (custom) platform for production silicon or ES2 from the list and click 'Next'.
{Soft_Acc_17p2_apu4.png}
Check the 'Shared Library' box and click 'Next'.
{Soft_Acc_17p2_apu5.png}
Select the 'FFT Library' template and click 'Finish'.
{Soft_Acc_17p2_apu6.png}
Change the 'Active build configuration' to Release in the SDx Project Settings window.
{Soft_Acc_17p2_apu7.png}
Right-click the fft project, select 'C/C++ Build Settings'. Navigate to the 'Build Artifacts' tab and add the output prefix 'lib'. Click OK.
{Soft_Acc_17p2_apu8.png}
Right-click the fft project and select 'Build Project'.
Copy the content of the generated sd_card folder to the images
$ cp -r fft/Release/sd_card/* $TRD_HOME/images/
QT-application:
This tutorial shows how to build Qt application.
Set up the Qt environment and generate a Makefile for the Qt project. Make sure the TRD_HOME, PETALINUX, and SYSROOT environment variables are set before running this step
$ cd $TRD_HOME/apu/swaccel_app/swaccel_qt $ source qmake_set_env.sh $ qmake swaccel_qt.pro -r -spec linux-oe-g++
Create a new SDx workspace.
$ cd .. $ sdx -workspace . &
Click on File > Import > General > Existing Projects into Workspace. Browse to the current working directory and make sure the "swaccel_qt" project is selected. Click finish.
{Soft_Acc_17p2_qt1.png}
Right-click the swaccel_qt project and click 'Build Project'.
{Soft_Acc_17p2_qt2.png}
Copy the generated swaccel_qt executable to the images directory.
$ cp swaccel_qt/swaccel_qt $TRD_HOME/images
User can now follow the above Board Setup steps to start the demo.
Support
To obtain technical support for this reference design, go to the:
Xilinx Answers Database to locate answers to known issues
Xilinx Community Forums to ask questions or discuss technical details and issues. Please make sure to browse the existing topics first before filing a new topic. If you do file a new topic, make sure it is filed in the sub-forum that best describes your issue or question e.g. Embedded Linux for any Linux related questions. Please include "ZCU102 Software Acceleration TRD" and the release version in the topic name along with a brief summary of the issue.

Zynq UltraScale+ MPSoC VCU 4k60 Design Example with HDMI Tx and Rx

$
0
0
...
This tutorial contains information about:
How to build all the required components based on the provided source files via detailed step-by-step tutorials.
...
uses cases .and we have used the Gstreamer application to create the pipeline and execute it accordingly.
Use Case 1: HDMI capture pipeline with VCU Encode and streaming
{Us_Case1.JPG}
...
The second ZCU106 board which is identified by its IP address runs a GStreamer pipeline which captures the encoded video stream from the network and decodes the encoded video packets and displays it on the HDMI monitor connected to the HDMI transmit interface.We will be using the gstreamer application to create this pipeline and execute in the Steps to run instructions provided in further section of this document.
Use Case 2: HDMI capture pipeline with VCU Encode and streaming in bidirectional mode
{design_use_case1.JPG}{UseCase2.JPG}
Overview:
...
HDMI transmit interface.
In the same way raw video captured by the HDMI Rx subsystem on the ZCU106 board2 is encoded and streamed out to zcu106 Board2 , displayed on the HDMI monitor connected to it.We
interface.This scenario happens vice versa
We
will be
Additional material that is not hosted on the tutorial:
Zynq UltraScale+ MPSoC VCU TRD user guide, UG1250: The UG provides the list of features, software architecture and hardware architecture.
...
Change directory to $Design_HOME/pl
To create the Vivado IPI project and invoke the GUI, run the following command.
...
vivado -source zcu106_4k_demo.tclscripts/zcu106_4k_demo.tcl
On Windows 7:
Click Start > All Programs > Xilinx Design Tools > Vivado 2017.3 > Vivado 2017.3.
...
In the Tcl console type:
cd </path/to/downloaded/zip-file>/Design_files/pl
source zcu106_4k_demo.tclscripts/zcu106_4k_demo.tcl
{hardwareflow1.png}
After executing the script, the vivado IPI block design comes up as shown in the below Figure.
...
% petalinux-package --boot --bif=vcu.bif
Copy the generated boot image and Linux image to the SD card directory.
...
BOOT.BIN image.ub $Design_HOME/images/$Design_HOME/images/rev-x
Preparing the SD Cards:
Preparing SD card for Board1
...
Receives the video data from Board1, decodes and displays on 4K monitor connected to HDMI-TX on Board2.
Below figure shows the execution of the use case 2
Appendix A
Determine
: Determine which COM
Note: Make sure that the ZCU106 board is powered on and the serial UART device USB cable is in place. This ensures that the USB-to-serial bridge is enumerated by the PC host.
Open your computer's Control Panel by clicking on Start > Control Panel.
...
Launch Tera Term and open the COM port that is associated to Silicon Labs Quad CP210x USB to UART Bridge: Interface 0 of the USB-to-serial bridge to both the boards
Set the COM port to 115200 Baud rate, 8, none, 1 –Set COM port.
Appendix A:File Description in Design directory
Design_files.zip is extracted as
Design_files
prebuilt_binaries - contains ready to test images
Board1 - Images to be copied on Board 1 SD Card
Board2 - Image to be copied on Board 2 SD Card
vcu_petalinux_bsp -Contains petalinux bsp files
pl - contains Hardware files

Zynq UltraScale MPSoC Software Acceleration TRD

$
0
0

Zynq UltraScale+ MPSoC VCU 4k60 Design Example with HDMI Tx and Rx

$
0
0
...
% petalinux-package --boot --bif=vcu.bif
Copy the generated boot image and Linux image to the SD card directory.
cp BOOT.BIN image.ub $Design_HOME/images/rev-x
Preparing the SD Cards:
Preparing SD card for Board1
...
{switch_Settings.png}
Connect 12V Power to the ZCU106 6-Pin Molex connector.
...
port of nVidiaNvidia shield.
Connect one end of HDMI cable to board’s HDMI-TX (top) port, and,the other end to HDMI port (it should have support for HDCP 2.2) of 4K monitor
Connect one end of Ethernet cable to Board1’s J67 connector, and connect the other end of Ethernet cable to Board2’s J67 connector Open Tera Term utility on windows machine and Power ON the Client board.
...
prebuilt_binaries - contains ready to test images
Board1 - Images to be copied on Board 1 SD Card
Board2 -- Image to
vcu_petalinux_bsp -Contains petalinux bsp files
pl - contains Hardware files

Zynq UltraScale+ MPSoC VCU 4k60 Design Example with HDMI Tx and Rx

$
0
0

Note: This page is under construction. Expect modification !!!!
Document History :
Date

Zynq UltraScale+ MPSoC VCU 4k60 Design Example with HDMI Tx and Rx

$
0
0

...
is under construction. Expect modificationconstruction !!!!
Document History :
Date

Zynq UltraScale+ MPSoC VCU 4k60 Design Example with HDMI Tx and Rx

$
0
0
...
The second ZCU106 board which is identified by its IP address runs a GStreamer pipeline which captures the encoded video stream from the network and decodes the encoded video packets and displays it on the HDMI monitor connected to the HDMI transmit interface.We will be using the gstreamer application to create this pipeline and execute in the Steps to run instructions provided in further section of this document.
Use Case 2: HDMI capture pipeline with VCU Encode and streaming in bidirectional mode
{UseCase2.JPG}{design_use_case1.JPG}
Overview:
In this use case we will be demonstrating video streaming using two ZCU106 boards. The raw video captured by the HDMI Rx subsystem on the ZCU106 board1 and encoded using the VCU Block in H.265 format and packetized into Ethernet RTP stream using RTP stack and sent to a ZCU106 Board2 .The ZCU106 Board2 which is identified by its IP address runs a GStreamer pipeline which captures the encoded video stream from the network and decodes the encoded video packets and displays it on the HDMI monitor connected to the HDMI transmit interface.This scenario happens vice versa
...
Launch Tera Term and open the COM port that is associated to Silicon Labs Quad CP210x USB to UART Bridge: Interface 0 of the USB-to-serial bridge to both the boards
Set the COM port to 115200 Baud rate, 8, none, 1 –Set COM port.
Appendix A:FileB:File Description in
Design_files.zip is extracted as
Design_files

Zynq UltraScale MPSoC Software Acceleration TRD 2017.4

$
0
0

{under.jpg}
{under.jpg}
Revision History
...
complements the 2017.22017.4 version of
Change Log:
...
versions to 2017.22017.4
Remove hard-TPG from design
Use SDSoC based function instead of linux based function to measure performance.
...
Various fixes and clean-up
Introduction
...
(TRD), version 2017.2.2017.4. The page
About the TRD
The Software Acceleration TRD is an embedded signal processing application designed to showcase various features and capabilities of the Zynq UltraScale+ MPSoC ZU9EG device for the embedded domain. The TRD consists of two elements: The Zynq UltraScale+ MPSoC Processing System (PS) and a signal processing application implemented in Programmable Logic (PL). The MPSoC allows you to implement a signal processing algorithm that performs Fast Fourier Transform (FFT) on samples (coming from Test Pattern Generator (TPG) in Application Processing Unit (APU) or System Monitoring (SYSMON) through an external channel) either as a software program running on the Zynq UltraScale+ MPSoC based PS or as a hardware accelerator inside the PL. The design has three accelerator cores generated using SDx for computing 4096, 16384, and 65536 point FFTs. The data transfers of the SDx accelerators is controlled by the APU. There is one accelerator (FFT IP from the Vivado IP catalog) for 4096 point FFT controlled by the Real-Time Processing Unit (RPU). The TRD demonstrates how to seamlessly switch between a software or a hardware implementation and to evaluate the cost and benefit of each implementation. The TRD also demonstrates the value of offloading computation-intensive tasks onto PL, thereby freeing the CPU resources to be available for user-specific applications.
...
ZCU102 Evaluation Kit (Rev 1.0 with ES2 or Production silicon)
A Linux development PC with following tools installed:
...
Design Suite 2017.22017.4
Xilinx SDx 2017.22017.4
Petalinux 2017.22017.4
Distributed version control system Git installed. For information, refer to the Xilinx Git wiki.
GNU make utility version 3.81 or higher.

Zynq UltraScale+ MPSoC VCU 4k60 Design Example with HDMI Tx and Rx

$
0
0
...
Rajesh Gugulothu
Initial Release
Files Provided
Design_files.zip
Archive file contains the Design_files directory.

Overview
The primary goal of this Tech Tip is to demonstrate the capabilities of video codec unit (VCU) hard block present in Zynq UltraScale+ MPSoC EV devices. This Tech Tip uses Vivado IP Integrator (IPI) flow for building the hardware design and Xilinx Yocto PetaLinux flow for software design. It uses Xilinx IPs and software drivers to demonstrate the capabilities of different components.
Viewing all 11776 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>