Need assistance with loopback modes

Jaco · Post by **Jaco** » Sat Sep 13, 2014 2:00 am

Hi,

I am having some trouble testing my transmitter using the internal loopback. My device configuration does work, I've tested it by running bladeRF-cli after my program has executed and the loopback mode correlates, and just to confirm I have tested all of the loopback modes available.

The tx / rx procedure works as follows:

Configure TX / RX data streams
Fill a TX buffer with the waveform .bin file
Enable the RX module
Enable the TX module
Transmit the samples in the TX buffer
Disable the TX module
Receive samples until a flag is set (when the rx.bin file is done writing)
Disable the RX module

I have tested the buffer population and file i/o thoroughly, and can confirm that it is working as intended.

I'm using a script in MATLAB to plot both real and imaginary parts of the transmitted and received waveforms. The transmitted one looks like it should (linear FM), but the received signal seems to be noise only.

For interest's sake my source code is available at https://github.com/git-strider/bladerf-radar, the TX/RX procedures are written in txrx.c.

So here are my questions:

My files are simply a list of IQ pairs, e.g. I1 Q1 I2 Q2 etc., is this correct?
Is the loopback mode even an appropriate method to test my transmit / receive routines?
Is the synchronous interface sufficient for what I'm trying to do?

Any input would be hugely appreciated.

Jaco.

Jaco · Post by **Jaco** » Sat Sep 13, 2014 2:14 am

For sake of completion, here are the images from the MATLAB script,

TX:

RX:

jynik · Post by **jynik** » Sun Sep 14, 2014 1:29 pm

Hi Jaco,

Are you waiting a sufficient amount of time between trasmitting the last sample and shutting off the TX module? Disabling the TX module occurs as quickly as it can -- it does not inherently flush all remaining samples, so it's possible that you can shut the module off before the samples are actually transmitted.

Another possibility is that your bladerf_sync_rx() calls are simply "missing" the samples -- they hit the FPGA before the RX side is ready to buffer them. When using the sync interface, the worker thread that manages buffers doesn't launch until your first bladerf_sync_rx() call. (This is done to ensure that the underlying stream doesn't time out between you configuring it and making the first bladerf_sync_rx() call.)

You may be able to catch the samples if you use a fairly large timeout and number of buffers on the RX side, and performing an bladerf_sync_rx() call prior to your first bladerf_sync_tx() call. This will ensure the underlying RX worker is actively filling buffers with incoming data until you make your bladerf_sync_rx() call to consume those samples.

However, you'll need to be careful in doing the above. If you're transmitting a fairly large set of samples, your RX buffers may overrun or you may timeout. As such, I consider the above to be a bit of a kludgy solution.

What I would instead advise is that you kick off a thread to RX samples to a file before you even begin TX'ing. You'll probably wind up with a bunch of extra samples before and after your desired data, but the data should be easy to prune in MATLAB.

Your procedure would then become something like...

Enable the desired loopback mode
Configure the TX streams
Fill a TX buffer with the waveform .bin file
Launch your RX thread
Enable the TX module
Transmit the samples in the TX buffer
Wait some amount of time to ensure the samples are transmitted and then received
Disable the TX module
Signal the RX thread to cleanup and shut down
Join the RX thread

The RX thread might look like this...

Open a file to save samples to
Configure the RX stream
Enable the RX module
While we haven't been signaled to shut down...
- Receive samples
- Write them to a file
Close the file
Disable the RX module
Exit the thread's function

For completeness, I've tried to provide some useful answers to your specific questions:

Jaco wrote: My files are simply a list of IQ pairs, e.g. I1 Q1 I2 Q2 etc., is this correct?

Yes. For the SC16Q11 format, each component is a right-aligned, sign-extended, little-endian 12-bit value. The values [-2048, 2048) represent [-1.0, 1.0). Note that the upper bound is exclusive here! (Keep the positive values to 2047 or less.)

Jaco wrote: Is the loopback mode even an appropriate method to test my transmit / receive routines?

For what you're describing, I think you should be able to make use of these.

The firmware loopback mode will loop back the exact data, and you'll generally want things running full duplex to keep samples moving, since the buffers in the FX3 are relatively small.

The other loopback modes pass your data through the analog paths on the LMS6002D, so this probably what you want. Generally, the baseband loopback modes will be a little "cleaner," but I think you should be just fine with the RF loopback modes.

Jaco wrote: Is the synchronous interface sufficient for what I'm trying to do?

Nothing from your description is jumping out at me to suggest that it would be insufficient. This interface only becomes insufficient if you need very fine-grained control over buffer management, have very low-latency requirements, or are really customizing something on the FX3 or FPGA side of things.

Hope that helps!
- Jon

Jaco · Post by **Jaco** » Sun Sep 14, 2014 10:28 pm

Hi Jon,

Thanks for your reply, it's extremely insightful.

I've tried to increase the number of received samples, but I haven't considered that I actually start receiving after the samples have arrived from the TX module, causing me to miss them completely.

I'm already pushing the limits of my CPU and USB connection by transmitting a 20MHz bandwidth signal, I'll lower that to USB2 speeds and increase my pulse width (5 us in the image I posted, if I remember correctly), just to take some possible hardware lockups out of the equation. I also haven't considered the requirements for #transfers and buffer sizes, as mentioned in the asych documentation, I'll change buffer sizes accordingly and see if there's a difference. I'll also include a routine to upper bound samples to 2047 as you've mentioned, once their read from the tx file.

I'm heading to the labs in the next few minutes, and I'll update this thread with findings and (hopefully) progress.

Thanks again, appreciate the input!

Jaco.

Jaco · Post by **Jaco** » Mon Sep 15, 2014 11:16 am

Just an update on my progress.

I've rewritten my TX and RX routines as threads, and I'm calling my TX and RX threads simultaneously, and I've followed your advice by beginning to receive and write samples before the transmitter is doing its job.

By increasing the number of transfers in the TX stream config to anything above 10 (strange number, I know), I can actually see my transmitted signal by using one of the baseband loopback modes. An example return looks as follows:

The number of pulses I can see in the above image is 10, one less than the number of repetitions in my stream config.

My stream settings are:
RX count: 4096
Num buffers: 24
RX block size: 1024
Num active transfers: 16
Buffer size: 1024
Num buffers: 24
Stream timeout: 100
TX repititions: 11

However, I don't seem to understand how any of the above parameters work, since changing them around randomly affects my received signal in a manner which I can't attribute to anything - e.g. changing the number of TX repetitions to >10, it starts working randomly. My pulses are 100 samples long at the moment, therefore by setting the number of repetitions to 16 I'm expecting to have 1600 samples worth of actual data, which is not the case, for some reason there's a part of one pulse (one, with the current settings) being cut off.

To explain what I'm experiencing with reference to the above image, my transmitter sends data [x---x---x---x---x---x] but my receiver writes data [--x---x---x---x---x---x-], and the next time [-x---x---x---x---x---x--], if that makes sense, since I can't seem to make sense of when exactly things are happening in the underlying streams. While I'm receiving the same number of pulses each time now, they're never in the same position. Also, introducing a delay inbetween transmissions does not seem to have any effect, I'll play around with it more to determine whether I'm calling it at the right location.

Is there any buffer settings that I can realistically change to accommodate a 20 MHz sampling rate? According to the asynch documentation, I would need buffer sizes > 30 MB to be safe against dropped samples, much larger than what the actual stream can be configured to.

Jaco.

EDIT: I've noticed that my init_module is never being called, which caused my sampling rate and frequency to be set to the default values of 1 MHz and 1 GHz, respectively. With my actual sampling rate (which is 10 MHz now), I can sort of produce results but they're in no way repeatable. By making the TX buffer 1024 samples long and padding my signal with zeros until that buffer size is reached works okay-ish, with a consistent 412 spacing (1024/2 - 100 = 412) most of the time. The number of repititions boggle me though, since as I've mentioned it yields nothing when it's set to lower than 11, even if my RX thread is started before my TX thread, and the TX thread is put to sleep for 500 us before it starts executing (using usleep()).

Jaco · Post by **Jaco** » Tue Sep 16, 2014 5:39 am

Another update.

Figured out that my TX buffer allocation isn't working as intended, in stead of assigning values to the buffer as follows: [x00000] where x is the signal, 0 is just padded zeros and the length is the default block size, it is assigning: [xxxxxx], and the final pulse is cut short. Each of my pulses, as I've mentioned, are 100 samples long and the default block size in my program is 1024, thus 1024/100 = 10.24, which is the reason why I don't see a received signal when the #repetitions is < 11, because the buffer is never filled up to 1024.

Working on a solution now, will post further updates.

Jaco.

EDIT: Solution was pretty simple, I noticed that when allocating my sample buffer, it's done according to the size of RX blocks (not always >= the number of samples per buffer in my TX stream). By allocating the sample buffer according to the size of the TX buffer and also transmitting that size when calling bladerf_sync_tx() solved it, and there are only two issues remaining.

1. Is there any way that I can ensure that my TX / RX routines start on exactly the same time on each run? It seems that the block of received samples is in a different position each time.
2. Is there any method to schedule the TX and RX routines, e.g. TX one buffer length, RX for 20 buffers, etc. (half-duplex)? It's extremely important that I maintain synchronization between the TX and RX threads, otherwise I'll often receive a bunch of garbage.

Jaco · Post by **Jaco** » Sun Sep 21, 2014 4:24 am

Jaco wrote:
1. Is there any way that I can ensure that my TX / RX routines start on exactly the same time on each run? It seems that the block of received samples is in a different position each time.
2. Is there any method to schedule the TX and RX routines, e.g. TX one buffer length, RX for 20 buffers, etc. (half-duplex)? It's extremely important that I maintain synchronization between the TX and RX threads, otherwise I'll often receive a bunch of garbage.

To answer my own questions:

1. No. It doesn't seem possible to account for internal delays and the exact time that a certain call is executed is never the same between runs. One idea however is to look for the sample# which corresponds to the first sample's value in the TX buffer, and work from there, but the actual implementation will probably get really awkward.
2. Yes. I've altered the program to run a single thread and tx / rx samples in a half duplex fashion. This does however create difficulties in testing the actual system, since a printf won't do the job and it's not really practical for me to try and receive pulses 15 km+ away.

I have another question though. Is the FX3 firmware asserting the 1024 buffer size limitation on the TX module (i.e. the module does not TX until there's at least 1024 values in the buffer) ? It places a huge limitation on my system, since @ 10 MHz sampling rate, 1024 samples equates to roughly 100 us, and a 100 us pulse travels 15 km before I can begin receiving values. Is there any way to change this size without altering the entire firmware? Even at the maximum sampling rate (tbd with my specific hardware, but according to the DAC specs it's 40 MHz) I would be limited to a minimum range of 3840 km.

Jaco.

jynik · Post by **jynik** » Mon Sep 22, 2014 7:04 am

Hi Jaco,

Sorry for the delay -- been out of town and am now catching up on emails, forum posts, etc. Nice job on figuring out things in the meantime, and thank you very much for making your solutions known. (I love when folks help prevent this situation

).

I believe items 1 and 2 will be addressed to a significant degree by the timestamp support I'm currently working on in libbladeRF.

There's been some delays in this effort, but I'm looking to finish it up and get it into master this week, with an associated FPGA update. This code is currently being developed in a dev-issue_238 branch (link will break with the issue is closed). Be wary of using this however -- dev-* branches should be regarded as being rather transient -- we may rebase against master, squash commits, etc. in these branches; they're simply intended to provide transparency WRT to ongoing work.

The metadata additions to the API include the ability to specify that you want to use the sync RX/TX interface with timestamp metadata. There is one sample counter shared between RX and TX, so they are inherently synchronized in that sense.

On the RX side, you can specify that you want to receieve N samples at a specific timestamp in the future. The sync interface will throw away data until that point. Alternatively, you can specify that you want data "now" and get the associated timestamp for that data.

The TX side operates in bursts. When calling the sync_tx function, you set a flag to denote the start of the burst, and the timestamp value at which the samples should go out. When you set an "end of burst" flag, the remainder of the underlying buffer will be zeroed and sent immediately.

So for what you're doing -- you should be able to say, "send N samples out at time T", while your rx-side waits for M samples at time T+t. Does this sound like what you'll need?

Regarding the FX3 firmware -- yes, the 1024 is a hard-coded DMA transfer size, so you would have to make some firmware changes to DMA channel initialization and probably the GPIF configuration. The FPGA sends data to the FX3 in 256 or 512-byte chunks, FWIW. It is possible to change these items -- I know other folks have done it for the bladeRF, but I haven't myself. The FX3 is capable of DMA'ing shorter transfers -- I know the Sync Slave Fifo GPIF configuration (not used in by the bladeRF FW) has a PKTEND signal that can be used to commit "short" buffers. Be aware that one has to find the right balance between low-latency and high throughput when you start tweaking the DMA sizes.

Best regards,
Jon

Jaco · Post by **Jaco** » Mon Sep 22, 2014 1:20 pm

Hi Jon,

Thanks again for your response.

The proposed update with added timestamp metadata will be a huge boon for my project indeed. Part of what I want to do after transmitting is receive N samples, and discard the rest until the next pulse is transmitted.

The current version works on creating threads for TX and RX, where each TX pulse is the length of my PRI (in samples) long, with the actual signal contained in the first few samples. What I hope to achieve is by repeatedly transmitting this, my RX thread can run simultaneously and collect all received samples, but I would still know which blocks represent dead-time (where the TX pulse was being transmitted) and discard those when processing. Only problem with that (which I can foresee, at least) is if the sync_tx and sync_rx calls occur at different times, which would be a huge problem.

I'm working on implementing conditional variables that halt the execution of a thread until a certain value is set in another thread. I would then wait for this condition in the RX thread just before sync_rx is called, and the condition will be set just before sync_tx is called. Whether this makes sense or not remains to be seen,

For now I'm fiddling with my implementation before going into the firmware, if it's absolutely necessary though I'll go there.

Thanks again, and good luck on the updates to libbladeRF, I'll be sure to check out that branch. I'll also post more updates as I progress.

Regards,
Jaco.

PS. loved that comic, been there so often!

Jaco · Post by **Jaco** » Tue Sep 23, 2014 12:19 pm

Another quick question, how do I transmit / receive data to and from the FPGA? Looking to implement a signal processing block (basically frequency domain filtering) that takes a signal, FFTs, fiters, IFFTs, sends it back. Is this realistic?

jynik · Post by **jynik** » Wed Sep 24, 2014 6:53 am

Hi Jaco,

At GRCon 2014, there was a lot of interest in getting a polyphase channelizer into the FPGA for applications like OpenMHz (which was easy to get running at the con with the bladeRF since it uses gr-osmosdr

). So that's definitely something that'll be queued up in the "TO DO" list.

With that said, we're going to be thinking about a good way of exposing processed data back to libbladeRF. I think this is something worth giving some good foresight in order to yield a sane API, and thoughts, feedback, pull requests, etc. will be appreciated.

One thought was for user-specific data formats, we could add a BLADERF_FORMAT_RAW_DATA value to the bladerf_format enumeration. In this format, our "sample" would simply be one byte, and it'd be up to the API user to parse that data as they wish.

In doing this, you'll need to add a couple cases to some switch statements, such as here and here.

Also, for a first cut at this, it'd probably be easiest to ensure data from the FPGA is padded out to the 4096-byte DMA blocks (which is the size of 1024 SC16Q11 samples) so you don't have to touch some of the existing assert() and error checks, firmware, and GPIF configuration. In the long-run, I think we most certainly want to dive into these items, though.

Hope that helps!
Jon

Jaco · Post by **Jaco** » Fri Sep 26, 2014 3:04 am

Hi Jon,

The link to GRCon is useful indeed, I would definitely like to attend that at some point!

What you said makes sense, well, sort of. I'm still unsure how I'm actually going to get the signal on to the FPGA and get the transformed signal back. In other words, how do I access the embedded memory in the first place (since my sampled signal is stored on the PC and from what I understand there's no libbladeRF functionality that can do this at the moment)? I'm not sure whether it's what you've tried to explain, or if my brain is just having difficulty to understand how it works.

Regards,
Jaco

jynik · Post by **jynik** » Fri Sep 26, 2014 8:31 am

Hi Jaco,

There's probably numerous aproaches here. I imagine that overhauling the FX3's GPIF configuration and FPGA<->FX3 interface could allow us to set up different USB endpoints for different functionalities, and move data from/to different FPGA modules. This is probably best in the long run, but the below might be a quick way to shoehorn in some additional functionality...

However, the current FX3 <-> FPGA interface simply moves the RX'd and TX'd samples to their respective destinations. '

Therefore, my suggestion was that we could simply use the existing RX and TX functions to move data from/to the FPGA. By adding additional format valus (e.g, BLADERF_FORMAT_RAW_DATA, BLADERF_FORMAT_FFT_4096), we could configure FPGA registers as needed by that format.

So for example -- if you added an FFT block in the FPGA, and all you wanted were the results of the FFT, you could specify the above format (that you'd add), and then just call bladerf_sync_rx() like you normally would. It'd be up to you to interpret the received FFT data correctly.

Hope that clarifies things a bit!
Jon

Jaco · Post by **Jaco** » Fri Sep 26, 2014 4:33 pm

Hi Jon,

That definitely clears it up! Thanks.

I'll see if I can figure something out with using the additional format as you've suggested.

Jaco.

Official bladeRF forum

Need assistance with loopback modes

Need assistance with loopback modes

Re: Need assistance with loopback modes

Re: Need assistance with loopback modes

Re: Need assistance with loopback modes

Re: Need assistance with loopback modes

Re: Need assistance with loopback modes

Re: Need assistance with loopback modes

Re: Need assistance with loopback modes

Re: Need assistance with loopback modes

Re: Need assistance with loopback modes

Re: Need assistance with loopback modes

Re: Need assistance with loopback modes

Re: Need assistance with loopback modes

Re: Need assistance with loopback modes