The 2015.06-rc1 release is available on GitHub.
It has taken a little longer than we hoped for this rc1 release, but there are some pretty fun and interesting updates included:
- A redesign of the NIOS II [1] code running on the FPGA
- An overall speed increase in retuning the LMS6002D
- Scheduled retune operations – tuning at a specified timestamp counter value
- “Quick retune” functionality – quickly restore previously identified tuning parameters
Be sure to check out the updated API docs for more details on the new frequency tuning options and improvements.
Over the summer, keep an eye on the nuand-papers repository, where we’ll be looking to upload documents associated with items we’re working on adding to the bladeRF examples directory. Currently, a draft of a paper on the design and implementation of an FRS transceiver using GNU Radio and the bladeRF is posted. We know some solid examples (beyond the simple test programs) are long overdue, and would appreciate any feedback, ideas, patches, or submissions to help move those along.
[1] The NIOS II is a soft-core processor running on the bladeRF’s FPGA. It’s used as a simple means to configure/control both peripherals on the bladeRF and modules in the FPGA’s programmable fabric.
NIOS II Code Redesign
A few months ago, Cameron Karlsson announced on our IRC channel (#bladeRF on Freenode) that he had redesigned the bladeRF NIOS II code for improved extensibility and readability.
Previously, the NIOS II code had used a single message format for communicating host to FPGA requests for control operations, such as accessing 8-bit LMS6002D and Si5338 registers. As other functionality was added, fitting 32 and 64-bit accesses into an interface designed for 8-bit register accesses became rather messy. Furthermore, there were a very limited number of ID bits to denote the target device to access. This did not leave much room for users to customize and expand on this code.
Cameron proposed and implemented lots of cleanup, which provided us with a vast amount of excellent feedback. Cameron also introduced the notion of having different “state machine” implementations that could handle different message formats. This would allow contributors to more easily introduce new functionality with its own message format. While the proposed abstractions made the code much easier to grok, we did see some slowdown in the time needed to complete register accesses.
After reviewing and benchmarking his code, we bounced some additional ideas around with Cameron for another iteration over the proposed changes. We then set to work on the design included in this release. You can find an overview of the new NIOS II code in the updated README.md
We further expanded Cameron’s state machine idea into a “packet handler” interface that provides the ability to: handle a request, provide a response, and queue up deferred work. We then migrated all of the existing requests to the packet formats listed in the aforementioned README. Compatibility with the previous message structure has been maintained though a nios_pkt_legacy format, which ensures older libbladeRF versions continue to operate with newer FPGAs.
As we’ll see below, the notion of “deferred work” was driven by the desire to be able to schedule retune operations. In order to support deferred work, the polling UART peripheral used by the NIOS II was replaced with a custom interrupt-driven “command UART.” This implementation takes care of queuing up an entire request (dropping those with invalid “magic” byte values) before alerting the NIOS II code to the presence of a new request. This frees up the code to iterate over the packet handlers and perform any deferred work because it’s no longer polling the UART for every single byte of the request.
The new packet formats are designed to have a fixed length and many of them only use a subset of the 16-bytes (required for the FX3 UART DMA transfer). Therefore, a future optimization we could make is to fire the “request available” interrupt when the necessary subset of bytes for a particular packet are received. This would allow us to begin handling the request and even sending back the response before reception of all 16 bytes completes. (Patches welcome! 😀)
After making these changes and re-running a simple timing test, we found that our peripheral access times were not only back to the original values, but consistently a few tens of microseconds faster. Not too shabby!
Improving LMS6002D Tuning Times
We’ve had a large number of requests to be able to support faster re-tuning in order to allow people to write code for sweeping more than 28 MHz of spectrum and to implement simple forms of frequency hopping.
As of the previous (2015.02) release, we measured our bladerf_set_frequency()
call to require
~11 ms on USB 3.0, and ~22 ms on USB 2.0. If you take a quick look at our overview of frequency tuning on the bladeRF, or a detailed review of sections 3.4 and 4.6 in the LMS6002D Programming and Calibration Guide, you’ll see a number of register settings are changed each time you retune the device. Many of those registers contain bits that control other features, necessitating read-modify-write (RMW) operations. By running bladeRF-cli -e 'set frequency 2.415G' -v verbose
, you can get a sense of just how many register accesses are involved — it’s quite a few!
As indicated by the differences in times measured for USB 3.0 and USB 2.0, incurring USB overhead for every register access is rather significant.
Therefore, we sought to build the LMS6002D returning code into the FPGA’s NIOS II.
First, the code used to retune the LMS6002D was split into two parts:
- A section that computes the register values required to tune the device to the specific frequency.
- The application of those register values and execution of the retune algorithm.
The lms.c
code was updated such that it could be build for both the libbladeRF and the NIOS II, and moved to the fpga_common directory. When configured for the FPGA-based tuning mode, the computation of register values is still done on the host. However, instead of issuing control transfers for every register access request, a single transfer containing a newly introduced nios_pkt_retune packet is issued. In response to this, the second set of operations listed above are all performed by the NIOS II code running in the FPGA.
Note that it is still possible to perform entirely “host-based” retuning via bladerf_set_tuning_mode() or the BLADERF_DEFAULT_TUNING_MODE environment variable.
Initially we did not achieve the speed up we expected, considering the clock rate used on the LMS6002D SPI interface. Further investigation [2] showed that this was due to the amount of time between chip select assertion and the clocking of data, as well as a delay between successive bytes.
As shown in the following capture, a single LMS access (consisting of an address byte and a data byte) was taking ~9.6 μs. The time between the address byte and data byte was approximately 5.71 μs.
In short, registers accesses using the generic NIOS II SPI controller were spending more time doing “nothing” than actually communicating data. Since we’re only using the SPI controller for LMS6002D accesses, Brian Glod switched this out with an LMS6002D-specific SPI controller. This reduces an LMS6002D register access to under 1 μs (zoomed in):
An astute reader might notice that the actual transaction, as measured by the duration of the clocks, appears longer. For this release, we reduced the LMS SPI clock from 40 MHz to 20 MHz, due to some timing constraints that were just barely failing at 40 MHz. We’ll be looking to address this and increase the clock back up to 40 MHz. Nonetheless, the new SPI controller allows bladerf_set_frequency()
to be performed significantly faster:
- Host-only tuning on USB 3.0 dropped from 12 ms to 5 ms (2.4x speedup)
- The FPGA tuning mode reduces that 12 ms to…
- under 700 μs for the NIOS II/e core (17.1x speedup)
- under 450 μs for the NIOS II/f core (26.7x speedup)
Again, not too shabby — and we haven’t even gotten to the “quick retune” functionality yet! Although not listed above, the speedups on USB 2.0 are equally pleasing.
In the near future, we’ll add support to this controller for accessing multiple registers within a SEN
window. Since the register values are not applied to the device until the rising edge of SEN
, it would be much more desirable for PLL changes, which are spread across multiple contiguous registers, in a single operation. (This will alleviate some spurious emissions that can occur while retuning with the transmitter enabled.)
[2] Note: The easiest way to place oscilloscope or logic analyzer probes on the LMS6002D SPI interface’s SEN
and CLK
is using the AB16 and AA15 vias exposed on the bottom of the board.
Scheduled Retunes
One feature we’ve been wanting to introduce is the ability to schedule a retune to occur at a specified sample timestamp value. (For those that are not familiar with it, libbladeRF allows you to transmit and receive samples using some metadata.)
This ability to schedule retunes in relation to transmitted/received is intended to make sweeping or hopping very straight-forward, once you have a sense of how long the underlying retuning operation takes to complete. In essence, you can:
- Schedule transmission or reception of samples at some timestamp value
T
via bladerf_sync_tx()
or bladerf_sync_rx()
- Schedule a retune to occur some time
R
after T + num_samples
via bladerf_schedule_retune()
- Schedule the next TX/RX operation to occur after
R + retune_duration
. The upper bound for retune_duration
will depending on the frequencies one is changing from/to, so you should do some experimenting to ensure you have sufficient “wiggle room” for your particular application.
To implement this functionality, the “Time Tamers” (timestamp counter modules) in the FPGA were separated and updated to provide programmable interrupt functionality. A timestamp field in the nios_pkt_retune
format allows the host to specify the relative time at which the retune should take place. The NIOS II code uses the Time Tamer’s programmable interrupt to schedule and trigger a retune callback. In the event that the provided timestamp is in the past, the Time Tamers will immediately fire their interrupts.
Multiple retune requests (currently up to 16) may be queued up. This allows the host to schedule multiple retunes in advance rather than needing to quickly submit the retune request directly beforehand. These queued up operations can also be canceled from the host. The underlying retune queues are flushed when bladerf_open
and bladerf_close
are called in order to prevent the device from unexpectedly returning on successive uses.
Some test code to exercise this functionality may be found in the libbladeRF_scheduled_retune directory.
“Quick Retune” functionality
It is possible to achieve even quicker retuning by simply saving the final results of the LMS6002D tuning algorithm and re-applying them to registers later. However, this speed increase comes with a trade-off in phase noise. The parameters used to tune the LMS6002D are subject to environmental change, so the register settings from prior results will not necessarily remain the “ideal” values for a particular frequency. Nonetheless, one could update these values by performing a full retune, and then fetching the quick retune parameters at regular intervals.
As you may have already noted, the bladerf_scheduled_retune() function takes an optional pointer to a bladerf_quick_tune structure. The below example, included in the API docs, exemplifies its usage:
int status;
unsigned int i, j;
const unsigned int frequencies[NUM_FREQUENCIES] = {
902000000,
903000000,
904000000,
925000000,
926000000,
927000000,
};
struct bladerf_quick_tune quick_tunes[NUM_FREQUENCIES];
/* Get our quick tune parameters for each frequency we'll be using */
for (i = 0; i < NUM_FREQUENCIES; i++) {
status = bladerf_set_frequency(dev, module, frequencies[i]);
if (status != 0) {
fprintf(stderr, "Failed to set frequency to %u Hz: %sn",
frequencies[i], bladerf_strerror(status));
return status;
}
status = bladerf_get_quick_tune(dev, module, &quick_tunes[i]);
if (status != 0) {
fprintf(stderr, "Failed to get quick tune for %u Hz: %sn",
frequencies[i], bladerf_strerror(status));
return status;
}
}
for (i = j = 0; i < ITERATIONS; i++) {
/* Tune to the specified frequency immediately via BLADERF_RETUNE_NOW.
*
* Alternatively, this re-tune could be scheduled by providing a
* timestamp counter value */
status = bladerf_schedule_retune(dev, module, BLADERF_RETUNE_NOW, 0,
&quick_tunes[j]);
if (status != 0) {
fprintf(stderr, "Failed to apply quick tune: %sn",
bladerf_strerror(status));
return status;
}
j = (j + 1) % NUM_FREQUENCIES;
/* ... Handle signals at current frequency ... */
} |
int status;
unsigned int i, j;
const unsigned int frequencies[NUM_FREQUENCIES] = {
902000000,
903000000,
904000000,
925000000,
926000000,
927000000,
};
struct bladerf_quick_tune quick_tunes[NUM_FREQUENCIES];
/* Get our quick tune parameters for each frequency we'll be using */
for (i = 0; i < NUM_FREQUENCIES; i++) {
status = bladerf_set_frequency(dev, module, frequencies[i]);
if (status != 0) {
fprintf(stderr, "Failed to set frequency to %u Hz: %sn",
frequencies[i], bladerf_strerror(status));
return status;
}
status = bladerf_get_quick_tune(dev, module, &quick_tunes[i]);
if (status != 0) {
fprintf(stderr, "Failed to get quick tune for %u Hz: %sn",
frequencies[i], bladerf_strerror(status));
return status;
}
}
for (i = j = 0; i < ITERATIONS; i++) {
/* Tune to the specified frequency immediately via BLADERF_RETUNE_NOW.
*
* Alternatively, this re-tune could be scheduled by providing a
* timestamp counter value */
status = bladerf_schedule_retune(dev, module, BLADERF_RETUNE_NOW, 0,
&quick_tunes[j]);
if (status != 0) {
fprintf(stderr, "Failed to apply quick tune: %sn",
bladerf_strerror(status));
return status;
}
j = (j + 1) % NUM_FREQUENCIES;
/* ... Handle signals at current frequency ... */
}
Using this functionality reduces the original 12 ms host-based retune time (on USB 3.0) down to…
- Under 250 μs for the NIOS II/e core (48x speedup)
- Under 150 μs for the NIOS II/f core (80x speedup)
Definitely not too shabby, if you can handle the trade-offs! Here’s a quick capture of this quick retune functionality in action with some CPFSK bursts.
Thank you!
As always, a big thank you goes out to everyone who has provided feedback, bug reports, patches, and kind words. We really appreciate it, and it’s a pleasure to chat and collaborate with you all via IRC, the forums, and email. Your ongoing support is what makes this project so enjoyable to work on!
We look forward to pulling in some very cool contributions in the coming months, as well as developing some interesting examples and content. We hope you’ll join in the fun!
– Jon (jynik)
Recent Comments