Did you know: Unaligned accesses in ARM

The X86 has always supported unaligned accesses. In the ARM world the first architecture that supported unaligned accesses in hardware was ARMv6. The architecture was implemented in the ARM11 core around the year 2002 and onward. There is an excellent article at ARM Infocenter giving technical details:


The support for unaligned accesses must be enabled in an ARM core explicitly. This is done by setting the bit A in the register SCTLR. Still, unaligned accesses will be allowed only on Normal memory; accesses to  Device memory type are always checked and will throw exceptions on misaligned accesses.

I had to verify that a Cortex-A53 core (ARMv8) correctly implements the support for unaligned accesses. In the beginning this task seemed very simple as only the bit SCTRL.A had to be set, I thought. However, the hidden issue is that all memory is treated as the Device memory type by default! Memory type specification is part of MMU page tables. Each block or page descriptor has a 3-bit field called Attribute Index. This is an index into the Memory Attribute Indirection Register (MAIR), which holds eight descriptors of the memory types used in the system (e.g. normal cacheable, normal non-cacheable, device, etc.). The operating system must know the system memory map and the caching requirements; therefore it can maintain virtual memory tables with correct attributes.

In the end to implement my test case, I had to implement paging tables with a flat mapping of VA to PA, setting the memory types on RAM blocks as needed.

Connecting MCU and FPGA at 100Mbit/s Using Ethernet RMII [Part 2]

This is Part 2 of a two-part series on Ethernet RMII. In Part 1 I described my hardware setup and basic Ethernet operation. In the second and final part I will describe the design of specialized MAC cores I implemented on FPGA, and there will be measurements to see how much throughput and latency the system can achieve.

Continue reading “Connecting MCU and FPGA at 100Mbit/s Using Ethernet RMII [Part 2]”

Connecting MCU and FPGA at 100Mbit/s Using Ethernet RMII [Part 1]

This is Part 1 of the two-part series on Ethernet RMII. Part 2 is also available.

Imagine your application requires a non-standard periphery controlled by an embedded processor. What options do you have? The periphery can be implemented in an FPGA; depending on periphery complexity you can choose an optimal FPGA that fits your budget. Where the processor goes? There are three possibilities: (a) inside FPGA as a soft-core → it will increase the cost of FPGA (larger type needed) and complicate HDL and software design. Or (b) inside FPGA as a hard-core → a nice compact solution and quite possible with heterogeneous FPGA from Xilinx (Zynq) and Altera (SoC). But the cost of these modern devices could still be too high for price sensitive applications. You must fit both your software and HDL to pre-engineered combinations of FPGA and ARM CPU sizes (perhaps a small Cortex-M core would suffice but you must pay for a gigahertz-class Cortex A cores).

The third option (c) is using a stand-alone MCU (maybe even not an ARM) and a standard FPGA. How do you connect them? You are limited to interfaces offered by the MCU. In modern low-end MCUs (by that I mean smaller STM32Fxxx devices) you have I2C (400 kbit/s), UART (115 kbit/s), SPI (~10Mbit/s), Fast Ethernet (100 Mbit/s). So what about the Ethernet core in the MCU? Could it be used to interface with FPGA? Sure it can!

Continue reading “Connecting MCU and FPGA at 100Mbit/s Using Ethernet RMII [Part 1]”

A Fistful of Radios

During pre-Christmas sale on Seeed Studio Bazaar  they offered these digital radio modules with the nRF24L01+ chip for only US$0.81 each. So I bought 10 of them outright 🙂

Fistful of radios

What would YOU  suggest to do with them? Build a wireless flower life-support monitoring network? A mobile voice communications radio system? Retrofit them into talking toasters and robotic vacuum cleaners? Let me know!

Error: jtag status contains invalid mode value – communication failure = SOLVED!

This issue bugged me a long time, finally I solved it this evening. Debugging code on my PIP-Watch using my ST-LINK-v2 JTAG debugger was very painful because the debugger software — OpenOCD and GDB — kept failing randomly during debug sessions with a rather cryptic message:

Error: jtag status contains invalid mode value - communication failure
Polling target stm32f1x.cpu failed, GDB will be halted. Polling again in 100ms

I scratched my head, updated firmware in ST-Link, looked at JTAG/SWDIO signals using a scope… But nothing helped.

Finally I found this message in a discussion forum. The problem is in the low-power mode! When CPU core clock is halted the debugger connection fails and debug session is halted.

I am using low-power mode to halt CPU clock when OS is idle – in vApplicationIdleHook() function I have the __WFI() – wait for interrupt – intrinsic function.

The solution is either to entirely disable low-power modes, or allow low-power debugging in the DBGMCU_CR register:


This code must be executed during CPU initialization before any low-power mode is first activated. After this, debugger connections will be kept even in sleep, standby and stop modes. The problem is gone!

Processor Low-power Optimizations in PIP-Watch

Processor Power

The PIP-Watch is a battery-powered device that will be continuously on, hence the average power consumption is one of the most important engineering aspects.

In this post I will go through two simple steps of optimizing CPU power – sleep modes and lowering the clock frequency. In a next separate post we will look into Bluetooth module power.

Continue reading “Processor Low-power Optimizations in PIP-Watch”