Addressing previous oversights

Introduction

Back in November 2024, as I closed my Blink them to death using Embedded Swift presentation at Pragma Conf, I promised to release the source code of the 3 projects I presented before the end of the year. I thought this would be an easy goal to achieve and that I’d finish way sooner.

Of course the projects had been prototypes so far, so I wanted to cleanup the code and recheck everything was working as expected. This ended up taking much longer than expected (plus, daily life got in the way—but that’s another story).

In the end, I pushed code for the third project on the last day of 2024 (and I cheated a bit as I removed some features compared to what I had in the prototypes). So all 3 projects are available:

And while doing the cleaning, I discovered that I had some loose ends in how I did things so far, and that included some of the code in my previously published posts. In this post, I’ll review those oversights and provide an improved version of what I initially presented.

Memory safety is nice

Swift is a language that provides memory safety and this is really nice, it catches a whole category of bugs at compile time. But with Embedded Swift, you sometimes need to perform some manual memory management and work without that safety net.

I was bitten by this issue as I experienced a very strange bug when working on Swatak, where the LEDs seemed to lit up in a completely random fashion. The root cause turned out to be a memory corruption.

Let me walk you through a simplified reproduction of the problem.

Based on the Button code that I published in Creating a Swift type for button input on nRF52 - Part 2, here is a bit of code that uses a Context object to store some message and print it out on button press.

@main
struct Main {
  static func main() {
    let context: Context = Context(message: "Hello")
    
    let _ = Button<Context>(gpio: &button, context: context) { _, callback, _ in
      let context = Button<Context>.getContext(callback)
      print("Btn pressed >\(context.message)<")
    }

    while true {
      k_msleep(5000)
    }
  }
}

class Context {
  let message: String

  init(message: String) {
    self.message = message
  }
}

And here is the output of running this on an nRF52840DK board

*** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
*** Using Zephyr OS v3.6.99-100befc70c74 ***
Btn pressed >Hello<
Btn pressed ><

On the first button press, the message is properly printed out, but on the second one, nothing is printed out.

Something strange is going on with Context here. Let’s implement a deinitializer and print a message when it gets deallocated.

  deinit {
    print("Context deinit")
  }

Flashing this version to the board, we see that the message is printed out immediately. Even before the button is pressed.

*** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
*** Using Zephyr OS v3.6.99-100befc70c74 ***
Context deinit

Let’s look at the Button code. Below is an extract, keeping only the code related to the context storage.

struct Button<T: AnyObject> {
  var context: T

  init(gpio: UnsafePointer<gpio_dt_spec>, context: T, handle: GpioCallbackHandler?) {
    self.context = context
    
    self.pin_cb_data.pointee.context = Unmanaged.passUnretained(context).toOpaque()

As we have a property in Button that references the context, the latter will be kept alive as long is the button exists.
In addition, we store a pointer to the context in the pin_cb_datastructure, and there we use passUnretained as to not keep an extra reference to the context.
Could it then be that the Button instance itself is getting deallocated?
Let’s add a deinitialized with a print statement to Button (changing it to a class for this test).

  deinit {
    print("Button deinit")
  }

Our output is now

*** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
*** Using Zephyr OS v3.6.99-100befc70c74 ***
Button deinit
Context deinit

And indeed, we can see that the Button instance is deallocated immediately, which in turns triggers the deallocation of the context.

Changing our main code to keep a reference to the button when it is created fixes the issue.

let btn1 = Button<Context>(gpio: &button, context: context) { _, callback, _ in
  let context = Button<Context>.getContext(callback)
  print("Btn pressed >\(context.message)<")
}

*** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
*** Using Zephyr OS v3.6.99-100befc70c74 ***
Btn pressed >Hello<
Btn pressed >Hello<

But this now generates a warning during compilation because the value is not used.

.../ButtonDeinitBug/Main.swift:6:9: warning: initialization of immutable value 'btn1' was never used; consider replacing with assignment to '_' or removing it
 4 |     let context: Context = Context(message: "Hello")
 5 |     
 6 |     let btn1 = Button<Context>(gpio: &button, context: context) { _, callback, _ in
   |         `- warning: initialization of immutable value 'btn1' was never used; consider replacing with assignment to '_' or removing it
 7 |       let context = Button<Context>.getContext(callback)
 8 |       print("Btn pressed >\(context.message)<")

AFAIK there is still no way to silence this single warning in Swift. However, it is possible to rewrite the code in such a way that we retain the button and avoid the warning.

    _ = Unmanaged.passRetained(Button<Context>(gpio: &button, context: context) { _, callback, _ in
      let context = Button<Context>.getContext(callback)
      print("Btn pressed >\(context.message)<")
    })

But this makes the code slightly less readable.
I leave it up to you to decide which is the lesser of the two evils.

Making it safe by default

We now understand why we saw an undefined behaviour and how to prevent it. But the Button code is still unsafe and we could easily fall into the same trap again later. We should instead make it so that users of this code would be Falling Into The Pit of Success, a concept I heard about in Alex Ozun’s talk at Swift Craft 2024 on Type-Driven Design with Swift.

One safe thing to do is to ensure that, if we would not retain the button, and it gets deallocated, we don’t access the context data i.e. making sure its callback closure is not called anymore. And just as nRF Connect SDK provides a gpio_add_callback function, there’s the corresponding gpio_remove_callbackone that we can use here.

By the way, also notice that in our Button initializer, we allocated a pin_cb_data structure

self.pin_cb_data = UnsafeMutablePointer<extended_callback>.allocate(capacity: 1)

which we never deallocated.

Taking care of both of the above issues, we now add a deinitializer to our Button class.

  deinit {
    print("Button deinit")
    gpio_remove_callback(gpio.pointee.port, &pin_cb_data.pointee.callback)
    pin_cb_data.deallocate()
  }

Having reverted our main code to not retain the button and flashing the board, we observe the following output

*** Booting nRF Connect SDK v2.7.0-5cb85570ca43 ***
*** Using Zephyr OS v3.6.99-100befc70c74 ***
Button deinit
Context deinit

So the Button and the Context instances are deallocated at startup. But now, we can press the button as much as we want and nothing will happen. I might not do what we want, but the code is safe from accessing random data or crashing the board.

Class or struct

Our Buttonstarted as a struct and we now made it a class, only because we needed a deinitmethod. An alternative would have been to make the struct non copyable, this also supports deinit. Which one makes the most sense ?

If you consider the Button type as a representation of the physical button, I would argue that there should only be one user of this button at a time. This aligns nicely with the ownership semantics of non-copyable structs.

But if you consider the Button type as representing the handler associated with the button press event, it does not really hurt to have multiple references to it and a class would work fine.

At this point, I’m only raising the point so you can consider this aspect in your own design. I’ll address this topic in more details in a future post.

Better understanding the underlying SDK

In a previous post, Controlling an LED using Embedded Swift on nRF52, I created a Led struct with simple on, off and toggle methods, controlling one of the on-board LED on the nRF52840 dk board. As it is one of the default LED for the board, its low-level definition is part of the SDK (in a device tree configuration file).

For the Swatak project mentioned above, I slightly changed the code, so that I could have a state property to proxy for the LED status.

  var state: Bool {
    didSet {
      gpio_pin_set_dt(gpio, state ? 1 : 0)
    }
  }

I mostly kept the same initializer code as in the original example, except I added the initialization of the state variable.

init(gpio: UnsafePointer<gpio_dt_spec>, state: Bool = false) {
  self.gpio = gpio
  self.state = state

  gpio_pin_configure_dt(gpio, GPIO_OUTPUT | GPIO_OUTPUT_INIT_HIGH | GPIO_OUTPUT_INIT_LOGICAL)
}

And with the blinking LED example, I did not notice anything wrong.

Invalid initial state of the physical LED

However, when I started using the code in Swatak, the LEDs were all on, even though the default state value is false.

In reusing old example code, I did not pay attention to the parameters of the gpio_pin_configure_dt() function. But of course, they matter.
Looking at the definition of the GPIO_OUTPUT_INIT_HIGH constant and its companion GPIO_OUTPUT_INIT_LOW provides some useful information.

/* Initializes output to a low state. */
#define GPIO_OUTPUT_INIT_LOW    (1U << 18)

/* Initializes output to a high state. */
#define GPIO_OUTPUT_INIT_HIGH   (1U << 19)

So, by always passing GPIO_OUTPUT_INIT_HIGH to the gpio_pin_configure_dt() call, the LED always started on, irrelevant of the initial state.
The problem can be fixed by adapting the initializer and passing the proper constant based on the initial state.

init(gpio: UnsafePointer<gpio_dt_spec>, state: Bool = false) {
  self.gpio = gpio
  self.state = state

  gpio_pin_configure_dt(gpio, GPIO_OUTPUT
                       | (state ? GPIO_OUTPUT_INIT_HIGH : GPIO_OUTPUT_INIT_LOW)
                       | GPIO_OUTPUT_INIT_LOGICAL)
}

When the physical and virtual world don’t agree

For both the TrafficLight and Swatak projects, I needed to use external GPIO PINs on which I had attached my own LEDs. This meant using a custom device tree configuration (dts) file.

So I opened the original dts file for the board (boards/nordic/nrf52840dk/nrf52840dk_nrf52840.dts in the Zephyr SDK source) and looked up how it defined its LED.

led0: led_0 {
	gpios = <&gpio0 13 GPIO_ACTIVE_LOW>;
	label = "Green LED 0";
};

I copied that snippet to my own project and adapted it for the GPIO pin my LED is connected to.

greenled: green_led {
	gpios = <&gpio1 10 GPIO_ACTIVE_LOW>;
	label = "LED P1.10";
};

And the LED behavior I was seeing did not make any sense. After a moment, I understood that the LED visual state was the opposite of the actual statevariable value. Why is that ?

Let’s again turn to the constants definitions.

/** GPIO pin is active (has logical value '1') in low state. */
#define GPIO_ACTIVE_LOW         (1 << 0)
/** GPIO pin is active (has logical value '1') in high state. */
#define GPIO_ACTIVE_HIGH        (0 << 0)

The above definition, using the GPIO_ACTIVE_LOW constant, means that there’s voltage on the PIN when the value is 0 and there’s no voltage when the value is 1.

Looking at the nRF52840 DK Hardware user guide (always a good idea) and more specifically the Buttons and LEDs section, one can read

The LEDs are active low, meaning that writing a logical zero (0) to the output pin turns on the LED.

So the above configuration is appropriate for the how the on-board LEDs are wired on the development board. But not for how my external LEDs are wired. Once I understood that, I could adapt the configuration.

greenled: green_led {
	gpios = <&gpio1 10 GPIO_ACTIVE_HIGH>;
	label = "LED P1.10";
};

The LEDs now behaved as expected.

Info

Device tree configuration is a complex topic that goes far beyond the scope of this post.
I find the Practical Zephyr posts series on the Memfault Interrupt blog a very detailed, yet approchable and very well written source of information on the topic.

Conclusion

Although the goal of Embedded Swift is to allow us to use a familiar language and easily transpose the work we’ve been going on iOS, macOS or other platforms to the embedded world, it is not yet that straightforward.

At this stage, we still need to understand that the underlying system has its own rules, that we should be aware of and understand.

I’m hopeful that as the platform matures, the community will develop higher-level abstractions, allowing developers to distance themselves from some of the lower-level details. I’d like to believe that some of the articles I write and the code I publish help contribute to that evolution.