Search by Tags

Torizon OTA

 

Article updated at 26 Aug 2020
Compare with Revision




Subscribe for this article updates

Warning: the Torizon OTA is a project currently under development at Toradex Labs. It is still an experimental project in it's early stages which is subject to changes without notice. This might impact new releases and/or iterations.

Introduction

TorizonCore is built with OSTree and Aktualizr, the former is a shared library and suite of command line tools that combines a "git-like" model for committing and downloading bootable filesystem trees, along with a layer for deploying them and managing the bootloader configuration. The latter is a "daemon-like" open-source implementation of the Uptane SOTA standard that secures updates from end-to-end.

OSTree and Aktualizr are complementary and together they form the foundation for OTA (over-the-air) update capabilities on the device.

The device portion of the Torizon OTA reuses what Linux microPlatform and meta-updater are providing. You can find more about the OTA strategy on the foundries.io Blog.

On the server side, Toradex is working on a cloud-based hosted option as well as an on-premise option to provide a complete OTA solution that works with Torizon-Core. This is currently a work in progress, subscribe to developer website updates to keep track of the progress. Meanwhile, you can Update Your Device Using HERE OTA Connect for early testing.

This article complies to the Typographic Conventions for Torizon Documentation.

OSTree

OSTree has its own article, please refer to OSTree for a brief overview and a demonstration of how to use it.

Uptane

Uptane is a de facto automotive SOTA standard, held by a non-profit consortium named Uptane Alliance under the IEEE/ISTO Federation. Its focus is to enable secure software updates over-the-air resiliently. It relies on multiple servers to provide security by validating data before a download starts and ensuring that even an offline attack that compromises a single server would still not be enough to compromise the system security. Uptane is an enhancement to the TUF (The Update Framework) security framework, which is currently a very widely used framework to secure software and package updates on computers and smartphones. The motivations to expand the TUF framework is described in detail in the Uptane Design page and a favorable explanation of TUF is in its docs page Understand the Notary service architecture.

Aktualizr

Aktualizr is the client implementation of Uptane. It is written in C++ and its responsibility is to communicate with a Uptane compatible server. It verifies if new downloads are available, install those updates on the system and reports status to the server, while guaranteeing the integrity and confidentiality of OTA updates. Aktualizr handles Docker image updates seamlessly by using Docker Compose yml files.

Rollback

There can be cases where the system may fail to boot or the boot process is considered unsuccessful either due to kernel panic or failure to start any critical user space application. These issues can be handled by developers during development, but it becomes a nightmare if the solution is deployed and such an issue occurs due to any bad update. This issue can be avoided if the following mechanisms are present:

  • kernel reboots on panic or hung task
  • the system is able to detect a failure of any critical user space operation
  • capability to configure the bootloader to boot a usable image in case of any failure

Rollback and bootcount support of aktualizr

The automatic rollback feature relies on aktualizr’s rollback support as well. TorizonCore uses aktualizr with rollback_mode set to uboot_masked (see aktualizr client configuration options). This enables aktualizr’s U-Boot bootcount integration: After an update aktualizr enables boot counting by setting the upgrade_available U-Boot environment variable to 1. In the error case, when the system reboots due to kernel/service failure, aktualizr won’t get started. After three tries U-Boot will rollback the system by setting the U-Boot environment rollback to 1. If the system has been rolled back, aktualizr does not mark the boot as successful (the system stays in rollback mode). If the system has been booted successfully, the upgrade_available and bootcount is set back to 0 to let the boot loader know that boot counting is no longer necessary.

Note: When installing an update without aktualizr (e.g. using ostree admin directly) automatic rollback will not work. To use automatic rollback in a pure OSTree system, those steps need to be executed manually as described in Ostree!

Implementation

The implementation makes use of the following:

  • OSTree
  • U-Boot’s bootcount support
  • Automatic reboot on kernel panic and service failure
  • aktualizr’s U-Boot bootcount integration.

  • Torizon Update Workflow

    Torizon Update Workflow

TorizonCore’s OTA allows to rollback to the last installed update thanks to its OSTree based root file system. It also allows to keep multiple deployments (kernel/initramfs/device-tree and the rootfs) on a system and have them bootable. The initial (factory) image has only a single deployment available and is assumed to be a working deployment (no rollback can be done at this point). After the first update has been rolled out, there are two deployments on the system at all times. If a new deployment fails, the system will automatically roll back to the previous deployment.

U-Boot bootcount

The U-Boot bootcount feature provides a simple boot counter and alternative boot command. The alternative boot command stored in altbootcmd and will boot into the previous OSTree deployment. If boot counting is enabled and the boot counter exceeds a predefined limit, the alternative boot command is executed. The boot limit is defined by the bootlimit environment variable by default set to 3. The boot count is held in the U-Boot environment on the on-module eMMC/raw NAND flash. To avoid too much wear, boot counting is only enabled after an update, controlled by the upgrade_available environment variable.

Automatic Reboot on Failure

The current implementation relies on software to reboot in case of a serious failure. The TorizonCore kernel has the CONFIG_PANIC_TIMEOUT configuration option enabled which reboots the system automatically on a kernel panic situation. In user space, the default configuration assumes the docker.service to be the crucial service. TorizonCore uses systemd’s FailureAction to tell systemd to reboot the system if starting the service fails.

Note: This setup relies on Linux' and systemd’s corporation in the failure case. Since these two software projects are fairly well tested and the features relied upon are fairly small, the risk of reboot failing is fairly small.

Boot Assessment - systemd

systemd offers automatic boot assessment for UEFI based systems. Since the Toradex module does not use UEFI the systemd boot assessment does not directly apply to Toradex modules. However, we reuse some aspects of the automatic boot assessment. In particular, the boot-complete.target is used as the synchronization point for services which are required to consider the system boot to be successful and the service which marks the system boot as successful. By default, we order the docker.service before the boot-complete.target, and the aktualizr.service after the boot-complete.target.