Flash diagnostics and health monitoring for NOR memory

In embedded systems, where failure is not an option, NOR flash devices storing boot code, firmware images, and critical application data are subject to gradual wear over their operational lifetime. That wear is not invisible; it’s reflected in internal device registers accessible at runtime without the need for external test equipment. Per-sector erase cycle counts, single-bit and double-bit error correcting code (ECC) event counters, and hardware-accelerated cyclic redundancy check (CRC) integrity results collectively form a health profile.

This profile covers user-defined address ranges and is updated continuously as the system operates. On certain device variants, an on-board temperature sensor provides confirmation that the device is running within its rated thermal envelope. These are no fault flags that fire after something has gone wrong. They are observable quantities whose value lies in being monitored over time.

The central premise of flash diagnostics is shift from reactive fault handling to proactive health monitoring. Fault handlers consult device status when an operation fails.

In contrast, diagnostics applications read the same registers on a schedule, build a time series, and watch for early indicators of wear. Early warning paves the way for preventative maintenance, fixing impending problems before they trigger failure.

Reading wear, ECC, CRC, and thermal trends over time

Program/erase cycle counts per sector are the most direct measure of wear. Flash arrays in real-world applications are not erased uniformly. Sectors holding frequently updated data, such as fault codes logged by an automotive ECU or a partition used for over-the-air updates, accumulate cycles at a much higher rate than sectors holding static firmware.

Some NOR flash memories—such as Infineon’s SEMPER NOR Flash—offer built-in wear leveling defenses that distribute P/E cycles across the full address range. A diagnostics application periodically tracking per-sector counts can identify this imbalance early and provide the system with the information needed to act, whether by redistributing write activity or by flagging sectors approaching their service limit.

ECC event counts add a sensitivity that cycle counts alone cannot provide. Single-bit events are corrected transparently by on-chip logic and produce no visible effect on system operation, but their rate carries information about how individual cells are aging. A sector whose single-bit event rate begins to rise is showing early signs of cell wear, something the cycle count alone may not yet reflect.

When this trend is observed, rewriting the sector contents to restore cell charge state is one response the diagnostics system can initiate. To ameliorate system performance, the process can be scheduled during low-activity periods. Whether and at what threshold to trigger a refresh is a configurable decision. Double-bit events represent a harder boundary: the device detects them but cannot correct them, and their occurrence is recorded with sector address and timestamp for subsequent analysis.

CRC integrity checks over defined address ranges complement the bit-level view ECC provides, catching consistency issues that fall outside the scope of individual ECC words. For example, CRC is often used to validate a full firmware image region after an OTA update completes. Thermal reading, where available, confirms whether the device has been operating within its rated temperature range. This data assists in evaluating whether observed ECC trends reflect normal aging or accelerated cell wear from sustained thermal stress.

Diagnostics across AUTOSAR, Linux, and bare metal

The same NOR flash device frequently appears in multiple ECU variants within a single vehicle platform, each running a different software environment. A diagnostics software module such as SEMPER Diagnostics Library can be configured to span this portfolio, covering AUTOSAR Classic and Adaptive, Linux, QNX, RTOS, and bare-metal environments without changing the underlying health monitoring logic. What differs between environments is only the integration surface.

In AUTOSAR, the diagnostics module fits as a complex device driver. Positioned above the memory hardware abstraction layer, it accesses device-specific commands and register reads that the standard flash driver interface does not expose, while making its outputs available to upper-layer software components through defined RTE ports.

Figure 1 Here is how SEMPER Diagnostic Library software architecture operates in AUTOSAR environment. Source: Infineon

In a POSIX environment such as Linux or QNX, the same logic runs in user space and issues health queries through the IOCTL mechanism on an extended driver. Where the system is a heterogeneous SoC, a diagnostics agent in the real-time domain writes health query results to a shared memory region. A counterpart Linux user-space process then reads through a character device, packages with device identification and timestamps, and routes to a storage destination.

Within Linux, the Memory Technology Device (MTD) subsystem is the integration point for the flash driver, and IOCTL commands on an extended MTD driver are the mechanism by which device-specific health metrics cross the user-space boundary without touching standard read/write paths. On bare-metal or RTOS systems, the library links directly with the memory driver and is scheduled by the task manager.

In the case of SEMPER NOR Flash, SEMPER Diagnostics Library provides the diagnostic data, and the user is free to log it to local flash, route it to the cloud, store it in an external database, or any other destination that fits their system architecture. Similarly, fleet-connected deployments can route the same data off-device for population-level analysis. The underlying algorithms are identical across all environments; only the integration scaffolding differs.

Diagnostics library: Architecture and demo

Figure 2 Integration examples are shown for SEMPER Diagnostics Library module across different software environments. Source: Infineon

Figure 3 The demo setup is running SEMPER Diagnostics Library on Linux (RaspberryPi) while showing Erase Count and ECC Errors per sector. Source: Infineon

The SEMPER NOR Flash diagnostics software dashboard, shown below, visualizes per-sector erase counts and ECC counts in real time, along with device metadata—Device ID, Chip Size, Protocol, ECC State, Address Mode, Page Size—giving engineers a turnkey view of the flash health profile without requiring custom tooling.

Figure 4 The diagnostics software dashboard visualizes per-sector erase counts and ECC counts in real time. Source: Infineon

Fleet telemetry and predictive maintenance

Health metrics tagged with a unique device identifier and correlated with vehicle operating history become qualitatively more useful at scale. Patterns invisible at the level of a single device become apparent when data from a large population is examined holistically.

For example, a correlation between a specific duty cycle profile and accelerated sector wear may appear random as a single event, but causal when considered in aggregate. This is the difference between diagnosing a device that has already failed and identifying a population that may fail while every unit in it is still functioning normally.

Estimating useful lifetime also benefits from the same accumulated data. A static model applying a single worst-case endurance figure will produce overly conservative estimates. SEMPER Diagnostics Library’s adaptive lifetime estimation concept goes further: observed erase count progression, ECC event rates, and thermal history enable a per-device estimate that reflects how that specific unit has been used with the potential to refine it further through fleet-level pattern recognition, identifying trajectories that have historically preceded reliability events.

Act before wear

NOR flash devices save a continuous stream of health data in their internal registers, yet most systems discard it. Per-sector erase counts, ECC event trends, CRC integrity results, and thermal confirmation collectively describe how a device is aging under its actual operating conditions. The information is available at runtime, and no additional hardware is required to harvest it. The longer it is collected, the more valuable it becomes.

A diagnostics framework such as SEMPER Diagnostics Library captures this data, made possible via hardware such as SEMPER NOR memory, consistently across AUTOSAR, Linux, and bare-metal environments, routes it across processing domain boundaries, and makes it available for both on-device response and population-level analysis.

This gives engineers advanced notice to act before wear affects system reliability. In applications where that lead time separates a scheduled maintenance event from an unplanned failure, the case for building it in from the start is clear.

Saurabh Tripathi is senior applications engineer at Infineon Technologies.

Related Content

The post Flash diagnostics and health monitoring for NOR memory appeared first on EDN.