NVRAM Over off-chip SPI-NOR Flash

This was the first time in 12 years that binary search trees were needed in microcontroller programming.

In this text, we will talk about how you can build an efficient software implementation of a non-volatile Key-Value Map (ki) over a cheap SPI NOR Flash for microcontroller projects. The point is simple. Need NVRAM.

An abstract data structure is required. Key-value. The key is a 32-bit integer. As data – a binary array of arbitrary length.

I already had a text about a simple, compact, naive, slow O (n) on-chip NOR Flash NVRAM https://habr.com/ru/articles/706972/, which generally works fine when 16kByte is available …128kBytes. Now it was necessary to create a full-fledged fast O(Log2(n)) NVRAM for off-chip NOR Flash(s) with sizes of 4….16 MByte.

Requirements for a great NVRAM

1–The working principle of NVRAM. You give a virtual 32bit address, data, data size, write data. You turn off the power. You turn on the power. You give the virtual address, you receive the data and their size.

2–The previous data with the ADDR key should remain if the power is lost at the time of writing the next block.

3–Can store different size data. For example, 512 bytes or more in one entry.

4–Data must be retained when power is lost.

5–Knot must be inserted quickly

6–The node must be quickly located

7–The node must be removed quickly

8–There must be no empty spaces in memory. Records follow one after another, back to back. No alignment whatsoever.

9–Data must be equipped with CRC16. This will allow you to identify events such as power failure at the time of recording the node, and simply distinguish the frame from random numbers.

10–Data must be stored encrypted. This will protect the data if someone decides to glue the chip with orange Kapton tape and blow off the SPI-NOR Flash (a) chip with a hot air gun for later reading the dump (a) by another microcontroller and parsing the data types inside.

11–Data can be compressed. That is, to be compressed. This will allow more bytes of payload(s) to be loaded into NOR-Flash.

Features of working with SPI NOR flash

As a physical storage number, ordinary cheap SPI NOR Flash chips will act. For example MX25R6435F

In NOR Flash, when NOR Flash is written, the bits change according to the following law.

That is, a bit zero is considered a record. A separate zeroed bit cannot be cocked back to one. You have to erase the whole page. And this is about 4kByte.

In fact, the task of developing NVRAM is more difficult than it seems. Dozens of iterations of debugging and diagnostics will be required. Here you need effective methods for debugging software.

To debug a purely software component, you must first sprinkle the SPI-Nor Flash simulator on a PC. As an SPI-Flash, it is easy to define the process’s RAM array. All the logic of work must be debugged on NetTop(e) and then with confidence to build this code on the Tagret platform. Plus, this DeskTop utility will come in handy if you want to compile an image of the finished file system as a hex array, and then flash it into flash along with the firmware.

Also, such tasks as the development of NVRAM are perfectly covered by unit tests. This is where the TDD methodology fits in perfectly. Therefore, we will immediately prepare as many tests as possible. These same tests will act as documentation and braces for refactoring and will help you understand when it’s time to stop developing NVRAM code.

The main idea of implementing NVRAM on SPI NOR Flash

First, let’s define terminology.

Term	Term	Definition
header	header	Payload metadata: preamble, address, CRC, etc.
Payload	payload	The useful binary data itself that needs to be stored
frame	frame	Header followed by useful data
frame	skeleton	The structure of a binary search tree. Pointers to the right and left leaves of a binary tree.

What API do we want to get? Probably something simple like this:

#ifndef NVRAM_DRIVER_H
#define NVRAM_DRIVER_H

#include <stdbool.h>
#include <stdint.h>

#include "sw_nvram_types.h"

bool nvram_write(NvRamItem_t* Node, uint32_t address,const uint8_t* const data, uint32_t size);
bool nvram_read(NvRamItem_t* Node, uint32_t address, uint8_t * const data, uint32_t* const size);
bool nvram_delete(NvRamItem_t* Node, uint32_t address);

bool nvram_init(NvRamItem_t* Node);

#endif /*NVRAM_DRIVER_H*/

Now let’s define the frame structure, which we will store in non-volatile memory.

You can see the table enlarged here. https://docs.google.com/spreadsheets/d/1UmJThfkCRQpiLWmO9J47Z5gdim99YF0lR8Lr8KIgXAY/edit#gid=0
In the header, everything is only the most necessary: preamble (parameterizable), address, data, checksum, data type, timestamp, variable state, pointers to leaves. Only 36 bytes per header.

Fine. With the heading plus / minus decided. But how to store the frames themselves then?

New nodes are always written to the end of the NOR Flash memory page. Close, close to the last frame written. No alignment so as not to waste precious bytes.

Now the important point. The main idea of NVRAM is to use binary tree search (BST). A binary search tree, among other things, is good in this case because pointers to the right and left nodes in each node are written only once. This is well within the physical limitations of SPI-NOR Flash(a). The same 32-bit data address will act as a key. We do NVram. And RAM works on the principle: “you give the address, you get the data.”

Inserting data into NVRAM.

Before inserting, you need to check if there is already such data in NOR Flash (e). In this case, the CRC16 checksum in each Frame (e) will help us. There is no point in posting the same. This is contrary to the requirement to take care of memory wear. This technology is called lazy write (lazy notation).

This is necessary for uniform wear of NOR Flash chips. Inserting data into NVRAM is reduced to inserting into a binary tree. Before inserting, you need to check that the tree already has such an address. It can be with data of a different size and a different checksum. If yes, then it should be marked as obsolete. After that, just insert a new leaf into the tree and mark the inserted frame as new. It is green in the picture.

Here is an example of node organization

Tree structure visualization with nodes removed (grey)

here is an example of inserting a pair of UART-CLI frames with the nvrw command

Node search

Searching for a node is reduced to searching within a binary search tree. The 32-bit address of a variable in NVRAM acts as a key. The search time is limited by the maximum height of the binary tree.

Deleting a node

To delete an entry in NVRAM, you need to go through all nodes with the Addr address and reset to zero the bit that is responsible for the activity of the address. You should also reset the data that corresponds to this node. At the same time, the address itself and the pointers should be kept as is, so that there is a connection with those valid nodes that are larger and smaller than the remote node. The framework must be preserved.

NVRAM based on a binary search tree (BST) is good because it stores not only data, but also the history of working with data. When created, what was removed. All this will remain in the short term until the moment of overflow and page switching. You can also easily traverse all nodes in ascending or descending order. The tree is searched quickly.

Who erases NOR Flash?

All memory that is available to the NVRAM driver will be divided into 2 equal pages: active and passive. The active page is the one in which the variables will be written. The passive page is the temporary storage that will be needed when rebuilding the tree to clean up obsolete and removed headers.

How does the NVRAM driver know which page is active?

At the very beginning of the NVRAM pages there will be one double word (qword).

Interpretation	Meaning
Active page	0xFEFEFEF
passive page	0xFCFCFCFC
Just a piece of memory that hasn’t been used yet	0xFFFFFFFF

When power is applied, the NVRAM driver in the nvram_init () function will check these numbers and understand which page to work with today. From the first or from the second. Here is the LookUp decision table in the nvram_init() function

When the active page overflows, the driver cleans up the passive page and writes to the passive page all the most new variables from the active page. A new tree will be formed without deleted and obsolete frames.

Only the most recent entries of the binary tree are copied to the new page.

In this case, when recopying the NVRAM page, it is necessary to traverse the binary tree in the order level order. Thus, in the opposite page, a tree of the same height in relation to active nodes will be formed as in the original page. It is very important. Since if, when transferring frames, the original tree is bypassed in order in order, then not a tree but a long linked list will be formed in the new page and this will slow down the time of all operations.

Ideally, one should generally come up with an algorithm for the streaming formation of a balanced tree in a new page when switching pages, provided that leaf pointers cannot be rewritten, but this topic is very difficult, it will pull on a master’s study of some European university (sheep).

It turns out that the page erase operation is a rather rare operation that occurs only before half of the entire NOR-Flash is filled. This is what protects the memory from wear and tear.

At the same time, the number of page switches can also be stored right there in NVRAM in a variable with a certain reserved virtual address and predict that wonderful day when NOR Flash fails. For example, if there are 80k switches, then you can issue a warning to the UART console.

Software Dependencies

NVRAM cannot operate in a vacuum. NVRAM is an add-on for the driver of some NOR-Flash (a). Be it on or off chip. You also need a checksum calculation component, functions that return time (UTC or Up-Time), and classic code for servicing a binary search tree.

Conclusion

So, solely thanks to the software, from almost disposable cheap memory at a price of 20 RUR/MByte, we got a reusable expensive memory at a price of 2000 RUR/MByte (100 times more expensive)! Magic!

To work with NVRAM, you need to navigate the following acronyms

Acronym	Decryption
NVRAM	Non-volatile random-access memory
BST	Binary search tree
RAM	random access memory
API	Application Programming Interface
SPI	Serial Peripheral Interface
PC	personal computer
TDD	Test driven development
NOR	not or

Links

Website for visualizing the construction of abstract data structures
https://www.cs.usfca.edu/~galles/visualization/Algorithms.html

Simple NVRAM for on-chip NOR Flash https://habr.com/ru/articles/706972/

Graph rendering site https://dreampuf.github.io/GraphvizOnline

Node structure for NVRAM https://docs.google.com/spreadsheets/d/1UmJThfkCRQpiLWmO9J47Z5gdim99YF0lR8Lr8KIgXAY/edit#gid=0

https://habr.com/ru/articles/730232/

https://habr.com/ru/articles/731604/

Control questions:
1–What are binary tree traversals?

2–What is the algorithmic complexity of finding a node in a binary search tree?

3–What is the algorithmic complexity of inserting a node into a binary search tree?

4–How to debug a large piece of code, if there is no way (do not want to) walk through the code with a step-by-step debugger?

5–To page “A” arbitrary binary search tree. How to draw the equivalent balanced binary search tree on page “B”? Restrictions: The tree on page B must be drawn on the first try, in one go. The tree on page “A” cannot be changed. The RAM memory is 300 times smaller than the size of the tree on page “A”.