State of NAND MTD driver at start of project | Yaffs - A Flash File System for embedded use

Current state of NAND MTD

The current NAND MTD supports basic read/write functions. The code is hard-wired to support the NAND devices on a SPIA board. The ECC code supports 256 byte sectors.

There are 3 files:

nand.c: nand support algorithms. Unfortunately there is no hardware abstraction (ie. the hardware is assumed to be the SPIA board). but this is relatively easy to add.

spia.c: this initialises the spia hardware to set up the addresses for the hardware access.

ecc.c: ECC for 256 byte pages.

What needs to be done at this level:

Hardware abstraction of nand.c. This allows this code to be used with different hardware configurations.
The larger memory sizes use 512 byte pages. The memory sizes supported by 256-byte pages are obsolete. 512-byte pages are treated as two sub-pages of 256 bytes each. The Samsung code is calculation based (ie. maybe slow). The Toshiba code is table based and probably faster. Check if they are equivalent.
Write a software emulation for a NAND device. The purpose of the software emulator is:
- Allow debugging without the need for hardware.
- NAND file systems need to be fault tolerant. The hardware can't be forced to fail to test out bad block detection/handling etc. A software emulator can be forced to fail.

Overall flash system diagram

A diagrammatic representation of support for SmartMedia, NAND and NOR as well as both FAT16 and JFFS2 might look like the above.

The NAND flash sub-system uses common layers for:

hardware access
ECC, bad block handling and low-level formatting. (based on Smart Media low-level specs). By using the same low level formatting and ECC as much as possible we gain some benefit in testing, reuse etc. The low-level formatting would differ in that fields used to store the LBA and reserved fields might be used for something else (??journaling tags??)

Concerns with using JFFS2 on NAND

There are some reasons why JFFS and NAND might not work together that well.

Two particular concerns re booting

NAND devices are typically much larger than NOR (eg. 128MB is currently a large NAND chip while 16MB would be considered a large NOR chip), thus the data structures associated with flash management are likely to be much larger. Also, JFFS2 scans the media on boot to build data structures representing the nodes in the system. This increases the boot time.
NAND is not random access but must be read serially. This means that the scanning routine can't just merrily skip through the flash, instead the whole flash must be read and ECC'd first. This slows scanning and hence booting.

Thus booting a 128MB NAND-based JFFS2 is likely to look something like this:

Read and ECC every last byte on the device even if we only need a few bytes per page for journaling headers and digest the contents. At say 5MB/second this will take ~25 seconds.
Each node requires a 16-byte structure (struct jffs2_raw_node_ref). If nodes were 512-bytes then that's 4MB of RAM!

These are probably unacceptable for the product being used. We therefore need to run something better than this.

Two approaches spring to mind:

Just run FAT16. ie. Just treat all NAND as SmartMedia. Benefits: one code base. Down-side: reliability.
I'm reluctant to store critical files in a FAT16 file system because these can be so unreliable. I'm convinced that a journaling file system is superior. Maybe what we need is a light weight journaling file system (using common code with JFFS) modified in such a way as sidestep some of the nasty issues raised above.

What I'm working towards

Right now I'm investigating the idea of trying to do something more appropriate. My current avenue of exploration is to use a journaling approach as follows:

The journaling records are stored in fixed-size 512-byte pages.
The journaling tags are stored in the 16-byte-per-page "management space", semi-consistent with SmartMedia, using the 8 bytes of reserved space and LBA space to store journaling tags. ie something like:
20 bits of page id within inode
20 bits of inode id
9 bits length (bytes used in this page).
7 bits flags etc.
8 bits of hamming code on this area.

Each page id identifies where the page fits into the file. When a page is replaced (ie file data is overwritten) a new page with the same page id is written and the old one is wiped out. During garbage collection, the pages containing valid data are copied and the wiped out data is discarded.

The benefits of this approach are:

NAND-friendly since it works within the write limitations of NAND.
Less stuff to read to reconstruct file structures (only read management data).

Down sides:

No compression. But is that an issue considering the target data is already compressed and NAND is far less precious than NOR (thanks to higher densities and lower price).

Still need to see if a new data structure is going to reduce the overhead of jffs2_raw_node_ref somewhat.