Re: [Yaffs] Bad Block definition

Top Page
Attachments:
Message as email
+ (text/plain)
+ (text/html)
Delete this message
Reply to this message
Author: Charles Manning
Date:  
To: Qi Wang 王起 (qiwang)
CC: yaffs@lists.aleph1.co.uk
Subject: Re: [Yaffs] Bad Block definition
On Fri, May 9, 2014 at 2:06 PM, Qi Wang 王起 (qiwang) <>wrote:

> Hi Charles,
>
> The best way to do this might require nformation that is not available via
> standard mtd drivers.
>
> So that means we cannot do any optimization for YAFFS2 on current stage?
>


That is correct if they are using the standard mtd interfacing.



> If it is possible to add REFRESH_NEEDED in YAFFS2, and add some comments
> to show this isn’t archived in YAFFS2, but customer can do modification
> base their specific driver and NAND flash.
>
> Or give users a configure option in Kconfig file, let customer to select
> if a bit flip block need to record and mark it as bad block?
>
> Can you share your ideas? I can make a patch and submit to you.
>


There are some people who completely replace the mtd layer with custom NAND
drivers. Some of that is because the layering in mtd can make some
operations slower than they need to be.

If people are prepared to write their own drivers then they can make the
information available that would be needed by this logic.


>
>
> Frankly speak, many Micron customer meet this bad block definition issue,
> as bit flip is very easy to see in MLC NAND. This issue could be worse in
> seldom power down device, such as setbox. So I think it is necessary to
> make users known they can do improvement/optimization on this point. How
> do you think about?
>


What I know some people do is use a different threshold.
For example, if they will say 0..2 flips = OK, 3..n=FIXED.

What is generally observed is that the distribution of flips on non-bad
blocks tends to be low.




> Thanks
>
>
>
> *From:* Charles Manning [mailto:cdhmanning@gmail.com]
> *Sent:* Friday, May 09, 2014 9:43 AM
> *To:* Qi Wang 王起 (qiwang)
>
> *Cc:*
> *Subject:* Re: [Yaffs] Bad Block definition
>
>
>
>
>
>
>
> On Fri, May 9, 2014 at 1:30 PM, Qi Wang 王起 (qiwang) <>
> wrote:
>
> Hi Charles,
>
> Yes, I totally agree with you. But one question is how to distinguish REFRESH_NEEDED
> and FIXED_BUT_SUSPECT.
>
>
>
>
>
> I think a good method is to distinguish by bit flip level. For example,
> define the max_ecc_cap is max ecc capability for one NAND flash, if one
> block bit flip exceed one third of max_ecc_cap, just consider this block as
> REFRESH_NEEDED, and refresh it. But if one block bit flip exceed
> two-third of max_ecc_cap, consider this block as FIXED_BUT_SUSPECT.
> Because normally, bit flip number increase by one or a few bit every time,
> when bit flip reach one third of max_ecc_cap, it will be refreshed, and
> never can be to two third of max_ecc_cap. If a block reach two-third of
> max_ecc_cap, we can consider it as suspect.
>
> But now, MTD layer read function only return –EUCLEAN to YAFFS2, YAFFS2
> cannot get how many bit flip occur.
>
> So can you share what’s you opinion for how to distinguish REFRESH_NEEDED
> and FIXED_BUT_SUSPECT?
>
>
>
> It will be up to the driver/integrator to determine this. There is no way
> Yaffs can figure out the policy because the right thing to do will depend
> on the flash and what information is exposed.
>
> Some flash parts (& controllers) provide a lot of information (eg. number
> of flipped bits). In that case you can make some determination in software
> (eg. 0 or 1 = OK, 2 or 3 = REFRESH NEEDED, 4 or more = FIXED BUT SUSPECT).
>
> On other flash types there is no such info (or there are not enough bits
> to do anything useful). In that case the driver writer can choose.
>
> The best way to do this might require nformation that is not available via
> standard mtd drivers.
>
>
>
> Thanks
>
>
>
>
>
> *From:* Charles Manning [mailto:cdhmanning@gmail.com<>]
>
> *Sent:* Friday, May 09, 2014 7:45 AM
> *To:* Qi Wang 王起 (qiwang)
> *Cc:*
> *Subject:* Re: [Yaffs] Bad Block definition
>
>
>
> This issue has been discussed a few times and raises some interesting
> issues.
>
> Yes, some flash parts say to consider bit flips as not bad - just refresh
> - but the other side to this is that if blocks experience MANY bit flips
> they are maybe going bad and we want to catch the problem before data is
> lost.
>
> It might be time to introduce a third level of error. ie something like:
>
> * NO_ERROR : No problems.
>
> * REFRESH_NEEDED: (new) Refresh block, don't worry about it going bad.
>
> * FIXED_BUT_SUSPECT: Treated same as FIXED is now, retire the block if it
> does this again.
>
> * UNFIXED
>
> -- Charles
>
>
>
> On Thu, May 8, 2014 at 5:30 PM, Qi Wang 王起 (qiwang) <>
> wrote:
>
> Hi,
>
> I see some bad block definition issue in YAFFS2.
>
> In the YAFFS2 read function, when a page occur bit flip more 3 times, the
> block contain this page will be marked as bad block. as below read function:
>
> int yaffs_rd_chunk_tags_nand(struct yaffs_dev *dev, int nand_chunk,
>
>                              u8 *buffer, struct yaffs_ext_tags *tags)

>
> {
>
>                 int result;

>
>                 struct yaffs_ext_tags local_tags;

>
>                 int flash_chunk = apply_chunk_offset(dev, nand_chunk);

>
>
>
>                 dev->n_page_reads++;

>
>
>
>                 /* If there are no tags provided use local tags. */

>
>                 if (!tags)

>
>                         tags = &local_tags;

>
>
>
>                 result = dev->tagger.read_chunk_tags_fn(dev, flash_chunk,
> buffer, tags);

>
>                 if (tags && tags->ecc_result > YAFFS_ECC_RESULT_NO_ERROR) {

>
>
>
>                         struct yaffs_block_info *bi;

>
>                         bi = yaffs_get_block_info(dev,

>
>                                           nand_chunk /

>
>                                           dev->param.chunks_per_block);

>
>                         yaffs_handle_chunk_error(dev, bi);

>
>                 }

>
>                 return result;

>
> }
>
>
>
> But actually, in NAND flash, only program and erase error can be marked
> bad block. Bit flip symptom is easy happen after a page is read many cycles.
>
> If a system use YAFFS2, and never power down this system, user will see a
> lot of bad block after they run a time, But this block isn’t a real bad
> block.
>
> How about just refresh the block when bit flip occur, but not record the
> bit flip count, and mark it as bad block?
>
> Thanks
>
>
>
>
>
>
>
> Best Regards,
>
> Qi Wang 王起
>
> EBU APAC SE
>
> Tel: 86-021-38997158
>
> Mobile: 86-15201958202
>
> Email:
>
> Address: No 601 Fasai Rd, Pudong, Shanghai, China, 200131
>
>
>
>
>
>
> _______________________________________________
> yaffs mailing list
>
> http://lists.aleph1.co.uk/cgi-bin/mailman/listinfo/yaffs
>
>
>
>
>