MSC exception

Hi,

Our production uses NRF52840 as the master chip. We encountered a strange problem where two folders have the same name and a file name has illegal characters. 

The probability of this happening is very low, so we can not find the root cause so far. As we know, the file system does not allow a folder with the same name and a file name with illegal characters. What are the possible causes of this problem?

The normal directory structure is shown below.

-----------------------------------------------------------------------------------------

I would like to provide more details for the MSC issue as below.

I attached the nrf_block_dev_qspi.c that I used, you may diff with the original file of the SDK.

There are two major modifications.

1.the issue was reported in this link https://devzone.nordicsemi.com/f/nordic-q-a/36654/usbd_msd-disk-initialization-fails-in-usb-unplug-with-sdk15-0

the difference was I added a timeout for the wait_for_idle()  function because I found it would hang on this function sometimes.

2. add debug message for troubleshooting 

We got the debug log sometimes(please find debug.log), although everything looks good, I thought it was unexpected. 

By the way, there were not any debug logs when the issue "two folders have the same name and a file name has illegal characters"  occurred.

Parents
  • Hi ,

    Filenames are stored on the QSPI memory just like file contents, so it is possible that there is a remaining bug in block_dev library other than what was fixed in the earlier thread. It is not a known issue though nor can we pinpoint it as this point.

    Can you say more about how early it occurs, and have you found a way to reliably reproduce it?

  • Hi Einar

    I can not find a way to reliably reproduce it so far, and the probability of this happening is very low. It may have happened before this thread, but I'm not 100% sure.
    I believe that there is a bug in block_dev library because I am able to capture the debug logs sometimes.
    By the way, the latest SDK did not fix the issue mentioned in this thread.

  • We are wondering if it is possible to rule out the MSC class as responsible for the issue or not. In your test, where you sometimes manage to get the corrupt data, we wonder if you can do some adjustments in order to narrow it down:

    • prepare/modify a FW that reproduces the issue
    • connect the DK to the PC via the JLink USB port
    • flash the FW
    • let the FW run until the issue occurs - so basically FW needs to have some way of validation when the corruption happens. Never connect the nRF52840 USBD port to the PC so MSC class never drives the block_dev.

    Is that possible?

    Yes, that is possible.

  • That is good, thank you. That will be very useful input for the team looking into this.

  • Hi Einar,

    I have found a way to reproduce this issue, I tested it on the below project that I attached before.

    Please replace the nrf_block_dev_qspi.c of the SDK, I use RTT for logging output and  test on the nRF5_SDK_17.0.2.

    2626.Project.zip

    1. press button 3 to format the mass storage 

    2. press button 1 to create an ACTIVI folder and TOTALS folder

    3. connect the USB to the PC, and create a new file as below 

    3.disconnect the USB

    4.connect the USB to the PC again, and then the MS-Windows  will popup a message as below 

    5.click this window and then popup another window as below 

    6.click "Scan and fix", popup the next window 

    7.click "Repair drive", and wait for repairs completely.

    8.create a new file again and then repeat step 3 through step 8

    9.Repeat it one or several times, you will find the file system was corrupted. The Totals folder change to a file.

  • Hi Jason,

    I am sorry for the late reply. I have been OoO.

    I see the same as you. I have only been able to reproduce when following these instructions exactly, writing and not ejecting before removing the USB cable. If I eject first no error happens. Also, I did not do extensive testing today, but I did not see corruptions with the attached patch for nrf_block_dev_qspi.c instead of using yours. There were still errors from Windows though, so apparently there was an issue, but no corruption was seen and no issues logged. (with your patched file I got a lot of "Cannot uninit because QSPI is busy" and "Cannot init because QSPI is busy", but this is addressed by the diff below (see line 438-445).

    I did extensive testing before the summer with several thousands of write cycles, where the computer wrote a file, ejected, disconnected USB, reconnected and verified, etc. This did not fail using the nrf_block_dev_qspi.c from the diff in this post. So it seem to me like the only way this can be reproduced after applying currently known fixes is by writing from a computer and not ejecting before removing the drive.

  • Hi Einar,

    but no corruption was seen and no issues logged. (with your patched file I got a lot of "Cannot uninit because QSPI is busy" and "Cannot init because QSPI is busy", but this is addressed by the diff below (see line 438-445).

    Actually, we have applied this patch in our product code, please see below.

    There are two major modifications.

    1.the issue was reported in this link https://devzone.nordicsemi.com/f/nordic-q-a/36654/usbd_msd-disk-initialization-fails-in-usb-unplug-with-sdk15-0

    the difference was I added a timeout for the wait_for_idle()  function because I found it would hang on this function sometimes.

    I still can reproduce this issue with the patch using the below method.

    I have found a way to reproduce this issue, I tested it on the below project that I attached before.

    But need to modify the main.c  file, create more folders in the root directory.

    --- D:/Project/Nordic SDK/nRF5_SDK_17.0.2_d674dde/examples/peripheral/usbd_msc/main.c
    +++ D:/Project/Nordic SDK/nRF5_SDK_17.0.2_d674dde/examples/peripheral/usbd_msc/main.c
    @@ -505 +505 @@ const char *dir_list[] = {
    -    "Activi", "Totals", /*"Tracks", "Workout",*/
    +    "Activi", "Totals", "Tracks", "Work", "Sport", "sys"

    So there is still an unexpected write operation from the PC, whether we can see the corruption depends on if there is data in the address of the unexpected write.

Reply
  • Hi Einar,

    but no corruption was seen and no issues logged. (with your patched file I got a lot of "Cannot uninit because QSPI is busy" and "Cannot init because QSPI is busy", but this is addressed by the diff below (see line 438-445).

    Actually, we have applied this patch in our product code, please see below.

    There are two major modifications.

    1.the issue was reported in this link https://devzone.nordicsemi.com/f/nordic-q-a/36654/usbd_msd-disk-initialization-fails-in-usb-unplug-with-sdk15-0

    the difference was I added a timeout for the wait_for_idle()  function because I found it would hang on this function sometimes.

    I still can reproduce this issue with the patch using the below method.

    I have found a way to reproduce this issue, I tested it on the below project that I attached before.

    But need to modify the main.c  file, create more folders in the root directory.

    --- D:/Project/Nordic SDK/nRF5_SDK_17.0.2_d674dde/examples/peripheral/usbd_msc/main.c
    +++ D:/Project/Nordic SDK/nRF5_SDK_17.0.2_d674dde/examples/peripheral/usbd_msc/main.c
    @@ -505 +505 @@ const char *dir_list[] = {
    -    "Activi", "Totals", /*"Tracks", "Workout",*/
    +    "Activi", "Totals", "Tracks", "Work", "Sport", "sys"

    So there is still an unexpected write operation from the PC, whether we can see the corruption depends on if there is data in the address of the unexpected write.

Children
Related