HARD FAULT during xTaskResumeAll after ending a DFU session and disabling the Softdevice

Hi all,

I am working on nrf52840 chip with SDK 17.0.2.

Our application runs smoothly and we want to add it the capabilities of  upgrading another  nrf52840 chip using Nordic DFU service.

In order to do that we turn off our application RADIO, suspend most of our freeRTOS tasks and then call for vTaskSuspendAll to suspend the scheduler.

Then we enable the Softdevice (as part of ble_stack_init) and send the image to the remote nrf52840.

This works well.

When we finish the DFU process we call nrf_sdh_disable_request and wait until we know that the Softdevice is disabled.

Then we resume our tasks and want to resume the scheduler by calling   xTaskResumeAll();

The problem is that we get the following  hard fault: 

<error> hardfault: HARD FAULT at 0x00029350
<error> hardfault: R0: 0x00000A85 R1: 0x08F38168 R2: 0x00684088 R3: 0x0000000B
<error> hardfault: R12: 0x2000FE40 LR: 0x0002AEB7 PSR: 0x21000200
<error> hardfault: Cause: Data bus error (return address in the stack frame is not related to the instruction that caused the error).

The call stack is : 

 

What am I doing wrong ?

Thanks in advance for any assistance ,

Rafalino

  • Hello Edvin,

    As you can see from the screen shot that I attached (that has the Call stack):

    it is coming from list.c  and it was called from timer.c : 

    /* The timer is in a list, remove it. */
    ( void ) uxListRemove( &( pxTimer->xTimerListItem ) );

    I suspect some timer was not stopped/cleared by the Softdevice but I can't figure out which one.

    Rafalino

  • Hi Ori, 

    Thanks for your patience.
    It looks to me that the pxTimer might be pointing to an invalid instance of timer here in the uxListRemove. 
    Can you put some logs in the timer to print the instances of the pxTimer. 

    This might not be directly related to DFU but might be related to the way you are suspending and resuming all tasks. My best guess right now, is that the timer instance or the list instance within the timer is somehow invalid or corrupted. If that is the case, we need to find out how and when that happened.

  • Please make sure that you are handling all the return values correctly from the API especially app_timer_stop

  • Hello Susheel,

    I am checking return value using APP_ERROR_CHECK(ERR_CODE)  

    From  :   ...nRF5_SDK\nRF5_SDK_17.0.2_d674dde\components\libraries\util\app_error.h

    But I don't see anything special.

  • Hello Sushel,

    I have added prints at timers.c  

    NRF_LOG_INFO(" pxNewTimer=0x%p , pxTimer->pvTimerID=0x%p  ", pxNewTimer, pxNewTimer->pvTimerID);

    at the end of xTimerCreate

    NRF_LOG_INFO(" pxTimer=0x%p , pxTimer->pvTimerID=0x%p  ", pxTimer, pxTimer->pvTimerID);

    at  prvProcessReceivedCommands just before  ( void ) uxListRemove( &( pxTimer->xTimerListItem ) );

     This is what we see:

     <info> app: pxNewTimer=0x0x2000CE40 , pxTimer->pvTimerID=0x0x200169C8
     <info> app: pxNewTimer=0x0x2000D0C0 , pxTimer->pvTimerID=0x0x200168F0
     <info> app: pxNewTimer=0x0x2000D0F0 , pxTimer->pvTimerID=0x0x2002D310
     <info> app: pxNewTimer=0x0x2000D120 , pxTimer->pvTimerID=0x0x2002D330


     <info> app: pxTimer=0x0x2000D120 , pxTimer->pvTimerID=0x0x2002D330
     <info> app: pxTimer=0x0x2000D120 , pxTimer->pvTimerID=0x0x2002D330
     <info> app: pxTimer=0x0x2000D0F0 , pxTimer->pvTimerID=0x0x2002D310

    Just before the hard fault :
     <info> app: pxTimer=0x0x00000A81 , pxTimer->pvTimerID=0x0x1E200000

    We see that all the timers are at address 0x0x2000DXXX

    and the timer that we need to remove is in weird address 0x0x00000A81

    What can cause this corruption ?

Related