Hard fault from nrf_log_frontend_dequeue()

Hi,

I'm getting a hard fault that I really need some help to debug. My call stack is shown below and includes nrf_log_frontend_dequeue() as well as a custom service function which calls sd_ble_gatts_hvx().

I checked the CFSR upon the fault following this guide and the only error flag set is for the following:

"IACCVIOL - Indicates that an attempt to execute an instruction triggered an MPU or Execute Never (XN) fault."

This error occurs in two cases:

(1) immediately upon connection with the peripheral device if--and this is a sort of strange condition--if SAADC acquisition time is set to 1us or 5us (NRF_SAADC_ACQTIME_1US)

(2) rarely/sporadically if SAADC acquisition time is set to anything >5us

One connection is that my custom function ble_sws_meas_send is called by the SAADC callback. But the frequency of SAADC reads is set independently of acquisition time, so I don't know why this would have an effect. 

Thanks in advance!

Parents
  • , thanks for the comment, but as I'm using SDK 17.0.2 (>15.2.0) I think I have an unrelated bug.

    , I am using a static SAADC double buffer, shown below. Should it not be static? (I did try removing the static keyword and got the same error.)

    static nrf_saadc_value_t     m_fr_buffer_pool[2][CHEM_BUFFER_NUM];

    The buffer is initialized as follows:

    err_code = nrf_drv_saadc_buffer_convert(m_fr_buffer_pool[0], CHEM_BUFFER_NUM);
    APP_ERROR_CHECK(err_code);

    err_code = nrf_drv_saadc_buffer_convert(m_fr_buffer_pool[1], CHEM_BUFFER_NUM);
    APP_ERROR_CHECK(err_code);

    As far as the specific code running during the fault, when it breaks, the call stack points to code for 3 unknown functions, and the most recent is at 0xA60. (This is different than described in my original post, but there have been some smaller unrelated changes to the codebase since, and it is still erring immediately upon Bluetooth connection as before.)

    4770 bx lr
    4B01 ldr r3, [pc, #4]          <- 0xA60
    681B ldr r3, [r3]
    68DB ldr r3, [r3, #12]
    4718 bx r3

    Thanks.

    Noelle

  • Ah, but SDK 17.0.2 doesn't handle the reported issue well as it uses a blocking mechanism (CRITICAL_REGION_ENTER()) which stalls a fast interrupt when called from interrupt context in (say) SAADC or BLE; this is perhaps why using a short ADC sample time makes it worse.

  • It looks like the code you provided in the other question will not work in SDK 17.0.2, e.g. log_data_t no longer has the log_is_busy member.

Reply Children
  • Yes it does work in 17.0.2; you have to add the extra (new) field:

    /**
     * brief An internal control block of the logger
     *
     * @note Circular buffer is using never cleared indexes and a mask. It means
     * that logger may break when indexes overflows. However, it is quite unlikely.
     * With rate of 1000 log entries with 2 parameters per second such situation
     * would happen after 12 days.
     */
    typedef struct
    {
        uint32_t                  wr_idx;          // Current write index (never reset)
        uint32_t                  rd_idx;          // Current read index  (never_reset)
        uint32_t                  mask;            // Size of buffer (must be power of 2) presented as mask
        uint32_t                  buffer[NRF_LOG_BUF_WORDS];
        nrf_log_timestamp_func_t  timestamp_func;  // A pointer to function that returns timestamp
        nrf_log_backend_t const * p_backend_head;
        nrf_atomic_flag_t         log_skipping;
        nrf_atomic_flag_t         log_skipped;
        nrf_atomic_flag_t         log_is_busy;     // This flag replaces the blocking use of critical region
        nrf_atomic_u32_t          log_dropped_cnt;
        bool                      autoflush;
    } log_data_t;
    

Related