This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

LWM2M Carrier Library and date_time library hard fault

I'm trying to integrate my application with the LWM2M Carrier library, and experiencing a hard fault when calling date_time_update_async(). This happens even if I call the function after I receive the event LWM2M_CARRIER_EVENT_LTE_READY. Anything obvious I'm missing? 

Parents
  • Hello, 

    Can you please provide more details on this isseu? What version of nRF Connect SDK and modem FW are you running? Will need more to reproduce on our side. 

    Kind regards,
    Øyvind

  • Sorry about that, I’m using NCS v1.7.0 on a custom board with MFW 1.3.0. Not sure about the version of LWM2M Carrier library, I guess whatever is included in NCS v1.7.0 by default

  • [00:01:15.316,497] <err> os: Exception occurred in Secure State
    [00:01:15.316,497] <err> os: ***** HARD FAULT *****
    [00:01:15.316,497] <err> os: Fault escalation (see below)
    [00:01:15.316,497] <err> os: ***** BUS FAULT *****
    [00:01:15.316,528] <err> os: Precise data bus error
    [00:01:15.316,528] <err> os: BFAR Address: 0x50008158
    [00:01:15.316,528] <err> os: r0/a1: 0x2002a4a0 r1/a2: 0x00000000 r2/a3: 0x80850000
    [00:01:15.316,528] <err> os: r3/a4: 0x0005ca5b r12/ip: 0x00036d04 r14/lr: 0x61060000
    [00:01:15.316,558] <err> os: xpsr: 0x40028000
    [00:01:15.316,558] <err> os: s[ 0]: 0x00000000 s[ 1]: 0x00000000 s[ 2]: 0x00000000 s[ 3]: 0x00000000
    [00:01:15.316,589] <err> os: s[ 4]: 0x00000000 s[ 5]: 0x00000000 s[ 6]: 0x00000000 s[ 7]: 0xffffffff
    [00:01:15.316,589] <err> os: s[ 8]: 0x00000000 s[ 9]: 0x00000003 s[10]: 0x00000000 s[11]: 0x00000000
    [00:01:15.316,589] <err> os: s[12]: 0x00000000 s[13]: 0x00000000 s[14]: 0x00000000 s[15]: 0x00000000
    [00:01:15.316,589] <err> os: fpscr: 0x00000000
    [00:01:15.316,589] <err> os: Faulting instruction address (r15/pc): 0x00000000
    [00:01:15.316,619] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
    [00:01:15.316,619] <err> os: Current thread: 0x200223f8 (time_thread)
    [00:01:15.839,660] <err> fatal_error: Resetting system

    This is the hard fault. I had put some of my own prints in the date_time library functions to try to trace it down. It seems like this is the critical section:

    static void new_date_time_get(void)
    {
    	int err;
    
    	while (true) {
    		k_sem_take(&time_fetch_sem, K_FOREVER);
    
    		LOG_DBG("Updating date time UTC...");
    		err = current_time_check();
    		if (err == 0) {
    			LOG_DBG("Time successfully obtained");
    			initial_valid_time = true;
    			date_time_notify_event(&evt);
    			continue;
    		}
    
    		LOG_DBG("Current time not valid");
    

    and is failing after it receives the semaphore from the date_time_update_async function here:

    int date_time_update_async(date_time_evt_handler_t evt_handler)
    {
    	if (evt_handler) {
    		app_evt_handler = evt_handler;
    	} else if (app_evt_handler == NULL) {
    		LOG_DBG("No handler registered");
    	}
    
    	k_sem_give(&time_fetch_sem);
    
    	return 0;
    }

    The relevant variables in prj.conf are:

    # Date Time library
    CONFIG_DATE_TIME=y
    CONFIG_DATE_TIME_NTP=n
    CONFIG_DATE_TIME_UPDATE_INTERVAL_SECONDS=0


    I tried non-zero values for CONFIG_DATE_TIME_UPDATE_INTERVAL_SECONDS and used to have CONFIG_DATE_TIME_NTP enabled but disabled it to try the other option and the fault occurs in all scenarios. Scratching my head a little bit since this call worked fine before integrating LWM2M Carrier, but it may just simply be a stack overflow? 

  • Are you able to test with the nRF9160:LwM2M carrier sample on your side? I.e. adding date_time to the sample alone. That worked for me. 

    esisk said:
    [00:01:15.316,497] <err> os: Exception occurred in Secure State

    What board are you building for? seems to be an issue with secure, not non-secure. Can you confirm?


  • When I get to work today I can test on the bare-bones sample. I always build my project for <custom_board>ns. I can always double check though. Would you mind adding a call to date_time_update_async(handler) in your test? I saw your log say DATE_TIME_NOT_OBTAINED which was before the LWM2M library had connected to the network, but I’d like to see what happens if you call the function I mentioned after the LTE link is ready. 

  • esisk said:
    Would you mind adding a call to date_time_update_async(handler) in your test?

    Sure. Initially I called the function from main(). I have now moved it to the LWM2M_CARRIER_EVENT_LTE_READY case. This yields the following output:

    2022-05-23T13:09:01.749Z DEBUG modem << *** Booting Zephyr OS build v2.6.99-ncs1 ***
    2022-05-23T13:09:01.755Z DEBUG modem << LWM2M Carrier library sample.
    2022-05-23T13:09:01.962Z DEBUG modem << LWM2M_CARRIER_EVENT_MODEM_INIT
    2022-05-23T13:09:01.990Z DEBUG modem << Certificate found, tag 411: [0;32mmatch
    2022-05-23T13:09:02.020Z DEBUG modem << [0mCertificate found, tag 412: [0;32mmatch
    2022-05-23T13:09:02.036Z DEBUG modem << [0mLWM2M_CARRIER_EVENT_CONNECTING
    2022-05-23T13:09:04.400Z DEBUG modem << LWM2M_CARRIER_EVENT_CONNECTED
    2022-05-23T13:09:05.442Z DEBUG modem << LWM2M_CARRIER_EVENT_LTE_READY
    2022-05-23T13:09:05.450Z DEBUG modem << DATE_TIME_OBTAINED_MODEM

    Make sure that your application waits until the LWM2M_CARRIER_EVENT_LTE_READY event, as the LwM2M carrier lib needs full control before this event to bootstrap with e.g. Verizon server.

    See LwM2M carrier library - application integration for more information

  • What is the stack size of the date_time thread set to? Could you try to increase this to 1280 i.e. CONFIG_DATE_TIME_THREAD_STACK_SIZE=1280, if not already done. 
    Ref. this Git pull request: github.com/.../da8d172763815205ac279dff9e91177cd05945ae

Reply Children
  • Thanks for doing that. Just checked the stack size, and if it's unset in prj.conf, 

    CONFIG_DATE_TIME_THREAD_SIZE=1024 is generated in autoconf.h. Yep, I'm aware of waiting to use the link until the LWM2M_CARRIER_EVENT_LTE_READY event is received. I'll try the stack size fix.
    Edit: I'm having trouble setting 
    CONFIG_DATE_TIME_THREAD_STACK_SIZE in NCS 1.7.0. I looked it up on the Kconfig reference and couldn't find it
    Edit 2: After looking some more it's called 
    CONFIG_DATE_TIME_THREAD_SIZE, ignore me, ha
    Edit 3: Increased the stack size to 1500 and still getting the error. I double checked it's nothing to do with our board and tried the same thing with the lwm2m_carrier sample. Worked fine
  • esisk said:
    tried the same thing with the lwm2m_carrier sample. Worked fine

    Ok, so you tested the lwm2m_carrier sample with date_time and it worked. Then it sounds like something else. Are you able to share your whole project with me? If so, I will convert this ticket to private. Will it run on a standard nRF9160DK?

  • I actually found my problem. I crawled through my git history and found that along with the changes I made to add the LWM2M Carrier library, I had also changed the MAIN_THREAD_STACK_SIZE from 4096 to 1024. This was the change that introduced the bug, but it was so slight it made me think it had something to do with the Carrier library. Thanks for your help, Øyvind. I hope this helps someone else

Related