During the stability test, the sensors failed to publish status message

Hi,

We setup a mesh network with 26 sensor devices, configured these 26 devices into one group.  From the mesh gateway device, it sends group message to these 26 devices each five seconds. After received the group message, each sensor will publish its status to these group. 

After running this stability test for about 2~3 hours, we found that among these 26 sensors,there are one or two sensors will not able to publish its status with following error found in LOG.

"Unable to publish status message, status: 4"

Parents
  • Hi,

    What SDK are you running and what version? Are you running a unmodified example?

    From the mesh gateway device

    What is this gateway? Can you elaborate?

    After received the group message, each sensor will publish its status to these group. 

    What you mean by these group? 

    Are you doing provisioning using our nRF Mesh app?

    Also, is it the same devices that keep running into this issue or is it random? Are you testing on a DK?

  • Hi Mttrinh,

    In our system, we have two type of hardware: the sensors and the gateways. All the hardware using nRF52840 with the V5.0 Mesh SDK.  The sensor device is running modified genericONOFF model.

    In the test, we provisioned 25 sensors and 1 gateway in one mesh network, and the 25 sensors are configured into a group  named A. Then the gateway will repeat such a pattern: wait 5 seconds and send a group genericON message to A group,5 seconds later it will send a group genericOFF message to A group. 

    Once the sensor devices get the group message, it will publish a ON/OFF message to the gateway.

    The issue is random, during our test it needs about 6 hours, about 2 of the 25 will happen.

  • HI Mttrinh,

    After further debugging, we found that NRF_ERROR_NO_MEM is generated from packet_tx(), once it happens, the sensor device will never be able to publish any message out. It will last for hours after the system is reboot by a software watchdog reset.

    The sensor will have two kinds of publish message: 1、light onoff   2、PIR for motion detection.

     

  • 00> [381548:access.c:1200] access_model_publish
    00> [381548:access.c:445] packet_alloc_and_tx
    00> [381548:mesh_mem_stdlib.c:64] mesh_mem_alloc,p_men:0x20007708,size:4
    00> [381548:mesh_mem_stdlib.c:74] total_size: 238
    00> [381548:access.c:339] packet_tx
    00> [381549:access.c:429] packet_tx,status:4

    Hi Mttrinh,

    The mesh stack running into such a status that any publish message from application layer will be rejected due to no mem. And will never recover itself.

    Would you please help take a look at the log and advice further? Thanks.

  • Hi,

    Sorry for the late reply.

    Can you debug the code and go through step by step inside packet_tx and see which function is returning this error code? 

    I suspect it is nrf_mesh_packet_send(). If so, then NRF_ERROR_NO_MEM means a packet buffer could not be allocated for the packet. The application should try to send the packet again at a later point.

  • Hi Mttrinh,

    Yes, it fails in nrf_mesh_packet_send due to not able to allocated for the packet. At a later point the application re-sends, sometimes it can recover, but sometimes it will never recover, each time the application sends a message it will report NRF_ERROR_NO_MEM.

    Do you have any suggestion? Thanks

Reply Children
Related