Hello!

Reaching out in parallel to David Barlia who I’m working with on this project.
He sent an email requesting feedback on the same subject, but I’m also going to post here with additional context to his questions in effort to troubleshoot our issue.

Our device/application is meant to travel to various locations and thus likely through diverse cellular conditions and so needs to be able to regain LTE connection after it’s lost so we can send messages to nrfcloud.com

Our application is partially based upon the cloud-client sample, so has the following prj.conf option:
CONFIG_NRF_CLOUD_CONNECTION_POLL_THREAD=y
which seems to prompt the code to poll for nrfcloud.com every 30 seconds whenever a connection is lost, but it doesn’t seem to attempt to regain LTE connection after it’s lost in this initial sample. We’re simulating the LTE disconnect/reconnection event by pausing and then resuming our Sim in hologram.io which appears to be a valid method for testing this issue.

Pressing RST to reboot the device works to trigger the reconnection to LTE/nrfcloud.com but this will not be possible when the device is deployed.

So I’ve tried a few different approaches to manually request the LTE connection to reconnect:
In the nrfcloud polling function
static void connect_work_fn(struct k_work *work)
I tried manually rebooting the modem with the following:

            `lte_error = lte_lc_deinit();
	if ( lte_error ) {
		LOG_INF("Failed to power down modem.");
	} else {
		LOG_INF("Modem powered off.");
	}
	
	LOG_INF("Attempting to Reconnect LTE.");
	lte_error = lte_lc_init_and_connect(); //`

This appears to work when the length of time of the disconnect is less than the length of the timeout, but when the disconnect is longer than that it appears the nrfcloud.com polling stops and so it seems unable to reconnect thereafter.

I tried using the non-blocking alternative to the reinitialization:
lte_lc_init_and_connect_async(lte_handler)
Which doesn’t stop the nrfcloud polling, but this approach doesn’t seem to ever regain the LTE connection.

We’re working with the 1.4.1 version of the toolkit as suggested by you and the thing-plus documentation, though we’ve noticed that there’s a flag in the prj.conf for 1.9.1 toolkit which allows you to set the LTE reconnection delay in seconds. This doesn’t appear to be an option in the 1.4.1 toolkit however.

I’ve found sparse information about the reconnection process in this forum, though there are a few hints in the Nordic Semiconductor forum that suggests the preferred method for reconnecting LTE is to poll the LTE state in the main thread with:
lte_lc_nw_reg_status_get()
then reboot the modem similar to how I have been. These posts are quite old, however and experiments I’ve done with this function so far have been unsuccessful so I prefer to rely on the status responses from the lte_handler.

I’m curious if you can offer any feedback about how we might resolve our LTE reconnection issue.
Should we try to upgrade to a newer version of the toolkit? Is there a different method of power cycling the modem that we should be using? Is there a function that we can call to just retrigger the device to reboot completely from software to simulate the RST button to regain connection?

Thanks again for your help and feedback.

    occurrentarts

    Thanks for the questions!

    These functions are particularly useful for enabling and disabling the modem:

    /** @brief Function for sending the modem to offline mode
     *
     * @return Zero on success or (negative) error code otherwise.
     */
    int lte_lc_offline(void);
    
    /** @brief Function for sending the modem to power off mode
     *
     * @return Zero on success or (negative) error code otherwise.
     */
    int lte_lc_power_off(void);

    Which really use lte_lc_func_mode_set underneath. This only enables and disables your modem/cellular connection though.

    For the MQTT client, you need to make sure you disconnect properly and then reconnect. You’ll need to use the appropriate API depending on what client you’re using. (I think David mentioned AWS?)

    The 1.7.x branch in NFED is better supported and I’ll probably upgrade all samples to 1.8.x or even 1.9.x in the future.

    If you go to nfed/samples/tracker you can see what I did there. I completely disconnect and turn the modem off between transmissions.

    I hope this helps!

      Thanks so much for your feedback.
      Trying these two functions turned off the modem effectively.
      But so far I haven’t had success reconnecting LTE.

      To re-enable the modem, after turning it off per your suggested functions, I’ve tried:
      -lte_lc_normal(); then lte_lc_connect();
      -lte_lc_connect(); only
      -lte_lc_connect_async(lte_handler);
      -lte_lc_init_and_connect_async(lte_handler);
      -lte_lc_init_and_connect();
      all with no luck reconnecting after the Sim is paused and then unpaused.

      Can you share recommended process for re-enabling the Modem after turning if off for 1.4.1.?

      We’re actually not using the MQTT client but rather nrfcloud.com for our cloud provider.
      The AWS server we have communicates with the nrfcloud.com api.

      I looked at your nfed/samples/tracker code to copy your implementation. I tried calling:
      lte_lc_func_mode_set(LTE_LC_FUNC_MODE_DEACTIVATE_LTE);
      then
      lte_lc_func_mode_set(LTE_LC_FUNC_MODE_ACTIVATE_LTE);
      but unfortunately it seems 1.4.1 doesn’t support “lte_func_mode_set” so it won’t build.
      http://developer.nordicsemi.com/nRF_Connect_SDK/doc/1.4.1/nrf/include/modem/lte_lc.html

      I suppose my next course of action is to attempt to upgrade to 1.7.x per your suggestion unless you can recommend anything else regarding reconnecting LTE in 1.4.1?
      Thanks again.

        Can you share recommended process for re-enabling the Modem after turning if off for 1.4.1.?

        Have you looked at the source I provided? Should be very similar. What version of the MFW are you using? I would recommend upgrading to 1.3.x if you can since the Thing Plus boards have the newer silicon. I’d also recommend upgrading to at least 1.7.x of NCS. since that’s when MFW 1.3.x was introduced.

        When you re-enable the modem, as long as your SIM is valid and enabled it should reconnect. You may be putting the modem in a funky state if you’re disabling service for that SIM. Worst case you should introduce a connection timeout delay which will reboot your device in the case you don’t get a valid LTE connection within a certain interval. (60 seconds is good but it really depends on your application)

        occurrentarts We’re actually not using the MQTT client but rather nrfcloud.com for our cloud provider.
        The AWS server we have communicates with the nrfcloud.com api.

        This isn’t clear here. If you’re using AWS, you’re using the underlying MQTT client. If you’re using nrfcloud you’re also using the MQTT client. Both clients are layers built on the default Zephyr MQTT implementation.

        Hope that helps!

          For what it’s worth, here’s an extract of the log, while the device is trying to reconnect…

          2022-02-25T22:29:54.527Z DEBUG modem << I: CLOUD_EVT_CONNECTING
          2022-02-25T22:29:54.538Z DEBUG modem << I: CLOUD_EVT_CONNECTING
          2022-02-25T22:29:54.556Z DEBUG modem << I: Put Modem Offline
          2022-02-25T22:29:54.573Z DEBUG modem << I: Powered off Modem
          2022-02-25T22:29:54.574Z DEBUG modem << I: Attempting to Reconnect Modem.
          2022-02-25T22:29:54.610Z DEBUG modem << I: Failed to connect
          2022-02-25T22:30:24.522Z DEBUG modem << I: Next connection retry in 30 seconds

            jaredwolff

            “Have you looked at the source I provided? Should be very similar. ”

            are you referring to these function?

            • int lte_lc_offline(void);
            • int lte_lc_power_off(void);

            yes i have them integrated to my code but they don’t appear to re-enable the LTE modem only shut it down correct?
            currently after calling these functions I am calling:

            lte_lc_connect_async(lte_handler);
            i might have actually gotten a single reconnect after a 20 or so minute wait, but haven’t been able to reconnect a second time to confirm it works multiple times.

            or when you ask if i looked at the source provided
            are you referring to the nfed/samples/tracker code?

            yes i looked at this as well but noticed that
            “lte_lc_func_mode_set” doesn’t appear to be a valid function in 1.4.1 so I wasn’t able to build.

            I’m using the default firmware version which came with the device. I’m not sure how to upgrade the firmware does that require a J-Link? Yes I can attempt an upgrade to 1.7.x as you suggest.

            “When you re-enable the modem, as long as your SIM is valid and enabled it should reconnect.”
            What is your recommended function to re-enable the modem?
            among the other options i tried:
            ltw_lc_normal();
            before calling
            lte_lc_connect_async
            to no avail.

            “You may be putting the modem in a funky state if you’re disabling service for that SIM.”
            this is very possible, pausing and restarting via hologram has gotten quite slow, and getting an initial LTE connection after flashing is starting to take some time.
            can you recommend another way to test LTE connection loss and subsequent reconnection?

            “Worst case you should introduce a connection timeout delay which will reboot your device in the case you don’t get a valid LTE connection within a certain interval.”
            Would love to try this! I do see in your tracker example that you do a call to sys_reboot to trigger a reboot, but even after including <power/reboot.h> I get build errors suggesting that sys_reboot isn’t a function available to 1.4.1
            Is there any other suggested method to trigger the reboot?

            ::UPDATE
            I was able to trigger reboot with sys_reboot after adding CONFIG_REBOOT=y to my prj.conf
            this has helped accelerate LTE reconnection and currently works with multiple disconnects in a row!
            If the signal dropout is particularly long it doesn’t seem like it’s able to get an initial connection, however.
            still need to troubleshoot this.::

            “This isn’t clear here. If you’re using AWS, you’re using the underlying MQTT client. If you’re using nrfcloud you’re also using the MQTT client. Both clients are layers built on the default Zephyr MQTT implementation.”
            understood! perhaps you’re right i didn’t develop the AWS server communication with nrfcloud.com, just know that the code is based upon the ncs/1.4.1/nrf/samples/nrf9160/cloud_client vs something clearly showing itself as related to MQTT such as mqtt_simple. Don’t see any reference to MQTT in the code, but I believe it’s there behind the scenes as you express.

            Thanks again for your feedback and guidance!

              @jaredwolff

              Just wanted to follow up to mention that using sys_reboot approach we were able to get our device to consistently re-connect to LTE after dropouts. Thanks again for your help.

                occurrentarts excellent!! When you have time I would upgrade to the latest MFW and NCS so you can avoid rebooting the device.

                Also, you only need to run the connect() call once in your application. Turning the modem on into “normal” mode should make it connect. (That’s what I’m doing in the aforementioned sample)

                Hi Jared,

                Brian (a.k.a. occurrentarts) has just rewritten the code to work with the 1.7.1 SDK and the latest MFW, which appears to be working well. Meanwhile, I’m having a problem applying the MFW firmware update and could use your assistance to figure it out.

                Here’s what shows up in the terminal when I attempt to apply the update:

                PS D:\David\Wildlife\DeterGents\ZephyrToolsRepo\nfed\samples\mfw_update> python3 update_modem.py mfw_nrf9160_1.3.1.zip com3 1000000
                # modem firmware upgrade over serial port example started.
                [HighLevel] Creating new probe
                [HighLevel] Initialize new probe.
                [Probes.com3] [ModemUARTDFUProbe] Dll directory is C:\Users\barli\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pynrfjprog\lib_x64.
                [Probes.com3] [ModemUARTDFUProbe] Find and connect to dfu dll
                [Probes.com3] [ModemUARTDFUProbe] Using DFU dll at C:\Users\barli\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pynrfjprog\lib_x64\NRFDFU.dll.
                [Probes.com3] [ModemUARTDFUProbe] Load library at C:\Users\barli\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pynrfjprog\lib_x64\NRFDFU.dll.
                [Probes.com3] [ModemUARTDFUProbe] Library loaded, loading member functions.
                [Probes.com3] [ModemUARTDFUProbe] Member functions succesfully loaded.
                [Probes.com3] [ModemUARTDFU-com3] Initialize new probe.
                [Probes.com3] [ModemUARTDFU-com3] Successfully opened port: com3@1000000,flow_control:none,parity:none.
                [Probes.com3] [ModemUARTDFU-com3] {
                    "duration": 903,
                    "error_code": "Ok",
                    "operation": "open_uart",
                    "outcome": "success",
                    "progress_percentage": 100
                }
                [HighLevel] Probe initialization complete!
                [Probes.com3] [ModemUARTDFU-com3] Check if provided file exists
                [Probes.com3] [ModemUARTDFU-com3] Check if provided file can be read
                [Probes.com3] [ModemUARTDFU-com3] Programming bootloader
                [Probes.com3] [ModemUARTDFU-com3] Programming modem bootloader 72B3D7C.ipc_dfu.signed_1.1.0.ihex.
                [Probes.com3] [ModemUARTDFU-com3] Extracting 13416 bytes from 72B3D7C.ipc_dfu.signed_1.1.0.ihex.
                [Probes.com3] [ModemUARTDFU-com3] {
                    "duration": 0,
                    "message": "Calculating image size",
                    "operation": "upload_image",
                    "progress_percentage": 5
                }
                [Probes.com3] [ModemUARTDFU-com3] {
                    "duration": 0,
                    "message": "Uploading image to device",
                    "operation": "upload_image",
                    "progress_percentage": 10
                }
                [Probes.com3] [ModemUARTDFU-com3] {
                    "duration": 30004,
                    "error_code": "Timeout",
                    "message": "Image upload failed. Bad response from device",
                    "operation": "upload_image",
                    "outcome": "fail",
                    "progress_percentage": 100
                }
                [Probes.com3] [ModemUARTDFU-com3] Error during image file upload. Upload returned an error.
                [Probes.com3] [ModemUARTDFU-com3] Failed to program bootloader file
                [Probes.com3] [ModemUARTDFUProbe] Failed to program DFU package
                [HighLevel] Failed programming the device.
                [Probes.com3] b'An error was reported by NRFJPROG DLL: -220 TIME_OUT. \n[Probes.com3] [ModemUARTDFU-com3] Error during image file upload. Upload returned an error.\n\textra: [Probes.com3] [ModemUARTDFU-com3] Failed to 
                program bootloader file\n\textra: [Probes.com3] [ModemUARTDFUProbe] Failed to program DFU package\n\textra: [HighLevel] Failed programming the device.'
                [Probes.com3] [ModemUARTDFUProbe] Uninitializing ModemUARTDFU probe at serial port com3.
                [Probes.com3] [ModemUARTDFU-com3] Sending device reset request
                [Probes.com3] [ModemUARTDFU-com3] Sending reset request to device.
                [Probes.com3] [ModemUARTDFU-com3] Closing connection to mcuboot device
                [Probes.com3] [ModemUARTDFU-com3] serial port com3 closed.
                [Probes.com3] [ModemUARTDFU-com3] {
                    "duration": 1,
                    "error_code": "Ok",
                    "operation": "close_uart",
                    "outcome": "success",
                    "progress_percentage": 100
                }
                [HighLevel] Done.
                [HighLevel] Closing and freeing sub dlls.
                Traceback (most recent call last):
                  File "D:\David\Wildlife\DeterGents\ZephyrToolsRepo\nfed\samples\mfw_update\update_modem.py", line 43, in <module>
                    run(args.uart, args.firmware, args.baudrate)
                  File "D:\David\Wildlife\DeterGents\ZephyrToolsRepo\nfed\samples\mfw_update\update_modem.py", line 26, in run
                    modem_dfu_probe.program(modem_firmware_zip)
                  File "C:\Users\barli\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pynrfjprog\HighLevel.py", line 388, in program
                    raise APIError(result, error_data=self.get_errors(), log=self._logger.error)
                pynrfjprog.APIError.APIError: An error was reported by NRFJPROG DLL: -220 TIME_OUT.
                [Probes.com3] [ModemUARTDFU-com3] Error during image file upload. Upload returned an error.
                        extra: [Probes.com3] [ModemUARTDFU-com3] Failed to program bootloader file
                        extra: [Probes.com3] [ModemUARTDFUProbe] Failed to program DFU package
                        extra: [HighLevel] Failed programming the device.

                Thanks,
                David

                  Barliesque you can actually use nrfjprog to apply modem firmware updates now.

                  nrfjprog --program mfw.zip

                  You do need a programmer that supports mfw updates though. I know the nRF9160DK works.

                  You can get the latest nrfjprog from Nordic here: https://www.nordicsemi.com/Products/Development-tools/nrf-command-line-tools/download

                    I don’t understand why I would need to get a programmer, given Brian was able to apply the firmware update without it – I’m just trying to follow the same process he’s already had success with.

                      I have your SDK tools installed in VS Code, with the following:

                      nrfjprog version: 10.15.0 external
                      JLinkARM.dll version: 6.88a

                        Ahh I saw nrfjprog DLL in the output so I assumed you were trying to use a programmer. My bad.

                        As for the error, it’s stating a timeout. So, I would make sure that

                        1. The mfw_update sample is programmed to the device
                        2. That you’re choosing the correct serial port
                        3. Also to be doubly sure make sure you install the python requirements:
                        pip3 install -r requirements.txt

                        Run that within the mfw_update sample folder.

                        Hope that helps

                          Also what type of system are you running this on? Windows?

                            PS D:\David\Wildlife\DeterGents\ZephyrToolsRepo\nfed\samples\mfw_update> pip3 install -r requirements.txt
                            Requirement already satisfied: pynrfjprog in c:\users\barli.zephyrtools\env\lib\site-packages (from -r requirements.txt (line 1)) (10.15.2)
                            Requirement already satisfied: future in c:\users\barli.zephyrtools\env\lib\site-packages (from pynrfjprog->-r requirements.txt (line 1)) (0.18.2)

                            I am running on WIndows, yes.

                              I’ve also verified the device is on COM3

                                The mfw_update sample is programmed to the device

                                Can you confirm Barliesque?

                                Make sure your device is not in bootloader when running the script. Also make sure you don’t have any open serial connections to COM3.

                                  Yes, mfw_update is programmed to the device. Using LTE Link Monitor, all AT commands fail – not sure if that’s expected for mfw_update. Pressing the RST button doesn’t cause any messages to appear there either. Before attempting the upload, I’ve made sure to close down anything that could have an open connection to COM3. The blue light is unlit, so not in bootloader mode.

                                    Are you able to run at_client on that board? (And try again with AT commands?) The mfw_update firmware will not respond to AT commands.

                                      Terms and Conditions | Privacy Policy