Seven new tests in tests/unit/test_connection_manager.py covering all
four G3 findings:
- test_g3_tcp_connect_returns_plain_string (F01): CONNECTED event
payload contains a plain string, not an asyncio.Future.
- test_g3_reconnect_loop_does_not_compound (F03): after max_attempts
failures, exactly that many connect() calls are made — no fan-out.
- test_g3_disconnect_cancels_reconnect_loop (F03): disconnect()
mid-loop cancels the single task cleanly.
- test_g3_reconnect_callback_called_after_reconnect (F02): callback
is invoked after a successful reconnect.
- test_g3_reconnect_callback_failure_does_not_crash_loop (F02):
callback exception is logged, reconnect still succeeds.
- test_g3_connect_none_is_soft_failure (N11): connect() returning
None does not set _is_connected or emit CONNECTED.
- test_g3_no_reconnect_callback_is_noop (N11/F02): no callback
provided — reconnect works, backwards-compatible.
Refs: Forensics report findings F01, F02, F03, N11
ConnectionManager._attempt_reconnect called self.connection.connect()
directly, bypassing MeshCore.connect() which runs send_appstart().
Firmware requires CMD_APP_START after every transport-level connection
to initialize the session. Without it, the reconnected transport has
no active session — sends go unanswered, tcp_no_response fires after
5 attempts, handle_disconnect re-enters _attempt_reconnect, and the
reconnect storm begins.
Fix: add an optional reconnect_callback parameter to
ConnectionManager.__init__. MeshCore passes self._on_reconnect which
calls send_appstart() after the transport reconnects. The callback
is invoked inside _attempt_reconnect immediately after a successful
connect(), before the CONNECTED event is emitted. Callback failures
are logged as warnings but do not break the reconnect — the transport
is up regardless. Default None keeps the API backwards-compatible
for direct ConnectionManager users.
Refs: Forensics report finding F02
_attempt_reconnect previously tail-recursed via asyncio.create_task,
re-assigning self._reconnect_task from inside the running coroutine.
This orphaned the current task — disconnect() cancelled only the
newest pointer, leaving previous-generation attempts in flight. Those
orphaned tasks could set _is_connected = True after the caller thought
the session was closed.
Fix: replace with a single iterative while loop that holds one
persistent task for the entire reconnect session. The task is created
once in handle_disconnect and torn down only on success, max attempts
exhausted, or disconnect() cancellation.
Refs: Forensics report finding F03
The three transports had inconsistent connect() return contracts:
TCP returned an asyncio.Future (fixed by F01), serial returned
self.port (always truthy), BLE returned self.address or None.
ConnectionManager's success check `if result is not None:` was
tautological for TCP and serial. With F01 fixed, the check is now
meaningful for all three.
Add a comprehensive docstring to ConnectionProtocol documenting the
contract: return truthy on success (included in CONNECTED event
payload), return None for soft failure (retry), or raise for hard
failure (also retry, logged). Also import Awaitable for the F02
reconnect_callback type hint that follows.
Refs: Forensics report finding N11
TCPConnection.connect() returned a resolved asyncio.Future wrapping
self.host instead of the plain string. ConnectionManager put this
Future directly into the CONNECTED event payload, which crashed any
downstream serializer (e.g. HA recorder's sanitize_event_data) that
tried to walk the payload dict. BLE and serial already return plain
strings. Fix: delete the Future creation and return self.host
directly.
Refs: Forensics report finding F01
- Update mock dispatcher to use subscribe-before-send pattern matching
the rewritten CommandHandler.send() method
- Use 32-byte pubkeys in tests for commands that now require
prefix_length=32 (login, logout, statusreq, reset_path, share/export/remove contact)
- Fix send_trace test path format to match flags=1 (2-byte path hashes)
- Update LPP current test to expect signed wrap for values > 32.767
- Fix BinaryReqType import (moved from meshcore.parsing to meshcore.packets)
- Fix register_binary_request call signature (added pubkey_prefix param)
- Update timeout test to expect 'no_event_received' instead of 'timeout'
Add warnings to send_login, send_statusreq, send_telemetry_req, and
send_path_discovery pointing users to their _sync counterparts. The
fire-and-forget versions bypass the mesh request lock and can cause
silent response drops due to firmware clearPendingReqs() behavior.
The companion firmware can only track one outstanding mesh request at a
time — clearPendingReqs() zeros all pending response flags before each
outgoing mesh request. Overlapping mesh commands cause silent response
drops.
Adds _mesh_request_lock to CommandHandlerBase and wraps all _sync
methods with it. Also adds send_login_sync and send_path_discovery_sync
for complete round-trip serialization of those commands.
Local commands (get_bat, get_channel, set_time, send_msg, etc.) are
unaffected — they don't trigger clearPendingReqs() on the firmware.