Merge remote-tracking branch 'origin/rfsim-deadlock-avoidance' into integration_2025_w06 (!3246)
Deadlock avoidance in rfsimulator This change introduces a countermeasure for deadlock in rfsimulator. The deadlock happens when all entities are waiting for new data to come in, and happens with 2+ clients, when a new client connects. I think this issue is due to ordering of fullwrite calls, resulting in out-of-order delivery of packets and eventually trashing the packets on the receiving side. The out-of-order delivery warnings are printed just before the system deadlocks but I have not found a better solution so far. The workaround makes the server never lock up permanently by ignoring the client failure to write on time after 10 tries. This was tested locally for both UE as server and gNB as server and works correctly, causing the deadlock to clear and the added log to be printed several times when the deadlock is detected, after which the system goes back to normal. I have some gdb output of the executables during deadlock: UE: $7 = {conn_sock = 98, lastReceivedTS = 3226163740, headerMode = true, trashingPacket = false, th = {size = 13184, nbAnt = 1, timestamp = 3226150556, option_value = 0, option_flag = 0}, transferPtr = 0x7f6a500018a8 "\200\063", remainToTransfer = 24, circularBufEnd = 0x7f6a503b3ac0 "", circularBuf = 0x7f6a501f1ac0, channel_model = 0x0} (gdb) p t->buf[5] $8 = {conn_sock = 97, lastReceivedTS = 0, headerMode = true, trashingPacket = false, th = {size = 0, nbAnt = 0, timestamp = 0, option_value = 0, option_flag = 0}, transferPtr = 0x7f6a50001900 "", remainToTransfer = 24, circularBufEnd = 0x7f6a50575ad0 "", circularBuf = 0x7f6a503b3ad0, channel_model = 0x0} nextRxTimestamp 3225937740 nsamps = 30720 gNB 1: (gdb) p t->buf[0] $4 = {conn_sock = 95, lastReceivedTS = 3226026876, headerMode = true, trashingPacket = false, th = {size = 1, nbAnt = 1, timestamp = 3226026875, option_value = 0, option_flag = 0}, transferPtr = 0x7f8dfc003ab8 "\001", remainToTransfer = 24, circularBufEnd = 0x7f8e1c3ff010 "", circularBuf = 0x7f8e1c23d010, channel_model = 0x0} nextRxTimestamp 3225996956 gNB 2: lastReceivedTS = 3226026875 $2 = {conn_sock = 95, lastReceivedTS = 3226026875, headerMode = true, trashingPacket = false, th = {size = 1, nbAnt = 1, timestamp = 3226026875, option_value = 0, option_flag = 0}, transferPtr = 0x744898003ab8 "\001", remainToTransfer = 24, circularBufEnd = 0x7448bc2e7010 "", circularBuf = 0x7448bc125010, channel_model = 0x0} nextRxTimestamp 3226026875 As you can see all executables are in have_to_wait state.
Showing
Please register or sign in to comment