Compare commits

...

60 Commits

Author SHA1 Message Date
de3f5c9677 fix: 前端时钟校准问题 2026-04-18 17:01:13 +08:00
ae2f1c3156 fix: 前后端时钟问题 2026-04-18 16:07:20 +08:00
212459a8e4 feat: 增加日志模块 2026-04-18 12:52:39 +08:00
b700dab484 fix: 传输方向反了 2026-04-17 22:01:04 +08:00
57d79a2759 fix: 传输方向反了! 2026-04-17 17:17:59 +08:00
1c845ba51e fix: C端日志采样、每条 KCP 连接每秒 1 次 periodic snapshot 2026-04-16 11:32:48 +08:00
d64329214d fix: 机器人摄像头端口号 2026-04-15 17:33:49 +08:00
7d6b7da157 fix: 临时放宽排空速度 2026-04-15 17:02:08 +08:00
df9a56af53 fix: 临时放宽排空速度 2026-04-15 16:48:12 +08:00
fd34330081 fix: 摄像头端口号固定 2026-04-15 16:35:28 +08:00
9b705dd8f8 fix: 摄像头端口号固定 2026-04-15 16:32:03 +08:00
c6484e847e chore: jazzy 2026-04-15 13:28:43 +08:00
fab2559980 chore: video26 2026-04-15 13:27:55 +08:00
9c2df9d674 fix: 删除视频侧 server_idle_reconnect_ms 误判检查 2026-04-14 22:44:36 +08:00
bd0a282344 fix: 删除视频侧 server_idle_reconnect_ms 误判检查 2026-04-14 22:31:33 +08:00
aa6235de5a fix: blitz-run-context.service 失败 2026-04-14 21:26:56 +08:00
1c2cf157d2 fix: blitz-run-context.service 失败 2026-04-14 21:13:27 +08:00
e895cdc9de feat: 日志增强功能 2026-04-14 20:52:41 +08:00
nnbcccscdscdsc
579e67a3db fix:5G不通走其他网络,5G通了走5G 2026-04-14 17:37:44 +08:00
nnbcccscdscdsc
ebb047c7b5 feat:新增gps监控服务(断开重连) 2026-04-14 15:16:11 +08:00
bb3e7b2989 fix: 视频程序也需要有 stale-session 检测 2026-04-14 13:22:19 +08:00
6ccd9e9fa1 test: b_side进程断开现象 2026-04-14 00:34:56 +08:00
71c026ccf3 fix: ip route在接口变化后也要更新,执行route repair 2026-04-14 00:15:25 +08:00
4805cc772d Merge branch 'c' of https://106.52.207.92:9103/limingjie/OmniSocketGo into c 2026-04-13 23:46:20 +08:00
3bbeaab0c3 fix: 修复模拟接口变更名字后(可能是掉网卡也可能不是),重恢复 2026-04-13 23:46:19 +08:00
nnbcccscdscdsc
a3d8835074 Merge branch 'c' of https://106.52.207.92:9103/limingjie/OmniSocketGo into c 2026-04-13 22:33:40 +08:00
nnbcccscdscdsc
947ecb2a2b fix: GPS采集端修复 2026-04-13 22:33:20 +08:00
25c68530ba feat: 自启动与自恢复机制 2026-04-13 21:55:40 +08:00
nnbcccscdscdsc
2f507a7546 fix:修改默认网路只有5G模组问题 2026-04-13 17:22:18 +08:00
nnbcccscdscdsc
7dc47d310d feat:5g代码迁移&修改配置文件 2026-04-13 15:55:25 +08:00
d819f9ca4d fix: KCP日志在A端记录 2026-04-11 20:43:14 +08:00
9009107a64 fix: 摄像头亮度默认夜间 2026-04-11 20:00:57 +08:00
f00d6661c0 fix: 保留eno3,服务之间不再相互依赖 2026-04-11 19:10:31 +08:00
8b6a243a0d Merge branch 'c' of https://106.52.207.92:9103/limingjie/OmniSocketGo into c 2026-04-11 17:20:52 +08:00
7622360a0e fix: 5G自动拨号、软件时钟同步、机器人端控制程序自启。 2026-04-11 17:20:52 +08:00
nnbcccscdscdsc
980d3b45e1 fix:更新视频帧尾部携带数据size 2026-04-11 16:41:31 +08:00
nnbcccscdscdsc
7ca136870d fix:更新gps数据处理 2026-04-11 16:33:26 +08:00
09dd9e24c0 Merge branch 'c' of https://106.52.207.92:9103/limingjie/OmniSocketGo into c 2026-04-11 16:06:52 +08:00
14ce3d4e1d feat: 5G自动拨号、软件时钟同步、机器人端控制程序自启。 2026-04-11 16:06:51 +08:00
nnbcccscdscdsc
757e6da2b2 fix: 对接GPS数据 2026-04-11 12:15:27 +08:00
84e0cc54d2 fix: 断联视频堆积问题与控制命令失效问题 2026-04-11 03:55:19 +08:00
Mock
6f727dbe57 fix: 程序断掉后超时清理连接 2026-04-10 13:21:24 +08:00
Mock
6cedf859db fix: 当前的 relay 实现原来只记了一个“最后发包的下游客户端地址” 2026-04-10 12:13:19 +08:00
Mock
6c5d410bdc fix: 修复启动bash路径 2026-04-10 11:56:04 +08:00
Mock
251d69c4ff fix: 修复启动bash路径 2026-04-10 11:55:03 +08:00
Mock
40cd68db3d feat: 修复启动bash路径、将丢失连接时的视频稍微不堆积 2026-04-10 11:49:38 +08:00
Mock
f443934ee4 Merge branch 'c' of https://106.52.207.92:9103/limingjie/OmniSocketGo into c 2026-04-10 11:11:05 +08:00
Mock
79dba2a664 feat: 长保持连接,控制端可重启 2026-04-10 11:11:03 +08:00
6529f0f048 feat: 脚本执行权限 2026-04-10 10:24:01 +08:00
Mock
2033db7268 check: 查看机器人速度topic信息 2026-04-09 22:40:01 +08:00
Mock
bc18f9a27b check: 查看机器人速度topic信息 2026-04-09 22:28:30 +08:00
Mock
d76230b9b0 feat: 启动命令不用环境变量 2026-04-09 18:13:03 +08:00
Mock
11e67282c7 feat: 增加链路统计信息,两个链路分别显示在前端,D向A汇报D与B的信息 2026-04-09 13:38:10 +08:00
nnbcccscdscdsc
e72f7f3fd9 feat: pipeline_gps (暂未改动) 2006-08-23 15:56:50 +08:00
70e835ed49 feat: 视频与控制程序合并 2026-04-04 23:25:43 +08:00
9ffc36f50d feat: 基于Python ROS2的控制程序 2026-04-03 20:00:33 +08:00
6ece408d9f chore: Update .gitignore to ignore all build directories 2026-04-03 17:12:49 +08:00
8a1baa64c0 feat: C控制程序对接KCP 2026-04-03 12:04:39 +08:00
628583f79d Merge branch 'c' of https://106.52.207.92:9103/limingjie/OmniSocketGo into c 2026-04-02 16:59:09 +08:00
7b4a508c46 fix: 编译视频程序使警告消失 2026-04-02 16:59:08 +08:00
110 changed files with 15631 additions and 824 deletions

View File

@@ -11,7 +11,8 @@
"Bash(git pull:*)",
"Bash(wc:*)",
"Bash(python3 -c \"import dis, marshal, types; f = open\\(''''C:/Users/64187/Desktop/Workspace/OmniSocketGo/__pycache__/omnisocket_video_sender.cpython-312.pyc'''',''''rb''''\\); f.read\\(16\\); code=marshal.load\\(f\\); dis.dis\\(code\\)\")",
"Bash(gh pr:*)"
"Bash(gh pr:*)",
"Bash(python3 -c ':*)"
]
}
}

0
.codex Normal file
View File

12
.gitignore vendored
View File

@@ -17,4 +17,14 @@ c/bin
/python/build
/python/omnisocket.egg-info
*.so*
*.so*
/.venv
**/build/
ros-control-py/install
ros-control-py/log
scripts/boot/modem_network_info.json
logs/

54
AGENT.md Normal file
View File

@@ -0,0 +1,54 @@
# 通信场景说明
## 系统拓扑
- 系统中有 `A/B/C/D` 四个终端。
- `A``C` 在同一侧,`B``D` 在同一侧。
- 实际运行时不是一条端到端 KCP而是两条独立 KCP 链路:
- `B <-> D`
- `D <-> A`
- `C` 只负责 UDP 端口转发。
- 整体目标是让 `A``B` 双向通信:
- `A -> B` 发送控制命令
- `B -> A` 发送视频和其他反馈数据
## 网络情况
- 主要瓶颈在 `B` 侧 5G 链路,`A` 侧是有线网络,不是主要约束。
- `B <-> D` 的时延抖动较大:
- 常见约 `15-60 ms`
- 最差约 `80-90 ms`
- 带宽是明显非对称的:
- `B -> D``8 Mbps`,偏低且不稳定
- `D -> B``27-50 Mbps`,更高但也有波动
- 工程含义:
- 控制流必须优先保障
- 视频流必须限速并具备自适应能力
- 不能默认把所有流合成一个共享传输就一定更优
## 三个项目的角色
- `OmniSocketGo`
- 底层传输项目
- 使用 C 编写
- 提供 OmniSocket/KCP 传输能力,以及视频发送等原生传输程序
- `robot-command-center`
- A 侧上层应用
- 主要负责接收视频帧,并提供监控/展示能力
- 当前通过本机 `A-side OmniDaemon` 使用传输能力
- `tienkung-szu`
- 上层控制项目
- 同时包含 A 侧控制发送和 B 侧控制执行逻辑
- A 侧典型入口是键盘/Xbox sender
- B 侧典型入口是 `rl_control_node.py` / `rl_control_node_sim.py`,负责接收控制并驱动机器人行为
## 当前架构方向
- `A` 侧通过 `A-side OmniDaemon` 统一管理本机的控制发送和视频接收。
- `B` 侧通过 `B-side OmniDaemon` 统一管理本机的控制接收和视频发送。
- 核心原则:
- 控制流与视频流逻辑分离
- 每台主机通过本地 daemon 统一持有传输会话
- 后续围绕真实网络瓶颈做 telemetry、调度和速率自适应
## 代码原则
- 一定要精简,不能写过于复杂和冗余的代码,因为必须确保各个环节可控,在低延迟的要求下,需要细化各个环节的消耗
## SSH原则
- 如果你需要SSH到某个远程端执行操作一定要确保不做删除文件操作确保留有记录

54
CLAUDE.md Normal file
View File

@@ -0,0 +1,54 @@
# 通信场景说明
## 系统拓扑
- 系统中有 `A/B/C/D` 四个终端。
- `A``C` 在同一侧,`B``D` 在同一侧。
- 实际运行时不是一条端到端 KCP而是两条独立 KCP 链路:
- `B <-> D`
- `D <-> A`
- `C` 只负责 UDP 端口转发。
- 整体目标是让 `A``B` 双向通信:
- `A -> B` 发送控制命令
- `B -> A` 发送视频和其他反馈数据
## 网络情况
- 主要瓶颈在 `B` 侧 5G 链路,`A` 侧是有线网络,不是主要约束。
- `B <-> D` 的时延抖动较大:
- 常见约 `15-60 ms`
- 最差约 `80-90 ms`
- 带宽是明显非对称的:
- `B -> D``8 Mbps`,偏低且不稳定
- `D -> B``27-50 Mbps`,更高但也有波动
- 工程含义:
- 控制流必须优先保障
- 视频流必须限速并具备自适应能力
- 不能默认把所有流合成一个共享传输就一定更优
## 三个项目的角色
- `OmniSocketGo`
- 底层传输项目
- 使用 C 编写
- 提供 OmniSocket/KCP 传输能力,以及视频发送等原生传输程序
- `robot-command-center`
- A 侧上层应用
- 主要负责接收视频帧,并提供监控/展示能力
- 当前通过本机 `A-side OmniDaemon` 使用传输能力
- `tienkung-szu`
- 上层控制项目
- 同时包含 A 侧控制发送和 B 侧控制执行逻辑
- A 侧典型入口是键盘/Xbox sender
- B 侧典型入口是 `rl_control_node.py` / `rl_control_node_sim.py`,负责接收控制并驱动机器人行为
## 当前架构方向
- `A` 侧通过 `A-side OmniDaemon` 统一管理本机的控制发送和视频接收。
- `B` 侧通过 `B-side OmniDaemon` 统一管理本机的控制接收和视频发送。
- 核心原则:
- 控制流与视频流逻辑分离
- 每台主机通过本地 daemon 统一持有传输会话
- 后续围绕真实网络瓶颈做 telemetry、调度和速率自适应
## 代码原则
- 一定要精简,不能写过于复杂和冗余的代码,因为必须确保各个环节可控,在低延迟的要求下,需要细化各个环节的消耗
## SSH原则
- 如果你需要SSH到某个远程端执行操作一定要确保不做删除文件操作确保留有记录

View File

@@ -4,6 +4,10 @@ CPPFLAGS ?= -Iinclude -Ithird_party/cjson -Ithird_party/kcp
LDFLAGS ?= -pthread
PYTHON ?= python3
ifeq ($(QUIET_FFMPEG_LOGS),1)
CFLAGS += -DQUIET_FFMPEG_LOGS
endif
BIN_DIR := bin
SRC_DIR := src
CMD_DIR := cmd
@@ -37,8 +41,9 @@ TARGETS := \
$(BIN_DIR)/kcpping
CAMERA_VIDEO_SENDER := $(BIN_DIR)/camera_video_sender
CAMERA_VIDEO_SENDER_SRCS := \
$(CMD_DIR)/v1_camera_pipeline_ifdef.c \
FFMPEG_PIPELINE_COMMON_SRCS := \
$(SRC_DIR)/video_pipeline.c \
$(SRC_DIR)/gps_buffer.c \
$(SRC_DIR)/omni_common.c \
$(SRC_DIR)/protocol.c \
$(SRC_DIR)/latencylog.c \
@@ -50,6 +55,15 @@ CAMERA_VIDEO_SENDER_SRCS := \
third_party/cjson/cJSON.c \
third_party/kcp/ikcp.c
CAMERA_VIDEO_SENDER_SRCS := \
$(CMD_DIR)/v1_camera_pipeline_ifdef.c \
$(FFMPEG_PIPELINE_COMMON_SRCS)
B_SIDE_OMNID := $(BIN_DIR)/b_side_omnid
B_SIDE_OMNID_SRCS := \
$(CMD_DIR)/b_side_omnid.c \
$(FFMPEG_PIPELINE_COMMON_SRCS)
all: $(TARGETS)
$(BIN_DIR):
@@ -77,10 +91,15 @@ $(BIN_DIR)/kcpping: $(CMD_DIR)/kcpping.c $(COMMON_SRCS) | $(BIN_DIR)
$(CC) $(CFLAGS) $(CPPFLAGS) -o $@ $^ $(LDFLAGS)
$(CAMERA_VIDEO_SENDER): $(CAMERA_VIDEO_SENDER_SRCS) | $(BIN_DIR)
$(CC) $(CFLAGS) $(CPPFLAGS) $$(pkg-config --cflags libavformat libavcodec libavutil libswscale) -o $@ $^ $(LDFLAGS) $$(pkg-config --libs libavformat libavcodec libavutil libswscale)
$(CC) $(CFLAGS) $(CPPFLAGS) $$(pkg-config --cflags libavformat libavcodec libavutil libswscale) -o $@ $^ $(LDFLAGS) $$(pkg-config --libs libavformat libavcodec libavutil libswscale) -lm
camera_video_sender: $(CAMERA_VIDEO_SENDER)
$(B_SIDE_OMNID): $(B_SIDE_OMNID_SRCS) | $(BIN_DIR)
$(CC) $(CFLAGS) $(CPPFLAGS) $$(pkg-config --cflags libavformat libavcodec libavutil libswscale) -o $@ $^ $(LDFLAGS) $$(pkg-config --libs libavformat libavcodec libavutil libswscale) -lm
b_side_omnid: $(B_SIDE_OMNID)
clean:
rm -rf $(BIN_DIR)
@@ -90,4 +109,4 @@ python-ext:
python-install:
cd python && $(PYTHON) -m pip install -e .
.PHONY: all clean python-ext python-install camera_video_sender
.PHONY: all clean python-ext python-install camera_video_sender b_side_omnid

View File

@@ -33,10 +33,14 @@ Server `D` runs the KCP hub on `0.0.0.0:10909`:
```bash
./bin/kcpserver -listen 0.0.0.0:10909 \
-kcp-ts-debug-log logs/d-kcp-ts.jsonl \
-kcp-session-stats-log logs/d-kcp-stats.jsonl
-telemetry-peer peer-a-telemetry \
-telemetry-interval 1000ms \
-kcp-session-stats-log logs/d-kcp-stats.jsonl \
-kcp-session-stats-interval 1000ms
```
For multi-hour runs, keep `-latency-log` and `-kcp-ts-debug-log` off unless you are collecting a short repro trace.
Relay `C` runs a raw UDP forwarder to `D`:
```bash

1283
cmd/b_side_omnid.c Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -262,9 +262,9 @@ int main(int argc, char **argv) {
goto cleanup;
}
if (relay_via[0] != '\0') {
fprintf(stderr, "opened KCP session as %s; logical server=%s, actual dial target=%s via relay; register not yet confirmed\n", kcp_client_id(client), server_addr, actual_dial_target);
fprintf(stderr, "opened KCP session as %s; logical server=%s, actual dial target=%s via relay; registration confirmed\n", kcp_client_id(client), server_addr, actual_dial_target);
} else {
fprintf(stderr, "opened KCP session as %s; logical server=%s, actual dial target=%s; register not yet confirmed\n", kcp_client_id(client), server_addr, actual_dial_target);
fprintf(stderr, "opened KCP session as %s; logical server=%s, actual dial target=%s; registration confirmed\n", kcp_client_id(client), server_addr, actual_dial_target);
}
receive_ctx.client = client;

View File

@@ -6,6 +6,7 @@ static void kcpserver_usage(FILE *out) {
fprintf(out, "usage: kcpserver [-mode hub|relay] [-listen addr] [-bind-device dev]\n");
fprintf(out, " [-latency-log path] [-kcp-ts-debug-log path]\n");
fprintf(out, " [-kcp-session-stats-log path] [-kcp-session-stats-interval 100ms]\n");
fprintf(out, " [-telemetry-peer peer-id] [-telemetry-interval 500ms]\n");
fprintf(out, " [-relay-remote addr] [-relay-listen addr] [-relay-peer addr]\n");
}
@@ -17,10 +18,13 @@ int main(int argc, char **argv) {
const char *packet_log_path = "";
const char *stats_log_path = "";
const char *stats_interval_raw = "";
const char *telemetry_peer_id = "";
const char *telemetry_interval_raw = "";
const char *relay_listen_alias = "";
const char *relay_remote_addr = "";
const char *relay_peer_alias = "";
int stats_interval_ms = KCP_DEFAULT_STATS_INTERVAL_MS;
int telemetry_interval_ms = 500;
int i;
int rc = 1;
@@ -84,6 +88,20 @@ int main(int argc, char **argv) {
stats_interval_raw = value;
continue;
}
if ((handled = cli_parse_value_flag(argc, argv, &i, argv[i], "-telemetry-peer", &value)) < 0) {
fprintf(stderr, "kcpserver: flag -telemetry-peer requires a value\n");
return 1;
} else if (handled) {
telemetry_peer_id = value;
continue;
}
if ((handled = cli_parse_value_flag(argc, argv, &i, argv[i], "-telemetry-interval", &value)) < 0) {
fprintf(stderr, "kcpserver: flag -telemetry-interval requires a value\n");
return 1;
} else if (handled) {
telemetry_interval_raw = value;
continue;
}
if ((handled = cli_parse_value_flag(argc, argv, &i, argv[i], "-relay-listen", &value)) < 0) {
fprintf(stderr, "kcpserver: flag -relay-listen requires a value\n");
return 1;
@@ -118,6 +136,10 @@ int main(int argc, char **argv) {
fprintf(stderr, "kcpserver: invalid -kcp-session-stats-interval value %s\n", stats_interval_raw);
return 1;
}
if (omni_parse_duration_ms(telemetry_interval_raw, 500, &telemetry_interval_ms) != 0) {
fprintf(stderr, "kcpserver: invalid -telemetry-interval value %s\n", telemetry_interval_raw);
return 1;
}
if (relay_peer_alias[0] != '\0' && relay_remote_addr[0] != '\0' && strcmp(relay_peer_alias, relay_remote_addr) != 0) {
fprintf(stderr, "kcpserver: flags -relay-remote and -relay-peer must match when both are set\n");
@@ -178,6 +200,10 @@ int main(int argc, char **argv) {
fprintf(stderr, "kcpserver: create hub failed\n");
goto cleanup;
}
if (telemetry_peer_id[0] != '\0' && kcp_hub_set_telemetry(hub, telemetry_peer_id, telemetry_interval_ms) != 0) {
fprintf(stderr, "kcpserver: configure telemetry peer %s failed\n", telemetry_peer_id);
goto cleanup;
}
fprintf(stderr, "kcp hub listening on %s\n", listen_addr);
if (kcp_hub_serve_listener(hub, listener) != 0) {
fprintf(stderr, "kcpserver: serve listener failed\n");
@@ -188,6 +214,10 @@ int main(int argc, char **argv) {
}
if (strcmp(mode, "relay") == 0) {
if (telemetry_peer_id[0] != '\0') {
fprintf(stderr, "kcpserver: flag -telemetry-peer may only be used in hub mode\n");
return 1;
}
if (bind_device[0] != '\0') {
fprintf(stderr, "kcpserver: flag -bind-device is not supported in relay mode\n");
return 1;

View File

@@ -1,687 +1,35 @@
// camera_pipeline_ifdef_fixed.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <sys/time.h>
#include <linux/videodev2.h>
#include <time.h>
// FFmpeg头文件 - 使用纯C包含
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavutil/avutil.h>
#include <libavutil/imgutils.h>
#include <libavutil/opt.h>
#include <libswscale/swscale.h>
#include "video_pipeline.h"
#include "peer_kcp_client.h"
int main(void) {
video_pipeline_config_t config;
video_pipeline_stats_t stats;
// ==========================================
// 1. 配置区域:在这里开启或关闭时间打印
// ==========================================
#define DEBUG_TIMING // 注释掉这一行,所有时间打印和计算都会消失
// 定义打印宏
#ifdef DEBUG_TIMING
#define PRINT_TIME(fmt, ...) printf(fmt, ##__VA_ARGS__)
#else
#define PRINT_TIME(fmt, ...) // 什么都不做,编译器会优化掉
#endif
#define WIDTH 1280
#define HEIGHT 720
#define OUTPUT_WIDTH 640
#define OUTPUT_HEIGHT 360
#define NUM_BUFFERS 4
#define CLEAR(x) memset(&(x), 0, sizeof(x))
#define VIDEO_SERVER_ADDR_ENV "OMNI_VIDEO_SERVER_ADDR"
#define VIDEO_RELAY_ADDR_ENV "OMNI_VIDEO_RELAY_VIA"
#define VIDEO_BIND_IP_ENV "OMNI_VIDEO_BIND_IP"
#define VIDEO_BIND_DEVICE_ENV "OMNI_VIDEO_BIND_DEVICE"
#define VIDEO_PEER_ID_ENV "OMNI_VIDEO_PEER_ID"
#define VIDEO_TARGET_PEER_ENV "OMNI_VIDEO_TARGET_PEER"
#define VIDEO_DEFAULT_PEER_ID "peer-b-video"
#define VIDEO_DEFAULT_TARGET_PEER "peer-a-video"
typedef struct
{
kcp_client_t *client;
char target_peer[OMNI_MAX_PEER_ID];
} VideoSender;
static int video_sender_init(VideoSender *sender);
static int video_sender_send_packet(VideoSender *sender, const AVPacket *encoded_pkt, uint64_t timestamp);
static void video_sender_close(VideoSender *sender);
typedef struct
{
void *start;
size_t length;
} Buffer;
double get_time_ms()
{
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts);
return ts.tv_sec * 1000.0 + ts.tv_nsec / 1000000.0;
}
int64_t get_realtime_ms()
{
struct timespec ts;
clock_gettime(CLOCK_REALTIME, &ts);
return (int64_t)ts.tv_sec * 1000 + ts.tv_nsec / 1000000;
}
static const char *env_or_default(const char *name, const char *fallback)
{
const char *value = getenv(name);
if (value != NULL && value[0] != '\0')
return value;
return fallback;
}
static int video_sender_init(VideoSender *sender)
{
const char *server_addr = getenv(VIDEO_SERVER_ADDR_ENV);
const char *relay_addr = env_or_default(VIDEO_RELAY_ADDR_ENV, "");
const char *bind_ip = env_or_default(VIDEO_BIND_IP_ENV, "");
const char *bind_device = env_or_default(VIDEO_BIND_DEVICE_ENV, "");
const char *peer_id = env_or_default(VIDEO_PEER_ID_ENV, VIDEO_DEFAULT_PEER_ID);
const char *target_peer = env_or_default(VIDEO_TARGET_PEER_ENV, VIDEO_DEFAULT_TARGET_PEER);
kcp_conn_options_t options;
if (sender == NULL)
{
errno = EINVAL;
return -1;
video_pipeline_config_init(&config);
video_pipeline_config_load_env(&config);
if (getenv("OMNI_VIDEO_DEBUG_TIMING") == NULL) {
config.enable_timing_logs = 1;
}
if (server_addr == NULL || server_addr[0] == '\0')
{
errno = EINVAL;
fprintf(stderr, "%s is required\n", VIDEO_SERVER_ADDR_ENV);
return -1;
if (video_pipeline_stats_init(&stats) != 0) {
perror("video_pipeline_stats_init");
return 1;
}
memset(sender, 0, sizeof(*sender));
snprintf(sender->target_peer, sizeof(sender->target_peer), "%s", target_peer);
for (;;) {
int rc = video_pipeline_run(&config, &stats, NULL);
kcp_conn_options_set_video_defaults(&options);
sender->client = kcp_client_dial_with_options(
server_addr,
relay_addr,
peer_id,
bind_ip,
bind_device,
&options,
NULL,
NULL,
NULL,
KCP_DEFAULT_STATS_INTERVAL_MS);
if (sender->client == NULL)
return -1;
fprintf(stderr, "Video sender connected as %s -> %s\n",
kcp_client_id(sender->client), sender->target_peer);
return 0;
}
static int video_sender_send_packet(VideoSender *sender, const AVPacket *encoded_pkt, uint64_t timestamp)
{
uint8_t *senddata;
size_t payload_len;
int rc;
if (sender == NULL || sender->client == NULL || encoded_pkt == NULL)
{
errno = EINVAL;
return -1;
}
if (encoded_pkt->size < 0)
{
errno = EINVAL;
return -1;
}
payload_len = (size_t)encoded_pkt->size + sizeof(timestamp);
senddata = (uint8_t *)malloc(payload_len);
if (senddata == NULL)
{
errno = ENOMEM;
return -1;
}
memcpy(senddata, encoded_pkt->data, (size_t)encoded_pkt->size);
memcpy(senddata + encoded_pkt->size, &timestamp, sizeof(timestamp));
rc = kcp_client_send_binary(
sender->client,
sender->target_peer,
senddata,
payload_len);
free(senddata);
return rc;
}
static void video_sender_close(VideoSender *sender)
{
if (sender == NULL || sender->client == NULL)
return;
kcp_client_close(sender->client);
kcp_client_free(sender->client);
sender->client = NULL;
}
int open_v4l2_device(const char *device)
{
int fd = open(device, O_RDWR | O_NONBLOCK);
if (fd < 0)
{
perror("open device");
return -1;
}
return fd;
}
int init_v4l2_device(int fd)
{
struct v4l2_format fmt = {0};
fmt.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
fmt.fmt.pix.width = WIDTH;
fmt.fmt.pix.height = HEIGHT;
fmt.fmt.pix.pixelformat = V4L2_PIX_FMT_MJPEG;
fmt.fmt.pix.field = V4L2_FIELD_NONE;
if (ioctl(fd, VIDIOC_S_FMT, &fmt) < 0)
{
perror("VIDIOC_S_FMT");
return -1;
}
PRINT_TIME("Set format: %dx%d MJPEG\n", WIDTH, HEIGHT);
return 0;
}
int init_mmap(int fd, Buffer **buffers, int *num_buffers)
{
struct v4l2_requestbuffers req = {0};
req.count = NUM_BUFFERS;
req.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
req.memory = V4L2_MEMORY_MMAP;
if (ioctl(fd, VIDIOC_REQBUFS, &req) < 0)
{
perror("VIDIOC_REQBUFS");
return -1;
}
*num_buffers = req.count;
*buffers = (Buffer *)calloc(req.count, sizeof(Buffer));
for (int i = 0; i < req.count; i++)
{
struct v4l2_buffer buf = {0};
buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
buf.memory = V4L2_MEMORY_MMAP;
buf.index = i;
if (ioctl(fd, VIDIOC_QUERYBUF, &buf) < 0)
{
perror("VIDIOC_QUERYBUF");
return -1;
if (rc == 0) {
break;
}
(*buffers)[i].length = buf.length;
(*buffers)[i].start = mmap(NULL, buf.length,
PROT_READ | PROT_WRITE,
MAP_SHARED, fd, buf.m.offset);
if ((*buffers)[i].start == MAP_FAILED)
{
perror("mmap");
return -1;
}
}
return 0;
}
AVCodecContext *create_mjpeg_decoder()
{
const AVCodec *decoder = avcodec_find_decoder(AV_CODEC_ID_MJPEG);
if (!decoder)
{
fprintf(stderr, "MJPEG decoder not found\n");
return NULL;
}
AVCodecContext *ctx = avcodec_alloc_context3(decoder);
ctx->width = WIDTH;
ctx->height = HEIGHT;
ctx->pix_fmt = AV_PIX_FMT_YUVJ420P; // 使用YUVJ420P
ctx->color_range = AVCOL_RANGE_JPEG; // JPEG范围
ctx->thread_count = 1;
AVDictionary *opts = NULL;
av_dict_set(&opts, "flags2", "+fast", 0);
if (avcodec_open2(ctx, decoder, &opts) < 0)
{
avcodec_free_context(&ctx);
av_dict_free(&opts);
return NULL;
}
av_dict_free(&opts);
printf("Decoder created with format: YUVJ420P\n");
return ctx;
}
AVCodecContext *create_mjpeg_encoder()
{
const AVCodec *encoder = avcodec_find_encoder(AV_CODEC_ID_MJPEG);
if (!encoder)
{
fprintf(stderr, "MJPEG encoder not found\n");
return NULL;
}
AVCodecContext *ctx = avcodec_alloc_context3(encoder);
ctx->width = OUTPUT_WIDTH;
ctx->height = OUTPUT_HEIGHT;
ctx->pix_fmt = AV_PIX_FMT_YUVJ420P; // 使用YUVJ420P
ctx->time_base = (AVRational){1, 30};
ctx->qmin = 8;
ctx->qmax = 31;
ctx->flags |= AV_CODEC_FLAG_QSCALE;
ctx->global_quality = FF_QP2LAMBDA * 5;
AVDictionary *opts = NULL;
av_dict_set(&opts, "huffman", "default", 0);
if (avcodec_open2(ctx, encoder, &opts) < 0)
{
avcodec_free_context(&ctx);
av_dict_free(&opts);
return NULL;
}
av_dict_free(&opts);
printf("Encoder created with format: YUVJ420P\n");
return ctx;
}
int decode_mjpeg_frame(AVCodecContext *decoder, const uint8_t *data, int size, AVFrame **frame)
{
if (frame == NULL)
return -1;
*frame = NULL;
AVPacket *pkt = av_packet_alloc();
if (!pkt)
return -1;
pkt->data = (uint8_t *)data;
pkt->size = size;
int ret = avcodec_send_packet(decoder, pkt);
if (ret < 0)
{
av_packet_free(&pkt);
return -1;
}
*frame = av_frame_alloc();
if (!*frame)
{
av_packet_free(&pkt);
return -1;
}
ret = avcodec_receive_frame(decoder, *frame);
av_packet_free(&pkt);
if (ret < 0)
av_frame_free(frame);
return ret;
}
int scale_frame(AVFrame *src, AVFrame **dst)
{
// 简单缩放,不设置复杂的色彩空间参数
struct SwsContext *sws_ctx = sws_getContext(
src->width, src->height, src->format,
OUTPUT_WIDTH, OUTPUT_HEIGHT, AV_PIX_FMT_YUVJ420P,
SWS_BILINEAR, NULL, NULL, NULL);
if (!sws_ctx)
{
fprintf(stderr, "Failed to create sws context\n");
return -1;
}
*dst = av_frame_alloc();
if (!*dst)
{
sws_freeContext(sws_ctx);
return -1;
}
(*dst)->width = OUTPUT_WIDTH;
(*dst)->height = OUTPUT_HEIGHT;
(*dst)->format = AV_PIX_FMT_YUVJ420P;
if (av_frame_get_buffer(*dst, 0) < 0)
{
fprintf(stderr, "Failed to allocate frame buffer\n");
av_frame_free(dst);
sws_freeContext(sws_ctx);
return -1;
}
int ret = sws_scale(sws_ctx, (const uint8_t *const *)src->data, src->linesize,
0, src->height, (*dst)->data, (*dst)->linesize);
sws_freeContext(sws_ctx);
if (ret < 0)
{
fprintf(stderr, "sws_scale failed\n");
av_frame_free(dst);
return -1;
}
return 0;
}
int encode_frame(AVCodecContext *encoder, AVFrame *frame, AVPacket **pkt)
{
if (pkt == NULL)
return -1;
*pkt = NULL;
*pkt = av_packet_alloc();
if (!*pkt)
return -1;
int ret = avcodec_send_frame(encoder, frame);
if (ret < 0)
{
av_packet_free(pkt);
return -1;
}
ret = avcodec_receive_packet(encoder, *pkt);
if (ret < 0)
av_packet_free(pkt);
return ret;
}
int main()
{
VideoSender sender;
memset(&sender, 0, sizeof(sender));
PRINT_TIME("=== V4L2 Direct Capture + FFmpeg Processing ===\n");
// 1. Open V4L2 device
int fd = open_v4l2_device("/dev/video0");
if (fd < 0)
{
fprintf(stderr, "Failed to open camera device\n");
return 1;
}
// 2. Initialize V4L2
if (init_v4l2_device(fd) < 0)
{
close(fd);
return 1;
}
// 3. Setup MMAP buffers
Buffer *buffers = NULL;
int num_buffers = 0;
if (init_mmap(fd, &buffers, &num_buffers) < 0)
{
close(fd);
return 1;
}
// 4. Create FFmpeg codecs
AVCodecContext *decoder = create_mjpeg_decoder();
AVCodecContext *encoder = create_mjpeg_encoder();
if (!decoder || !encoder)
{
fprintf(stderr, "Failed to create codecs\n");
close(fd);
return 1;
}
if (video_sender_init(&sender) < 0)
{
perror("video_sender_init");
video_sender_close(&sender);
avcodec_free_context(&encoder);
avcodec_free_context(&decoder);
close(fd);
return 1;
}
// 5. Queue buffers
for (int i = 0; i < num_buffers; i++)
{
struct v4l2_buffer buf = {0};
buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
buf.memory = V4L2_MEMORY_MMAP;
buf.index = i;
if (ioctl(fd, VIDIOC_QBUF, &buf) < 0)
{
perror("VIDIOC_QBUF");
close(fd);
if (rc != VIDEO_PIPELINE_RUN_RETRY_IMMEDIATE) {
perror("video_pipeline_run");
video_pipeline_stats_destroy(&stats);
return 1;
}
}
// 6. Start streaming
enum v4l2_buf_type type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
if (ioctl(fd, VIDIOC_STREAMON, &type) < 0)
{
perror("VIDIOC_STREAMON");
close(fd);
return 1;
}
// 7. Benchmark
// 使用宏控制打印表头
PRINT_TIME("\nRunning benchmark (100 frames)...\n");
PRINT_TIME("Frame | Capture | Decode | Scale | Encode | Total | Size | Marker\n");
PRINT_TIME("------|---------|--------|-------|--------|-------|------|--------\n");
for (int i = 0; i < 10000; i++)
{
// 只有在开启 DEBUG_TIMING 时才声明这些时间变量
#ifdef DEBUG_TIMING
double total_start = get_time_ms();
double capture_start, capture_end;
double decode_start, decode_end;
double scale_start, scale_end;
double encode_start, encode_end;
#endif
// Wait for frame
fd_set fds;
FD_ZERO(&fds);
FD_SET(fd, &fds);
struct timeval tv = {2, 0};
int r = select(fd + 1, &fds, NULL, NULL, &tv);
if (r <= 0)
{
PRINT_TIME("Timeout waiting for frame\n");
break;
}
#ifdef DEBUG_TIMING
capture_start = get_time_ms();
#endif
// Dequeue buffer
struct v4l2_buffer buf = {0};
buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
buf.memory = V4L2_MEMORY_MMAP;
if (ioctl(fd, VIDIOC_DQBUF, &buf) < 0)
{
perror("VIDIOC_DQBUF");
break;
}
#ifdef DEBUG_TIMING
capture_end = get_time_ms();
#endif
// Decode
#ifdef DEBUG_TIMING
decode_start = get_time_ms();
#endif
AVFrame *decoded_frame = NULL;
int ret = decode_mjpeg_frame(decoder,
(uint8_t *)buffers[buf.index].start, buf.bytesused, &decoded_frame);
#ifdef DEBUG_TIMING
decode_end = get_time_ms();
#endif
if (ret < 0 || !decoded_frame)
{
PRINT_TIME("Frame %d: Decode failed\n", i + 1);
ioctl(fd, VIDIOC_QBUF, &buf);
continue;
}
// Scale
#ifdef DEBUG_TIMING
scale_start = get_time_ms();
#endif
AVFrame *scaled_frame = NULL;
if (scale_frame(decoded_frame, &scaled_frame) < 0)
{
PRINT_TIME("Frame %d: Scale failed\n", i + 1);
av_frame_free(&decoded_frame);
ioctl(fd, VIDIOC_QBUF, &buf);
continue;
}
#ifdef DEBUG_TIMING
scale_end = get_time_ms();
#endif
// Encode
#ifdef DEBUG_TIMING
encode_start = get_time_ms();
#endif
AVPacket *encoded_pkt = NULL;
if (encode_frame(encoder, scaled_frame, &encoded_pkt) < 0)
{
PRINT_TIME("Frame %d: Encode failed\n", i + 1);
}
#ifdef DEBUG_TIMING
if (encoded_pkt && i % 50 == 0)
{
char filename[100];
sprintf(filename, "frame_%04d.jpg", i + 1);
FILE *f = fopen(filename, "wb");
if (f)
{
fwrite(encoded_pkt->data, 1, encoded_pkt->size, f);
fclose(f);
PRINT_TIME("Saved as %s\n", filename);
}
}
#endif
#ifdef DEBUG_TIMING
encode_end = get_time_ms();
double total_end = get_time_ms();
#endif
// 打印结果
#ifdef DEBUG_TIMING
PRINT_TIME("%5d | %7.1f | %6.1f | %5.1f | %6.1f | %5.1f | %4d KB | 0x%02x\n",
i + 1,
capture_end - capture_start,
decode_end - decode_start,
scale_end - scale_start,
encode_end - encode_start,
total_end - total_start,
encoded_pkt ? encoded_pkt->size / 1024 : 0,
encoded_pkt && encoded_pkt->size > 1 ? encoded_pkt->data[1] : 0);
#else
// 如果不开启宏,也打印一些基本信息
printf("Frame %d processed\n", i + 1);
#endif
int64_t unixtime_ms;
if (encoded_pkt)
{
unixtime_ms = get_realtime_ms();
}
if (encoded_pkt && video_sender_send_packet(&sender, encoded_pkt, unixtime_ms) != 0)
{
perror("video_sender_send_packet");
av_frame_free(&decoded_frame);
av_frame_free(&scaled_frame);
av_packet_free(&encoded_pkt);
ioctl(fd, VIDIOC_QBUF, &buf);
break;
}
// Cleanup
av_frame_free(&decoded_frame);
av_frame_free(&scaled_frame);
if (encoded_pkt)
av_packet_free(&encoded_pkt);
// Requeue buffer
if (ioctl(fd, VIDIOC_QBUF, &buf) < 0)
{
perror("VIDIOC_QBUF");
break;
}
}
// 8. Stop streaming
ioctl(fd, VIDIOC_STREAMOFF, &type);
// 9. Cleanup
for (int i = 0; i < num_buffers; i++)
{
if (buffers[i].start != MAP_FAILED)
{
munmap(buffers[i].start, buffers[i].length);
}
}
free(buffers);
video_sender_close(&sender);
avcodec_free_context(&encoder);
avcodec_free_context(&decoder);
close(fd);
video_pipeline_stats_destroy(&stats);
return 0;
}
// gcc -o v1_camera_pipeline_ifdef v1_camera_pipeline_ifdef.c $(pkg-config --cflags --libs libavformat libavcodec libavutil libswscale)

View File

@@ -0,0 +1,7 @@
#ifndef OMNI_CONTROL_PROTOCOL_H
#define OMNI_CONTROL_PROTOCOL_H
#define OMNI_CONTROL_PACKET_FLOATS 6
#define OMNI_CONTROL_PACKET_SIZE (OMNI_CONTROL_PACKET_FLOATS * sizeof(float))
#endif

16
include/gps_buffer.h Normal file
View File

@@ -0,0 +1,16 @@
#ifndef GPS_BUFFER_H
#define GPS_BUFFER_H
#include <stdint.h>
typedef struct gps_video_sample {
double latitude;
double longitude;
} gps_video_sample_t;
gps_video_sample_t get_latest_gps_for_video(void);
int gps_buffer_init(const char* host);
void gps_buffer_cleanup(void);
#endif

View File

@@ -24,8 +24,22 @@ typedef struct kcp_session_stats_record {
uint32_t rto_ms;
int has_srtt_ms;
int32_t srtt_ms;
int has_min_srtt_ms;
int32_t min_srtt_ms;
int has_srttvar_ms;
int32_t srttvar_ms;
int has_last_feedback_age_ms;
uint32_t last_feedback_age_ms;
int has_snd_wnd;
uint32_t snd_wnd;
int has_rmt_wnd;
uint32_t rmt_wnd;
int has_inflight;
uint32_t inflight;
int has_window_limit;
uint32_t window_limit;
int has_window_pressure_pct;
double window_pressure_pct;
int has_bytes_sent;
uint64_t bytes_sent;
int has_bytes_received;

View File

@@ -3,6 +3,7 @@
#include <errno.h>
#include <inttypes.h>
#include <limits.h>
#include <pthread.h>
#include <stdarg.h>
#include <stdbool.h>
@@ -31,6 +32,15 @@
typedef struct omni_file_logger {
FILE *file;
pthread_mutex_t mutex;
char path[PATH_MAX];
size_t current_bytes;
size_t buffered_bytes;
size_t flush_bytes;
size_t max_bytes;
int flush_interval_ms;
int max_files;
int immediate_flush;
uint64_t last_flush_monotonic_ms;
} omni_file_logger_t;
int64_t omni_now_unix_nano(void);
@@ -61,6 +71,7 @@ double omni_duration_ms_to_ns(double ms);
const char *omni_path_base_name(const char *path);
void omni_file_logger_init(omni_file_logger_t *logger, FILE *file);
void omni_file_logger_init_path(omni_file_logger_t *logger, FILE *file, const char *path, int immediate_flush);
void omni_file_logger_destroy(omni_file_logger_t *logger);
int omni_file_logger_write_line(omni_file_logger_t *logger, const char *line);

View File

@@ -16,17 +16,26 @@ typedef struct kcp_client_recv_meta {
char file_name[OMNI_MAX_FILE_NAME];
size_t body_len;
} kcp_client_recv_meta_t;
typedef struct kcp_client_state {
int connected;
int registered;
uint32_t server_idle_ms;
char last_server_error[256];
} kcp_client_state_t;
kcp_client_t *kcp_client_dial_with_options(const char *server_addr, const char *dial_addr, const char *peer_id, const char *bind_ip, const char *bind_device, const kcp_conn_options_t *options, latency_logger_t *logger, kcp_packet_debug_logger_t *packet_logger, kcp_session_stats_logger_t *stats_logger, int stats_interval_ms);
kcp_client_t *kcp_client_dial(const char *server_addr, const char *dial_addr, const char *peer_id, const char *bind_ip, const char *bind_device, latency_logger_t *logger, kcp_packet_debug_logger_t *packet_logger, kcp_session_stats_logger_t *stats_logger, int stats_interval_ms);
const char *kcp_client_id(const kcp_client_t *client);
int kcp_client_send_text(kcp_client_t *client, const char *to, const char *text);
int kcp_client_send_binary(kcp_client_t *client, const char *to, const void *data, size_t data_len);
int kcp_client_send_binary_with_id(kcp_client_t *client, const char *to, const void *data, size_t data_len, uint64_t *out_id);
int kcp_client_send_file_path(kcp_client_t *client, const char *to, const char *path);
int kcp_client_receive_timed(kcp_client_t *client, message_t *out_msg, int timeout_ms);
int kcp_client_receive(kcp_client_t *client, message_t *out_msg);
int kcp_client_receive_binary_into(kcp_client_t *client, void *buffer, size_t buffer_len, kcp_client_recv_meta_t *out_meta, int timeout_ms);
int kcp_client_persist_message(kcp_client_t *client, const message_t *msg, const char *inbox_dir, char *out_path, size_t out_path_len);
void kcp_client_state_snapshot(kcp_client_t *client, kcp_client_state_t *out_state);
void kcp_client_runtime_stats_snapshot(kcp_client_t *client, kcp_runtime_stats_t *out_stats);
int kcp_client_close(kcp_client_t *client);
void kcp_client_free(kcp_client_t *client);

View File

@@ -8,12 +8,24 @@ extern "C" {
#endif
typedef struct udp_client udp_client_t;
typedef struct udp_client_recv_meta {
message_type_t type;
uint64_t id;
char from[OMNI_MAX_PEER_ID];
char to[OMNI_MAX_PEER_ID];
char file_name[OMNI_MAX_FILE_NAME];
size_t body_len;
} udp_client_recv_meta_t;
udp_client_t *udp_client_dial_with_options(const char *server_addr, const char *peer_id, const char *bind_ip, const char *bind_device, latency_logger_t *logger, tx_timestamp_debug_logger_t *debug_logger, int enable_timestamping);
udp_client_t *udp_client_dial(const char *server_addr, const char *peer_id, const char *bind_ip, latency_logger_t *logger, tx_timestamp_debug_logger_t *debug_logger, int enable_timestamping);
const char *udp_client_id(const udp_client_t *client);
int udp_client_send_text(udp_client_t *client, const char *to, const char *text);
int udp_client_send_binary(udp_client_t *client, const char *to, const void *data, size_t data_len);
int udp_client_send_file_path(udp_client_t *client, const char *to, const char *path);
int udp_client_receive_timed(udp_client_t *client, message_t *out_msg, int timeout_ms);
int udp_client_receive(udp_client_t *client, message_t *out_msg);
int udp_client_receive_into(udp_client_t *client, void *buffer, size_t buffer_len, udp_client_recv_meta_t *out_meta, int timeout_ms);
int udp_client_persist_message(udp_client_t *client, const message_t *msg, const char *inbox_dir, char *out_path, size_t out_path_len);
int udp_client_close(udp_client_t *client);
void udp_client_free(udp_client_t *client);

View File

@@ -14,6 +14,7 @@ int kcp_hub_serve_listener(kcp_hub_t *hub, kcp_listener_t *listener);
int kcp_hub_serve_session(kcp_hub_t *hub, kcp_conn_t *conn);
int kcp_hub_set_relay(kcp_hub_t *hub, int relay_fd, const struct sockaddr *peer_addr, socklen_t peer_addr_len, int learn_peer);
int kcp_hub_set_telemetry(kcp_hub_t *hub, const char *peer_id, int interval_ms);
int kcp_hub_serve_relay(kcp_hub_t *hub);
int kcp_hub_close(kcp_hub_t *hub);

View File

@@ -34,6 +34,14 @@ extern "C" {
#define KCP_VIDEO_RCV_WND 256
#define KCP_VIDEO_MTU 1400
#define KCP_TELEMETRY_NODELAY 0
#define KCP_TELEMETRY_INTERVAL_MS 50
#define KCP_TELEMETRY_RESEND 0
#define KCP_TELEMETRY_NC 0
#define KCP_TELEMETRY_SND_WND 64
#define KCP_TELEMETRY_RCV_WND 64
#define KCP_TELEMETRY_MTU 1400
#define KCP_NODELAY KCP_DEFAULT_NODELAY
#define KCP_INTERVAL KCP_DEFAULT_INTERVAL_MS
#define KCP_RESEND KCP_DEFAULT_RESEND
@@ -43,6 +51,29 @@ extern "C" {
typedef struct kcp_conn kcp_conn_t;
typedef struct kcp_listener kcp_listener_t;
typedef struct kcp_runtime_stats {
int connected;
uint32_t conv;
uint32_t rto_ms;
int32_t srtt_ms;
int32_t min_srtt_ms;
int32_t srttvar_ms;
uint32_t last_feedback_age_ms;
uint32_t snd_wnd;
uint32_t rmt_wnd;
uint32_t inflight;
uint32_t window_limit;
double window_pressure_pct;
uint32_t snd_queue;
uint32_t rcv_queue;
uint32_t snd_buffer;
uint64_t out_segs_total;
uint64_t retrans_total;
uint64_t fast_retrans_total;
uint64_t lost_total;
uint64_t repeat_total;
uint32_t xmit_total;
} kcp_runtime_stats_t;
typedef struct kcp_conn_options {
int nodelay;
int interval_ms;
@@ -56,6 +87,7 @@ typedef struct kcp_conn_options {
void kcp_conn_options_init(kcp_conn_options_t *options);
void kcp_conn_options_set_control_defaults(kcp_conn_options_t *options);
void kcp_conn_options_set_video_defaults(kcp_conn_options_t *options);
void kcp_conn_options_set_telemetry_defaults(kcp_conn_options_t *options);
kcp_conn_t *kcp_conn_dial_with_options(const char *server_addr, const char *bind_ip, const char *bind_device, const kcp_conn_options_t *options, kcp_packet_debug_logger_t *packet_logger, latency_logger_t *logger, const char *node_role, const char *node_id, kcp_session_stats_logger_t *stats_logger, int stats_interval_ms);
kcp_conn_t *kcp_conn_dial(const char *server_addr, const char *bind_ip, const char *bind_device, kcp_packet_debug_logger_t *packet_logger, latency_logger_t *logger, const char *node_role, const char *node_id, kcp_session_stats_logger_t *stats_logger, int stats_interval_ms);
@@ -68,6 +100,8 @@ int kcp_conn_close(kcp_conn_t *conn);
void kcp_conn_free(kcp_conn_t *conn);
uint32_t kcp_conn_conv(const kcp_conn_t *conn);
int kcp_conn_local_addr(const kcp_conn_t *conn, struct sockaddr_storage *addr, socklen_t *addr_len);
int kcp_conn_remote_addr(const kcp_conn_t *conn, struct sockaddr_storage *addr, socklen_t *addr_len);
void kcp_conn_runtime_stats_snapshot(kcp_conn_t *conn, kcp_runtime_stats_t *out_stats);
kcp_listener_t *kcp_listener_listen(const char *listen_addr, const char *bind_device, kcp_packet_debug_logger_t *packet_logger, const char *node_role, const char *node_id);
kcp_conn_t *kcp_listener_accept(kcp_listener_t *listener);

96
include/video_pipeline.h Normal file
View File

@@ -0,0 +1,96 @@
#ifndef OMNI_VIDEO_PIPELINE_H
#define OMNI_VIDEO_PIPELINE_H
#include <pthread.h>
#include <signal.h>
#include <stdint.h>
#include "gps_buffer.h"
#include "omni_common.h"
#include "peer_kcp_client.h"
#ifdef __cplusplus
extern "C" {
#endif
#if defined(__GNUC__)
typedef struct __attribute__((packed)) video_pipeline_packet_metadata {
#else
typedef struct video_pipeline_packet_metadata {
#endif
uint64_t timestamp_ms;
double latitude;
double longitude;
uint32_t capture_to_send_ms;
} video_pipeline_packet_metadata_t;
typedef struct video_stage_logger {
omni_file_logger_t file_logger;
int enabled;
uint64_t sample_mod;
} video_stage_logger_t;
typedef void (*video_pipeline_progress_fn)(void *context);
#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L
_Static_assert(sizeof(video_pipeline_packet_metadata_t) == 28, "video trailer metadata must be 28 bytes");
#endif
typedef struct video_pipeline_config {
const char *camera_device;
const char *server_addr;
const char *relay_via;
const char *bind_ip;
const char *bind_device;
const char *peer_id;
const char *target_peer;
int capture_width;
int capture_height;
int output_width;
int output_height;
int max_frames;
int enable_timing_logs;
int soft_backpressure_segments;
int hard_backpressure_segments;
int hard_backpressure_hold_ms;
int frame_stall_reconnect_ms;
kcp_session_stats_logger_t *stats_logger;
video_stage_logger_t *stage_logger;
int stats_interval_ms;
video_pipeline_progress_fn progress_callback;
void *progress_context;
} video_pipeline_config_t;
typedef struct video_pipeline_stats {
pthread_mutex_t mutex;
uint64_t frames_sent;
uint64_t bytes_sent;
uint64_t send_errors;
uint64_t backpressure_drops;
uint64_t backlog_resets;
uint64_t last_frame_bytes;
uint32_t last_backlog_segments;
uint32_t last_capture_to_send_ms;
double avg_capture_to_send_ms;
int connected;
char last_error[256];
char last_backlog_reason[128];
kcp_runtime_stats_t transport;
} video_pipeline_stats_t;
#define VIDEO_PIPELINE_RUN_RETRY_IMMEDIATE 2
void video_pipeline_config_init(video_pipeline_config_t *config);
void video_pipeline_config_load_env(video_pipeline_config_t *config);
int video_pipeline_stats_init(video_pipeline_stats_t *stats);
void video_pipeline_stats_destroy(video_pipeline_stats_t *stats);
void video_pipeline_stats_snapshot(video_pipeline_stats_t *stats, video_pipeline_stats_t *out_stats);
video_stage_logger_t *video_stage_logger_open_jsonl(const char *path, uint64_t sample_mod);
void video_stage_logger_close(video_stage_logger_t *logger);
int video_pipeline_run(const video_pipeline_config_t *config, video_pipeline_stats_t *stats, volatile sig_atomic_t *stop_requested);
#ifdef __cplusplus
}
#endif
#endif

View File

@@ -6,6 +6,7 @@ try:
MSG_TYPE_REGISTER,
MSG_TYPE_TEXT,
Session,
UdpSession,
)
except ImportError as exc:
raise ImportError(
@@ -32,8 +33,19 @@ VIDEO_DEFAULTS = {
"mtu": 1400,
}
TELEMETRY_DEFAULTS = {
"nodelay": 0,
"interval_ms": 50,
"resend": 0,
"nc": 0,
"sndwnd": 64,
"rcvwnd": 64,
"mtu": 1400,
}
__all__ = [
"CONTROL_DEFAULTS",
"TELEMETRY_DEFAULTS",
"VIDEO_DEFAULTS",
"MSG_TYPE_BINARY",
"MSG_TYPE_ERROR",
@@ -41,4 +53,5 @@ __all__ = [
"MSG_TYPE_REGISTER",
"MSG_TYPE_TEXT",
"Session",
"UdpSession",
]

View File

@@ -8,6 +8,11 @@ typedef struct PyOmniSession {
omnisocket_session_t session;
} PyOmniSession;
typedef struct PyOmniUdpSession {
PyObject_HEAD
omnisocket_udp_session_t session;
} PyOmniUdpSession;
PyDoc_STRVAR(
PyOmniSession_recv_doc,
"recv(timeout_ms=-1) -> (from_peer, msg_type, payload) | None"
@@ -22,6 +27,121 @@ PyDoc_STRVAR(
"current frame has already been consumed and is lost."
);
static PyObject *build_recv_result(const message_t *msg) {
PyObject *body = NULL;
PyObject *result = NULL;
body = PyBytes_FromStringAndSize((const char *) msg->body, (Py_ssize_t) msg->body_len);
if (body == NULL) {
return NULL;
}
result = Py_BuildValue("(siO)", msg->from, (int) msg->type, body);
Py_DECREF(body);
return result;
}
static PyObject *build_recv_meta_dict(
const char *from_peer,
const char *to_peer,
const char *file_name,
int msg_type,
unsigned long long message_id,
unsigned long long body_len
) {
return Py_BuildValue(
"{s:s,s:s,s:s,s:i,s:K,s:K}",
"from",
from_peer,
"to",
to_peer,
"file_name",
file_name,
"msg_type",
msg_type,
"message_id",
message_id,
"body_len",
body_len
);
}
static PyObject *build_stats_dict(const omnisocket_session_stats_t *stats) {
return Py_BuildValue(
"{s:K,s:K,s:K,s:K,s:K,s:K,s:K,s:i,s:i,s:s}",
"send_calls",
(unsigned long long) stats->send_calls,
"send_bytes",
(unsigned long long) stats->send_bytes,
"send_errors",
(unsigned long long) stats->send_errors,
"recv_calls",
(unsigned long long) stats->recv_calls,
"recv_bytes",
(unsigned long long) stats->recv_bytes,
"recv_timeouts",
(unsigned long long) stats->recv_timeouts,
"recv_errors",
(unsigned long long) stats->recv_errors,
"connected",
stats->connected,
"registered",
stats->registered,
"last_server_error",
stats->last_server_error
);
}
static PyObject *build_kcp_stats_dict(const omnisocket_session_kcp_stats_t *stats) {
PyObject *dict = PyDict_New();
PyObject *value = NULL;
if (dict == NULL) {
return NULL;
}
#define SET_KCP_STAT(key, expr) \
do { \
value = (expr); \
if (value == NULL) { \
Py_DECREF(dict); \
return NULL; \
} \
if (PyDict_SetItemString(dict, (key), value) != 0) { \
Py_DECREF(value); \
Py_DECREF(dict); \
return NULL; \
} \
Py_DECREF(value); \
value = NULL; \
} while (0)
SET_KCP_STAT("connected", PyLong_FromLong(stats->connected));
SET_KCP_STAT("conv", PyLong_FromUnsignedLong(stats->conv));
SET_KCP_STAT("rto_ms", PyLong_FromUnsignedLong(stats->rto_ms));
SET_KCP_STAT("srtt_ms", PyLong_FromLong(stats->srtt_ms));
SET_KCP_STAT("min_srtt_ms", PyLong_FromLong(stats->min_srtt_ms));
SET_KCP_STAT("srttvar_ms", PyLong_FromLong(stats->srttvar_ms));
SET_KCP_STAT("last_feedback_age_ms", PyLong_FromUnsignedLong(stats->last_feedback_age_ms));
SET_KCP_STAT("snd_wnd", PyLong_FromUnsignedLong(stats->snd_wnd));
SET_KCP_STAT("rmt_wnd", PyLong_FromUnsignedLong(stats->rmt_wnd));
SET_KCP_STAT("inflight", PyLong_FromUnsignedLong(stats->inflight));
SET_KCP_STAT("window_limit", PyLong_FromUnsignedLong(stats->window_limit));
SET_KCP_STAT("window_pressure_pct", PyFloat_FromDouble(stats->window_pressure_pct));
SET_KCP_STAT("snd_queue", PyLong_FromUnsignedLong(stats->snd_queue));
SET_KCP_STAT("rcv_queue", PyLong_FromUnsignedLong(stats->rcv_queue));
SET_KCP_STAT("snd_buffer", PyLong_FromUnsignedLong(stats->snd_buffer));
SET_KCP_STAT("out_segs_total", PyLong_FromUnsignedLongLong((unsigned long long) stats->out_segs_total));
SET_KCP_STAT("retrans_total", PyLong_FromUnsignedLongLong((unsigned long long) stats->retrans_total));
SET_KCP_STAT("fast_retrans_total", PyLong_FromUnsignedLongLong((unsigned long long) stats->fast_retrans_total));
SET_KCP_STAT("lost_total", PyLong_FromUnsignedLongLong((unsigned long long) stats->lost_total));
SET_KCP_STAT("repeat_total", PyLong_FromUnsignedLongLong((unsigned long long) stats->repeat_total));
SET_KCP_STAT("xmit_total", PyLong_FromUnsignedLong(stats->xmit_total));
#undef SET_KCP_STAT
return dict;
}
static PyObject *PyOmniSession_new(PyTypeObject *type, PyObject *args, PyObject *kwargs) {
PyOmniSession *self;
(void) args;
@@ -161,11 +281,33 @@ static PyObject *PyOmniSession_send(PyOmniSession *self, PyObject *args, PyObjec
Py_RETURN_NONE;
}
static PyObject *PyOmniSession_send_with_id(PyOmniSession *self, PyObject *args, PyObject *kwargs) {
const char *to;
Py_buffer payload;
int rc;
uint64_t message_id = 0;
static char *kwlist[] = {"to", "data", NULL};
memset(&payload, 0, sizeof(payload));
if (!PyArg_ParseTupleAndKeywords(args, kwargs, "sy*", kwlist, &to, &payload)) {
return NULL;
}
Py_BEGIN_ALLOW_THREADS
rc = omnisocket_session_send_with_id(&self->session, to, payload.buf, (size_t) payload.len, &message_id);
Py_END_ALLOW_THREADS
PyBuffer_Release(&payload);
if (rc != 0) {
return PyErr_SetFromErrno(PyExc_OSError);
}
return PyLong_FromUnsignedLongLong((unsigned long long) message_id);
}
static PyObject *PyOmniSession_recv(PyOmniSession *self, PyObject *args, PyObject *kwargs) {
int timeout_ms = -1;
int rc;
message_t msg;
PyObject *body = NULL;
PyObject *result = NULL;
static char *kwlist[] = {"timeout_ms", NULL};
@@ -187,13 +329,7 @@ static PyObject *PyOmniSession_recv(PyOmniSession *self, PyObject *args, PyObjec
return PyErr_SetFromErrno(PyExc_OSError);
}
body = PyBytes_FromStringAndSize((const char *) msg.body, (Py_ssize_t) msg.body_len);
if (body == NULL) {
protocol_message_clear(&msg);
return NULL;
}
result = Py_BuildValue("(siO)", msg.from, (int) msg.type, body);
Py_DECREF(body);
result = build_recv_result(&msg);
protocol_message_clear(&msg);
return result;
}
@@ -237,19 +373,12 @@ static PyObject *PyOmniSession_recv_into(PyOmniSession *self, PyObject *args, Py
return PyErr_SetFromErrno(PyExc_OSError);
}
result = Py_BuildValue(
"{s:s,s:s,s:s,s:i,s:K,s:K}",
"from",
result = build_recv_meta_dict(
meta.from,
"to",
meta.to,
"file_name",
meta.file_name,
"msg_type",
(int) meta.type,
"message_id",
(unsigned long long) meta.id,
"body_len",
(unsigned long long) meta.body_len
);
return result;
@@ -260,34 +389,26 @@ static PyObject *PyOmniSession_stats(PyOmniSession *self, PyObject *Py_UNUSED(ig
memset(&stats, 0, sizeof(stats));
omnisocket_session_stats_snapshot(&self->session, &stats);
return Py_BuildValue(
"{s:K,s:K,s:K,s:K,s:K,s:K,s:K,s:i}",
"send_calls",
(unsigned long long) stats.send_calls,
"send_bytes",
(unsigned long long) stats.send_bytes,
"send_errors",
(unsigned long long) stats.send_errors,
"recv_calls",
(unsigned long long) stats.recv_calls,
"recv_bytes",
(unsigned long long) stats.recv_bytes,
"recv_timeouts",
(unsigned long long) stats.recv_timeouts,
"recv_errors",
(unsigned long long) stats.recv_errors,
"connected",
stats.connected
);
return build_stats_dict(&stats);
}
static PyObject *PyOmniSession_kcp_stats(PyOmniSession *self, PyObject *Py_UNUSED(ignored)) {
omnisocket_session_kcp_stats_t stats;
memset(&stats, 0, sizeof(stats));
omnisocket_session_kcp_stats_snapshot(&self->session, &stats);
return build_kcp_stats_dict(&stats);
}
static PyMethodDef PyOmniSession_methods[] = {
{"connect", (PyCFunction) PyOmniSession_connect, METH_VARARGS | METH_KEYWORDS, NULL},
{"close", (PyCFunction) PyOmniSession_close, METH_NOARGS, NULL},
{"send", (PyCFunction) PyOmniSession_send, METH_VARARGS | METH_KEYWORDS, NULL},
{"send_with_id", (PyCFunction) PyOmniSession_send_with_id, METH_VARARGS | METH_KEYWORDS, NULL},
{"recv", (PyCFunction) PyOmniSession_recv, METH_VARARGS | METH_KEYWORDS, PyOmniSession_recv_doc},
{"recv_into", (PyCFunction) PyOmniSession_recv_into, METH_VARARGS | METH_KEYWORDS, PyOmniSession_recv_into_doc},
{"stats", (PyCFunction) PyOmniSession_stats, METH_NOARGS, NULL},
{"kcp_stats", (PyCFunction) PyOmniSession_kcp_stats, METH_NOARGS, NULL},
{NULL, NULL, 0, NULL}
};
@@ -295,6 +416,211 @@ static PyTypeObject PyOmniSessionType = {
PyVarObject_HEAD_INIT(NULL, 0)
};
static PyObject *PyOmniUdpSession_new(PyTypeObject *type, PyObject *args, PyObject *kwargs) {
PyOmniUdpSession *self;
(void) args;
(void) kwargs;
self = (PyOmniUdpSession *) type->tp_alloc(type, 0);
if (self == NULL) {
return NULL;
}
if (omnisocket_udp_session_init(&self->session) != 0) {
type->tp_free((PyObject *) self);
return PyErr_SetFromErrno(PyExc_OSError);
}
return (PyObject *) self;
}
static void PyOmniUdpSession_dealloc(PyOmniUdpSession *self) {
omnisocket_udp_session_destroy(&self->session);
Py_TYPE(self)->tp_free((PyObject *) self);
}
static PyObject *PyOmniUdpSession_connect(PyOmniUdpSession *self, PyObject *args, PyObject *kwargs) {
const char *server_addr;
const char *peer_id;
const char *bind_ip = "";
const char *bind_device = "";
int enable_timestamping = 0;
int rc;
static char *kwlist[] = {
"server_addr",
"peer_id",
"bind_ip",
"bind_device",
"enable_timestamping",
NULL
};
if (!PyArg_ParseTupleAndKeywords(
args,
kwargs,
"ss|ssi",
kwlist,
&server_addr,
&peer_id,
&bind_ip,
&bind_device,
&enable_timestamping)) {
return NULL;
}
Py_BEGIN_ALLOW_THREADS
rc = omnisocket_udp_session_connect(
&self->session,
server_addr,
peer_id,
bind_ip,
bind_device,
enable_timestamping
);
Py_END_ALLOW_THREADS
if (rc != 0) {
return PyErr_SetFromErrno(PyExc_OSError);
}
Py_RETURN_NONE;
}
static PyObject *PyOmniUdpSession_close(PyOmniUdpSession *self, PyObject *Py_UNUSED(ignored)) {
int rc;
Py_BEGIN_ALLOW_THREADS
rc = omnisocket_udp_session_close(&self->session);
Py_END_ALLOW_THREADS
if (rc != 0) {
return PyErr_SetFromErrno(PyExc_OSError);
}
Py_RETURN_NONE;
}
static PyObject *PyOmniUdpSession_send(PyOmniUdpSession *self, PyObject *args, PyObject *kwargs) {
const char *to;
Py_buffer payload;
int rc;
static char *kwlist[] = {"to", "data", NULL};
memset(&payload, 0, sizeof(payload));
if (!PyArg_ParseTupleAndKeywords(args, kwargs, "sy*", kwlist, &to, &payload)) {
return NULL;
}
Py_BEGIN_ALLOW_THREADS
rc = omnisocket_udp_session_send(&self->session, to, payload.buf, (size_t) payload.len);
Py_END_ALLOW_THREADS
PyBuffer_Release(&payload);
if (rc != 0) {
return PyErr_SetFromErrno(PyExc_OSError);
}
Py_RETURN_NONE;
}
static PyObject *PyOmniUdpSession_recv(PyOmniUdpSession *self, PyObject *args, PyObject *kwargs) {
int timeout_ms = -1;
int rc;
message_t msg;
PyObject *result = NULL;
static char *kwlist[] = {"timeout_ms", NULL};
if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|i", kwlist, &timeout_ms)) {
return NULL;
}
protocol_message_init(&msg);
Py_BEGIN_ALLOW_THREADS
rc = omnisocket_udp_session_recv(&self->session, &msg, timeout_ms);
Py_END_ALLOW_THREADS
if (rc == 1) {
protocol_message_clear(&msg);
Py_RETURN_NONE;
}
if (rc != 0) {
protocol_message_clear(&msg);
return PyErr_SetFromErrno(PyExc_OSError);
}
result = build_recv_result(&msg);
protocol_message_clear(&msg);
return result;
}
static PyObject *PyOmniUdpSession_recv_into(PyOmniUdpSession *self, PyObject *args, PyObject *kwargs) {
PyObject *buffer_obj;
Py_buffer view;
int timeout_ms = -1;
int rc;
udp_client_recv_meta_t meta;
PyObject *result = NULL;
static char *kwlist[] = {"buffer", "timeout_ms", NULL};
memset(&view, 0, sizeof(view));
memset(&meta, 0, sizeof(meta));
if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|i", kwlist, &buffer_obj, &timeout_ms)) {
return NULL;
}
if (PyObject_GetBuffer(buffer_obj, &view, PyBUF_WRITABLE) != 0) {
return NULL;
}
Py_BEGIN_ALLOW_THREADS
rc = omnisocket_udp_session_recv_into(&self->session, view.buf, (size_t) view.len, &meta, timeout_ms);
Py_END_ALLOW_THREADS
PyBuffer_Release(&view);
if (rc == 1) {
Py_RETURN_NONE;
}
if (rc == 2) {
PyErr_Format(
PyExc_BufferError,
"buffer too small: need %zu bytes; current frame was already consumed and dropped",
meta.body_len
);
return NULL;
}
if (rc != 0) {
return PyErr_SetFromErrno(PyExc_OSError);
}
result = build_recv_meta_dict(
meta.from,
meta.to,
meta.file_name,
(int) meta.type,
(unsigned long long) meta.id,
(unsigned long long) meta.body_len
);
return result;
}
static PyObject *PyOmniUdpSession_stats(PyOmniUdpSession *self, PyObject *Py_UNUSED(ignored)) {
omnisocket_session_stats_t stats;
memset(&stats, 0, sizeof(stats));
omnisocket_udp_session_stats_snapshot(&self->session, &stats);
return build_stats_dict(&stats);
}
static PyMethodDef PyOmniUdpSession_methods[] = {
{"connect", (PyCFunction) PyOmniUdpSession_connect, METH_VARARGS | METH_KEYWORDS, NULL},
{"close", (PyCFunction) PyOmniUdpSession_close, METH_NOARGS, NULL},
{"send", (PyCFunction) PyOmniUdpSession_send, METH_VARARGS | METH_KEYWORDS, NULL},
{"recv", (PyCFunction) PyOmniUdpSession_recv, METH_VARARGS | METH_KEYWORDS, PyOmniSession_recv_doc},
{"recv_into", (PyCFunction) PyOmniUdpSession_recv_into, METH_VARARGS | METH_KEYWORDS, PyOmniSession_recv_into_doc},
{"stats", (PyCFunction) PyOmniUdpSession_stats, METH_NOARGS, NULL},
{NULL, NULL, 0, NULL}
};
static PyTypeObject PyOmniUdpSessionType = {
PyVarObject_HEAD_INIT(NULL, 0)
};
static PyModuleDef omnisocket_module = {
PyModuleDef_HEAD_INIT,
.m_name = "_omnisocket",
@@ -315,6 +641,17 @@ PyMODINIT_FUNC PyInit__omnisocket(void) {
return NULL;
}
PyOmniUdpSessionType.tp_name = "omnisocket.UdpSession";
PyOmniUdpSessionType.tp_basicsize = sizeof(PyOmniUdpSession);
PyOmniUdpSessionType.tp_flags = Py_TPFLAGS_DEFAULT;
PyOmniUdpSessionType.tp_new = PyOmniUdpSession_new;
PyOmniUdpSessionType.tp_dealloc = (destructor) PyOmniUdpSession_dealloc;
PyOmniUdpSessionType.tp_methods = PyOmniUdpSession_methods;
if (PyType_Ready(&PyOmniUdpSessionType) < 0) {
return NULL;
}
module = PyModule_Create(&omnisocket_module);
if (module == NULL) {
return NULL;
@@ -327,6 +664,13 @@ PyMODINIT_FUNC PyInit__omnisocket(void) {
return NULL;
}
Py_INCREF(&PyOmniUdpSessionType);
if (PyModule_AddObject(module, "UdpSession", (PyObject *) &PyOmniUdpSessionType) != 0) {
Py_DECREF(&PyOmniUdpSessionType);
Py_DECREF(module);
return NULL;
}
if (PyModule_AddIntConstant(module, "MSG_TYPE_TEXT", MSG_TYPE_TEXT) != 0 ||
PyModule_AddIntConstant(module, "MSG_TYPE_FILE", MSG_TYPE_FILE) != 0 ||
PyModule_AddIntConstant(module, "MSG_TYPE_REGISTER", MSG_TYPE_REGISTER) != 0 ||

View File

@@ -1,5 +1,33 @@
#include "omnisocket_client.h"
static void omnisocket_session_sync_client_state_locked(omnisocket_session_t *session, kcp_client_t *client) {
kcp_client_state_t client_state;
if (session == NULL) {
return;
}
memset(&client_state, 0, sizeof(client_state));
if (client != NULL) {
kcp_client_state_snapshot(client, &client_state);
}
session->stats.connected = client_state.connected;
session->stats.registered = client_state.registered;
snprintf(
session->stats.last_server_error,
sizeof(session->stats.last_server_error),
"%s",
client_state.last_server_error
);
}
static void omnisocket_session_mark_disconnected_locked(omnisocket_session_t *session) {
if (session == NULL) {
return;
}
session->stats.connected = 0;
session->stats.registered = 0;
}
int omnisocket_session_init(omnisocket_session_t *session) {
int rc;
@@ -97,7 +125,7 @@ int omnisocket_session_connect(
return -1;
}
session->client = client;
session->stats.connected = 1;
omnisocket_session_sync_client_state_locked(session, client);
pthread_mutex_unlock(&session->mutex);
return 0;
}
@@ -119,7 +147,7 @@ int omnisocket_session_close(omnisocket_session_t *session) {
session->closing = 1;
session->client = NULL;
}
session->stats.connected = 0;
omnisocket_session_mark_disconnected_locked(session);
pthread_mutex_unlock(&session->mutex);
if (client != NULL) {
@@ -139,6 +167,16 @@ int omnisocket_session_close(omnisocket_session_t *session) {
}
int omnisocket_session_send(omnisocket_session_t *session, const char *to, const void *data, size_t data_len) {
return omnisocket_session_send_with_id(session, to, data, data_len, NULL);
}
int omnisocket_session_send_with_id(
omnisocket_session_t *session,
const char *to,
const void *data,
size_t data_len,
uint64_t *out_message_id
) {
kcp_client_t *client;
int rc;
@@ -150,7 +188,7 @@ int omnisocket_session_send(omnisocket_session_t *session, const char *to, const
if (omnisocket_session_begin_client_op(session, &client) != 0) {
return -1;
}
rc = kcp_client_send_binary(client, to, data, data_len);
rc = kcp_client_send_binary_with_id(client, to, data, data_len, out_message_id);
pthread_mutex_lock(&session->mutex);
if (rc == 0) {
session->stats.send_calls += 1;
@@ -158,6 +196,7 @@ int omnisocket_session_send(omnisocket_session_t *session, const char *to, const
} else {
session->stats.send_errors += 1;
}
omnisocket_session_sync_client_state_locked(session, client);
if (session->active_ops > 0) {
session->active_ops -= 1;
}
@@ -190,6 +229,7 @@ int omnisocket_session_recv(omnisocket_session_t *session, message_t *out_msg, i
} else {
session->stats.recv_errors += 1;
}
omnisocket_session_sync_client_state_locked(session, client);
if (session->active_ops > 0) {
session->active_ops -= 1;
}
@@ -228,6 +268,7 @@ int omnisocket_session_recv_into(
} else {
session->stats.recv_errors += 1;
}
omnisocket_session_sync_client_state_locked(session, client);
if (session->active_ops > 0) {
session->active_ops -= 1;
}
@@ -246,3 +287,286 @@ void omnisocket_session_stats_snapshot(omnisocket_session_t *session, omnisocket
*out_stats = session->stats;
pthread_mutex_unlock(&session->mutex);
}
void omnisocket_session_kcp_stats_snapshot(omnisocket_session_t *session, omnisocket_session_kcp_stats_t *out_stats) {
kcp_runtime_stats_t runtime_stats;
if (session == NULL || out_stats == NULL) {
return;
}
memset(&runtime_stats, 0, sizeof(runtime_stats));
pthread_mutex_lock(&session->mutex);
if (session->client != NULL) {
kcp_client_runtime_stats_snapshot(session->client, &runtime_stats);
}
pthread_mutex_unlock(&session->mutex);
memset(out_stats, 0, sizeof(*out_stats));
out_stats->connected = runtime_stats.connected;
out_stats->conv = runtime_stats.conv;
out_stats->rto_ms = runtime_stats.rto_ms;
out_stats->srtt_ms = runtime_stats.srtt_ms;
out_stats->min_srtt_ms = runtime_stats.min_srtt_ms;
out_stats->srttvar_ms = runtime_stats.srttvar_ms;
out_stats->last_feedback_age_ms = runtime_stats.last_feedback_age_ms;
out_stats->snd_wnd = runtime_stats.snd_wnd;
out_stats->rmt_wnd = runtime_stats.rmt_wnd;
out_stats->inflight = runtime_stats.inflight;
out_stats->window_limit = runtime_stats.window_limit;
out_stats->window_pressure_pct = runtime_stats.window_pressure_pct;
out_stats->snd_queue = runtime_stats.snd_queue;
out_stats->rcv_queue = runtime_stats.rcv_queue;
out_stats->snd_buffer = runtime_stats.snd_buffer;
out_stats->out_segs_total = runtime_stats.out_segs_total;
out_stats->retrans_total = runtime_stats.retrans_total;
out_stats->fast_retrans_total = runtime_stats.fast_retrans_total;
out_stats->lost_total = runtime_stats.lost_total;
out_stats->repeat_total = runtime_stats.repeat_total;
out_stats->xmit_total = runtime_stats.xmit_total;
}
int omnisocket_udp_session_init(omnisocket_udp_session_t *session) {
int rc;
if (session == NULL) {
errno = EINVAL;
return -1;
}
memset(session, 0, sizeof(*session));
rc = pthread_mutex_init(&session->mutex, NULL);
if (rc != 0) {
errno = rc;
return -1;
}
rc = pthread_cond_init(&session->idle_cond, NULL);
if (rc != 0) {
pthread_mutex_destroy(&session->mutex);
errno = rc;
return -1;
}
return 0;
}
void omnisocket_udp_session_destroy(omnisocket_udp_session_t *session) {
if (session == NULL) {
return;
}
(void) omnisocket_udp_session_close(session);
pthread_cond_destroy(&session->idle_cond);
pthread_mutex_destroy(&session->mutex);
}
static int omnisocket_udp_session_begin_client_op(omnisocket_udp_session_t *session, udp_client_t **out_client) {
if (session == NULL || out_client == NULL) {
errno = EINVAL;
return -1;
}
pthread_mutex_lock(&session->mutex);
if (session->closing) {
pthread_mutex_unlock(&session->mutex);
errno = ECANCELED;
return -1;
}
if (session->client == NULL) {
pthread_mutex_unlock(&session->mutex);
errno = ENOTCONN;
return -1;
}
*out_client = session->client;
session->active_ops += 1;
pthread_mutex_unlock(&session->mutex);
return 0;
}
int omnisocket_udp_session_connect(
omnisocket_udp_session_t *session,
const char *server_addr,
const char *peer_id,
const char *bind_ip,
const char *bind_device,
int enable_timestamping
) {
udp_client_t *client;
if (session == NULL || server_addr == NULL || peer_id == NULL) {
errno = EINVAL;
return -1;
}
pthread_mutex_lock(&session->mutex);
while (session->closing) {
pthread_cond_wait(&session->idle_cond, &session->mutex);
}
if (session->client != NULL) {
pthread_mutex_unlock(&session->mutex);
errno = EISCONN;
return -1;
}
client = udp_client_dial_with_options(
server_addr,
peer_id,
bind_ip,
bind_device,
NULL,
NULL,
enable_timestamping
);
if (client == NULL) {
pthread_mutex_unlock(&session->mutex);
return -1;
}
session->client = client;
session->stats.connected = 1;
session->stats.registered = 1;
session->stats.last_server_error[0] = '\0';
pthread_mutex_unlock(&session->mutex);
return 0;
}
int omnisocket_udp_session_close(omnisocket_udp_session_t *session) {
udp_client_t *client;
if (session == NULL) {
errno = EINVAL;
return -1;
}
pthread_mutex_lock(&session->mutex);
while (session->closing) {
pthread_cond_wait(&session->idle_cond, &session->mutex);
}
client = session->client;
if (client != NULL) {
session->closing = 1;
session->client = NULL;
}
session->stats.connected = 0;
session->stats.registered = 0;
pthread_mutex_unlock(&session->mutex);
if (client != NULL) {
udp_client_close(client);
pthread_mutex_lock(&session->mutex);
while (session->active_ops > 0) {
pthread_cond_wait(&session->idle_cond, &session->mutex);
}
pthread_mutex_unlock(&session->mutex);
udp_client_free(client);
pthread_mutex_lock(&session->mutex);
session->closing = 0;
pthread_cond_broadcast(&session->idle_cond);
pthread_mutex_unlock(&session->mutex);
}
return 0;
}
int omnisocket_udp_session_send(omnisocket_udp_session_t *session, const char *to, const void *data, size_t data_len) {
udp_client_t *client;
int rc;
if (session == NULL || to == NULL || (data == NULL && data_len > 0)) {
errno = EINVAL;
return -1;
}
if (omnisocket_udp_session_begin_client_op(session, &client) != 0) {
return -1;
}
rc = udp_client_send_binary(client, to, data, data_len);
pthread_mutex_lock(&session->mutex);
if (rc == 0) {
session->stats.send_calls += 1;
session->stats.send_bytes += (uint64_t) data_len;
} else {
session->stats.send_errors += 1;
}
if (session->active_ops > 0) {
session->active_ops -= 1;
}
if (session->closing && session->active_ops == 0) {
pthread_cond_broadcast(&session->idle_cond);
}
pthread_mutex_unlock(&session->mutex);
return rc;
}
int omnisocket_udp_session_recv(omnisocket_udp_session_t *session, message_t *out_msg, int timeout_ms) {
udp_client_t *client;
int rc;
if (session == NULL || out_msg == NULL) {
errno = EINVAL;
return -1;
}
if (omnisocket_udp_session_begin_client_op(session, &client) != 0) {
return -1;
}
rc = udp_client_receive_timed(client, out_msg, timeout_ms);
pthread_mutex_lock(&session->mutex);
if (rc == 0) {
session->stats.recv_calls += 1;
session->stats.recv_bytes += (uint64_t) out_msg->body_len;
} else if (rc == 1) {
session->stats.recv_timeouts += 1;
} else {
session->stats.recv_errors += 1;
}
if (session->active_ops > 0) {
session->active_ops -= 1;
}
if (session->closing && session->active_ops == 0) {
pthread_cond_broadcast(&session->idle_cond);
}
pthread_mutex_unlock(&session->mutex);
return rc;
}
int omnisocket_udp_session_recv_into(
omnisocket_udp_session_t *session,
void *buffer,
size_t buffer_len,
udp_client_recv_meta_t *out_meta,
int timeout_ms
) {
udp_client_t *client;
int rc;
if (session == NULL || out_meta == NULL || (buffer == NULL && buffer_len > 0)) {
errno = EINVAL;
return -1;
}
if (omnisocket_udp_session_begin_client_op(session, &client) != 0) {
return -1;
}
rc = udp_client_receive_into(client, buffer, buffer_len, out_meta, timeout_ms);
pthread_mutex_lock(&session->mutex);
if (rc == 0) {
session->stats.recv_calls += 1;
session->stats.recv_bytes += (uint64_t) out_meta->body_len;
} else if (rc == 1) {
session->stats.recv_timeouts += 1;
} else {
session->stats.recv_errors += 1;
}
if (session->active_ops > 0) {
session->active_ops -= 1;
}
if (session->closing && session->active_ops == 0) {
pthread_cond_broadcast(&session->idle_cond);
}
pthread_mutex_unlock(&session->mutex);
return rc;
}
void omnisocket_udp_session_stats_snapshot(omnisocket_udp_session_t *session, omnisocket_session_stats_t *out_stats) {
if (session == NULL || out_stats == NULL) {
return;
}
pthread_mutex_lock(&session->mutex);
*out_stats = session->stats;
pthread_mutex_unlock(&session->mutex);
}

View File

@@ -2,6 +2,7 @@
#define OMNISOCKET_PY_CLIENT_H
#include "peer_kcp_client.h"
#include "peer_udp_client.h"
typedef struct omnisocket_session_stats {
uint64_t send_calls;
@@ -12,8 +13,34 @@ typedef struct omnisocket_session_stats {
uint64_t recv_timeouts;
uint64_t recv_errors;
int connected;
int registered;
char last_server_error[256];
} omnisocket_session_stats_t;
typedef struct omnisocket_session_kcp_stats {
int connected;
uint32_t conv;
uint32_t rto_ms;
int32_t srtt_ms;
int32_t min_srtt_ms;
int32_t srttvar_ms;
uint32_t last_feedback_age_ms;
uint32_t snd_wnd;
uint32_t rmt_wnd;
uint32_t inflight;
uint32_t window_limit;
double window_pressure_pct;
uint32_t snd_queue;
uint32_t rcv_queue;
uint32_t snd_buffer;
uint64_t out_segs_total;
uint64_t retrans_total;
uint64_t fast_retrans_total;
uint64_t lost_total;
uint64_t repeat_total;
uint32_t xmit_total;
} omnisocket_session_kcp_stats_t;
typedef struct omnisocket_session {
pthread_mutex_t mutex;
pthread_cond_t idle_cond;
@@ -23,6 +50,15 @@ typedef struct omnisocket_session {
omnisocket_session_stats_t stats;
} omnisocket_session_t;
typedef struct omnisocket_udp_session {
pthread_mutex_t mutex;
pthread_cond_t idle_cond;
udp_client_t *client;
size_t active_ops;
int closing;
omnisocket_session_stats_t stats;
} omnisocket_udp_session_t;
int omnisocket_session_init(omnisocket_session_t *session);
void omnisocket_session_destroy(omnisocket_session_t *session);
@@ -38,6 +74,13 @@ int omnisocket_session_connect(
);
int omnisocket_session_close(omnisocket_session_t *session);
int omnisocket_session_send(omnisocket_session_t *session, const char *to, const void *data, size_t data_len);
int omnisocket_session_send_with_id(
omnisocket_session_t *session,
const char *to,
const void *data,
size_t data_len,
uint64_t *out_message_id
);
int omnisocket_session_recv(omnisocket_session_t *session, message_t *out_msg, int timeout_ms);
int omnisocket_session_recv_into(
omnisocket_session_t *session,
@@ -47,5 +90,29 @@ int omnisocket_session_recv_into(
int timeout_ms
);
void omnisocket_session_stats_snapshot(omnisocket_session_t *session, omnisocket_session_stats_t *out_stats);
void omnisocket_session_kcp_stats_snapshot(omnisocket_session_t *session, omnisocket_session_kcp_stats_t *out_stats);
int omnisocket_udp_session_init(omnisocket_udp_session_t *session);
void omnisocket_udp_session_destroy(omnisocket_udp_session_t *session);
int omnisocket_udp_session_connect(
omnisocket_udp_session_t *session,
const char *server_addr,
const char *peer_id,
const char *bind_ip,
const char *bind_device,
int enable_timestamping
);
int omnisocket_udp_session_close(omnisocket_udp_session_t *session);
int omnisocket_udp_session_send(omnisocket_udp_session_t *session, const char *to, const void *data, size_t data_len);
int omnisocket_udp_session_recv(omnisocket_udp_session_t *session, message_t *out_msg, int timeout_ms);
int omnisocket_udp_session_recv_into(
omnisocket_udp_session_t *session,
void *buffer,
size_t buffer_len,
udp_client_recv_meta_t *out_meta,
int timeout_ms
);
void omnisocket_udp_session_stats_snapshot(omnisocket_udp_session_t *session, omnisocket_session_stats_t *out_stats);
#endif

View File

@@ -0,0 +1,318 @@
from __future__ import annotations
from contextlib import contextmanager
from pathlib import Path
import socket
import subprocess
import sys
import threading
import time
import pytest
pytestmark = pytest.mark.skipif(sys.platform != 'linux', reason='Linux-only OmniSocket extension')
ROOT = Path(__file__).resolve().parents[2]
PYTHON_ROOT = ROOT / 'python'
if str(PYTHON_ROOT) not in sys.path:
sys.path.insert(0, str(PYTHON_ROOT))
omnisocket = pytest.importorskip('omnisocket')
CONTROL_DEFAULTS = omnisocket.CONTROL_DEFAULTS
MSG_TYPE_BINARY = omnisocket.MSG_TYPE_BINARY
Session = omnisocket.Session
UdpSession = omnisocket.UdpSession
def _reserve_port() -> int:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
sock.bind(('127.0.0.1', 0))
return int(sock.getsockname()[1])
@contextmanager
def _run_server(binary_name: str, listen_addr: str):
binary = ROOT / 'bin' / binary_name
if not binary.exists():
pytest.skip(f'{binary} is not built')
process = subprocess.Popen(
[str(binary), '-listen', listen_addr],
cwd=str(ROOT),
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
try:
time.sleep(0.2)
yield process
finally:
process.terminate()
try:
process.wait(timeout=2.0)
except subprocess.TimeoutExpired:
process.kill()
process.wait(timeout=2.0)
@contextmanager
def _run_relay(listen_addr: str, remote_addr: str):
binary = ROOT / 'bin' / 'kcpserver'
if not binary.exists():
pytest.skip(f'{binary} is not built')
process = subprocess.Popen(
[str(binary), '-mode', 'relay', '-listen', listen_addr, '-relay-remote', remote_addr],
cwd=str(ROOT),
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
try:
time.sleep(0.2)
yield process
finally:
process.terminate()
try:
process.wait(timeout=2.0)
except subprocess.TimeoutExpired:
process.kill()
process.wait(timeout=2.0)
def _connect_with_retry(session_cls, *, transport: str, server_addr: str, peer_id: str, relay_via: str = ''):
deadline = time.monotonic() + 3.0
last_error: Exception | None = None
while time.monotonic() < deadline:
session = session_cls()
try:
kwargs: dict[str, object] = {
'server_addr': server_addr,
'peer_id': peer_id,
}
if transport == 'kcp':
kwargs.update(CONTROL_DEFAULTS)
if relay_via:
kwargs['relay_via'] = relay_via
else:
kwargs['enable_timestamping'] = False
session.connect(**kwargs)
return session
except OSError as exc:
last_error = exc
time.sleep(0.1)
raise AssertionError(f'failed to connect {peer_id} to {server_addr}: {last_error}')
@pytest.mark.parametrize(
('transport', 'binary_name', 'session_cls'),
[
('udp', 'udpserver', UdpSession),
('kcp', 'kcpserver', Session),
],
)
def test_control_sessions_smoke(transport: str, binary_name: str, session_cls) -> None:
port = _reserve_port()
listen_addr = f'127.0.0.1:{port}'
sender_id = f'pytest-{transport}-sender'
receiver_id = f'pytest-{transport}-receiver'
with _run_server(binary_name, listen_addr):
sender = _connect_with_retry(session_cls, transport=transport, server_addr=listen_addr, peer_id=sender_id)
receiver = _connect_with_retry(session_cls, transport=transport, server_addr=listen_addr, peer_id=receiver_id)
try:
assert receiver.recv(timeout_ms=20) is None
payload = b'control-packet-1'
sender.send(to=receiver_id, data=payload)
from_peer, msg_type, recv_payload = receiver.recv(timeout_ms=1000)
assert from_peer == sender_id
assert msg_type == MSG_TYPE_BINARY
assert recv_payload == payload
payload2 = b'control-packet-2'
sender.send(to=receiver_id, data=payload2)
recv_buffer = bytearray(128)
meta = receiver.recv_into(buffer=recv_buffer, timeout_ms=1000)
assert meta is not None
assert meta['from'] == sender_id
assert meta['msg_type'] == MSG_TYPE_BINARY
assert meta['body_len'] == len(payload2)
assert bytes(recv_buffer[: meta['body_len']]) == payload2
sender_stats = sender.stats()
receiver_stats = receiver.stats()
assert sender_stats['connected'] == 1
assert receiver_stats['connected'] == 1
assert sender_stats['registered'] == 1
assert receiver_stats['registered'] == 1
assert sender_stats['send_calls'] >= 2
assert receiver_stats['recv_calls'] >= 2
if transport == 'kcp':
sender_kcp_stats = sender.kcp_stats()
receiver_kcp_stats = receiver.kcp_stats()
assert sender_kcp_stats['connected'] == 1
assert receiver_kcp_stats['connected'] == 1
assert 'srtt_ms' in sender_kcp_stats
assert 'snd_queue' in receiver_kcp_stats
finally:
sender.close()
receiver.close()
def test_kcp_duplicate_peer_new_instance_wins() -> None:
port = _reserve_port()
listen_addr = f'127.0.0.1:{port}'
shared_peer_id = 'pytest-kcp-shared-peer'
sender_id = 'pytest-kcp-unique-sender'
with _run_server('kcpserver', listen_addr):
original = _connect_with_retry(Session, transport='kcp', server_addr=listen_addr, peer_id=shared_peer_id)
sender = _connect_with_retry(Session, transport='kcp', server_addr=listen_addr, peer_id=sender_id)
replacement = None
try:
replacement = _connect_with_retry(Session, transport='kcp', server_addr=listen_addr, peer_id=shared_peer_id)
replacement_stats = replacement.stats()
assert replacement_stats['connected'] == 1
assert replacement_stats['registered'] == 1
with pytest.raises(OSError):
original.recv(timeout_ms=1000)
payload = b'registered-replacement'
sender.send(to=shared_peer_id, data=payload)
from_peer, msg_type, recv_payload = replacement.recv(timeout_ms=1000)
assert from_peer == sender_id
assert msg_type == MSG_TYPE_BINARY
assert recv_payload == payload
finally:
original.close()
sender.close()
if replacement is not None:
replacement.close()
def test_kcp_idle_video_peers_survive_without_receive_loop() -> None:
port = _reserve_port()
listen_addr = f'127.0.0.1:{port}'
sender_id = 'peer-b-video'
receiver_id = 'pytest-kcp-video-idle-receiver'
with _run_server('kcpserver', listen_addr):
sender = _connect_with_retry(Session, transport='kcp', server_addr=listen_addr, peer_id=sender_id)
receiver = _connect_with_retry(Session, transport='kcp', server_addr=listen_addr, peer_id=receiver_id)
try:
time.sleep(5.0)
payload = b'idle-video-session-still-alive'
sender.send(to=receiver_id, data=payload)
from_peer, msg_type, recv_payload = receiver.recv(timeout_ms=1000)
assert from_peer == sender_id
assert msg_type == MSG_TYPE_BINARY
assert recv_payload == payload
finally:
sender.close()
receiver.close()
def test_kcp_peer_a_video_stale_receiver_is_evicted() -> None:
port = _reserve_port()
listen_addr = f'127.0.0.1:{port}'
receiver_id = 'peer-a-video'
with _run_server('kcpserver', listen_addr):
receiver = _connect_with_retry(Session, transport='kcp', server_addr=listen_addr, peer_id=receiver_id)
try:
time.sleep(5.0)
with pytest.raises(OSError):
receiver.recv(timeout_ms=1000)
finally:
receiver.close()
def test_kcp_relay_routes_multiple_sessions_by_conv() -> None:
hub_port = _reserve_port()
relay_port = _reserve_port()
hub_addr = f'127.0.0.1:{hub_port}'
relay_addr = f'127.0.0.1:{relay_port}'
with _run_server('kcpserver', hub_addr):
with _run_relay(relay_addr, hub_addr):
sender = _connect_with_retry(Session, transport='kcp', server_addr=hub_addr, peer_id='pytest-relay-sender', relay_via=relay_addr)
receiver = _connect_with_retry(Session, transport='kcp', server_addr=hub_addr, peer_id='pytest-relay-receiver', relay_via=relay_addr)
chatter = _connect_with_retry(Session, transport='kcp', server_addr=hub_addr, peer_id='pytest-relay-chatter', relay_via=relay_addr)
try:
chatter.send(to='pytest-relay-sender', data=b'chatter-primes-last-client')
from_peer, msg_type, recv_payload = sender.recv(timeout_ms=1000)
assert from_peer == 'pytest-relay-chatter'
assert msg_type == MSG_TYPE_BINARY
assert recv_payload == b'chatter-primes-last-client'
sender.send(to='pytest-relay-receiver', data=b'relay-video-frame')
from_peer, msg_type, recv_payload = receiver.recv(timeout_ms=1000)
assert from_peer == 'pytest-relay-sender'
assert msg_type == MSG_TYPE_BINARY
assert recv_payload == b'relay-video-frame'
finally:
sender.close()
receiver.close()
chatter.close()
def test_udp_session_close_interrupts_blocking_recv() -> None:
port = _reserve_port()
listen_addr = f'127.0.0.1:{port}'
receiver_id = 'pytest-udp-blocking-recv'
with _run_server('udpserver', listen_addr):
receiver = _connect_with_retry(
UdpSession,
transport='udp',
server_addr=listen_addr,
peer_id=receiver_id,
)
recv_error: list[BaseException] = []
close_error: list[BaseException] = []
recv_started = threading.Event()
recv_done = threading.Event()
close_done = threading.Event()
def recv_worker() -> None:
recv_started.set()
try:
receiver.recv()
except BaseException as exc: # pragma: no cover - assertion is on thread completion
recv_error.append(exc)
finally:
recv_done.set()
def close_worker() -> None:
try:
receiver.close()
except BaseException as exc: # pragma: no cover - assertion is on thread completion
close_error.append(exc)
finally:
close_done.set()
recv_thread = threading.Thread(target=recv_worker, daemon=True)
recv_thread.start()
assert recv_started.wait(timeout=1.0)
time.sleep(0.05)
close_thread = threading.Thread(target=close_worker, daemon=True)
close_thread.start()
assert close_done.wait(timeout=1.0), 'UdpSession.close() blocked while recv() was waiting'
assert recv_done.wait(timeout=1.0), 'UdpSession.recv() stayed blocked after close()'
assert not close_thread.is_alive()
assert not recv_thread.is_alive()
assert not close_error
assert not recv_error or isinstance(recv_error[0], OSError)

34
ros-control-c/Makefile Normal file
View File

@@ -0,0 +1,34 @@
CC = gcc
CFLAGS = -std=c11 -Wall -Wextra -O2 -pthread -D_GNU_SOURCE -I../include -I../third_party/cjson -I../third_party/kcp -I./common
LDFLAGS = -pthread -lm
OMNI_SRCS = \
../src/omni_common.c \
../src/protocol.c \
../src/latencylog.c \
../src/kcp_packet_debug.c \
../src/kcp_session_stats.c \
../src/linux_timestamping.c \
../src/transport_kcp.c \
../src/peer_kcp_client.c \
../third_party/cjson/cJSON.c \
../third_party/kcp/ikcp.c
BUILDDIR = build
TARGETS = $(BUILDDIR)/keyboard_controller $(BUILDDIR)/gamepad_controller
.PHONY: all clean
all: $(TARGETS)
$(BUILDDIR)/keyboard_controller: remote/keyboard_controller.c common/protocol.h common/teleop_transport.h common/teleop_transport.c $(OMNI_SRCS)
@mkdir -p $(BUILDDIR)
$(CC) $(CFLAGS) -o $@ remote/keyboard_controller.c common/teleop_transport.c $(OMNI_SRCS) $(LDFLAGS)
$(BUILDDIR)/gamepad_controller: remote/gamepad_controller.c common/protocol.h common/teleop_transport.h common/teleop_transport.c $(OMNI_SRCS)
@mkdir -p $(BUILDDIR)
$(CC) $(CFLAGS) -o $@ remote/gamepad_controller.c common/teleop_transport.c $(OMNI_SRCS) $(LDFLAGS)
clean:
rm -rf $(BUILDDIR)

76
ros-control-c/README.md Normal file
View File

@@ -0,0 +1,76 @@
# ros-control-c
`ros-control-c` keeps the original 24-byte `twist_cmd_t` control payload and now supports two runtime transports:
- `udp` (default): unchanged from the original implementation
- `kcp`: sent through OmniSocket using `MSG_TYPE_BINARY`
Note:
- This README documents the `ros-control-c` path only.
- `ros-control-py` now uses OmniSocket for both `transport:=udp` and `transport:=kcp`; its `udp` mode is no longer raw socket UDP.
## Build
On Linux:
```bash
make -C ros-control-c
```
If the robot-side Python bridge will use KCP, build and install the OmniSocket Python extension from the repo root first:
```bash
make python-ext
make python-install
```
## UDP Mode
Sender:
```bash
./ros-control-c/build/keyboard_controller -i 192.168.1.100 -p 9870
./ros-control-c/build/gamepad_controller -i 192.168.1.100 -p 9870
```
Robot bridge:
```bash
python3 ros-control-c/robot/udp_ros_bridge.py
```
## KCP Mode
Start the existing OmniSocket KCP hub from the repo root:
```bash
./bin/kcpserver -listen :9002 -telemetry-peer peer-a-telemetry
```
Sender:
```bash
./ros-control-c/build/keyboard_controller -t kcp -s 192.168.1.50:9002 -I ros-keyboard-ctrl -T ros-bridge-ctrl
./ros-control-c/build/gamepad_controller -t kcp -s 192.168.1.50:9002 -I ros-gamepad-ctrl -T ros-bridge-ctrl
```
If a relay is needed, add `-r <relay_addr>` to the controller command.
Robot bridge:
```bash
python3 ros-control-c/robot/udp_ros_bridge.py --ros-args \
-p transport:=kcp \
-p kcp_server:=192.168.1.50:9002 \
-p peer_id:=ros-bridge-ctrl
```
Optional sender filtering:
```bash
python3 ros-control-c/robot/udp_ros_bridge.py --ros-args \
-p transport:=kcp \
-p peer_id:=ros-bridge-ctrl \
-p expected_sender:=ros-keyboard-ctrl
```

View File

@@ -0,0 +1,221 @@
Robot Remote Control via UDP — Implementation Plan
Context
The robot subscribes to /hric/robot/cmd_vel with geometry_msgs/msg/TwistStamped (frame_id: pelvis). Standard ROS2 teleop tools (teleop_twist_keyboard, teleop_twist_joy) publish
plain Twist, not TwistStamped, so they won't work directly. We build custom keyboard and gamepad controllers in C (zero external dependencies, Linux-only) communicating over UDP
to a robot-side ROS2 bridge.
How to Make the Robot Move
Publish TwistStamped to /hric/robot/cmd_vel continuously (~20 Hz):
┌─────────────────────┬─────────────────┐
│ Field │ Effect │
├─────────────────────┼─────────────────┤
│ twist.linear.x > 0 │ Walk forward │
├─────────────────────┼─────────────────┤
│ twist.linear.x < 0 │ Walk backward │
├─────────────────────┼─────────────────┤
│ twist.linear.y > 0 │ Strafe left │
├─────────────────────┼─────────────────┤
│ twist.linear.y < 0 │ Strafe right │
├─────────────────────┼─────────────────┤
│ twist.angular.z > 0 │ Turn left (CCW) │
├─────────────────────┼─────────────────┤
│ twist.angular.z < 0 │ Turn right (CW) │
├─────────────────────┼─────────────────┤
│ All zeros │ Stop │
└─────────────────────┴─────────────────┘
Header must have frame_id = "pelvis" and current ROS timestamp.
---
Architecture
[PC: Keyboard/Gamepad (C)] --UDP binary struct--> [Robot: Bridge (Python/rclpy)] --> /hric/robot/cmd_vel
---
Project Structure
ros-control/
├── topic_example.yaml # (existing)
├── Makefile # Build both C programs
├── common/
│ └── protocol.h # Shared UDP protocol (binary struct)
├── remote/
│ ├── keyboard_controller.c # Keyboard teleop (C, termios)
│ └── gamepad_controller.c # Gamepad teleop (C, Linux joystick API)
└── robot/
└── udp_ros_bridge.py # UDP → ROS2 TwistStamped (Python/rclpy)
---
UDP Protocol (common/protocol.h)
Binary packed struct — 24 bytes, no parsing overhead:
#pragma pack(push, 1)
typedef struct {
float lx, ly, lz; // linear velocity (m/s)
float ax, ay, az; // angular velocity (rad/s)
} twist_cmd_t;
#pragma pack(pop)
#define DEFAULT_PORT 9870
#define DEFAULT_IP "127.0.0.1"
On Python side, decode with struct.unpack('<6f', data).
---
Program 1: Keyboard Controller (remote/keyboard_controller.c)
Dependencies: None (POSIX + termios only)
Technical approach:
- termios.h: Set terminal to raw mode (~ICANON, ~ECHO, VMIN=0, VTIME=1)
- select() with 50ms timeout for non-blocking key detection
- Arrow keys: detect ESC sequence (\x1B[A/B/C/D)
- UDP send via standard socket() / sendto()
Key mapping:
┌────────┬───────────────────────────────────┐
│ Key │ Action │
├────────┼───────────────────────────────────┤
│ W / ↑ │ Forward (+linear.x) │
├────────┼───────────────────────────────────┤
│ S / ↓ │ Backward (-linear.x) │
├────────┼───────────────────────────────────┤
│ A / ← │ Turn left (+angular.z) │
├────────┼───────────────────────────────────┤
│ D / → │ Turn right (-angular.z) │
├────────┼───────────────────────────────────┤
│ Q │ Strafe left (+linear.y) │
├────────┼───────────────────────────────────┤
│ E │ Strafe right (-linear.y) │
├────────┼───────────────────────────────────┤
│ Space │ Emergency stop (all zeros) │
├────────┼───────────────────────────────────┤
│ [ / ] │ Decrease / increase linear speed │
├────────┼───────────────────────────────────┤
│ - / = │ Decrease / increase angular speed │
├────────┼───────────────────────────────────┤
│ Ctrl+C │ Quit (restore terminal) │
└────────┴───────────────────────────────────┘
Behavior:
- 20 Hz send loop in main thread
- On key press: set velocity to ±max_speed
- On no key (select timeout): gradually decay velocity to zero OR send zero immediately (configurable)
- Print current velocity and speed settings to terminal (refresh in-place with \r)
- signal(SIGINT) handler to restore terminal settings before exit
- CLI args: -i <ip>, -p <port>, -l <max_linear>, -a <max_angular>
---
Program 2: Gamepad Controller (remote/gamepad_controller.c)
Dependencies: None (Linux joystick API only: linux/joystick.h)
Technical approach:
- Open /dev/input/js0 (configurable) with O_RDONLY | O_NONBLOCK
- Read struct js_event (8 bytes: __u32 time, __s16 value, __u8 type, __u8 number)
- Event types: JS_EVENT_AXIS (0x02), JS_EVENT_BUTTON (0x01)
- select() for multiplexing joystick read + periodic UDP send
Xbox controller axis mapping (xpad driver):
┌────────┬───────────────┬───────────────────────────────────┐
│ Axis # │ Physical │ Mapping │
├────────┼───────────────┼───────────────────────────────────┤
│ 0 │ Left stick X │ linear.y (strafe) │
├────────┼───────────────┼───────────────────────────────────┤
│ 1 │ Left stick Y │ linear.x (forward/back, inverted) │
├────────┼───────────────┼───────────────────────────────────┤
│ 3 │ Right stick X │ angular.z (turn) │
└────────┴───────────────┴───────────────────────────────────┘
Button mapping:
┌──────────┬──────────┬────────────────┐
│ Button # │ Physical │ Action │
├──────────┼──────────┼────────────────┤
│ 0 │ A │ Emergency stop │
├──────────┼──────────┼────────────────┤
│ 1 │ B │ Quit │
└──────────┴──────────┴────────────────┘
Behavior:
- Axis values: raw range [-32767, 32767] → normalized to [-1.0, 1.0] → scaled by max_speed
- Deadzone: |normalized| < 0.1 → treat as 0 (configurable)
- 20 Hz UDP send loop
- Print gamepad name (via JSIOCGNAME ioctl), axes, and current velocities
- Auto-detect controller disconnect / reconnect
- CLI args: -i <ip>, -p <port>, -d <device>, -l <max_linear>, -a <max_angular>, -z <deadzone>
---
Program 3: UDP-to-ROS2 Bridge (robot/udp_ros_bridge.py)
Dependencies: rclpy, geometry_msgs (standard ROS2)
Behavior:
- ROS2 node: udp_teleop_bridge
- Bind UDP on 0.0.0.0:9870
- Receive 24-byte struct → struct.unpack('<6f', data) → build TwistStamped
- Set header.stamp = current ROS time, header.frame_id = 'pelvis'
- Publish to /hric/robot/cmd_vel at received rate
- Watchdog: if no packet for 0.5s, publish zero velocity (safety stop)
- UDP recv in separate threading.Thread, ROS2 spin() in main thread
- ROS2 parameters: udp_port (int), topic (string), frame_id (string), timeout (float)
---
Build System (Makefile)
CC = gcc
CFLAGS = -Wall -Wextra -O2 -I./common
LDFLAGS = -lm
all: build/keyboard_controller build/gamepad_controller
build/keyboard_controller: remote/keyboard_controller.c common/protocol.h
@mkdir -p build
$(CC) $(CFLAGS) -o $@ $< $(LDFLAGS)
build/gamepad_controller: remote/gamepad_controller.c common/protocol.h
@mkdir -p build
$(CC) $(CFLAGS) -o $@ $< $(LDFLAGS)
clean:
rm -rf build
---
Files to Create (5 total)
┌─────┬──────────────────────────────┬────────┬───────────────────────────────────┐
│ # │ File │ Lang │ Purpose │
├─────┼──────────────────────────────┼────────┼───────────────────────────────────┤
│ 1 │ common/protocol.h │ C │ UDP protocol: struct + constants │
├─────┼──────────────────────────────┼────────┼───────────────────────────────────┤
│ 2 │ remote/keyboard_controller.c │ C │ Keyboard → UDP (termios, select) │
├─────┼──────────────────────────────┼────────┼───────────────────────────────────┤
│ 3 │ remote/gamepad_controller.c │ C │ Gamepad → UDP (linux/joystick.h) │
├─────┼──────────────────────────────┼────────┼───────────────────────────────────┤
│ 4 │ robot/udp_ros_bridge.py │ Python │ UDP → ROS2 TwistStamped publisher │
├─────┼──────────────────────────────┼────────┼───────────────────────────────────┤
│ 5 │ Makefile │ Make │ Build system │
└─────┴──────────────────────────────┴────────┴───────────────────────────────────┘
---
Verification
1. Build: make — should compile without warnings
2. Keyboard test: Run build/keyboard_controller -i 127.0.0.1, use a simple Python UDP listener to verify packets:
import socket, struct
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.bind(('0.0.0.0', 9870))
while True:
data, _ = s.recvfrom(24)
print(struct.unpack('<6f', data))
3. Gamepad test: Connect Xbox controller, run build/gamepad_controller, verify stick input produces correct UDP packets
4. Bridge test: Run udp_ros_bridge.py, then ros2 topic echo /hric/robot/cmd_vel to verify TwistStamped messages
5. Safety: Stop controller, confirm bridge sends zero velocity after 0.5s timeout
6. End-to-end: Controller → Bridge → robot moves

View File

@@ -0,0 +1,26 @@
#ifndef PROTOCOL_H
#define PROTOCOL_H
#include <stdint.h>
#define DEFAULT_PORT 9870
#define DEFAULT_IP "127.0.0.1"
#define SEND_RATE_HZ 20
#define SEND_INTERVAL_US (1000000 / SEND_RATE_HZ)
#pragma pack(push, 1)
typedef struct {
float lx, ly, lz; /* linear velocity (m/s) */
float ax, ay, az; /* angular velocity (rad/s) */
} twist_cmd_t;
#pragma pack(pop)
#define TWIST_CMD_SIZE sizeof(twist_cmd_t) /* 24 bytes */
static inline void twist_cmd_zero(twist_cmd_t *cmd)
{
cmd->lx = cmd->ly = cmd->lz = 0.0f;
cmd->ax = cmd->ay = cmd->az = 0.0f;
}
#endif /* PROTOCOL_H */

View File

@@ -0,0 +1,300 @@
#include "teleop_transport.h"
#include <arpa/inet.h>
#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <sys/socket.h>
#include <unistd.h>
static void teleop_transport_clear(teleop_transport_t *transport)
{
if (transport == NULL) {
return;
}
memset(transport, 0, sizeof(*transport));
transport->mode = TELEOP_TRANSPORT_MODE_UDP;
transport->udp_fd = -1;
}
int teleop_transport_parse_mode(const char *raw, teleop_transport_mode_t *out_mode)
{
if (raw == NULL || out_mode == NULL) {
errno = EINVAL;
return -1;
}
if (strcmp(raw, "udp") == 0) {
*out_mode = TELEOP_TRANSPORT_MODE_UDP;
return 0;
}
if (strcmp(raw, "kcp") == 0) {
*out_mode = TELEOP_TRANSPORT_MODE_KCP;
return 0;
}
errno = EINVAL;
return -1;
}
const char *teleop_transport_mode_name(teleop_transport_mode_t mode)
{
return mode == TELEOP_TRANSPORT_MODE_KCP ? "kcp" : "udp";
}
static void teleop_transport_log_incoming(const message_t *msg)
{
if (msg == NULL) {
return;
}
switch (msg->type) {
case MSG_TYPE_ERROR:
fprintf(stderr,
"teleop transport: server error from %s to %s: %.*s\n",
msg->from,
msg->to,
(int)msg->body_len,
msg->body == NULL ? "" : (const char *)msg->body);
break;
case MSG_TYPE_TEXT:
fprintf(stderr,
"teleop transport: dropped unexpected text from %s to %s: %.*s\n",
msg->from,
msg->to,
(int)msg->body_len,
msg->body == NULL ? "" : (const char *)msg->body);
break;
case MSG_TYPE_BINARY:
fprintf(stderr,
"teleop transport: dropped unexpected binary payload from %s to %s (%lu bytes)\n",
msg->from,
msg->to,
(unsigned long)msg->body_len);
break;
case MSG_TYPE_FILE:
fprintf(stderr,
"teleop transport: dropped unexpected file from %s to %s: %s (%lu bytes)\n",
msg->from,
msg->to,
msg->file_name,
(unsigned long)msg->body_len);
break;
case MSG_TYPE_REGISTER:
fprintf(stderr,
"teleop transport: dropped unexpected register message from %s to %s\n",
msg->from,
msg->to);
break;
default:
fprintf(stderr,
"teleop transport: dropped unexpected message type %s from %s\n",
protocol_message_type_name(msg->type),
msg->from);
break;
}
}
static void *teleop_transport_kcp_recv_thread_main(void *arg)
{
teleop_transport_t *transport = (teleop_transport_t *)arg;
for (;;) {
message_t msg;
int rc;
if (transport->stop_requested) {
return NULL;
}
protocol_message_init(&msg);
rc = kcp_client_receive_timed(transport->kcp_client, &msg, 100);
if (rc == 1) {
protocol_message_clear(&msg);
continue;
}
if (rc != 0) {
protocol_message_clear(&msg);
if (!transport->stop_requested) {
int saved_errno = errno;
fprintf(stderr,
"teleop transport: KCP receive loop stopped: %s (errno=%d)\n",
saved_errno != 0 ? strerror(saved_errno) : "unknown error",
saved_errno);
}
return NULL;
}
teleop_transport_log_incoming(&msg);
protocol_message_clear(&msg);
}
}
static int teleop_transport_open_udp(teleop_transport_t *transport, const teleop_transport_config_t *config)
{
int sockfd;
if (transport == NULL || config == NULL || config->udp_ip == NULL) {
errno = EINVAL;
return -1;
}
sockfd = socket(AF_INET, SOCK_DGRAM, 0);
if (sockfd < 0) {
perror("socket");
return -1;
}
memset(&transport->udp_dest, 0, sizeof(transport->udp_dest));
transport->udp_dest.sin_family = AF_INET;
transport->udp_dest.sin_port = htons(config->udp_port);
if (inet_pton(AF_INET, config->udp_ip, &transport->udp_dest.sin_addr) <= 0) {
fprintf(stderr, "Invalid IP: %s\n", config->udp_ip);
close(sockfd);
errno = EINVAL;
return -1;
}
transport->udp_fd = sockfd;
return 0;
}
static int teleop_transport_open_kcp(teleop_transport_t *transport, const teleop_transport_config_t *config)
{
kcp_conn_options_t options;
const char *relay_via;
if (transport == NULL || config == NULL ||
config->server_addr == NULL || config->peer_id == NULL || config->target_peer == NULL) {
errno = EINVAL;
return -1;
}
kcp_conn_options_set_control_defaults(&options);
relay_via = (config->relay_via != NULL && config->relay_via[0] != '\0') ? config->relay_via : NULL;
transport->kcp_client = kcp_client_dial_with_options(
config->server_addr,
relay_via,
config->peer_id,
"",
"",
&options,
NULL,
NULL,
NULL,
KCP_DEFAULT_STATS_INTERVAL_MS
);
if (transport->kcp_client == NULL) {
int saved_errno = errno;
fprintf(stderr,
"teleop transport: failed to open KCP session as %s via %s%s%s: %s (errno=%d)\n",
config->peer_id,
config->server_addr,
relay_via != NULL ? ", relay=" : "",
relay_via != NULL ? relay_via : "",
saved_errno != 0 ? strerror(saved_errno) : "unknown error",
saved_errno);
errno = saved_errno;
return -1;
}
{
int thread_rc = pthread_create(&transport->recv_thread, NULL, teleop_transport_kcp_recv_thread_main, transport);
if (thread_rc != 0) {
fprintf(stderr,
"teleop transport: failed to start KCP receive thread: %s (errno=%d)\n",
strerror(thread_rc),
thread_rc);
kcp_client_close(transport->kcp_client);
kcp_client_free(transport->kcp_client);
transport->kcp_client = NULL;
errno = thread_rc;
return -1;
}
}
transport->recv_thread_started = 1;
return 0;
}
int teleop_transport_open(teleop_transport_t *transport, const teleop_transport_config_t *config)
{
if (transport == NULL || config == NULL) {
errno = EINVAL;
return -1;
}
teleop_transport_clear(transport);
transport->mode = config->mode;
snprintf(transport->server_addr, sizeof(transport->server_addr), "%s",
config->server_addr == NULL ? "" : config->server_addr);
snprintf(transport->relay_via, sizeof(transport->relay_via), "%s",
config->relay_via == NULL ? "" : config->relay_via);
snprintf(transport->peer_id, sizeof(transport->peer_id), "%s",
config->peer_id == NULL ? "" : config->peer_id);
snprintf(transport->target_peer, sizeof(transport->target_peer), "%s",
config->target_peer == NULL ? "" : config->target_peer);
if (config->mode == TELEOP_TRANSPORT_MODE_KCP) {
return teleop_transport_open_kcp(transport, config);
}
return teleop_transport_open_udp(transport, config);
}
int teleop_transport_send_twist(teleop_transport_t *transport, const twist_cmd_t *cmd)
{
if (transport == NULL || cmd == NULL) {
errno = EINVAL;
return -1;
}
if (transport->mode == TELEOP_TRANSPORT_MODE_KCP) {
if (kcp_client_send_binary(transport->kcp_client, transport->target_peer, cmd, TWIST_CMD_SIZE) != 0) {
int saved_errno = errno;
fprintf(stderr,
"teleop transport: failed to send KCP payload to %s: %s (errno=%d)\n",
transport->target_peer,
saved_errno != 0 ? strerror(saved_errno) : "unknown error",
saved_errno);
errno = saved_errno;
return -1;
}
return 0;
}
{
ssize_t sent = sendto(transport->udp_fd, cmd, TWIST_CMD_SIZE, 0,
(const struct sockaddr *)&transport->udp_dest, sizeof(transport->udp_dest));
if (sent < 0) {
perror("sendto");
return -1;
}
if ((size_t)sent != TWIST_CMD_SIZE) {
fprintf(stderr, "sendto: short send (%zd/%zu)\n", sent, (size_t)TWIST_CMD_SIZE);
errno = EIO;
return -1;
}
}
return 0;
}
void teleop_transport_close(teleop_transport_t *transport)
{
if (transport == NULL) {
return;
}
transport->stop_requested = 1;
if (transport->kcp_client != NULL) {
kcp_client_close(transport->kcp_client);
}
if (transport->recv_thread_started) {
pthread_join(transport->recv_thread, NULL);
transport->recv_thread_started = 0;
}
if (transport->kcp_client != NULL) {
kcp_client_free(transport->kcp_client);
transport->kcp_client = NULL;
}
if (transport->udp_fd >= 0) {
close(transport->udp_fd);
transport->udp_fd = -1;
}
}

View File

@@ -0,0 +1,59 @@
#ifndef TELEOP_TRANSPORT_H
#define TELEOP_TRANSPORT_H
#include <netinet/in.h>
#include <pthread.h>
#include "protocol.h"
#include "peer_kcp_client.h"
#ifdef __cplusplus
extern "C" {
#endif
#define DEFAULT_KCP_SERVER_ADDR "127.0.0.1:9002"
#define DEFAULT_KCP_KEYBOARD_PEER_ID "ros-keyboard-ctrl"
#define DEFAULT_KCP_GAMEPAD_PEER_ID "ros-gamepad-ctrl"
#define DEFAULT_KCP_TARGET_PEER_ID "ros-bridge-ctrl"
typedef enum teleop_transport_mode {
TELEOP_TRANSPORT_MODE_UDP = 0,
TELEOP_TRANSPORT_MODE_KCP = 1
} teleop_transport_mode_t;
typedef struct teleop_transport_config {
teleop_transport_mode_t mode;
const char *udp_ip;
int udp_port;
const char *server_addr;
const char *relay_via;
const char *peer_id;
const char *target_peer;
} teleop_transport_config_t;
typedef struct teleop_transport {
teleop_transport_mode_t mode;
int udp_fd;
struct sockaddr_in udp_dest;
kcp_client_t *kcp_client;
pthread_t recv_thread;
int recv_thread_started;
volatile int stop_requested;
char server_addr[OMNI_MAX_ADDR_TEXT];
char relay_via[OMNI_MAX_ADDR_TEXT];
char peer_id[OMNI_MAX_PEER_ID];
char target_peer[OMNI_MAX_PEER_ID];
} teleop_transport_t;
int teleop_transport_parse_mode(const char *raw, teleop_transport_mode_t *out_mode);
const char *teleop_transport_mode_name(teleop_transport_mode_t mode);
int teleop_transport_open(teleop_transport_t *transport, const teleop_transport_config_t *config);
int teleop_transport_send_twist(teleop_transport_t *transport, const twist_cmd_t *cmd);
void teleop_transport_close(teleop_transport_t *transport);
#ifdef __cplusplus
}
#endif
#endif /* TELEOP_TRANSPORT_H */

View File

@@ -0,0 +1,293 @@
/*
* gamepad_controller.c — Gamepad/joystick teleop over UDP or KCP
*
* Uses the Linux joystick API (/dev/input/js*).
* Zero external dependencies.
*
* Xbox controller mapping (xpad driver):
* Left stick Y (axis 1) → linear.x (forward/back, inverted)
* Left stick X (axis 0) → linear.y (strafe)
* Right stick X (axis 3) → angular.z (turn)
* Button A (0) → emergency stop
* Button B (1) → quit
*
* Build: gcc -Wall -O2 -I../common -o gamepad_controller gamepad_controller.c -lm
* Usage: ./gamepad_controller [-i IP] [-p PORT] [-d /dev/input/js0]
* [-l MAX_LIN] [-a MAX_ANG] [-z DEADZONE]
* [-t udp|kcp] [-s SERVER] [-r RELAY]
* [-I PEER_ID] [-T TARGET_PEER]
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <signal.h>
#include <math.h>
#include <errno.h>
#include <sys/select.h>
#include <sys/time.h>
#include <sys/ioctl.h>
#include <getopt.h>
#include <linux/joystick.h>
#include "../common/protocol.h"
#include "../common/teleop_transport.h"
/* ── config ─────────────────────────────────────────────────────────── */
#define MAX_AXES 16
#define MAX_BUTTONS 16
#define JS_AXIS_MAX 32767.0f
/* Xbox mapping indices */
#define AXIS_LX 0 /* left stick X → strafe */
#define AXIS_LY 1 /* left stick Y → fwd/back (inverted) */
#define AXIS_RX 3 /* right stick X → turn */
#define BTN_STOP 0 /* A → emergency stop */
#define BTN_QUIT 1 /* B → quit */
static volatile sig_atomic_t g_running = 1;
static void sigint_handler(int sig) { (void)sig; g_running = 0; }
static int parse_port(const char *text, int *port_out)
{
char *end = NULL;
long value = strtol(text, &end, 10);
if (end == text || *end != '\0' || value < 1 || value > 65535)
return -1;
*port_out = (int)value;
return 0;
}
/* ── apply deadzone ─────────────────────────────────────────────────── */
static float apply_deadzone(float v, float dz)
{
if (fabsf(v) < dz) return 0.0f;
/* rescale so the output starts from 0 just outside the deadzone */
float sign = (v > 0) ? 1.0f : -1.0f;
return sign * (fabsf(v) - dz) / (1.0f - dz);
}
/* ── usage ──────────────────────────────────────────────────────────── */
static void usage(const char *prog)
{
fprintf(stderr,
"Usage: %s [options]\n"
" -i IP target IP (default %s)\n"
" -p PORT target port (default %d)\n"
" -d DEVICE joystick device (default /dev/input/js0)\n"
" -l SPEED max linear m/s (default 0.5)\n"
" -a SPEED max angular rad/s (default 0.5)\n"
" -z DZ deadzone 0<=DZ<1 (default 0.1)\n"
" -t MODE transport mode udp|kcp (default udp)\n"
" -s ADDR KCP server addr (default %s)\n"
" -r ADDR KCP relay addr (default none)\n"
" -I ID local KCP peer id (default %s)\n"
" -T ID target KCP peer id (default %s)\n"
" -h show help\n",
prog, DEFAULT_IP, DEFAULT_PORT,
DEFAULT_KCP_SERVER_ADDR,
DEFAULT_KCP_GAMEPAD_PEER_ID,
DEFAULT_KCP_TARGET_PEER_ID);
}
/* ──────────────────────────────────────────────────────────────────── */
int main(int argc, char *argv[])
{
char ip[64] = DEFAULT_IP;
int port = DEFAULT_PORT;
char device[128] = "/dev/input/js0";
char kcp_server[OMNI_MAX_ADDR_TEXT] = DEFAULT_KCP_SERVER_ADDR;
char kcp_relay[OMNI_MAX_ADDR_TEXT] = "";
char peer_id[OMNI_MAX_PEER_ID] = DEFAULT_KCP_GAMEPAD_PEER_ID;
char target_peer[OMNI_MAX_PEER_ID] = DEFAULT_KCP_TARGET_PEER_ID;
float max_lin = 0.5f;
float max_ang = 0.5f;
float deadzone = 0.1f;
teleop_transport_mode_t transport_mode = TELEOP_TRANSPORT_MODE_UDP;
teleop_transport_t transport;
teleop_transport_config_t transport_config;
int opt;
while ((opt = getopt(argc, argv, "i:p:d:l:a:z:t:s:r:I:T:h")) != -1) {
switch (opt) {
case 'i': strncpy(ip, optarg, sizeof(ip)-1); ip[sizeof(ip)-1] = '\0'; break;
case 'p':
if (parse_port(optarg, &port) != 0) {
fprintf(stderr, "Invalid port: %s (expected 1-65535)\n", optarg);
return 1;
}
break;
case 'd': strncpy(device, optarg, sizeof(device)-1); device[sizeof(device)-1] = '\0'; break;
case 'l': max_lin = strtof(optarg, NULL); break;
case 'a': max_ang = strtof(optarg, NULL); break;
case 'z': deadzone = strtof(optarg, NULL); break;
case 't':
if (teleop_transport_parse_mode(optarg, &transport_mode) != 0) {
fprintf(stderr, "Invalid transport mode: %s (expected udp or kcp)\n", optarg);
return 1;
}
break;
case 's': strncpy(kcp_server, optarg, sizeof(kcp_server)-1); kcp_server[sizeof(kcp_server)-1] = '\0'; break;
case 'r': strncpy(kcp_relay, optarg, sizeof(kcp_relay)-1); kcp_relay[sizeof(kcp_relay)-1] = '\0'; break;
case 'I': strncpy(peer_id, optarg, sizeof(peer_id)-1); peer_id[sizeof(peer_id)-1] = '\0'; break;
case 'T': strncpy(target_peer, optarg, sizeof(target_peer)-1); target_peer[sizeof(target_peer)-1] = '\0'; break;
default: usage(argv[0]); return (opt == 'h') ? 0 : 1;
}
}
if (deadzone < 0.0f || deadzone >= 1.0f) {
fprintf(stderr, "Invalid deadzone %.3f: expected 0 <= dz < 1\n", deadzone);
return 1;
}
signal(SIGINT, sigint_handler);
/* ── open joystick ───────────────────────────────────────────── */
int jsfd = open(device, O_RDONLY | O_NONBLOCK);
if (jsfd < 0) {
fprintf(stderr, "Cannot open %s: %s\n"
" Hint: connect Xbox controller, check 'ls /dev/input/js*'\n",
device, strerror(errno));
return 1;
}
char js_name[128] = "Unknown";
ioctl(jsfd, JSIOCGNAME(sizeof(js_name)), js_name);
int num_axes = 0, num_buttons = 0;
ioctl(jsfd, JSIOCGAXES, &num_axes);
ioctl(jsfd, JSIOCGBUTTONS, &num_buttons);
printf("========================================\n");
printf(" Gamepad Teleop Controller\n");
printf("========================================\n");
printf(" Device : %s\n", device);
printf(" Name : %s\n", js_name);
printf(" Axes : %d Buttons: %d\n", num_axes, num_buttons);
printf(" Transport: %s\n", teleop_transport_mode_name(transport_mode));
if (transport_mode == TELEOP_TRANSPORT_MODE_KCP) {
printf(" KCP server: %s\n", kcp_server);
if (kcp_relay[0] != '\0')
printf(" Relay via : %s\n", kcp_relay);
printf(" Peer ID : %s -> %s\n", peer_id, target_peer);
} else {
printf(" Target : %s:%d\n", ip, port);
}
printf(" Linear : %.2f m/s Angular: %.2f rad/s\n", max_lin, max_ang);
printf(" Deadzone: %.2f\n", deadzone);
printf("----------------------------------------\n");
printf(" Left stick → forward/back + strafe\n");
printf(" Right stick → turn\n");
printf(" A button → emergency stop\n");
printf(" B button → quit\n");
printf("========================================\n\n");
memset(&transport_config, 0, sizeof(transport_config));
transport_config.mode = transport_mode;
transport_config.udp_ip = ip;
transport_config.udp_port = port;
transport_config.server_addr = kcp_server;
transport_config.relay_via = kcp_relay;
transport_config.peer_id = peer_id;
transport_config.target_peer = target_peer;
if (teleop_transport_open(&transport, &transport_config) != 0) {
close(jsfd);
return 1;
}
/* ── state ───────────────────────────────────────────────────── */
float axes[MAX_AXES];
int buttons[MAX_BUTTONS];
memset(axes, 0, sizeof(axes));
memset(buttons, 0, sizeof(buttons));
twist_cmd_t cmd;
twist_cmd_zero(&cmd);
struct timeval last_send;
gettimeofday(&last_send, NULL);
int e_stop = 0;
/* ── main loop ───────────────────────────────────────────────── */
while (g_running) {
/* read all pending joystick events */
struct js_event ev;
while (read(jsfd, &ev, sizeof(ev)) == sizeof(ev)) {
ev.type &= ~JS_EVENT_INIT; /* strip init flag */
if (ev.type == JS_EVENT_AXIS && ev.number < MAX_AXES) {
axes[ev.number] = (float)ev.value / JS_AXIS_MAX;
} else if (ev.type == JS_EVENT_BUTTON && ev.number < MAX_BUTTONS) {
buttons[ev.number] = ev.value;
if (ev.number == BTN_QUIT && ev.value) {
g_running = 0;
break;
}
if (ev.number == BTN_STOP && ev.value) {
e_stop = !e_stop;
if (e_stop)
printf("\r ** EMERGENCY STOP ** ");
else
printf("\r ** E-STOP released ** ");
fflush(stdout);
}
}
}
/* EAGAIN is expected in non-blocking mode */
if (errno != EAGAIN && errno != 0) {
perror("read joystick");
break;
}
errno = 0;
/* map axes → twist (skip if e-stopped) */
if (e_stop) {
twist_cmd_zero(&cmd);
} else {
float lx_raw = apply_deadzone(-axes[AXIS_LY], deadzone); /* Y inverted */
float ly_raw = apply_deadzone(-axes[AXIS_LX], deadzone);
float az_raw = apply_deadzone(-axes[AXIS_RX], deadzone);
cmd.lx = lx_raw * max_lin;
cmd.ly = ly_raw * max_lin;
cmd.lz = 0.0f;
cmd.ax = 0.0f;
cmd.ay = 0.0f;
cmd.az = az_raw * max_ang;
}
/* rate-limit sending */
struct timeval now;
gettimeofday(&now, NULL);
long elapsed = (now.tv_sec - last_send.tv_sec) * 1000000
+ (now.tv_usec - last_send.tv_usec);
if (elapsed < SEND_INTERVAL_US) {
usleep(5000); /* 5 ms sleep to avoid busy-spin */
continue;
}
last_send = now;
teleop_transport_send_twist(&transport, &cmd);
printf("\r cmd: lx=%+.2f ly=%+.2f az=%+.2f | raw: LY=%+.2f LX=%+.2f RX=%+.2f ",
cmd.lx, cmd.ly, cmd.az,
axes[AXIS_LY], axes[AXIS_LX], axes[AXIS_RX]);
fflush(stdout);
}
/* send final stop */
twist_cmd_zero(&cmd);
teleop_transport_send_twist(&transport, &cmd);
close(jsfd);
teleop_transport_close(&transport);
printf("\nStopped.\n");
return 0;
}

View File

@@ -0,0 +1,361 @@
/*
* keyboard_controller.c - Keyboard teleop over UDP or KCP
*
* Keys:
* W/Up forward S/Down backward
* A/Left turn left D/Right turn right
* Q strafe left E strafe right
* Space stop
* [ / ] linear speed down/up
* - / = angular speed down/up
* Ctrl-C quit
*
* Build: gcc -Wall -O2 -I../common -o keyboard_controller keyboard_controller.c
* Usage: ./keyboard_controller [-i IP] [-p PORT] [-l MAX_LIN] [-a MAX_ANG]
* [-t udp|kcp] [-s SERVER] [-r RELAY]
* [-I PEER_ID] [-T TARGET_PEER]
*/
#include <getopt.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/select.h>
#include <sys/time.h>
#include <termios.h>
#include <unistd.h>
#include "../common/protocol.h"
#include "../common/teleop_transport.h"
/*
* Terminals do not provide key-release events, so keep the last motion command
* alive briefly to bridge the initial auto-repeat delay while a key is held.
*/
#define KEY_HOLD_TIMEOUT_US 500000L
static struct termios g_orig_termios;
static volatile sig_atomic_t g_running = 1;
static long elapsed_us(const struct timeval *start, const struct timeval *end)
{
return (end->tv_sec - start->tv_sec) * 1000000L
+ (end->tv_usec - start->tv_usec);
}
static int parse_port(const char *text, int *port_out)
{
char *end = NULL;
long value = strtol(text, &end, 10);
if (end == text || *end != '\0' || value < 1 || value > 65535)
return -1;
*port_out = (int)value;
return 0;
}
static void restore_terminal(void)
{
tcsetattr(STDIN_FILENO, TCSANOW, &g_orig_termios);
printf("\n\033[?25h");
fflush(stdout);
}
static void sigint_handler(int sig)
{
(void)sig;
g_running = 0;
}
static void set_raw_mode(void)
{
struct termios raw;
tcgetattr(STDIN_FILENO, &g_orig_termios);
atexit(restore_terminal);
raw = g_orig_termios;
raw.c_lflag &= ~(ICANON | ECHO | ISIG);
raw.c_cc[VMIN] = 0;
raw.c_cc[VTIME] = 0;
tcsetattr(STDIN_FILENO, TCSANOW, &raw);
}
static int read_key(long timeout_us)
{
fd_set fds;
struct timeval tv;
unsigned char c;
FD_ZERO(&fds);
FD_SET(STDIN_FILENO, &fds);
tv.tv_sec = timeout_us / 1000000L;
tv.tv_usec = timeout_us % 1000000L;
if (select(STDIN_FILENO + 1, &fds, NULL, NULL, &tv) <= 0)
return -1;
if (read(STDIN_FILENO, &c, 1) != 1)
return -1;
if (c == 0x1B) {
unsigned char seq[2];
tv.tv_sec = 0;
tv.tv_usec = 20000;
FD_ZERO(&fds);
FD_SET(STDIN_FILENO, &fds);
if (select(STDIN_FILENO + 1, &fds, NULL, NULL, &tv) <= 0)
return 0x1B;
if (read(STDIN_FILENO, &seq[0], 1) != 1)
return 0x1B;
FD_ZERO(&fds);
FD_SET(STDIN_FILENO, &fds);
tv.tv_sec = 0;
tv.tv_usec = 20000;
if (select(STDIN_FILENO + 1, &fds, NULL, NULL, &tv) <= 0)
return 0x1B;
if (read(STDIN_FILENO, &seq[1], 1) != 1)
return 0x1B;
if (seq[0] == '[') {
switch (seq[1]) {
case 'A': return 'W';
case 'B': return 'S';
case 'D': return 'A';
case 'C': return 'D';
default: break;
}
}
return 0x1B;
}
if (c >= 'a' && c <= 'z')
c = (unsigned char)(c - ('a' - 'A'));
return c;
}
static void print_banner(void)
{
printf("\033[2J\033[H");
printf("========================================\n");
printf(" Keyboard Teleop Controller\n");
printf("========================================\n");
printf(" W/Up : forward S/Down : back\n");
printf(" A/Left : turn left D/Right: turn right\n");
printf(" Q : strafe left E : strafe right\n");
printf(" Space : stop\n");
printf(" [ / ] : linear speed -/+\n");
printf(" - / = : angular speed -/+\n");
printf(" Ctrl-C : quit\n");
printf("========================================\n\n");
}
static void usage(const char *prog)
{
fprintf(stderr,
"Usage: %s [options]\n"
" -i IP target IP (default %s)\n"
" -p PORT target port (default %d)\n"
" -l SPEED max linear speed m/s (default 0.5)\n"
" -a SPEED max angular speed rad/s (default 0.5)\n"
" -t MODE transport mode udp|kcp (default udp)\n"
" -s ADDR KCP server addr (default %s)\n"
" -r ADDR KCP relay addr (default none)\n"
" -I ID local KCP peer id (default %s)\n"
" -T ID target KCP peer id (default %s)\n"
" -h show help\n",
prog, DEFAULT_IP, DEFAULT_PORT,
DEFAULT_KCP_SERVER_ADDR,
DEFAULT_KCP_KEYBOARD_PEER_ID,
DEFAULT_KCP_TARGET_PEER_ID);
}
int main(int argc, char *argv[])
{
char ip[64] = DEFAULT_IP;
int port = DEFAULT_PORT;
char kcp_server[OMNI_MAX_ADDR_TEXT] = DEFAULT_KCP_SERVER_ADDR;
char kcp_relay[OMNI_MAX_ADDR_TEXT] = "";
char peer_id[OMNI_MAX_PEER_ID] = DEFAULT_KCP_KEYBOARD_PEER_ID;
char target_peer[OMNI_MAX_PEER_ID] = DEFAULT_KCP_TARGET_PEER_ID;
float max_lin = 0.5f;
float max_ang = 0.5f;
const float speed_step = 0.1f;
teleop_transport_mode_t transport_mode = TELEOP_TRANSPORT_MODE_UDP;
teleop_transport_t transport;
teleop_transport_config_t transport_config;
int opt;
while ((opt = getopt(argc, argv, "i:p:l:a:t:s:r:I:T:h")) != -1) {
switch (opt) {
case 'i':
strncpy(ip, optarg, sizeof(ip) - 1);
ip[sizeof(ip) - 1] = '\0';
break;
case 'p':
if (parse_port(optarg, &port) != 0) {
fprintf(stderr, "Invalid port: %s (expected 1-65535)\n", optarg);
return 1;
}
break;
case 'l':
max_lin = strtof(optarg, NULL);
break;
case 'a':
max_ang = strtof(optarg, NULL);
break;
case 't':
if (teleop_transport_parse_mode(optarg, &transport_mode) != 0) {
fprintf(stderr, "Invalid transport mode: %s (expected udp or kcp)\n", optarg);
return 1;
}
break;
case 's':
strncpy(kcp_server, optarg, sizeof(kcp_server) - 1);
kcp_server[sizeof(kcp_server) - 1] = '\0';
break;
case 'r':
strncpy(kcp_relay, optarg, sizeof(kcp_relay) - 1);
kcp_relay[sizeof(kcp_relay) - 1] = '\0';
break;
case 'I':
strncpy(peer_id, optarg, sizeof(peer_id) - 1);
peer_id[sizeof(peer_id) - 1] = '\0';
break;
case 'T':
strncpy(target_peer, optarg, sizeof(target_peer) - 1);
target_peer[sizeof(target_peer) - 1] = '\0';
break;
default:
usage(argv[0]);
return (opt == 'h') ? 0 : 1;
}
}
memset(&transport_config, 0, sizeof(transport_config));
transport_config.mode = transport_mode;
transport_config.udp_ip = ip;
transport_config.udp_port = port;
transport_config.server_addr = kcp_server;
transport_config.relay_via = kcp_relay;
transport_config.peer_id = peer_id;
transport_config.target_peer = target_peer;
if (teleop_transport_open(&transport, &transport_config) != 0) {
return 1;
}
set_raw_mode();
signal(SIGINT, sigint_handler);
print_banner();
printf(" Transport: %s\n", teleop_transport_mode_name(transport_mode));
if (transport_mode == TELEOP_TRANSPORT_MODE_KCP) {
printf(" KCP server: %s\n", kcp_server);
if (kcp_relay[0] != '\0')
printf(" Relay via : %s\n", kcp_relay);
printf(" Peer ID : %s -> %s\n", peer_id, target_peer);
} else {
printf(" Target: %s:%d\n", ip, port);
}
printf(" Linear: %.2f m/s Angular: %.2f rad/s\n\n", max_lin, max_ang);
printf("\033[?25l");
twist_cmd_t cmd;
twist_cmd_zero(&cmd);
struct timeval last_send;
struct timeval last_motion_key;
gettimeofday(&last_send, NULL);
last_motion_key = last_send;
while (g_running) {
int key = read_key(SEND_INTERVAL_US);
if (key >= 0) {
twist_cmd_zero(&cmd);
switch (key) {
case 'W':
cmd.lx = max_lin;
gettimeofday(&last_motion_key, NULL);
break;
case 'S':
cmd.lx = -max_lin;
gettimeofday(&last_motion_key, NULL);
break;
case 'A':
cmd.az = max_ang;
gettimeofday(&last_motion_key, NULL);
break;
case 'D':
cmd.az = -max_ang;
gettimeofday(&last_motion_key, NULL);
break;
case 'Q':
cmd.ly = max_lin;
gettimeofday(&last_motion_key, NULL);
break;
case 'E':
cmd.ly = -max_lin;
gettimeofday(&last_motion_key, NULL);
break;
case ' ':
break;
case ']':
max_lin += speed_step;
printf("\r Linear speed: %.2f m/s ", max_lin);
fflush(stdout);
continue;
case '[':
max_lin = (max_lin > speed_step) ? max_lin - speed_step : speed_step;
printf("\r Linear speed: %.2f m/s ", max_lin);
fflush(stdout);
continue;
case '=':
max_ang += speed_step;
printf("\r Angular speed: %.2f rad/s ", max_ang);
fflush(stdout);
continue;
case '-':
max_ang = (max_ang > speed_step) ? max_ang - speed_step : speed_step;
printf("\r Angular speed: %.2f rad/s ", max_ang);
fflush(stdout);
continue;
case 0x03:
g_running = 0;
continue;
default:
continue;
}
} else {
struct timeval now;
gettimeofday(&now, NULL);
if (elapsed_us(&last_motion_key, &now) > KEY_HOLD_TIMEOUT_US)
twist_cmd_zero(&cmd);
}
{
struct timeval now;
long elapsed;
gettimeofday(&now, NULL);
elapsed = elapsed_us(&last_send, &now);
if (elapsed < SEND_INTERVAL_US)
continue;
last_send = now;
}
teleop_transport_send_twist(&transport, &cmd);
printf("\r cmd: lx=%+.2f ly=%+.2f az=%+.2f | lin=%.2f ang=%.2f ",
cmd.lx, cmd.ly, cmd.az, max_lin, max_ang);
fflush(stdout);
}
twist_cmd_zero(&cmd);
teleop_transport_send_twist(&transport, &cmd);
teleop_transport_close(&transport);
printf("\nStopped.\n");
return 0;
}

View File

@@ -0,0 +1,247 @@
#!/usr/bin/env python3
"""
udp_ros_bridge.py — UDP/KCP → ROS2 TwistStamped bridge
Receives 24-byte binary twist commands from keyboard/gamepad controllers
via UDP or OmniSocket/KCP and publishes geometry_msgs/msg/TwistStamped to
/hric/robot/cmd_vel.
Usage:
ros2 run <your_pkg> udp_ros_bridge (if installed as a ROS2 package)
python3 udp_ros_bridge.py (standalone)
ROS2 parameters:
transport (string) — udp or kcp (default udp)
udp_port (int) — UDP listen port (default 9870)
kcp_server (string) — KCP hub addr (default 127.0.0.1:9002)
kcp_relay_via (string) — optional relay addr (default "")
peer_id (string) — local KCP peer id (default ros-bridge-ctrl)
expected_sender (string) — optional sender filter (default "")
topic (string) — publish topic (default /hric/robot/cmd_vel)
frame_id (string) — TwistStamped frame_id (default pelvis)
timeout (float) — watchdog timeout seconds (default 0.5)
"""
from pathlib import Path
import struct
import socket
import sys
import threading
import time
import rclpy
from rclpy.node import Node
from geometry_msgs.msg import TwistStamped
TWIST_CMD_FMT = '<6f' # 6 little-endian floats, 24 bytes
TWIST_CMD_SIZE = struct.calcsize(TWIST_CMD_FMT)
def _load_omnisocket():
try:
from omnisocket import CONTROL_DEFAULTS, MSG_TYPE_BINARY, MSG_TYPE_ERROR, Session
return CONTROL_DEFAULTS, MSG_TYPE_BINARY, MSG_TYPE_ERROR, Session
except ImportError:
root = Path(__file__).resolve().parents[2]
python_dir = root / 'python'
if str(python_dir) not in sys.path:
sys.path.insert(0, str(python_dir))
from omnisocket import CONTROL_DEFAULTS, MSG_TYPE_BINARY, MSG_TYPE_ERROR, Session
return CONTROL_DEFAULTS, MSG_TYPE_BINARY, MSG_TYPE_ERROR, Session
class UdpTeleopBridge(Node):
def __init__(self):
super().__init__('udp_teleop_bridge')
# declare parameters
self.declare_parameter('transport', 'udp')
self.declare_parameter('udp_port', 9870)
self.declare_parameter('kcp_server', '127.0.0.1:9002')
self.declare_parameter('kcp_relay_via', '')
self.declare_parameter('peer_id', 'ros-bridge-ctrl')
self.declare_parameter('expected_sender', '')
self.declare_parameter('topic', '/hric/robot/cmd_vel')
self.declare_parameter('frame_id', 'pelvis')
self.declare_parameter('timeout', 0.5)
self._transport = str(self.get_parameter('transport').value).strip().lower()
self._port = self.get_parameter('udp_port').value
self._kcp_server = str(self.get_parameter('kcp_server').value)
self._kcp_relay_via = str(self.get_parameter('kcp_relay_via').value)
self._peer_id = str(self.get_parameter('peer_id').value)
self._expected_sender = str(self.get_parameter('expected_sender').value)
self._topic = self.get_parameter('topic').value
self._frame_id = self.get_parameter('frame_id').value
self._timeout = self.get_parameter('timeout').value
if self._transport not in ('udp', 'kcp'):
raise ValueError(f"Unsupported transport '{self._transport}', expected 'udp' or 'kcp'")
# publisher
self._pub = self.create_publisher(TwistStamped, self._topic, 10)
# watchdog timer
self._last_recv = time.monotonic()
self._lock = threading.Lock()
self._latest_cmd = (0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
self._timer = self.create_timer(1.0 / 20.0, self._timer_cb)
self._sock = None
self._session = None
self._msg_type_binary = None
self._msg_type_error = None
self._closing = False
if self._transport == 'kcp':
control_defaults, self._msg_type_binary, self._msg_type_error, session_cls = _load_omnisocket()
self._session = session_cls()
self._session.connect(
server_addr=self._kcp_server,
peer_id=self._peer_id,
relay_via=self._kcp_relay_via,
**control_defaults,
)
else:
self._sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
self._sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self._sock.bind(('0.0.0.0', self._port))
self._sock.settimeout(0.1)
# receive thread
recv_target = self._recv_loop_kcp if self._transport == 'kcp' else self._recv_loop_udp
self._recv_thread = threading.Thread(target=recv_target, daemon=True)
self._recv_thread.start()
if self._transport == 'kcp':
self.get_logger().info(
f'Bridge ready — KCP {self._kcp_server} as {self._peer_id}{self._topic} '
f'(frame_id={self._frame_id}, timeout={self._timeout}s)'
)
else:
self.get_logger().info(
f'Bridge ready — UDP 0.0.0.0:{self._port}{self._topic} '
f'(frame_id={self._frame_id}, timeout={self._timeout}s)'
)
def _recv_loop_udp(self):
"""Background thread: receive UDP packets and update latest command."""
while rclpy.ok():
try:
data, addr = self._sock.recvfrom(TWIST_CMD_SIZE + 64)
except socket.timeout:
continue
except OSError:
break
if len(data) != TWIST_CMD_SIZE:
self.get_logger().warn(
f'Packet has invalid size {len(data)} bytes from {addr}, '
f'expected {TWIST_CMD_SIZE}'
)
continue
values = struct.unpack(TWIST_CMD_FMT, data)
with self._lock:
self._latest_cmd = values
self._last_recv = time.monotonic()
def _recv_loop_kcp(self):
"""Background thread: receive KCP packets and update latest command."""
while rclpy.ok():
try:
result = self._session.recv(timeout_ms=100)
except OSError as exc:
if not self._closing:
self.get_logger().error(f'KCP receive failed: {exc}')
break
if result is None:
continue
from_peer, msg_type, payload = result
if msg_type == self._msg_type_error:
self.get_logger().error(
f'KCP server error from {from_peer}: {payload.decode("utf-8", errors="replace")}'
)
continue
if self._expected_sender and from_peer != self._expected_sender:
self.get_logger().warn(
f'Ignoring KCP packet from unexpected sender {from_peer}, '
f'expected {self._expected_sender}'
)
continue
if msg_type != self._msg_type_binary:
self.get_logger().warn(
f'Ignoring non-binary KCP message type {msg_type} from {from_peer}'
)
continue
if len(payload) != TWIST_CMD_SIZE:
self.get_logger().warn(
f'KCP payload has invalid size {len(payload)} bytes from {from_peer}, '
f'expected {TWIST_CMD_SIZE}'
)
continue
values = struct.unpack(TWIST_CMD_FMT, payload)
with self._lock:
self._latest_cmd = values
self._last_recv = time.monotonic()
def _timer_cb(self):
"""20 Hz: publish TwistStamped from latest received command."""
with self._lock:
elapsed = time.monotonic() - self._last_recv
if elapsed > self._timeout:
lx, ly, lz, ax, ay, az = 0.0, 0.0, 0.0, 0.0, 0.0, 0.0
else:
lx, ly, lz, ax, ay, az = self._latest_cmd
msg = TwistStamped()
msg.header.stamp = self.get_clock().now().to_msg()
msg.header.frame_id = self._frame_id
msg.twist.linear.x = float(lx)
msg.twist.linear.y = float(ly)
msg.twist.linear.z = float(lz)
msg.twist.angular.x = float(ax)
msg.twist.angular.y = float(ay)
msg.twist.angular.z = float(az)
self._pub.publish(msg)
def destroy_node(self):
self._closing = True
if self._sock is not None:
self._sock.close()
self._sock = None
if self._session is not None:
try:
self._session.close()
except OSError as exc:
self.get_logger().warn(f'Closing KCP session failed: {exc}')
self._session = None
if hasattr(self, '_recv_thread') and self._recv_thread.is_alive():
self._recv_thread.join(timeout=0.2)
super().destroy_node()
def main(args=None):
rclpy.init(args=args)
node = None
try:
node = UdpTeleopBridge()
rclpy.spin(node)
except KeyboardInterrupt:
pass
finally:
if node is not None:
node.destroy_node()
rclpy.shutdown()
if __name__ == '__main__':
main()

257
ros-control-py/README.md Normal file
View File

@@ -0,0 +1,257 @@
# ROS2 Teleop over OmniSocket UDP/KCP
`ros-control-py/udp_teleop_bridge` 现在把 teleop 控制流统一接到 OmniSocket peer 传输上。
- `transport:=udp` 表示 OmniSocket UDP`udpserver/udppeer` 的消息协议传输
- `transport:=kcp` 表示 OmniSocket KCP`kcpserver/kcppeer` 的消息协议传输
- 不再使用原来的裸 `socket.sendto()/recvfrom()` UDP 路径
机器人最终接收的话题保持不变:
- topic: `/hric/robot/cmd_vel`
- type: `geometry_msgs/msg/TwistStamped`
- frame_id: `pelvis`
控制负载也保持不变:
- fixed payload: 24-byte little-endian `<6f>`
- order: `lx, ly, lz, ax, ay, az`
## 目录
- `udp_teleop_bridge/udp_teleop_bridge/cmd_vel_udp_sender.py`: 订阅 `TwistStamped`,经 OmniSocket 发送 24 字节控制包
- `udp_teleop_bridge/udp_teleop_bridge/udp_cmd_vel_receiver.py`: 从 OmniSocket 接收控制包,补时间戳并发布到机器人 ROS2 topic
- `udp_teleop_bridge/udp_teleop_bridge/omni_transport.py`: 统一封装 OmniSocket UDP/KCP session
- `udp_teleop_bridge/config/xbox_twist_joy.yaml`: Xbox 手柄映射
- `udp_teleop_bridge/launch/*.launch.py`: Linux 启动入口
## Linux 构建
先安装 ROS 2 官方 teleop 依赖:
```bash
sudo apt install ros-${ROS_DISTRO}-joy ros-${ROS_DISTRO}-teleop-twist-joy ros-${ROS_DISTRO}-teleop-twist-keyboard
```
再构建并安装 OmniSocket Python 扩展:
```bash
make python-ext
make python-install
```
最后构建 ROS 包:
```bash
colcon build --packages-select udp_teleop_bridge
source install/setup.bash
```
如果 `omnisocket` 没有安装到当前 ROS Python 环境sender/receiver 会直接报错退出。
## 先验证机器人控制语义
在机器人本机先直接低速发布 `/hric/robot/cmd_vel`,确认 `linear.x``linear.y``angular.z` 的物理方向符合预期:
```bash
ros2 topic pub /hric/robot/cmd_vel geometry_msgs/msg/TwistStamped \
"{header: {frame_id: pelvis}, twist: {linear: {x: 0.10, y: 0.0, z: 0.0}, angular: {x: 0.0, y: 0.0, z: 0.0}}}" \
-r 20
```
```bash
ros2 topic pub /hric/robot/cmd_vel geometry_msgs/msg/TwistStamped \
"{header: {frame_id: pelvis}, twist: {linear: {x: 0.0, y: 0.10, z: 0.0}, angular: {x: 0.0, y: 0.0, z: 0.0}}}" \
-r 20
```
```bash
ros2 topic pub /hric/robot/cmd_vel geometry_msgs/msg/TwistStamped \
"{header: {frame_id: pelvis}, twist: {linear: {x: 0.0, y: 0.0, z: 0.0}, angular: {x: 0.0, y: 0.0, z: 0.30}}}" \
-r 20
```
停止:
```bash
ros2 topic pub --once /hric/robot/cmd_vel geometry_msgs/msg/TwistStamped \
"{header: {frame_id: pelvis}, twist: {linear: {x: 0.0, y: 0.0, z: 0.0}, angular: {x: 0.0, y: 0.0, z: 0.0}}}"
```
## 启动 OmniSocket Hub
OmniSocket UDP:
```bash
./bin/udpserver -listen :9001
```
OmniSocket KCP:
```bash
./bin/kcpserver -listen :9002 -telemetry-peer peer-a-telemetry
```
`server_addr` 不传时,节点会按 `transport` 自动选择默认值:
- `udp` -> `127.0.0.1:9001`
- `kcp` -> `127.0.0.1:9002`
`relay_via` 只在 `transport:=kcp` 时生效。
## 机器人端运行
UDP:
```bash
ros2 launch udp_teleop_bridge robot_udp_receiver.launch.py \
transport:=udp \
server_addr:=127.0.0.1:9001 \
peer_id:=ros-bridge-ctrl \
output_topic:=/hric/robot/cmd_vel \
frame_id:=pelvis \
watchdog_timeout:=0.5
```
KCP:
```bash
ros2 launch udp_teleop_bridge robot_udp_receiver.launch.py \
transport:=kcp \
server_addr:=127.0.0.1:9002 \
peer_id:=ros-bridge-ctrl \
output_topic:=/hric/robot/cmd_vel \
frame_id:=pelvis \
watchdog_timeout:=0.5
```
如果只允许某个 sender 控制,可以加:
```bash
expected_sender:=ros-keyboard-ctrl
```
Local daemon handoff via Unix datagram:
```bash
ros2 launch udp_teleop_bridge robot_udp_receiver.launch.py \
transport:=unix_dgram \
local_socket_path:=/tmp/omnisocket-b-side-cmd.sock \
output_topic:=/hric/robot/cmd_vel \
frame_id:=pelvis \
watchdog_timeout:=0.5
```
## 控制端键盘运行
终端 A启动 sender:
```bash
ros2 launch udp_teleop_bridge keyboard_sender.launch.py \
transport:=udp \
server_addr:=127.0.0.1:9001 \
peer_id:=ros-keyboard-ctrl \
target_peer:=ros-bridge-ctrl
```
如果走 KCP:
```bash
ros2 launch udp_teleop_bridge keyboard_sender.launch.py \
transport:=kcp \
server_addr:=127.0.0.1:9002 \
peer_id:=ros-keyboard-ctrl \
target_peer:=ros-bridge-ctrl
```
终端 B启动官方键盘 teleop:
```bash
ros2 run teleop_twist_keyboard teleop_twist_keyboard --ros-args \
--remap cmd_vel:=/teleop/cmd_vel \
-p stamped:=true \
-p frame_id:=pelvis \
-p speed:=0.20 \
-p turn:=0.60
```
键盘默认键位(`teleop_twist_keyboard`,建议使用 US 键盘布局):
- `i`: 前进(`linear.x > 0`
- `,`: 后退(`linear.x < 0`
- `j`: 左转(`angular.z > 0`
- `l`: 右转(`angular.z < 0`
- `Shift + J`: 左平移(`linear.y > 0`
- `Shift + L`: 右平移(`linear.y < 0`
- `u` / `o` / `m` / `.`: 组合前进或后退加转向
- `k` 或其他未映射按键: 停止
- `q` / `z`: 整体速度增加 / 降低 10%
- `w` / `x`: 仅线速度增加 / 降低 10%
- `e` / `c`: 仅角速度增加 / 降低 10%
- `Ctrl-C`: 退出键盘 teleop
## 控制端 Xbox 手柄运行
UDP:
```bash
ros2 launch udp_teleop_bridge xbox_to_udp.launch.py \
transport:=udp \
server_addr:=127.0.0.1:9001 \
peer_id:=ros-gamepad-ctrl \
target_peer:=ros-bridge-ctrl \
joy_dev:=/dev/input/js0 \
frame_id:=pelvis
```
KCP:
```bash
ros2 launch udp_teleop_bridge xbox_to_udp.launch.py \
transport:=kcp \
server_addr:=127.0.0.1:9002 \
peer_id:=ros-gamepad-ctrl \
target_peer:=ros-bridge-ctrl \
joy_dev:=/dev/input/js0 \
frame_id:=pelvis
```
当前默认手柄映射:
- 左摇杆上下 -> `linear.x`
- 左摇杆左右 -> `linear.y`
- 右摇杆左右 -> `angular.z`
- `RB` 按住才允许运动
- `LB` 为 turbo
手柄实际操控含义(基于 `config/xbox_twist_joy.yaml` 的 Xbox 默认映射):
- 左摇杆向前 / 向后: 前进 / 后退
- 左摇杆向左 / 向右: 左平移 / 右平移
- 右摇杆向左 / 向右: 左转 / 右转
- 按住 `RB`: 以常速启用运动输出
- 同时按住 `LB` + `RB`: 启用 turbo更高的线速度和角速度
- 松开 `RB` 或将摇杆回中: 输出回到零速
## 数据流
键盘链路:
```text
teleop_twist_keyboard -> /teleop/cmd_vel (TwistStamped) -> cmd_vel_udp_sender -> OmniSocket UDP/KCP -> udp_cmd_vel_receiver -> /hric/robot/cmd_vel
```
手柄链路:
```text
joy_node -> teleop_twist_joy -> /teleop/cmd_vel (TwistStamped) -> cmd_vel_udp_sender -> OmniSocket UDP/KCP -> udp_cmd_vel_receiver -> /hric/robot/cmd_vel
```
## 安全行为
- sender 默认按 20 Hz 重发最新命令
- sender 输入超时后会改发零速
- sender 退出时会主动发送数个零速控制包
- receiver 超时后会在 ROS 主线程发布零速 stop
- receiver 只接受 `MSG_TYPE_BINARY` 且长度为 24 字节的负载
- 非预期 sender、非 binary 消息、错误长度消息都会被丢弃并记录日志

View File

@@ -0,0 +1,153 @@
## ROS2 Teleop over OmniSocket UDP/KCP
这个文档对应 `ros-control-py/udp_teleop_bridge` 的当前实现。
核心变化:
- `transport:=udp` 现在表示 OmniSocket UDP
- `transport:=kcp` 表示 OmniSocket KCP
- 不再使用原来的裸 `socket` UDP 实现
控制接口保持不变:
- topic: `/hric/robot/cmd_vel`
- type: `geometry_msgs/msg/TwistStamped`
- frame_id: `pelvis`
- payload: fixed 24-byte little-endian `<6f>`
负载顺序:
`lx, ly, lz, ax, ay, az`
### 构建顺序
```bash
make python-ext
make python-install
```
```bash
colcon build --packages-select udp_teleop_bridge
source install/setup.bash
```
### 启动 Hub
OmniSocket UDP:
```bash
./bin/udpserver -listen :9001
```
OmniSocket KCP:
```bash
./bin/kcpserver -listen :9002 -telemetry-peer peer-a-telemetry
```
### 机器人端 Receiver
```bash
ros2 launch udp_teleop_bridge robot_udp_receiver.launch.py \
transport:=udp \
server_addr:=127.0.0.1:9001 \
peer_id:=ros-bridge-ctrl \
output_topic:=/hric/robot/cmd_vel \
frame_id:=pelvis \
watchdog_timeout:=0.5
```
KCP 只需把 `transport``server_addr` 改成:
```bash
transport:=kcp server_addr:=127.0.0.1:9002
```
如果控制命令来自本机 `b_side_omnid`,可以改为:
```bash
transport:=unix_dgram local_socket_path:=/tmp/omnisocket-b-side-cmd.sock
```
只接受指定 sender:
```bash
expected_sender:=ros-keyboard-ctrl
```
### 键盘 Sender
```bash
ros2 launch udp_teleop_bridge keyboard_sender.launch.py \
transport:=udp \
server_addr:=127.0.0.1:9001 \
peer_id:=ros-keyboard-ctrl \
target_peer:=ros-bridge-ctrl
```
```bash
ros2 run teleop_twist_keyboard teleop_twist_keyboard --ros-args \
--remap cmd_vel:=/teleop/cmd_vel \
-p stamped:=true \
-p frame_id:=pelvis \
-p speed:=0.20 \
-p turn:=0.60
```
### Xbox Sender
```bash
ros2 launch udp_teleop_bridge xbox_to_udp.launch.py \
transport:=udp \
server_addr:=127.0.0.1:9001 \
peer_id:=ros-gamepad-ctrl \
target_peer:=ros-bridge-ctrl \
joy_dev:=/dev/input/js0 \
frame_id:=pelvis
```
### 参数语义
- sender:
- `transport`
- `server_addr`
- `relay_via`
- `peer_id`
- `target_peer`
- `input_topic`
- `send_rate_hz`
- `input_timeout`
- receiver:
- `transport`
- `server_addr`
- `relay_via`
- `peer_id`
- `expected_sender`
- `output_topic`
- `frame_id`
- `watchdog_timeout`
- `publish_rate_hz`
`server_addr` 省略时,会按 transport 自动选择:
- `udp` -> `127.0.0.1:9001`
- `kcp` -> `127.0.0.1:9002`
### 数据流
```text
teleop_twist_keyboard / teleop_twist_joy
-> /teleop/cmd_vel (TwistStamped)
-> cmd_vel_udp_sender
-> OmniSocket UDP/KCP binary message
-> udp_cmd_vel_receiver
-> /hric/robot/cmd_vel
```
### 安全与约束
- sender 默认 20 Hz 重发
- sender 输入超时后改发零速
- receiver watchdog 超时后发零速 stop
- receiver 只接受 24 字节 binary 负载
- `relay_via` 只在 KCP 模式有效

View File

@@ -0,0 +1,32 @@
/**:
ros__parameters:
require_enable_button: true
enable_button: 5
enable_turbo_button: 4
axis_linear:
x: 1
y: 0
z: -1
scale_linear:
x: -0.30
y: -0.25
z: 0.0
scale_linear_turbo:
x: -0.60
y: -0.45
z: 0.0
axis_angular:
yaw: 3
pitch: -1
roll: -1
scale_angular:
yaw: -0.80
pitch: 0.0
roll: 0.0
scale_angular_turbo:
yaw: -1.20
pitch: 0.0
roll: 0.0
inverted_reverse: false
publish_stamped_twist: true
frame: pelvis

View File

@@ -0,0 +1,34 @@
from launch import LaunchDescription
from launch.actions import DeclareLaunchArgument
from launch.substitutions import LaunchConfiguration
from launch_ros.actions import Node
from launch_ros.parameter_descriptions import ParameterValue
def generate_launch_description() -> LaunchDescription:
return LaunchDescription([
DeclareLaunchArgument('transport', default_value='udp'),
DeclareLaunchArgument('server_addr', default_value=''),
DeclareLaunchArgument('relay_via', default_value=''),
DeclareLaunchArgument('peer_id', default_value='ros-keyboard-ctrl'),
DeclareLaunchArgument('target_peer', default_value='ros-bridge-ctrl'),
DeclareLaunchArgument('input_topic', default_value='/teleop/cmd_vel'),
DeclareLaunchArgument('send_rate_hz', default_value='20.0'),
DeclareLaunchArgument('input_timeout', default_value='0.75'),
Node(
package='udp_teleop_bridge',
executable='cmd_vel_udp_sender',
name='cmd_vel_udp_sender',
output='screen',
parameters=[{
'transport': LaunchConfiguration('transport'),
'server_addr': LaunchConfiguration('server_addr'),
'relay_via': LaunchConfiguration('relay_via'),
'peer_id': LaunchConfiguration('peer_id'),
'target_peer': LaunchConfiguration('target_peer'),
'input_topic': LaunchConfiguration('input_topic'),
'send_rate_hz': ParameterValue(LaunchConfiguration('send_rate_hz'), value_type=float),
'input_timeout': ParameterValue(LaunchConfiguration('input_timeout'), value_type=float),
}],
),
])

View File

@@ -0,0 +1,38 @@
from launch import LaunchDescription
from launch.actions import DeclareLaunchArgument
from launch.substitutions import LaunchConfiguration
from launch_ros.actions import Node
from launch_ros.parameter_descriptions import ParameterValue
def generate_launch_description() -> LaunchDescription:
return LaunchDescription([
DeclareLaunchArgument('transport', default_value='udp'),
DeclareLaunchArgument('server_addr', default_value=''),
DeclareLaunchArgument('relay_via', default_value=''),
DeclareLaunchArgument('peer_id', default_value='ros-bridge-ctrl'),
DeclareLaunchArgument('expected_sender', default_value=''),
DeclareLaunchArgument('local_socket_path', default_value='/tmp/omnisocket-b-side-cmd.sock'),
DeclareLaunchArgument('output_topic', default_value='/hric/robot/cmd_vel'),
DeclareLaunchArgument('frame_id', default_value='pelvis'),
DeclareLaunchArgument('watchdog_timeout', default_value='0.5'),
DeclareLaunchArgument('publish_rate_hz', default_value='100.0'),
Node(
package='udp_teleop_bridge',
executable='udp_cmd_vel_receiver',
name='udp_cmd_vel_receiver',
output='screen',
parameters=[{
'transport': LaunchConfiguration('transport'),
'server_addr': LaunchConfiguration('server_addr'),
'relay_via': LaunchConfiguration('relay_via'),
'peer_id': LaunchConfiguration('peer_id'),
'expected_sender': LaunchConfiguration('expected_sender'),
'local_socket_path': LaunchConfiguration('local_socket_path'),
'output_topic': LaunchConfiguration('output_topic'),
'frame_id': LaunchConfiguration('frame_id'),
'watchdog_timeout': ParameterValue(LaunchConfiguration('watchdog_timeout'), value_type=float),
'publish_rate_hz': ParameterValue(LaunchConfiguration('publish_rate_hz'), value_type=float),
}],
),
])

View File

@@ -0,0 +1,74 @@
from launch import LaunchDescription
from launch.actions import DeclareLaunchArgument
from launch.substitutions import LaunchConfiguration, PathJoinSubstitution
from launch_ros.actions import Node
from launch_ros.parameter_descriptions import ParameterValue
from launch_ros.substitutions import FindPackageShare
def generate_launch_description() -> LaunchDescription:
teleop_config = PathJoinSubstitution([
FindPackageShare('udp_teleop_bridge'),
'config',
'xbox_twist_joy.yaml',
])
teleop_topic = LaunchConfiguration('teleop_topic')
return LaunchDescription([
DeclareLaunchArgument('transport', default_value='udp'),
DeclareLaunchArgument('server_addr', default_value=''),
DeclareLaunchArgument('relay_via', default_value=''),
DeclareLaunchArgument('peer_id', default_value='ros-gamepad-ctrl'),
DeclareLaunchArgument('target_peer', default_value='ros-bridge-ctrl'),
DeclareLaunchArgument('joy_dev', default_value='/dev/input/js0'),
DeclareLaunchArgument('deadzone', default_value='0.10'),
DeclareLaunchArgument('autorepeat_rate', default_value='20.0'),
DeclareLaunchArgument('frame_id', default_value='pelvis'),
DeclareLaunchArgument('teleop_topic', default_value='/teleop/cmd_vel'),
DeclareLaunchArgument('send_rate_hz', default_value='20.0'),
DeclareLaunchArgument('input_timeout', default_value='0.30'),
Node(
package='joy',
executable='joy_node',
name='joy_node',
output='screen',
parameters=[{
'dev': LaunchConfiguration('joy_dev'),
'deadzone': ParameterValue(LaunchConfiguration('deadzone'), value_type=float),
'autorepeat_rate': ParameterValue(LaunchConfiguration('autorepeat_rate'), value_type=float),
}],
),
Node(
package='teleop_twist_joy',
executable='teleop_node',
name='teleop_twist_joy',
output='screen',
parameters=[
teleop_config,
{
'publish_stamped_twist': True,
'frame': LaunchConfiguration('frame_id'),
},
],
remappings=[
('cmd_vel', teleop_topic),
],
),
Node(
package='udp_teleop_bridge',
executable='cmd_vel_udp_sender',
name='cmd_vel_udp_sender',
output='screen',
parameters=[{
'transport': LaunchConfiguration('transport'),
'server_addr': LaunchConfiguration('server_addr'),
'relay_via': LaunchConfiguration('relay_via'),
'peer_id': LaunchConfiguration('peer_id'),
'target_peer': LaunchConfiguration('target_peer'),
'input_topic': teleop_topic,
'send_rate_hz': ParameterValue(LaunchConfiguration('send_rate_hz'), value_type=float),
'input_timeout': ParameterValue(LaunchConfiguration('input_timeout'), value_type=float),
}],
),
])

View File

@@ -0,0 +1,25 @@
<?xml version="1.0"?>
<package format="3">
<name>udp_teleop_bridge</name>
<version>0.1.0</version>
<description>ROS 2 OmniSocket UDP/KCP bridge for teleop TwistStamped commands.</description>
<maintainer email="codex@example.com">Codex</maintainer>
<license>MIT</license>
<buildtool_depend>ament_python</buildtool_depend>
<exec_depend>ament_index_python</exec_depend>
<exec_depend>geometry_msgs</exec_depend>
<exec_depend>joy</exec_depend>
<exec_depend>launch</exec_depend>
<exec_depend>launch_ros</exec_depend>
<exec_depend>rclpy</exec_depend>
<exec_depend>rosidl_runtime_py</exec_depend>
<exec_depend>teleop_twist_joy</exec_depend>
<exec_depend>teleop_twist_keyboard</exec_depend>
<export>
<build_type>ament_python</build_type>
</export>
</package>

View File

@@ -0,0 +1 @@
udp_teleop_bridge

View File

@@ -0,0 +1,5 @@
[develop]
script_dir=$base/lib/udp_teleop_bridge
[install]
install_scripts=$base/lib/udp_teleop_bridge

View File

@@ -0,0 +1,34 @@
from setuptools import find_packages, setup
package_name = 'udp_teleop_bridge'
setup(
name=package_name,
version='0.1.0',
packages=find_packages(exclude=['test']),
data_files=[
('share/ament_index/resource_index/packages', [f'resource/{package_name}']),
(f'share/{package_name}', ['package.xml']),
(f'share/{package_name}/launch', [
'launch/keyboard_sender.launch.py',
'launch/robot_udp_receiver.launch.py',
'launch/xbox_to_udp.launch.py',
]),
(f'share/{package_name}/config', ['config/xbox_twist_joy.yaml']),
],
install_requires=['setuptools'],
zip_safe=True,
maintainer='Codex',
maintainer_email='codex@example.com',
description='ROS 2 OmniSocket UDP/KCP bridge for teleop TwistStamped commands.',
license='MIT',
entry_points={
'console_scripts': [
'cmd_vel_udp_sender = udp_teleop_bridge.cmd_vel_udp_sender:main',
'udp_cmd_vel_receiver = udp_teleop_bridge.udp_cmd_vel_receiver:main',
'topic_status_reader = udp_teleop_bridge.topic_status_reader:main',
],
},
)

View File

@@ -0,0 +1,54 @@
from pathlib import Path
import sys
import pytest
ROOT = Path(__file__).resolve().parents[1]
if str(ROOT) not in sys.path:
sys.path.insert(0, str(ROOT))
from udp_teleop_bridge.protocol import ( # noqa: E402
PACKET_SIZE,
default_server_addr_for_transport,
normalize_command,
normalize_transport,
pack_command,
unpack_command,
)
def test_pack_unpack_round_trip() -> None:
command = (0.1, -0.2, 0.3, -0.4, 0.5, -0.6)
payload = pack_command(command)
assert len(payload) == PACKET_SIZE
assert unpack_command(payload) == pytest.approx(command)
@pytest.mark.parametrize('value', [float('nan'), float('inf'), float('-inf')])
def test_normalize_command_rejects_non_finite_values(value: float) -> None:
with pytest.raises(ValueError, match='non-finite'):
normalize_command((0.0, 0.0, value, 0.0, 0.0, 0.0))
def test_unpack_command_rejects_wrong_length() -> None:
with pytest.raises(ValueError, match='Expected'):
unpack_command(b'\x00' * (PACKET_SIZE - 1))
@pytest.mark.parametrize(
('transport', 'expected'),
[
('udp', '127.0.0.1:9001'),
('kcp', '127.0.0.1:9002'),
],
)
def test_default_server_addr_for_transport(transport: str, expected: str) -> None:
assert default_server_addr_for_transport(transport) == expected
def test_normalize_transport_rejects_unknown_value() -> None:
with pytest.raises(ValueError, match='Unsupported transport'):
normalize_transport('sctp')

View File

@@ -0,0 +1 @@
"""OmniSocket teleop bridge package."""

View File

@@ -0,0 +1,207 @@
"""ROS 2 node that forwards TwistStamped teleop commands over OmniSocket."""
from __future__ import annotations
import threading
import time
from typing import Dict, Optional, Tuple
import rclpy
from geometry_msgs.msg import TwistStamped
from rclpy.node import Node
from .omni_transport import MSG_TYPE_ERROR, OmniTransport
from .protocol import (
DEFAULT_EXIT_ZERO_PACKETS,
DEFAULT_INPUT_TIMEOUT,
DEFAULT_INPUT_TOPIC,
DEFAULT_KEYBOARD_PEER_ID,
DEFAULT_QUEUE_DEPTH,
DEFAULT_SEND_RATE_HZ,
DEFAULT_TARGET_PEER,
DEFAULT_TRANSPORT,
ZERO_COMMAND,
pack_command,
)
CommandTuple = Tuple[float, float, float, float, float, float]
class CmdVelUdpSender(Node):
"""Forward TwistStamped messages to a remote OmniSocket peer."""
def __init__(self) -> None:
super().__init__('cmd_vel_udp_sender')
self.declare_parameter('transport', DEFAULT_TRANSPORT)
self.declare_parameter('server_addr', '')
self.declare_parameter('relay_via', '')
self.declare_parameter('peer_id', DEFAULT_KEYBOARD_PEER_ID)
self.declare_parameter('target_peer', DEFAULT_TARGET_PEER)
self.declare_parameter('input_topic', DEFAULT_INPUT_TOPIC)
self.declare_parameter('send_rate_hz', DEFAULT_SEND_RATE_HZ)
self.declare_parameter('input_timeout', DEFAULT_INPUT_TIMEOUT)
self.declare_parameter('queue_depth', DEFAULT_QUEUE_DEPTH)
self.declare_parameter('exit_zero_packets', DEFAULT_EXIT_ZERO_PACKETS)
self._transport_name = str(self.get_parameter('transport').value)
self._server_addr = str(self.get_parameter('server_addr').value)
self._relay_via = str(self.get_parameter('relay_via').value)
self._peer_id = str(self.get_parameter('peer_id').value)
self._target_peer = str(self.get_parameter('target_peer').value).strip()
self._input_topic = str(self.get_parameter('input_topic').value)
self._send_rate_hz = float(self.get_parameter('send_rate_hz').value)
self._input_timeout = float(self.get_parameter('input_timeout').value)
self._queue_depth = int(self.get_parameter('queue_depth').value)
self._exit_zero_packets = int(self.get_parameter('exit_zero_packets').value)
if self._send_rate_hz <= 0.0:
raise ValueError('send_rate_hz must be > 0')
if self._input_timeout < 0.0:
raise ValueError('input_timeout must be >= 0')
if self._queue_depth <= 0:
raise ValueError('queue_depth must be > 0')
if not self._target_peer:
raise ValueError('target_peer must not be empty')
self._transport = OmniTransport(
transport=self._transport_name,
server_addr=self._server_addr,
relay_via=self._relay_via,
peer_id=self._peer_id,
)
self._last_log_times: Dict[str, float] = {}
self._latest_command: CommandTuple = ZERO_COMMAND
self._last_input_monotonic: Optional[float] = None
self._last_sent_command: Optional[CommandTuple] = None
self._closing = threading.Event()
self.create_subscription(
TwistStamped,
self._input_topic,
self._handle_twist,
self._queue_depth,
)
self.create_timer(1.0 / self._send_rate_hz, self._send_latest_command)
self._drain_thread = threading.Thread(target=self._drain_incoming, daemon=True)
self._drain_thread.start()
self.get_logger().info(
'Forwarding TwistStamped from %s via %s://%s as %s -> %s at %.1f Hz '
'(input timeout %.2f s)'
% (
self._input_topic,
self._transport.transport,
self._transport.server_addr,
self._peer_id,
self._target_peer,
self._send_rate_hz,
self._input_timeout,
)
)
def _should_log(self, key: str, throttle_sec: float) -> bool:
now = time.monotonic()
previous = self._last_log_times.get(key)
if previous is None or (now - previous) >= throttle_sec:
self._last_log_times[key] = now
return True
return False
def _handle_twist(self, msg: TwistStamped) -> None:
self._latest_command = (
float(msg.twist.linear.x),
float(msg.twist.linear.y),
float(msg.twist.linear.z),
float(msg.twist.angular.x),
float(msg.twist.angular.y),
float(msg.twist.angular.z),
)
self._last_input_monotonic = time.monotonic()
def _command_for_current_tick(self) -> CommandTuple:
if self._last_input_monotonic is None:
return ZERO_COMMAND
if self._input_timeout == 0.0:
return self._latest_command
age = time.monotonic() - self._last_input_monotonic
if age > self._input_timeout:
return ZERO_COMMAND
return self._latest_command
def _send_command(self, command: CommandTuple) -> None:
payload = pack_command(command)
try:
self._transport.send(to=self._target_peer, data=payload)
self._last_sent_command = command
except OSError as exc:
if self._should_log('send_error', 2.0):
self.get_logger().error(f'OmniSocket send failed: {exc}')
def _send_latest_command(self) -> None:
self._send_command(self._command_for_current_tick())
def _log_inbound_message(self, from_peer: str, msg_type: int, payload: bytes) -> None:
if msg_type == MSG_TYPE_ERROR:
if self._should_log('server_error', 1.0):
text = payload.decode('utf-8', errors='replace')
self.get_logger().error(f'OmniSocket server error from {from_peer}: {text}')
return
if self._should_log('unexpected_inbound', 2.0):
self.get_logger().warning(
'Ignoring unexpected inbound message type %d from %s (%d bytes)'
% (msg_type, from_peer, len(payload))
)
def _drain_incoming(self) -> None:
while not self._closing.is_set() and rclpy.ok():
try:
result = self._transport.recv(timeout_ms=100)
except OSError as exc:
if not self._closing.is_set() and self._should_log('drain_error', 2.0):
self.get_logger().error(f'OmniSocket receive loop stopped: {exc}')
return
if result is None:
continue
from_peer, msg_type, payload = result
self._log_inbound_message(from_peer, msg_type, payload)
def send_zero_burst(self) -> None:
"""Best-effort stop command sent during shutdown."""
for _ in range(max(1, self._exit_zero_packets)):
self._send_command(ZERO_COMMAND)
time.sleep(0.02)
def close(self) -> None:
self._closing.set()
if hasattr(self, '_transport') and self._transport is not None:
try:
self._transport.close()
except OSError as exc:
if self._should_log('close_error', 2.0):
self.get_logger().warning(f'Closing OmniSocket transport failed: {exc}')
self._transport = None
if hasattr(self, '_drain_thread') and self._drain_thread.is_alive():
self._drain_thread.join(timeout=0.5)
def destroy_node(self) -> bool:
self.close()
return super().destroy_node()
def main(args: Optional[list[str]] = None) -> None:
rclpy.init(args=args)
node = CmdVelUdpSender()
try:
rclpy.spin(node)
except KeyboardInterrupt:
pass
finally:
node.send_zero_burst()
node.destroy_node()
rclpy.shutdown()

View File

@@ -0,0 +1,101 @@
"""Helpers for working with OmniSocket transport sessions."""
from __future__ import annotations
from .protocol import default_server_addr_for_transport, normalize_transport
try:
from omnisocket import (
CONTROL_DEFAULTS,
MSG_TYPE_BINARY,
MSG_TYPE_ERROR,
Session,
UdpSession,
)
except ImportError as exc: # pragma: no cover - depends on external build/install
raise RuntimeError(
'omnisocket is not installed for this Python environment; run '
'`make python-ext && make python-install` on a Linux host first'
) from exc
def _normalize_optional(value: object) -> str:
return str(value).strip()
class OmniTransport:
"""Small wrapper that normalizes OmniSocket UDP/KCP session setup."""
def __init__(
self,
*,
transport: object,
server_addr: object,
peer_id: object,
relay_via: object = '',
bind_ip: object = '',
bind_device: object = '',
enable_timestamping: bool = False,
) -> None:
self.transport = normalize_transport(transport)
self.server_addr = _normalize_optional(server_addr) or default_server_addr_for_transport(self.transport)
self.peer_id = _normalize_optional(peer_id)
self.relay_via = _normalize_optional(relay_via)
self.bind_ip = _normalize_optional(bind_ip)
self.bind_device = _normalize_optional(bind_device)
if not self.peer_id:
raise ValueError('peer_id must not be empty')
session_cls = Session if self.transport == 'kcp' else UdpSession
self._session = session_cls()
connect_kwargs: dict[str, object] = {
'server_addr': self.server_addr,
'peer_id': self.peer_id,
}
if self.bind_ip:
connect_kwargs['bind_ip'] = self.bind_ip
if self.bind_device:
connect_kwargs['bind_device'] = self.bind_device
if self.transport == 'kcp':
if self.relay_via:
connect_kwargs['relay_via'] = self.relay_via
connect_kwargs.update(CONTROL_DEFAULTS)
else:
connect_kwargs['enable_timestamping'] = bool(enable_timestamping)
self._session.connect(**connect_kwargs)
def send(self, *, to: str, data: bytes) -> None:
self._session.send(to=to, data=data)
def send_with_id(self, *, to: str, data: bytes) -> int:
if not hasattr(self._session, 'send_with_id'):
self._session.send(to=to, data=data)
raise RuntimeError('send_with_id is not available on this omnisocket build')
return int(self._session.send_with_id(to=to, data=data))
def recv(self, *, timeout_ms: int = -1):
return self._session.recv(timeout_ms=timeout_ms)
def recv_into(self, *, buffer, timeout_ms: int = -1):
return self._session.recv_into(buffer=buffer, timeout_ms=timeout_ms)
def close(self) -> None:
self._session.close()
def stats(self) -> dict[str, int]:
return self._session.stats()
__all__ = [
'CONTROL_DEFAULTS',
'MSG_TYPE_BINARY',
'MSG_TYPE_ERROR',
'OmniTransport',
'Session',
'UdpSession',
]

View File

@@ -0,0 +1,74 @@
"""Shared teleop protocol helpers and transport defaults."""
from __future__ import annotations
import math
import struct
from typing import Iterable, Tuple
COMMAND_STRUCT = struct.Struct('<6f')
PACKET_SIZE = COMMAND_STRUCT.size
SUPPORTED_TRANSPORTS = ('udp', 'kcp')
DEFAULT_TRANSPORT = 'udp'
DEFAULT_OMNI_UDP_SERVER_ADDR = '127.0.0.1:9001'
DEFAULT_OMNI_KCP_SERVER_ADDR = '127.0.0.1:9002'
DEFAULT_KEYBOARD_PEER_ID = 'ros-keyboard-ctrl'
DEFAULT_GAMEPAD_PEER_ID = 'ros-gamepad-ctrl'
DEFAULT_BRIDGE_PEER_ID = 'ros-bridge-ctrl'
DEFAULT_TARGET_PEER = DEFAULT_BRIDGE_PEER_ID
DEFAULT_FRAME_ID = 'pelvis'
DEFAULT_INPUT_TOPIC = '/teleop/cmd_vel'
DEFAULT_OUTPUT_TOPIC = '/hric/robot/cmd_vel'
DEFAULT_SEND_RATE_HZ = 20.0
DEFAULT_INPUT_TIMEOUT = 0.75
DEFAULT_WATCHDOG_TIMEOUT = 0.5
DEFAULT_PUBLISH_RATE_HZ = 100.0
DEFAULT_QUEUE_DEPTH = 10
DEFAULT_EXIT_ZERO_PACKETS = 3
DEFAULT_RECV_BUFFER_BYTES = 2048
ZERO_COMMAND = (0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
def normalize_transport(value: object) -> str:
"""Return a supported transport name."""
transport = str(value).strip().lower()
if transport not in SUPPORTED_TRANSPORTS:
supported = ', '.join(SUPPORTED_TRANSPORTS)
raise ValueError(f"Unsupported transport '{transport}', expected one of: {supported}")
return transport
def default_server_addr_for_transport(transport: str) -> str:
"""Return the default OmniSocket server for the chosen transport."""
transport = normalize_transport(transport)
if transport == 'udp':
return DEFAULT_OMNI_UDP_SERVER_ADDR
return DEFAULT_OMNI_KCP_SERVER_ADDR
def normalize_command(values: Iterable[float]) -> Tuple[float, float, float, float, float, float]:
"""Return a finite six-float command tuple."""
command = tuple(float(value) for value in values)
if len(command) != 6:
raise ValueError(f'Expected 6 command values, got {len(command)}')
if any(not math.isfinite(value) for value in command):
raise ValueError('Command contains a non-finite value')
return command
def pack_command(values: Iterable[float]) -> bytes:
"""Pack six floats into the wire format."""
return COMMAND_STRUCT.pack(*normalize_command(values))
def unpack_command(payload: bytes) -> Tuple[float, float, float, float, float, float]:
"""Decode a control packet into a six-float command tuple."""
if len(payload) != PACKET_SIZE:
raise ValueError(f'Expected {PACKET_SIZE} bytes, got {len(payload)}')
return normalize_command(COMMAND_STRUCT.unpack(payload))

View File

@@ -0,0 +1,122 @@
"""Subscribe to a ROS 2 topic with runtime type discovery and print messages."""
from __future__ import annotations
import json
import time
from typing import Any
import rclpy
from rclpy.node import Node
from rosidl_runtime_py.convert import message_to_ordereddict
from rosidl_runtime_py.utilities import get_message
WAIT_LOG_INTERVAL_SEC = 5.0
class TopicStatusReader(Node):
"""Wait for a topic to appear, subscribe to it, and print each message."""
def __init__(self) -> None:
super().__init__('topic_status_reader')
self.declare_parameter('topic', '/hric/robot/cmd_vel_status')
self.declare_parameter('qos_depth', 10)
self.declare_parameter('poll_interval_sec', 0.5)
self._topic = str(self.get_parameter('topic').value).strip()
self._qos_depth = int(self.get_parameter('qos_depth').value)
self._poll_interval_sec = float(self.get_parameter('poll_interval_sec').value)
if not self._topic:
raise ValueError('topic must not be empty')
if self._qos_depth <= 0:
raise ValueError('qos_depth must be > 0')
if self._poll_interval_sec <= 0.0:
raise ValueError('poll_interval_sec must be > 0')
self._topic_type: str | None = None
self._subscription = None
self._message_count = 0
self._last_wait_log_monotonic = 0.0
self._poll_timer = self.create_timer(self._poll_interval_sec, self._ensure_subscription)
self._ensure_subscription()
def _discover_topic_types(self) -> list[str]:
for topic_name, topic_types in self.get_topic_names_and_types():
if topic_name == self._topic:
return list(topic_types)
return []
def _log_waiting(self) -> None:
now = time.monotonic()
if (now - self._last_wait_log_monotonic) < WAIT_LOG_INTERVAL_SEC:
return
self._last_wait_log_monotonic = now
self.get_logger().info(f'Waiting for topic {self._topic} to appear...')
def _ensure_subscription(self) -> None:
if self._subscription is not None:
return
topic_types = self._discover_topic_types()
if not topic_types:
self._log_waiting()
return
if len(topic_types) > 1:
joined = ', '.join(topic_types)
self.get_logger().warning(
f'Topic {self._topic} reports multiple types ({joined}); using {topic_types[0]}'
)
self._topic_type = topic_types[0]
try:
message_type = get_message(self._topic_type)
except Exception as exc:
self.get_logger().error(
f'Failed to import message type {self._topic_type} for {self._topic}: {exc}'
)
return
self._subscription = self.create_subscription(
message_type,
self._topic,
self._handle_message,
self._qos_depth,
)
self._poll_timer.cancel()
self.get_logger().info(
f'Subscribed to {self._topic} with type {self._topic_type} (qos_depth={self._qos_depth})'
)
def _format_message(self, msg: Any) -> str:
try:
payload = message_to_ordereddict(msg)
except Exception:
return str(msg)
return json.dumps(payload, ensure_ascii=False, indent=2)
def _handle_message(self, msg: Any) -> None:
self._message_count += 1
received_at = time.strftime('%Y-%m-%d %H:%M:%S')
topic_type = self._topic_type or type(msg).__name__
rendered = self._format_message(msg)
print(
f'[{received_at}] #{self._message_count} {self._topic} ({topic_type})\n{rendered}\n',
flush=True,
)
def main(args: list[str] | None = None) -> None:
rclpy.init(args=args)
node = TopicStatusReader()
try:
rclpy.spin(node)
except KeyboardInterrupt:
pass
finally:
node.destroy_node()
rclpy.shutdown()

View File

@@ -0,0 +1,480 @@
"""ROS 2 node that receives OmniSocket teleop packets and republishes TwistStamped."""
from __future__ import annotations
import json
import os
import socket
import threading
import time
from typing import Dict, Optional, Tuple
import rclpy
from geometry_msgs.msg import TwistStamped
from rclpy.node import Node
from .protocol import (
DEFAULT_BRIDGE_PEER_ID,
DEFAULT_FRAME_ID,
DEFAULT_OUTPUT_TOPIC,
DEFAULT_PUBLISH_RATE_HZ,
DEFAULT_QUEUE_DEPTH,
DEFAULT_RECV_BUFFER_BYTES,
DEFAULT_TRANSPORT,
DEFAULT_WATCHDOG_TIMEOUT,
PACKET_SIZE,
ZERO_COMMAND,
unpack_command,
)
CommandTuple = Tuple[float, float, float, float, float, float]
class UdpCmdVelReceiver(Node):
"""Publish TwistStamped commands from the OmniSocket control wire format."""
def __init__(self) -> None:
super().__init__('udp_cmd_vel_receiver')
self.declare_parameter('transport', DEFAULT_TRANSPORT)
self.declare_parameter('server_addr', '')
self.declare_parameter('relay_via', '')
self.declare_parameter('peer_id', DEFAULT_BRIDGE_PEER_ID)
self.declare_parameter('expected_sender', '')
self.declare_parameter('local_socket_path', '/tmp/omnisocket-b-side-cmd.sock')
self.declare_parameter('output_topic', DEFAULT_OUTPUT_TOPIC)
self.declare_parameter('frame_id', DEFAULT_FRAME_ID)
self.declare_parameter('watchdog_timeout', DEFAULT_WATCHDOG_TIMEOUT)
self.declare_parameter('publish_rate_hz', DEFAULT_PUBLISH_RATE_HZ)
self.declare_parameter('queue_depth', DEFAULT_QUEUE_DEPTH)
self._transport_name = str(self.get_parameter('transport').value)
self._server_addr = str(self.get_parameter('server_addr').value)
self._relay_via = str(self.get_parameter('relay_via').value)
self._peer_id = str(self.get_parameter('peer_id').value)
self._expected_sender = str(self.get_parameter('expected_sender').value).strip()
self._local_socket_path = str(self.get_parameter('local_socket_path').value).strip()
self._output_topic = str(self.get_parameter('output_topic').value)
self._frame_id = str(self.get_parameter('frame_id').value)
self._watchdog_timeout = float(self.get_parameter('watchdog_timeout').value)
self._publish_rate_hz = float(self.get_parameter('publish_rate_hz').value)
self._queue_depth = int(self.get_parameter('queue_depth').value)
if self._transport_name not in ('udp', 'kcp', 'unix_dgram'):
raise ValueError("transport must be one of: udp, kcp, unix_dgram")
if self._watchdog_timeout <= 0.0:
raise ValueError('watchdog_timeout must be > 0')
if self._publish_rate_hz <= 0.0:
raise ValueError('publish_rate_hz must be > 0')
if self._queue_depth <= 0:
raise ValueError('queue_depth must be > 0')
self._publisher = self.create_publisher(TwistStamped, self._output_topic, self._queue_depth)
self._transport = None
self._unix_socket: socket.socket | None = None
self._msg_type_binary = 0
self._msg_type_error = 0
if self._transport_name == 'unix_dgram':
self._setup_unix_socket()
else:
from .omni_transport import MSG_TYPE_BINARY, MSG_TYPE_ERROR, OmniTransport
self._msg_type_binary = MSG_TYPE_BINARY
self._msg_type_error = MSG_TYPE_ERROR
self._transport = self._create_transport()
self._lock = threading.Lock()
self._last_log_times: Dict[str, float] = {}
self._latest_command: CommandTuple = ZERO_COMMAND
self._last_packet_monotonic: Optional[float] = None
self._last_published_command: CommandTuple = ZERO_COMMAND
self._closing = threading.Event()
self._recv_buffer = bytearray(DEFAULT_RECV_BUFFER_BYTES)
self._runtime_dir = os.getenv('BLITZ_RUNTIME_DIR', '/run/blitz-robot').strip() or '/run/blitz-robot'
self._status_path = os.path.join(self._runtime_dir, 'ros-receiver.status.json')
self._transport_reconnect_count = 0
self._recv_thread_heartbeat_epoch_ms = self._now_epoch_ms()
self._runtime_last_error = ''
self.create_timer(1.0 / self._publish_rate_hz, self._publish_tick)
self.create_timer(1.0, self._write_status_tick)
recv_target = self._recv_loop_unix_dgram if self._transport_name == 'unix_dgram' else self._recv_loop
self._recv_thread = threading.Thread(target=recv_target, daemon=True)
self._recv_thread.start()
if self._transport_name == 'unix_dgram':
self.get_logger().info(
'Receiving teleop commands via unix_dgram://%s and publishing TwistStamped to %s '
'at %.1f Hz (frame_id=%s, watchdog %.2f s)'
% (
self._local_socket_path,
self._output_topic,
self._publish_rate_hz,
self._frame_id,
self._watchdog_timeout,
)
)
else:
assert self._transport is not None
self.get_logger().info(
'Receiving teleop commands via %s://%s as %s and publishing TwistStamped to %s '
'at %.1f Hz (frame_id=%s, watchdog %.2f s)'
% (
self._transport.transport,
self._transport.server_addr,
self._peer_id,
self._output_topic,
self._publish_rate_hz,
self._frame_id,
self._watchdog_timeout,
)
)
def _setup_unix_socket(self) -> None:
if not self._local_socket_path:
raise ValueError('local_socket_path must not be empty for unix_dgram transport')
socket_dir = os.path.dirname(self._local_socket_path)
if socket_dir:
os.makedirs(socket_dir, exist_ok=True)
if os.path.exists(self._local_socket_path):
self.get_logger().warning(
'Removing existing unix datagram socket path before bind: %s'
% self._local_socket_path
)
try:
os.unlink(self._local_socket_path)
except FileNotFoundError:
pass
self._unix_socket = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
self._unix_socket.bind(self._local_socket_path)
self._unix_socket.settimeout(0.1)
def _close_unix_socket(self) -> None:
if self._unix_socket is not None:
try:
self._unix_socket.close()
except OSError:
pass
self._unix_socket = None
def _create_transport(self):
from .omni_transport import OmniTransport
return OmniTransport(
transport=self._transport_name,
server_addr=self._server_addr,
relay_via=self._relay_via,
peer_id=self._peer_id,
)
def _reconnect_transport(self) -> bool:
while not self._closing.is_set() and rclpy.ok():
current_transport = self._transport
if current_transport is not None:
try:
current_transport.close()
except OSError:
pass
try:
self._transport = self._create_transport()
self._transport_reconnect_count += 1
self._set_runtime_last_error('')
if self._should_log('transport_reconnected', 1.0):
self.get_logger().info(
'Reconnected OmniSocket transport %s://%s as %s'
% (self._transport_name, self._server_addr, self._peer_id)
)
return True
except OSError as exc:
self._transport = None
self._set_runtime_last_error(str(exc))
if self._should_log('transport_reconnect_error', 2.0):
self.get_logger().error(f'Failed to reconnect OmniSocket transport: {exc}')
time.sleep(0.5)
return False
def _rebind_unix_socket(self) -> bool:
while not self._closing.is_set() and rclpy.ok():
self._close_unix_socket()
try:
self._setup_unix_socket()
self._transport_reconnect_count += 1
self._set_runtime_last_error('')
if self._should_log('unix_rebound', 1.0):
self.get_logger().info(f'Rebound unix datagram socket at {self._local_socket_path}')
return True
except OSError as exc:
self._set_runtime_last_error(str(exc))
if self._should_log('unix_rebind_error', 2.0):
self.get_logger().error(f'Failed to rebind unix datagram socket: {exc}')
time.sleep(0.5)
return False
def _should_log(self, key: str, throttle_sec: float) -> bool:
now = time.monotonic()
previous = self._last_log_times.get(key)
if previous is None or (now - previous) >= throttle_sec:
self._last_log_times[key] = now
return True
return False
def _now_epoch_ms(self) -> int:
return time.time_ns() // 1_000_000
def _update_recv_heartbeat(self) -> None:
with self._lock:
self._recv_thread_heartbeat_epoch_ms = self._now_epoch_ms()
def _last_packet_age_ms(self) -> int | None:
with self._lock:
last_packet_monotonic = self._last_packet_monotonic
if last_packet_monotonic is None:
return None
return max(0, int((time.monotonic() - last_packet_monotonic) * 1000.0))
def _socket_bound(self) -> bool:
if self._transport_name == 'unix_dgram':
return self._unix_socket is not None and os.path.exists(self._local_socket_path)
return self._transport is not None
def _set_runtime_last_error(self, message: str) -> None:
self._runtime_last_error = message
def _status_payload(self) -> dict[str, object]:
with self._lock:
recv_thread_heartbeat_epoch_ms = self._recv_thread_heartbeat_epoch_ms
return {
'updated_at_epoch_ms': self._now_epoch_ms(),
'pid': os.getpid(),
'recv_thread_heartbeat_epoch_ms': recv_thread_heartbeat_epoch_ms,
'transport': self._transport_name,
'local_socket_path': self._local_socket_path,
'socket_bound': self._socket_bound(),
'transport_reconnect_count': self._transport_reconnect_count,
'last_packet_age_ms': self._last_packet_age_ms(),
'last_error': self._runtime_last_error,
}
def _write_status_tick(self) -> None:
payload = self._status_payload()
if self._transport_name == 'unix_dgram':
if self._unix_socket is None:
payload['last_error'] = self._runtime_last_error or 'unix datagram socket is not bound'
else:
if self._transport is None:
payload['last_error'] = self._runtime_last_error or 'OmniSocket transport is not connected'
try:
os.makedirs(self._runtime_dir, exist_ok=True)
temp_path = f'{self._status_path}.tmp.{os.getpid()}'
with open(temp_path, 'w', encoding='utf-8') as handle:
json.dump(payload, handle, ensure_ascii=True, separators=(',', ':'))
os.replace(temp_path, self._status_path)
except OSError as exc:
if self._should_log('status_write_error', 5.0):
self.get_logger().warning(f'Failed to write receiver status file: {exc}')
def _publish_command(self, command: CommandTuple) -> None:
msg = TwistStamped()
msg.header.stamp = self.get_clock().now().to_msg()
msg.header.frame_id = self._frame_id
msg.twist.linear.x = command[0]
msg.twist.linear.y = command[1]
msg.twist.linear.z = command[2]
msg.twist.angular.x = command[3]
msg.twist.angular.y = command[4]
msg.twist.angular.z = command[5]
self._publisher.publish(msg)
self._last_published_command = command
def _handle_error_message(self, from_peer: str, body_len: int) -> None:
if self._should_log('server_error', 1.0):
text = bytes(self._recv_buffer[:body_len]).decode('utf-8', errors='replace')
self.get_logger().error(f'OmniSocket server error from {from_peer}: {text}')
def _recv_loop(self) -> None:
while not self._closing.is_set() and rclpy.ok():
self._update_recv_heartbeat()
try:
assert self._transport is not None
meta = self._transport.recv_into(buffer=self._recv_buffer, timeout_ms=100)
except BufferError as exc:
self._set_runtime_last_error(str(exc))
if self._should_log('buffer_error', 2.0):
self.get_logger().warning(f'Dropped oversized OmniSocket frame: {exc}')
continue
except OSError as exc:
self._set_runtime_last_error(str(exc))
if not self._closing.is_set() and self._should_log('recv_error', 2.0):
self.get_logger().error(f'OmniSocket receive loop stopped: {exc}')
if not self._reconnect_transport():
return
continue
self._update_recv_heartbeat()
if meta is None:
continue
self._set_runtime_last_error('')
from_peer = str(meta['from'])
msg_type = int(meta['msg_type'])
body_len = int(meta['body_len'])
if msg_type == self._msg_type_error:
self._set_runtime_last_error(f'server error message from {from_peer}')
self._handle_error_message(from_peer, body_len)
continue
if self._expected_sender and from_peer != self._expected_sender:
self._set_runtime_last_error(f'unexpected sender {from_peer}')
if self._should_log('unexpected_sender', 2.0):
self.get_logger().warning(
'Ignoring message from unexpected sender %s (expected %s)'
% (from_peer, self._expected_sender)
)
continue
if msg_type != self._msg_type_binary:
self._set_runtime_last_error(f'unexpected message type {msg_type}')
if self._should_log('unexpected_type', 2.0):
self.get_logger().warning(
'Ignoring unexpected message type %d from %s (%d bytes)'
% (msg_type, from_peer, body_len)
)
continue
if body_len != PACKET_SIZE:
self._set_runtime_last_error(f'invalid payload size {body_len}')
if self._should_log('packet_size', 2.0):
self.get_logger().warning(
'Dropped binary payload from %s with invalid size %d (expected %d)'
% (from_peer, body_len, PACKET_SIZE)
)
continue
try:
command = unpack_command(self._recv_buffer[:PACKET_SIZE])
except ValueError as exc:
self._set_runtime_last_error(str(exc))
if self._should_log('decode_error', 2.0):
self.get_logger().warning(f'Dropped malformed command payload: {exc}')
continue
with self._lock:
self._latest_command = command
self._last_packet_monotonic = time.monotonic()
self._set_runtime_last_error('')
def _recv_loop_unix_dgram(self) -> None:
assert self._unix_socket is not None
while not self._closing.is_set() and rclpy.ok():
self._update_recv_heartbeat()
try:
payload = self._unix_socket.recv(DEFAULT_RECV_BUFFER_BYTES)
except socket.timeout:
if not os.path.exists(self._local_socket_path):
self._set_runtime_last_error('unix datagram socket path disappeared')
if self._should_log('unix_socket_missing', 2.0):
self.get_logger().warning(
f'Unix datagram socket path disappeared, rebinding {self._local_socket_path}'
)
if not self._rebind_unix_socket():
return
continue
except OSError as exc:
self._set_runtime_last_error(str(exc))
if not self._closing.is_set() and self._should_log('unix_recv_error', 2.0):
self.get_logger().error(f'Unix datagram receive loop stopped: {exc}')
if not self._rebind_unix_socket():
return
continue
self._update_recv_heartbeat()
if len(payload) != PACKET_SIZE:
self._set_runtime_last_error(f'invalid unix datagram payload size {len(payload)}')
if self._should_log('unix_packet_size', 2.0):
self.get_logger().warning(
'Dropped unix datagram payload with invalid size %d (expected %d)'
% (len(payload), PACKET_SIZE)
)
continue
try:
command = unpack_command(payload)
except ValueError as exc:
self._set_runtime_last_error(str(exc))
if self._should_log('unix_decode_error', 2.0):
self.get_logger().warning(f'Dropped malformed unix datagram payload: {exc}')
continue
with self._lock:
self._latest_command = command
self._last_packet_monotonic = time.monotonic()
self._set_runtime_last_error('')
def _command_for_publish_tick(self) -> tuple[CommandTuple, Optional[float], bool]:
with self._lock:
latest_command = self._latest_command
last_packet_monotonic = self._last_packet_monotonic
if last_packet_monotonic is None:
return ZERO_COMMAND, None, False
age = time.monotonic() - last_packet_monotonic
if age > self._watchdog_timeout:
return ZERO_COMMAND, age, True
return latest_command, age, False
def _publish_tick(self) -> None:
publish_command, age, timed_out = self._command_for_publish_tick()
if timed_out and self._last_published_command != ZERO_COMMAND:
if self._should_log('watchdog_stop', 2.0):
self.get_logger().warning(
'Command stream timed out after %.2f s, publishing zero velocity stop'
% age
)
self._publish_command(publish_command)
def close(self) -> None:
self._closing.set()
if hasattr(self, '_transport') and self._transport is not None:
try:
self._transport.close()
except OSError as exc:
if self._should_log('close_error', 2.0):
self.get_logger().warning(f'Closing OmniSocket transport failed: {exc}')
self._transport = None
if self._unix_socket is not None:
try:
self._close_unix_socket()
except OSError as exc:
if self._should_log('unix_close_error', 2.0):
self.get_logger().warning(f'Closing unix socket failed: {exc}')
try:
os.unlink(self._local_socket_path)
except FileNotFoundError:
pass
if hasattr(self, '_recv_thread') and self._recv_thread.is_alive():
self._recv_thread.join(timeout=0.5)
def destroy_node(self) -> bool:
self.close()
return super().destroy_node()
def main(args: Optional[list[str]] = None) -> None:
rclpy.init(args=args)
node = UdpCmdVelReceiver()
try:
rclpy.spin(node)
except KeyboardInterrupt:
pass
finally:
node.destroy_node()
rclpy.shutdown()

View File

@@ -56,6 +56,7 @@ if [ ! -x ./bin/kcpserver ]; then
exit 1
fi
setsid ./bin/kcpserver -listen 0.0.0.0:10909 \
-telemetry-peer peer-a-telemetry \
-kcp-ts-debug-log logs/d-kcp-ts.jsonl \
-kcp-session-stats-log logs/d-kcp-stats.jsonl > server_console.log 2>&1 </dev/null &
echo "server D launched (pid=$!)"

View File

@@ -56,6 +56,7 @@ if [ ! -x ./bin/kcpserver ]; then
exit 1
fi
setsid ./bin/kcpserver -listen 0.0.0.0:10909 \
-telemetry-peer peer-a-telemetry \
-kcp-ts-debug-log logs/d-kcp-ts.jsonl \
-kcp-session-stats-log logs/d-kcp-stats.jsonl > server_console.log 2>&1 </dev/null &
echo "server D launched (pid=$!)"

180
scripts/boot/5g-dial.sh Normal file
View File

@@ -0,0 +1,180 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="5g-dial"
append_route_targets() {
local raw_list="$1"
local target
if [[ -z "${raw_list}" ]]; then
return 0
fi
for target in ${raw_list//,/ }; do
if [[ -z "${target}" ]]; then
continue
fi
dial_cmd+=(--route-target "${target}")
done
}
read_detected_interface() {
local info_json="$1"
if [[ ! -f "${info_json}" ]]; then
return 1
fi
python3 -c 'import json, sys; print((json.load(open(sys.argv[1], encoding="utf-8")).get("interface") or "").strip())' "${info_json}"
}
disable_interfaces() {
local raw_list="$1"
local iface
local nmcli_available=0
if [[ -z "${raw_list}" ]]; then
return 0
fi
if command -v nmcli >/dev/null 2>&1; then
nmcli_available=1
fi
for iface in ${raw_list//,/ }; do
if [[ -z "${iface}" ]]; then
continue
fi
blitz_log "${STEP}" "disable-interface" "start" "iface=${iface}" 0
if [[ "${nmcli_available}" -eq 1 ]]; then
nmcli device disconnect "${iface}" >/dev/null 2>&1 || true
fi
if ip link show dev "${iface}" >/dev/null 2>&1; then
if ip link set dev "${iface}" down; then
blitz_log "${STEP}" "disable-interface" "success" "iface=${iface}" 0
else
rc=$?
blitz_log "${STEP}" "disable-interface" "failure" "iface=${iface}" "${rc}"
return "${rc}"
fi
else
blitz_log "${STEP}" "disable-interface" "success" "iface=${iface} not present, skipping" 0
fi
done
}
wait_for_serial() {
local serial_port="$1"
local timeout_sec="$2"
local waited=0
while (( waited < timeout_sec )); do
if [[ -e "${serial_port}" ]]; then
blitz_log "${STEP}" "wait-serial" "success" "serial_port=${serial_port} waited_sec=${waited}" 0
return 0
fi
if (( waited == 0 || waited % 5 == 0 )); then
blitz_log "${STEP}" "wait-serial" "waiting" "serial_port=${serial_port} waited_sec=${waited}" 0
fi
sleep 1
waited=$(( waited + 1 ))
done
blitz_log "${STEP}" "wait-serial" "failure" "serial_port=${serial_port} timeout_sec=${timeout_sec}" 1
return 1
}
wait_for_route() {
local target_ip="$1"
local timeout_sec="$2"
local expected_interface="${3:-}"
local waited=0
local route_output
while (( waited < timeout_sec )); do
route_output="$(blitz_route_ready "${target_ip}" "${expected_interface}" || true)"
if [[ -n "${route_output}" ]]; then
blitz_log "${STEP}" "route-check" "success" "target_ip=${target_ip} interface=${expected_interface:-auto} route=${route_output}" 0
return 0
fi
if (( waited == 0 || waited % 5 == 0 )); then
blitz_log "${STEP}" "route-check" "waiting" "target_ip=${target_ip} interface=${expected_interface:-auto} waited_sec=${waited}" 0
fi
sleep 1
waited=$(( waited + 1 ))
done
blitz_log "${STEP}" "route-check" "failure" "target_ip=${target_ip} interface=${expected_interface:-auto} timeout_sec=${timeout_sec}" 1
return 1
}
blitz_load_boot_env
blitz_require_root "${STEP}"
blitz_require_command ip "${STEP}"
blitz_require_command python3 "${STEP}"
blitz_require_file "${BLITZ_5G_DIAL_DIR}/rndis_dial.py" "${STEP}"
if [[ -z "${BLITZ_TIME_SERVER_IP}" ]]; then
blitz_log "${STEP}" "precheck" "failure" "BLITZ_TIME_SERVER_IP is empty and no fallback could be derived" 1
exit 1
fi
disable_interfaces "${BLITZ_5G_DISABLE_INTERFACES:-}"
if [[ -n "${BLITZ_5G_INTERFACE:-}" ]]; then
route_output="$(blitz_route_ready "${BLITZ_TIME_SERVER_IP}" "${BLITZ_5G_INTERFACE}" || true)"
if [[ -n "${route_output}" ]]; then
blitz_log "${STEP}" "dial" "already_up" "target_ip=${BLITZ_TIME_SERVER_IP} interface=${BLITZ_5G_INTERFACE} route=${route_output}" 0
exit 0
fi
else
blitz_log "${STEP}" "route-check" "info" "BLITZ_5G_INTERFACE is empty, skipping pre-dial route shortcut and using auto-detect mode" 0
fi
wait_for_serial "${BLITZ_5G_SERIAL_PORT}" "${BLITZ_5G_SERIAL_WAIT_SEC}"
dial_cmd=(
python3
rndis_dial.py
--serial-port "${BLITZ_5G_SERIAL_PORT}"
--modem-subnet "${BLITZ_5G_MODEM_SUBNET}"
)
if [[ -n "${BLITZ_5G_INTERFACE:-}" ]]; then
dial_cmd+=(--interface "${BLITZ_5G_INTERFACE}")
fi
case "${BLITZ_5G_SKIP_DHCP:-0}" in
1|true|TRUE|yes|YES)
dial_cmd+=(--skip-dhcp)
;;
esac
case "${BLITZ_5G_REMOVE_DEFAULT_ROUTE:-1}" in
1|true|TRUE|yes|YES)
dial_cmd+=(--remove-default-route --gateway "${BLITZ_5G_GATEWAY}" --route-target "${BLITZ_TIME_SERVER_IP}")
append_route_targets "${BLITZ_5G_ROUTE_TARGETS:-}"
;;
esac
pushd "${BLITZ_5G_DIAL_DIR}" >/dev/null
blitz_run "${STEP}" "dial" "${dial_cmd[@]}"
popd >/dev/null
resolved_interface="${BLITZ_5G_INTERFACE:-}"
if [[ -z "${resolved_interface}" ]]; then
resolved_interface="$(read_detected_interface "${BLITZ_5G_INFO_JSON}" || true)"
if [[ -n "${resolved_interface}" ]]; then
blitz_log "${STEP}" "resolve-interface" "success" "resolved interface from ${BLITZ_5G_INFO_JSON}: ${resolved_interface}" 0
else
blitz_log "${STEP}" "resolve-interface" "failure" "failed to read detected interface from ${BLITZ_5G_INFO_JSON}" 1
fi
fi
if [[ -n "${resolved_interface}" ]]; then
wait_for_route "${BLITZ_TIME_SERVER_IP}" "${BLITZ_5G_ROUTE_WAIT_SEC}" "${resolved_interface}"
blitz_log "${STEP}" "complete" "success" "5G dial completed and route is ready on ${resolved_interface}" 0
else
blitz_log "${STEP}" "complete" "success" "5G dial completed but route wait was skipped because no interface could be resolved; refer to rndis_dial.py logs" 0
fi

219
scripts/boot/README.md Normal file
View File

@@ -0,0 +1,219 @@
# Robot B-Side Boot Chain
This directory contains the robot-side boot and recovery scripts.
Normal usage is:
```bash
sudo bash scripts/boot/install-systemd.sh
sudo systemctl start blitz-robot.target
```
After installation, `blitz-robot.target` is enabled and will start automatically on reboot.
To stop the chain now and disable boot-time autostart for future reboots:
```bash
sudo bash scripts/boot/disable-systemd.sh
```
## Current Startup Order
The current cold-start chain is:
1. `blitz-boot-gate.service`
2. `blitz-5g-dial.service`
3. `blitz-ros-receiver.service`
4. `blitz-b-side-omnid.service`
5. `blitz-watchdog.service`
There is no longer any automatic time-sync step in the boot chain.
## What Each Script Does
- `robot-boot.env`: default boot configuration
- `robot-boot.env.local`: machine-local overrides
- `common.sh`: shared env loading, logging, and helper functions
- `boot-gate.sh`: fixed startup delay gate
- `5g-dial.sh`: brings up the 5G modem path and verifies routing
- `start-ros-receiver-service.sh`: boot wrapper for ROS receiver
- `wait-for-unix-socket.sh`: waits for the ROS receiver unix socket
- `start-b-side-omnid-service.sh`: boot wrapper for `b_side_omnid`
- `blitz-watchdog.sh`: runtime health watchdog and recovery orchestrator
- `blitz-fault-inject.sh`: fault injection entrypoint
- `install-systemd.sh`: installs systemd units into `/etc/systemd/system`
- `disable-systemd.sh`: stops the boot chain and disables autostart
## Important Configuration
Most machine-specific overrides should go into:
```text
scripts/boot/robot-boot.env.local
```
Typical settings:
```bash
BLITZ_BOOT_DELAY_SEC="30"
BLITZ_LOG_FILE="/var/log/blitz-robot/startup.log"
BLITZ_RUNTIME_DIR="/run/blitz-robot"
BLITZ_5G_DIAL_DIR="${OMNISOCKETGO_ROOT}/scripts/boot"
BLITZ_5G_SERIAL_PORT="/dev/ttyUSB2"
BLITZ_5G_INTERFACE=""
BLITZ_5G_MODEM_SUBNET="192.168.224.0/22"
BLITZ_5G_GATEWAY="192.168.225.1"
BLITZ_5G_REMOVE_DEFAULT_ROUTE="1"
BLITZ_5G_ROUTE_TARGETS="106.55.173.235"
BLITZ_5G_INFO_JSON="${OMNISOCKETGO_ROOT}/scripts/boot/modem_network_info.json"
BLITZ_TIME_SERVER_IP="81.70.156.140"
BLITZ_ROS_USER="nvidia"
BLITZ_ROS_SOCKET_WAIT_SEC="20"
BLITZ_WATCHDOG_INTERVAL_SEC="5"
BLITZ_HEALTH_STALE_SEC="15"
BLITZ_OMNID_THREAD_HEARTBEAT_TIMEOUT_SEC="15"
BLITZ_NETWORK_FAIL_THRESHOLD="3"
BLITZ_NETWORK_RECOVERY_COOLDOWN_SEC="30"
BLITZ_GPS_MONITOR_ENABLED="1"
BLITZ_GPS_DEVICE_GLOB="/dev/ttyCH341USB*"
BLITZ_GPS_CHECK_INTERVAL_SEC="10"
BLITZ_GPS_RESTART_UNITS="gpsd.socket gpsd.service"
BLITZ_WATCHDOG_ALLOW_FAULT_INJECTION="0"
```
`BLITZ_TIME_SERVER_IP` is still used, but only as the 5G route/ping health-check target. It is no longer used for automatic clock synchronization.
If `BLITZ_TIME_SERVER_IP` is left empty, the scripts fall back to the host part of `ROBOT_SIDE_OMNISOCKET_SERVER_ADDR`.
## Install Or Upgrade
Run:
```bash
sudo bash scripts/boot/install-systemd.sh
sudo systemctl daemon-reload
sudo systemctl restart blitz-robot.target
```
`install-systemd.sh` will also remove any old `blitz-time-sync.service` unit left over from earlier versions.
## Disable Autostart
To stop the currently running services and disable autostart for future reboots:
```bash
sudo bash scripts/boot/disable-systemd.sh
```
To re-enable later:
```bash
sudo bash scripts/boot/install-systemd.sh
sudo systemctl start blitz-robot.target
```
## Logs
All boot-chain and watchdog logs are appended to:
```text
/var/log/blitz-robot/startup.log
```
Follow the log live:
```bash
sudo tail -f /var/log/blitz-robot/startup.log
```
Check service state:
```bash
sudo systemctl status blitz-robot.target
sudo systemctl status blitz-5g-dial.service
sudo systemctl status blitz-ros-receiver.service
sudo systemctl status blitz-b-side-omnid.service
sudo systemctl status blitz-watchdog.service
```
Check systemd journal:
```bash
sudo journalctl -u blitz-robot.target -u blitz-5g-dial.service \
-u blitz-ros-receiver.service -u blitz-b-side-omnid.service \
-u blitz-watchdog.service -f
```
## Runtime Status Files
The runtime status directory is:
```text
/run/blitz-robot
```
Key files:
- `b-side-omnid.status.json`
- `ros-receiver.status.json`
- `watchdog.status.json`
`watchdog.status.json` now also records `gps_ok` and `gps_device_present` so you can quickly tell whether the GPS USB serial node is currently visible and whether the last `gpsd` reconnect attempt succeeded.
Pretty-print them:
```bash
sudo python3 -m json.tool /run/blitz-robot/watchdog.status.json
sudo python3 -m json.tool /run/blitz-robot/b-side-omnid.status.json
sudo python3 -m json.tool /run/blitz-robot/ros-receiver.status.json
```
## Fault Injection
Available test commands:
```bash
sudo bash scripts/boot/blitz-fault-inject.sh bside-crash
sudo bash scripts/boot/blitz-fault-inject.sh bside-process-freeze
sudo bash scripts/boot/blitz-fault-inject.sh bside-video-thread-stall
sudo bash scripts/boot/blitz-fault-inject.sh bside-control-thread-stall
sudo bash scripts/boot/blitz-fault-inject.sh ros-crash
sudo bash scripts/boot/blitz-fault-inject.sh ros-freeze
```
For synthetic network fault injection, first enable it in `robot-boot.env.local`:
```bash
BLITZ_WATCHDOG_ALLOW_FAULT_INJECTION="1"
```
Then restart watchdog and inject:
```bash
sudo systemctl restart blitz-watchdog.service
sudo bash scripts/boot/blitz-fault-inject.sh network-down on
sudo bash scripts/boot/blitz-fault-inject.sh network-down off
```
## Recovery Behavior Summary
- If `b_side_omnid` dies or its status file goes stale, watchdog first tries a targeted `b_side` restart.
- If ROS receiver dies, loses its socket, or its heartbeat goes stale, watchdog performs an ordered full restart:
- stop `b_side`
- restart ROS receiver
- wait for unix socket
- start `b_side`
- If network checks fail repeatedly, watchdog stops `b_side`, runs `5g-dial.sh`, waits for route recovery, and then restores services.
- While 5G is healthy, watchdog keeps every host route listed by `BLITZ_TIME_SERVER_IP` and `BLITZ_5G_ROUTE_TARGETS` pinned to the resolved 5G interface. When 5G becomes unhealthy, watchdog deletes those host routes so traffic can fall back to the remaining default network path. If that fallback path is still reachable, watchdog keeps `b_side_omnid` running instead of treating it as a full network outage.
- Whenever watchdog changes or restores those host routes, it logs `route-path` lines for each target so you can see which interface Linux currently chooses for `81.70.156.140`, `106.55.173.235`, and any other configured 5G-pinned target.
- If GPS monitoring is enabled, watchdog checks `BLITZ_GPS_DEVICE_GLOB` every `BLITZ_GPS_CHECK_INTERVAL_SEC` seconds. When the GPS serial device disappears and later reappears, watchdog restarts the units in `BLITZ_GPS_RESTART_UNITS` so `gpsd` can bind to the new device node again.
- Camera disappearance is logged as degraded state. Reappearance triggers a `b_side` restart after the device is stable.
## Notes
- `time-sync.sh` and `blitz-time-sync.service` are intentionally removed from the automatic boot path.
- `b_side_omnid` must already be built before boot-time startup.
- `bin/b_side_omnid` missing, ROS env missing, or modem script missing will all show up in `startup.log`.

View File

@@ -0,0 +1,137 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="5g-link-logger"
resolve_target_ip() {
if [[ -n "${BLITZ_TIME_SERVER_IP:-}" ]]; then
printf '%s\n' "${BLITZ_TIME_SERVER_IP}"
return 0
fi
for candidate in ${BLITZ_5G_ROUTE_TARGETS//,/ }; do
if [[ -n "${candidate}" ]]; then
printf '%s\n' "${candidate}"
return 0
fi
done
return 1
}
emit_sample_json() {
local interface_name="${1:-}"
local target_ip="${2:-}"
python3 - "${interface_name}" "${target_ip}" <<'PY'
import json
import subprocess
import sys
import time
interface_name = sys.argv[1]
target_ip = sys.argv[2]
payload = {
"ts_unix_ms": time.time_ns() // 1_000_000,
"interface": interface_name,
"target_ip": target_ip,
"link_present": False,
"route_output": "",
"route_ok": False,
"probe_ok": False,
"ping_rtt_ms": None,
"rx_bytes": 0,
"tx_bytes": 0,
"rx_packets": 0,
"tx_packets": 0,
"rx_errors": 0,
"tx_errors": 0,
"rx_drops": 0,
"tx_drops": 0,
}
if interface_name:
try:
output = subprocess.check_output(
["ip", "-j", "-s", "link", "show", "dev", interface_name],
text=True,
stderr=subprocess.DEVNULL,
)
stats = json.loads(output)
if stats:
item = stats[0]
payload["link_present"] = True
rx = item.get("stats64", {}).get("rx", {})
tx = item.get("stats64", {}).get("tx", {})
if not rx and not tx:
rx = item.get("stats", {}).get("rx", {})
tx = item.get("stats", {}).get("tx", {})
payload["rx_bytes"] = int(rx.get("bytes") or 0)
payload["tx_bytes"] = int(tx.get("bytes") or 0)
payload["rx_packets"] = int(rx.get("packets") or 0)
payload["tx_packets"] = int(tx.get("packets") or 0)
payload["rx_errors"] = int(rx.get("errors") or 0)
payload["tx_errors"] = int(tx.get("errors") or 0)
payload["rx_drops"] = int(rx.get("dropped") or 0)
payload["tx_drops"] = int(tx.get("dropped") or 0)
except Exception:
pass
if target_ip:
try:
route = subprocess.check_output(
["ip", "route", "get", target_ip],
text=True,
stderr=subprocess.STDOUT,
).strip()
payload["route_output"] = route.splitlines()[0] if route else ""
payload["route_ok"] = bool(payload["route_output"]) and (
not interface_name or f" dev {interface_name}" in payload["route_output"]
)
except Exception as exc:
payload["route_output"] = str(exc)
ping_cmd = ["ping", "-c", "1", "-W", "2", target_ip]
if interface_name:
ping_cmd[1:1] = ["-I", interface_name]
ping = subprocess.run(ping_cmd, capture_output=True, text=True)
payload["probe_ok"] = ping.returncode == 0
output = (ping.stdout or "") + "\n" + (ping.stderr or "")
for token in output.replace("\n", " ").split():
if token.startswith("time="):
value = token.split("=", 1)[1].rstrip("ms")
try:
payload["ping_rtt_ms"] = float(value)
except ValueError:
pass
break
print(json.dumps(payload, separators=(",", ":"), ensure_ascii=False))
PY
}
if [[ "${OMNI_BOOT_MODE:-0}" == "1" ]]; then
blitz_load_boot_env
blitz_require_run_context
fi
if [[ -z "${BLITZ_RUN_DIR:-}" && -f "${BLITZ_RUN_CONTEXT_FILE:-}" ]]; then
blitz_load_run_context_env || true
fi
blitz_ensure_instance_id
export BLITZ_5G_LINK_LOG_PATH="${BLITZ_5G_LINK_LOG_PATH:-${BLITZ_RUN_DIR}/b-5g-link-quality.${BLITZ_INSTANCE_ID}.jsonl}"
target_ip="$(resolve_target_ip || true)"
blitz_log "${STEP}" "start" "start" "path=${BLITZ_5G_LINK_LOG_PATH} interval_sec=${BLITZ_5G_LINK_LOG_INTERVAL_SEC}" 0
while true; do
interface_name="$(blitz_resolve_5g_interface || true)"
line="$(emit_sample_json "${interface_name}" "${target_ip}")"
blitz_jsonl_append_line "${BLITZ_5G_LINK_LOG_PATH}" "${line}"
sleep "${BLITZ_5G_LINK_LOG_INTERVAL_SEC}"
done

View File

@@ -0,0 +1,139 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="fault-inject"
B_SIDE_SERVICE="blitz-b-side-omnid.service"
ROS_SERVICE="blitz-ros-receiver.service"
main_pid_for_service() {
local service_name="$1"
systemctl show --property MainPID --value "${service_name}"
}
wait_for_service_pid_change() {
local service_name="$1"
local previous_pid="$2"
local timeout_sec="${3:-10}"
local waited=0
local current_pid=""
while (( waited < timeout_sec )); do
current_pid="$(main_pid_for_service "${service_name}")"
if [[ -n "${current_pid}" && "${current_pid}" != "0" && "${current_pid}" != "${previous_pid}" ]]; then
printf '%s\n' "${current_pid}"
return 0
fi
sleep 1
waited=$(( waited + 1 ))
done
return 1
}
require_running_pid() {
local service_name="$1"
local pid
pid="$(main_pid_for_service "${service_name}")"
if [[ -z "${pid}" || "${pid}" == "0" ]]; then
blitz_log "${STEP}" "lookup-pid" "failure" "service=${service_name}" 1
exit 1
fi
printf '%s\n' "${pid}"
}
write_fault_flag() {
local flag_name="$1"
local flag_path="${BLITZ_RUNTIME_DIR}/${flag_name}"
printf '%s\n' "$(date +%s)" > "${flag_path}"
blitz_log "${STEP}" "flag-on" "success" "path=${flag_path}" 0
}
clear_fault_flag() {
local flag_name="$1"
local flag_path="${BLITZ_RUNTIME_DIR}/${flag_name}"
rm -f "${flag_path}"
blitz_log "${STEP}" "flag-off" "success" "path=${flag_path}" 0
}
blitz_load_boot_env
blitz_require_root "${STEP}"
blitz_prepare_runtime_dir
case "${1:-}" in
bside-crash)
target_pid="$(require_running_pid "${B_SIDE_SERVICE}")"
blitz_log "${STEP}" "bside-crash" "start" "service=${B_SIDE_SERVICE} pid=${target_pid}" 0
kill -9 "${target_pid}"
if restarted_pid="$(wait_for_service_pid_change "${B_SIDE_SERVICE}" "${target_pid}")"; then
blitz_log "${STEP}" "bside-crash" "success" "old_pid=${target_pid} new_pid=${restarted_pid}" 0
else
blitz_log "${STEP}" "bside-crash" "failure" "old_pid=${target_pid} restart_not_observed_within=10s" 1
exit 1
fi
;;
bside-process-freeze)
target_pid="$(require_running_pid "${B_SIDE_SERVICE}")"
blitz_log "${STEP}" "bside-process-freeze" "start" "service=${B_SIDE_SERVICE} pid=${target_pid}" 0
kill -STOP "${target_pid}"
blitz_log "${STEP}" "bside-process-freeze" "success" "service=${B_SIDE_SERVICE} pid=${target_pid}" 0
;;
bside-video-thread-stall)
write_fault_flag "fault-injection-bside-video-thread-stall"
;;
bside-control-thread-stall)
write_fault_flag "fault-injection-bside-control-thread-stall"
;;
ros-crash)
target_pid="$(require_running_pid "${ROS_SERVICE}")"
blitz_log "${STEP}" "ros-crash" "start" "service=${ROS_SERVICE} pid=${target_pid}" 0
kill -9 "${target_pid}"
if restarted_pid="$(wait_for_service_pid_change "${ROS_SERVICE}" "${target_pid}")"; then
blitz_log "${STEP}" "ros-crash" "success" "old_pid=${target_pid} new_pid=${restarted_pid}" 0
else
blitz_log "${STEP}" "ros-crash" "failure" "old_pid=${target_pid} restart_not_observed_within=10s" 1
exit 1
fi
;;
ros-freeze)
target_pid="$(require_running_pid "${ROS_SERVICE}")"
blitz_log "${STEP}" "ros-freeze" "start" "service=${ROS_SERVICE} pid=${target_pid}" 0
kill -STOP "${target_pid}"
blitz_log "${STEP}" "ros-freeze" "success" "service=${ROS_SERVICE} pid=${target_pid}" 0
;;
network-down)
if [[ "${BLITZ_WATCHDOG_ALLOW_FAULT_INJECTION}" != "1" ]]; then
blitz_log "${STEP}" "network-down" "failure" "set BLITZ_WATCHDOG_ALLOW_FAULT_INJECTION=1 first" 1
exit 1
fi
case "${2:-}" in
on)
write_fault_flag "fault-injection-network-down"
;;
off)
clear_fault_flag "fault-injection-network-down"
;;
*)
echo "usage: $0 network-down on|off" >&2
exit 2
;;
esac
;;
*)
cat <<'EOF'
usage:
blitz-fault-inject.sh bside-crash
blitz-fault-inject.sh bside-process-freeze
blitz-fault-inject.sh bside-video-thread-stall
blitz-fault-inject.sh bside-control-thread-stall
blitz-fault-inject.sh ros-crash
blitz-fault-inject.sh ros-freeze
blitz-fault-inject.sh network-down on|off
EOF
exit 2
;;
esac

View File

@@ -0,0 +1,50 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="incident-launch"
incident_id=""
args=()
timeout_bin=""
while (($# > 0)); do
case "$1" in
--incident-id)
incident_id="${2:-}"
shift 2
;;
*)
args+=("$1")
shift
;;
esac
done
blitz_load_boot_env
blitz_require_root "${STEP}"
blitz_require_command systemd-run "${STEP}"
blitz_require_command timeout "${STEP}"
timeout_bin="$(command -v timeout)"
if [[ -z "${incident_id}" ]]; then
incident_id="$(blitz_new_incident_id)"
fi
unit_name="blitz-incident-${incident_id//[^A-Za-z0-9_.-]/-}"
systemd-run \
--quiet \
--collect \
--unit "${unit_name}" \
--property=Type=oneshot \
--property="StandardOutput=append:${BLITZ_LOG_FILE}" \
--property="StandardError=append:${BLITZ_LOG_FILE}" \
"${timeout_bin}" "${BLITZ_INCIDENT_TOTAL_TIMEOUT_SEC}s" \
/bin/bash "${SCRIPT_DIR}/blitz-incident-capture.sh" \
--incident-id "${incident_id}" \
"${args[@]}"
printf '%s\n' "${incident_id}"

View File

@@ -0,0 +1,131 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="incident-capture"
incident_id=""
incident_source=""
incident_reason=""
incident_unit=""
incident_result=""
incident_exit_status=""
run_capture() {
local output_path="$1"
shift
if command -v timeout >/dev/null 2>&1; then
timeout "${BLITZ_INCIDENT_COMMAND_TIMEOUT_SEC}s" "$@" > "${output_path}" 2>&1 || true
else
"$@" > "${output_path}" 2>&1 || true
fi
}
while (($# > 0)); do
case "$1" in
--incident-id)
incident_id="${2:-}"
shift 2
;;
--source)
incident_source="${2:-}"
shift 2
;;
--reason)
incident_reason="${2:-}"
shift 2
;;
--unit)
incident_unit="${2:-}"
shift 2
;;
--result)
incident_result="${2:-}"
shift 2
;;
--exit-status)
incident_exit_status="${2:-}"
shift 2
;;
*)
blitz_log "${STEP}" "parse-arg" "failure" "unknown argument: $1" 2
exit 2
;;
esac
done
if [[ -n "${incident_result}" && "${incident_result}" == "success" ]]; then
exit 0
fi
blitz_load_boot_env
blitz_load_run_context_env || true
blitz_prepare_runtime_dir
blitz_prepare_run_root
if [[ -z "${incident_id}" ]]; then
incident_id="$(blitz_new_incident_id)"
fi
incident_dir="${BLITZ_RUN_ROOT}/incidents/${incident_id}"
mkdir -p "${incident_dir}"
python3 - "${incident_dir}/incident.json" "${incident_id}" "${BLITZ_RUN_ID:-}" "${incident_source}" "${incident_reason}" "${incident_unit}" "${incident_result}" "${incident_exit_status}" "${BLITZ_RUN_DIR:-}" "${HOSTNAME:-$(hostname)}" <<'PY'
import json
import sys
import time
path, incident_id, run_id, source, reason, unit, result, exit_status, run_dir, hostname = sys.argv[1:10]
payload = {
"incident_id": incident_id,
"run_id": run_id,
"source": source,
"fault_reason": reason,
"unit": unit,
"service_result": result,
"exit_status": exit_status,
"run_dir": run_dir,
"hostname": hostname,
"captured_at": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
}
with open(path, "w", encoding="utf-8") as handle:
json.dump(payload, handle, ensure_ascii=False, indent=2, sort_keys=True)
PY
for status_file in \
"${BLITZ_RUNTIME_DIR}/watchdog.status.json" \
"${BLITZ_RUNTIME_DIR}/b-side-omnid.status.json" \
"${BLITZ_RUNTIME_DIR}/ros-receiver.status.json"
do
if [[ -f "${status_file}" ]]; then
cp -f "${status_file}" "${incident_dir}/$(basename "${status_file}")"
fi
done
if [[ -f "${BLITZ_LOG_FILE}" ]]; then
tail -n 400 "${BLITZ_LOG_FILE}" > "${incident_dir}/startup.log.tail"
fi
run_capture "${incident_dir}/systemctl-status.txt" \
systemctl status blitz-robot.target blitz-run-context.service blitz-5g-dial.service blitz-5g-link-logger.service blitz-ros-receiver.service blitz-b-side-omnid.service blitz-watchdog.service
run_capture "${incident_dir}/journal.txt" \
journalctl --no-pager --since "5 minutes ago" -u blitz-run-context.service -u blitz-5g-dial.service -u blitz-5g-link-logger.service -u blitz-ros-receiver.service -u blitz-b-side-omnid.service -u blitz-watchdog.service
run_capture "${incident_dir}/ip-addr.txt" ip addr
run_capture "${incident_dir}/ip-route.txt" ip route
run_capture "${incident_dir}/ss-uapn.txt" ss -uapn
run_capture "${incident_dir}/ss-xlp.txt" ss -xlp
if [[ -f "${BLITZ_5G_INFO_JSON:-}" ]]; then
cp -f "${BLITZ_5G_INFO_JSON}" "${incident_dir}/$(basename "${BLITZ_5G_INFO_JSON}")"
fi
if [[ -n "${BLITZ_RUN_DIR:-}" && -d "${BLITZ_RUN_DIR}" ]]; then
while IFS= read -r -d '' jsonl; do
tail -n 200 "${jsonl}" > "${incident_dir}/tail-$(basename "${jsonl}")"
done < <(find "${BLITZ_RUN_DIR}" -maxdepth 1 -type f -name '*.jsonl' -print0 2>/dev/null)
fi
blitz_log "${STEP}" "complete" "success" "incident_id=${incident_id} path=${incident_dir}" 0

View File

@@ -0,0 +1,22 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="run-context"
on_error() {
local rc="$?"
blitz_log "${STEP}" "error" "failure" "line=${1:-unknown} cmd=${BASH_COMMAND:-unknown}" "${rc}"
exit "${rc}"
}
trap 'on_error "${LINENO}"' ERR
blitz_load_boot_env
blitz_require_root "${STEP}"
blitz_require_command python3 "${STEP}"
blitz_init_run_context
blitz_log "${STEP}" "complete" "success" "run_id=${BLITZ_RUN_ID} run_dir=${BLITZ_RUN_DIR}" 0

View File

@@ -0,0 +1,971 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="watchdog"
B_SIDE_SERVICE="blitz-b-side-omnid.service"
ROS_SERVICE="blitz-ros-receiver.service"
B_SIDE_STATUS_FILE=""
ROS_STATUS_FILE=""
WATCHDOG_STATUS_FILE=""
NETWORK_FAULT_FILE=""
WATCHDOG_EVENT_LOG=""
WATCHDOG_SAMPLE_LOG=""
WATCHDOG_EVENT_LOG_FAILURE_REPORTED=0
WATCHDOG_SAMPLE_LOG_FAILURE_REPORTED=0
CAMERA_MISSING_PREV=0
CAMERA_RECOVERY_STABLE_COUNT=0
NETWORK_FAIL_COUNT=0
NETWORK_COOLDOWN_UNTIL=0
BACKOFF_UNTIL=0
LAST_ACTION="none"
LAST_ACTION_EPOCH_MS=0
FULL_RESTART_WINDOW_START=0
FULL_RESTART_WINDOW_COUNT=0
NETWORK_LAST_INTERFACE=""
NETWORK_ROUTE_INTERFACE_LAST_KNOWN=""
NETWORK_PRIMARY_LAST_RETRY_SEC=0
GPS_LAST_CHECK_SEC=0
GPS_DEVICE_PRESENT_PREV=-1
GPS_DEVICE_PRESENT_STATE=1
GPS_STACK_ACTIVE_STATE=1
LAST_REPORTED_FAULT_REASON=""
LAST_REPORTED_RECOVERY_STATE=""
declare -A TARGETED_RESTART_WINDOW_START=()
declare -A TARGETED_RESTART_WINDOW_COUNT=()
now_epoch_sec() {
date +%s
}
now_epoch_ms() {
date +%s%3N
}
service_is_active() {
systemctl is-active --quiet "$1"
}
gps_monitor_enabled() {
[[ "${BLITZ_GPS_MONITOR_ENABLED:-0}" == "1" ]]
}
gps_stack_active() {
local units=()
local unit
read -r -a units <<< "${BLITZ_GPS_RESTART_UNITS:-}"
if (( ${#units[@]} == 0 )); then
return 1
fi
for unit in "${units[@]}"; do
if service_is_active "${unit}"; then
return 0
fi
done
return 1
}
restart_gps_stack() {
local reason="$1"
local devices="$2"
local units=()
local rc
read -r -a units <<< "${BLITZ_GPS_RESTART_UNITS:-}"
if (( ${#units[@]} == 0 )); then
GPS_STACK_ACTIVE_STATE=0
blitz_log "${STEP}" "gps-reconnect" "failure" "reason=${reason} devices=${devices} units=empty" 1
return 1
fi
set_last_action "gps-reconnect"
blitz_log "${STEP}" "gps-reconnect" "start" "reason=${reason} devices=${devices} units=${BLITZ_GPS_RESTART_UNITS}" 0
if systemctl restart "${units[@]}"; then
GPS_STACK_ACTIVE_STATE=1
blitz_log "${STEP}" "gps-reconnect" "success" "reason=${reason} devices=${devices} units=${BLITZ_GPS_RESTART_UNITS}" 0
return 0
fi
rc=$?
GPS_STACK_ACTIVE_STATE=0
blitz_log "${STEP}" "gps-reconnect" "failure" "reason=${reason} devices=${devices} units=${BLITZ_GPS_RESTART_UNITS}" "${rc}"
return "${rc}"
}
check_gps_health() {
local now_sec="$1"
local check_interval_sec="${BLITZ_GPS_CHECK_INTERVAL_SEC:-10}"
local device_glob="${BLITZ_GPS_DEVICE_GLOB:-}"
local previous_present="${GPS_DEVICE_PRESENT_PREV}"
local recovery_reason=""
local device_summary=""
local -a devices=()
if ! gps_monitor_enabled; then
GPS_DEVICE_PRESENT_STATE=1
GPS_STACK_ACTIVE_STATE=1
return 0
fi
if (( check_interval_sec < 1 )); then
check_interval_sec=1
fi
if (( GPS_LAST_CHECK_SEC != 0 && now_sec - GPS_LAST_CHECK_SEC < check_interval_sec )); then
if (( GPS_DEVICE_PRESENT_STATE == 1 && GPS_STACK_ACTIVE_STATE == 1 )); then
return 0
fi
return 1
fi
GPS_LAST_CHECK_SEC="${now_sec}"
mapfile -t devices < <(compgen -G "${device_glob}" || true)
if (( ${#devices[@]} == 0 )); then
GPS_DEVICE_PRESENT_STATE=0
GPS_STACK_ACTIVE_STATE=0
if (( previous_present != 0 )); then
blitz_log "${STEP}" "gps-device-check" "failure" "state=missing glob=${device_glob}" 1
fi
GPS_DEVICE_PRESENT_PREV=0
return 1
fi
device_summary="$(IFS=,; printf '%s' "${devices[*]}")"
GPS_DEVICE_PRESENT_STATE=1
GPS_DEVICE_PRESENT_PREV=1
if (( previous_present == 0 )); then
blitz_log "${STEP}" "gps-device-check" "success" "state=reappeared devices=${device_summary}" 0
recovery_reason="device-reappeared"
elif ! gps_stack_active; then
recovery_reason="gpsd-inactive"
fi
if [[ -n "${recovery_reason}" ]]; then
if restart_gps_stack "${recovery_reason}" "${device_summary}"; then
return 0
fi
return 1
fi
GPS_STACK_ACTIVE_STATE=1
return 0
}
status_file_fresh() {
local path="$1"
local max_age_sec="$2"
local now_sec
local mtime_sec
if [[ ! -f "${path}" ]]; then
return 1
fi
now_sec="$(now_epoch_sec)"
mtime_sec="$(stat -c %Y "${path}" 2>/dev/null || echo 0)"
(( now_sec - mtime_sec <= max_age_sec ))
}
ros_receiver_status_fresh() {
local path="$1"
local max_age_sec="$2"
local now_epoch_ms_value
now_epoch_ms_value="$(now_epoch_ms)"
python3 - "${path}" "${now_epoch_ms_value}" "${max_age_sec}" <<'PY'
import json
import sys
path = sys.argv[1]
now_epoch_ms = int(sys.argv[2])
max_age_ms = int(sys.argv[3]) * 1000
try:
with open(path, "r", encoding="utf-8") as handle:
payload = json.load(handle)
except Exception:
raise SystemExit(1)
heartbeat_ms = int(payload.get("recv_thread_heartbeat_epoch_ms") or 0)
socket_bound = bool(payload.get("socket_bound"))
if heartbeat_ms <= 0 or not socket_bound:
raise SystemExit(1)
raise SystemExit(0 if now_epoch_ms - heartbeat_ms <= max_age_ms else 1)
PY
}
ros_receiver_healthy() {
local max_age_sec="$1"
service_is_active "${ROS_SERVICE}" \
&& [[ -S "${ROBOT_RECEIVER_LOCAL_SOCKET_PATH}" ]] \
&& status_file_fresh "${ROS_STATUS_FILE}" "${max_age_sec}" \
&& ros_receiver_status_fresh "${ROS_STATUS_FILE}" "${max_age_sec}"
}
write_watchdog_status() {
local fault_reason="$1"
local recovery_state="$2"
local network_ok="$3"
local camera_ok="$4"
local ros_ok="$5"
local bside_ok="$6"
local gps_ok="$7"
local gps_device_present="$8"
local tmp_file
tmp_file="${WATCHDOG_STATUS_FILE}.tmp.$$"
cat > "${tmp_file}" <<EOF
{
"updated_at_epoch_ms": $(now_epoch_ms),
"fault_reason": "${fault_reason}",
"recovery_state": "${recovery_state}",
"network_ok": ${network_ok},
"camera_ok": ${camera_ok},
"ros_ok": ${ros_ok},
"bside_ok": ${bside_ok},
"gps_ok": ${gps_ok},
"gps_device_present": ${gps_device_present},
"network_fail_count": ${NETWORK_FAIL_COUNT},
"targeted_restart_count": $(targeted_restart_total),
"full_restart_count": ${FULL_RESTART_WINDOW_COUNT},
"last_action": "${LAST_ACTION}",
"last_action_epoch_ms": ${LAST_ACTION_EPOCH_MS}
}
EOF
mv -f "${tmp_file}" "${WATCHDOG_STATUS_FILE}"
}
watchdog_emit_json() {
local record_type="$1"
local action="$2"
local fault_reason="$3"
local recovery_state="$4"
local detail="$5"
local incident_id="${6:-}"
local network_ok="${7:-1}"
local camera_ok="${8:-1}"
local ros_ok="${9:-1}"
local bside_ok="${10:-1}"
local gps_ok="${11:-1}"
local gps_device_present="${12:-1}"
python3 - "${record_type}" "${action}" "${fault_reason}" "${recovery_state}" "${detail}" "${incident_id}" "${network_ok}" "${camera_ok}" "${ros_ok}" "${bside_ok}" "${gps_ok}" "${gps_device_present}" "${LAST_ACTION}" "${LAST_ACTION_EPOCH_MS}" "${NETWORK_FAIL_COUNT}" "$(targeted_restart_total)" "${FULL_RESTART_WINDOW_COUNT}" <<'PY'
import json
import sys
import time
record_type, action, fault_reason, recovery_state, detail, incident_id, network_ok, camera_ok, ros_ok, bside_ok, gps_ok, gps_device_present, last_action, last_action_epoch_ms, network_fail_count, targeted_restart_count, full_restart_count = sys.argv[1:18]
payload = {
"ts_unix_ms": time.time_ns() // 1_000_000,
"record_type": record_type,
"action": action,
"fault_reason": fault_reason,
"recovery_state": recovery_state,
"detail": detail,
"incident_id": incident_id or None,
"network_ok": network_ok == "1",
"camera_ok": camera_ok == "1",
"ros_ok": ros_ok == "1",
"bside_ok": bside_ok == "1",
"gps_ok": gps_ok == "1",
"gps_device_present": gps_device_present == "1",
"network_fail_count": int(network_fail_count),
"targeted_restart_count": int(targeted_restart_count),
"full_restart_count": int(full_restart_count),
"last_action": last_action,
"last_action_epoch_ms": int(last_action_epoch_ms or 0),
}
print(json.dumps(payload, separators=(",", ":"), ensure_ascii=False))
PY
}
watchdog_append_event() {
local line=""
[[ -n "${WATCHDOG_EVENT_LOG}" ]] || return 0
if ! line="$(watchdog_emit_json "$@" 2>&1)"; then
if (( WATCHDOG_EVENT_LOG_FAILURE_REPORTED == 0 )); then
blitz_log "${STEP}" "watchdog-event-log" "failure" "path=${WATCHDOG_EVENT_LOG} detail=${line}" 0 || true
WATCHDOG_EVENT_LOG_FAILURE_REPORTED=1
fi
return 0
fi
if ! blitz_jsonl_append_line "${WATCHDOG_EVENT_LOG}" "${line}"; then
if (( WATCHDOG_EVENT_LOG_FAILURE_REPORTED == 0 )); then
blitz_log "${STEP}" "watchdog-event-log" "failure" "path=${WATCHDOG_EVENT_LOG} detail=append-failed" 0 || true
WATCHDOG_EVENT_LOG_FAILURE_REPORTED=1
fi
return 0
fi
WATCHDOG_EVENT_LOG_FAILURE_REPORTED=0
}
watchdog_append_sample() {
local line=""
[[ -n "${WATCHDOG_SAMPLE_LOG}" ]] || return 0
if ! line="$(watchdog_emit_json "$@" 2>&1)"; then
if (( WATCHDOG_SAMPLE_LOG_FAILURE_REPORTED == 0 )); then
blitz_log "${STEP}" "watchdog-sample-log" "failure" "path=${WATCHDOG_SAMPLE_LOG} detail=${line}" 0 || true
WATCHDOG_SAMPLE_LOG_FAILURE_REPORTED=1
fi
return 0
fi
if ! blitz_jsonl_append_line "${WATCHDOG_SAMPLE_LOG}" "${line}"; then
if (( WATCHDOG_SAMPLE_LOG_FAILURE_REPORTED == 0 )); then
blitz_log "${STEP}" "watchdog-sample-log" "failure" "path=${WATCHDOG_SAMPLE_LOG} detail=append-failed" 0 || true
WATCHDOG_SAMPLE_LOG_FAILURE_REPORTED=1
fi
return 0
fi
WATCHDOG_SAMPLE_LOG_FAILURE_REPORTED=0
}
watchdog_record_state_transition() {
local fault_reason="$1"
local recovery_state="$2"
if [[ "${fault_reason}" == "${LAST_REPORTED_FAULT_REASON}" && "${recovery_state}" == "${LAST_REPORTED_RECOVERY_STATE}" ]]; then
return 0
fi
watchdog_append_event "event" "state-transition" "${fault_reason}" "${recovery_state}" "" ""
LAST_REPORTED_FAULT_REASON="${fault_reason}"
LAST_REPORTED_RECOVERY_STATE="${recovery_state}"
}
watchdog_launch_incident() {
local reason="$1"
local unit_name="$2"
blitz_launch_incident_capture \
--source watchdog \
--reason "${reason}" \
--unit "${unit_name}" \
--result failure \
--exit-status 1 2>/dev/null || true
}
set_last_action() {
LAST_ACTION="$1"
LAST_ACTION_EPOCH_MS="$(now_epoch_ms)"
}
targeted_restart_total() {
local total=0
local key
for key in "${!TARGETED_RESTART_WINDOW_COUNT[@]}"; do
total=$(( total + TARGETED_RESTART_WINDOW_COUNT["${key}"] ))
done
printf '%s\n' "${total}"
}
register_targeted_restart() {
local fault_key="$1"
local now_sec
local window_start
local count
now_sec="$(now_epoch_sec)"
window_start="${TARGETED_RESTART_WINDOW_START["${fault_key}"]:-0}"
count="${TARGETED_RESTART_WINDOW_COUNT["${fault_key}"]:-0}"
if (( window_start == 0 || now_sec - window_start > 60 )); then
window_start="${now_sec}"
count=1
else
count=$(( count + 1 ))
fi
TARGETED_RESTART_WINDOW_START["${fault_key}"]="${window_start}"
TARGETED_RESTART_WINDOW_COUNT["${fault_key}"]="${count}"
(( count >= 2 ))
}
record_full_restart() {
local now_sec
now_sec="$(now_epoch_sec)"
if (( FULL_RESTART_WINDOW_START == 0 || now_sec - FULL_RESTART_WINDOW_START > 600 )); then
FULL_RESTART_WINDOW_START="${now_sec}"
FULL_RESTART_WINDOW_COUNT=1
else
FULL_RESTART_WINDOW_COUNT=$(( FULL_RESTART_WINDOW_COUNT + 1 ))
fi
if (( FULL_RESTART_WINDOW_COUNT >= 3 )); then
BACKOFF_UNTIL=$(( now_sec + 60 ))
watchdog_append_event "event" "backoff-enter" "backoff" "backoff" "full_restart_count=${FULL_RESTART_WINDOW_COUNT}" ""
fi
}
restart_bside_targeted() {
local fault_key="$1"
local reason="$2"
local rc
local incident_id=""
if register_targeted_restart "${fault_key}"; then
blitz_log "${STEP}" "escalate-full-restart" "start" "reason=${reason}" 0
watchdog_append_event "event" "escalate-full-restart" "${reason}-escalated" "recovering" "fault_key=${fault_key}" ""
full_restart_stack "${reason}-escalated"
return 0
fi
incident_id="$(watchdog_launch_incident "${reason}" "${B_SIDE_SERVICE}")"
set_last_action "restart-bside"
RECOVERY_ACTION_TAKEN=1
blitz_log "${STEP}" "restart-bside" "start" "reason=${reason}" 0
watchdog_append_event "event" "restart-bside-start" "${reason}" "recovering" "fault_key=${fault_key}" "${incident_id}"
if systemctl restart "${B_SIDE_SERVICE}"; then
blitz_log "${STEP}" "restart-bside" "success" "reason=${reason}" 0
watchdog_append_event "event" "restart-bside-success" "${reason}" "recovering" "fault_key=${fault_key}" "${incident_id}"
return 0
fi
rc=$?
blitz_log "${STEP}" "restart-bside" "failure" "reason=${reason}" "${rc}"
watchdog_append_event "event" "restart-bside-failure" "${reason}" "recovering" "fault_key=${fault_key} rc=${rc}" "${incident_id}"
return "${rc}"
}
full_restart_stack() {
local reason="$1"
local rc
local incident_id=""
incident_id="$(watchdog_launch_incident "${reason}" "blitz-robot.target")"
set_last_action "full-restart"
RECOVERY_ACTION_TAKEN=1
recovery_state="recovering"
fault_reason="${reason}"
blitz_log "${STEP}" "full-restart-stop-bside" "start" "reason=${reason}" 0
watchdog_append_event "event" "full-restart-start" "${reason}" "recovering" "" "${incident_id}"
systemctl stop "${B_SIDE_SERVICE}" || true
if systemctl restart "${ROS_SERVICE}"; then
blitz_log "${STEP}" "full-restart-restart-ros" "success" "reason=${reason}" 0
else
rc=$?
blitz_log "${STEP}" "full-restart-restart-ros" "failure" "reason=${reason}" "${rc}"
record_full_restart
return "${rc}"
fi
if bash "${BOOT_SCRIPT_DIR}/wait-for-unix-socket.sh" --step "${STEP}" --timeout "${BLITZ_ROS_SOCKET_WAIT_SEC}"; then
:
else
rc=$?
blitz_log "${STEP}" "full-restart-wait-socket" "failure" "reason=${reason}" "${rc}"
record_full_restart
return "${rc}"
fi
if systemctl start "${B_SIDE_SERVICE}"; then
blitz_log "${STEP}" "full-restart-start-bside" "success" "reason=${reason}" 0
else
rc=$?
blitz_log "${STEP}" "full-restart-start-bside" "failure" "reason=${reason}" "${rc}"
watchdog_append_event "event" "full-restart-failure" "${reason}" "recovering" "stage=start-bside rc=${rc}" "${incident_id}"
record_full_restart
return "${rc}"
fi
watchdog_append_event "event" "full-restart-success" "${reason}" "recovering" "" "${incident_id}"
record_full_restart
}
network_fault_injected() {
[[ "${BLITZ_WATCHDOG_ALLOW_FAULT_INJECTION}" == "1" && -f "${NETWORK_FAULT_FILE}" ]]
}
resolve_network_interface() {
NETWORK_LAST_INTERFACE="$(blitz_resolve_5g_interface || true)"
if [[ -n "${NETWORK_LAST_INTERFACE}" ]]; then
NETWORK_ROUTE_INTERFACE_LAST_KNOWN="${NETWORK_LAST_INTERFACE}"
return 0
fi
return 1
}
network_route_targets() {
local target
if [[ -n "${BLITZ_TIME_SERVER_IP:-}" ]]; then
printf '%s\n' "${BLITZ_TIME_SERVER_IP}"
fi
for target in ${BLITZ_5G_ROUTE_TARGETS//,/ }; do
if [[ -n "${target}" && "${target}" != "${BLITZ_TIME_SERVER_IP:-}" ]]; then
printf '%s\n' "${target}"
fi
done
}
log_target_route_paths() {
local action="$1"
local target
local route_output
while IFS= read -r target; do
[[ -n "${target}" ]] || continue
route_output="$(ip route get "${target}" 2>&1 | head -n 1 || true)"
if [[ -z "${route_output}" ]]; then
route_output="unresolved"
fi
blitz_log "${STEP}" "route-path" "info" "action=${action} target=${target} route=${route_output}" 0
done < <(network_route_targets)
}
route_output_uses_interface() {
local route_output="$1"
local interface_name="$2"
[[ -n "${interface_name}" ]] || return 1
[[ "${route_output}" == *" dev ${interface_name} "* || "${route_output}" == *" dev ${interface_name}" ]]
}
route_output_uses_gateway() {
local route_output="$1"
local gateway="$2"
[[ -n "${gateway}" ]] || return 1
[[ "${route_output}" == *"via ${gateway}"* ]]
}
route_is_desired_target_route() {
local route_output="$1"
local interface_name="$2"
local gateway="$3"
route_output_uses_interface "${route_output}" "${interface_name}" \
&& route_output_uses_gateway "${route_output}" "${gateway}"
}
route_is_managed_5g_route() {
local route_output="$1"
local interface_name="${2:-}"
local gateway="${3:-}"
if route_output_uses_interface "${route_output}" "${interface_name}"; then
return 0
fi
if route_output_uses_gateway "${route_output}" "${gateway}"; then
return 0
fi
if route_output_uses_gateway "${route_output}" "${BLITZ_5G_GATEWAY:-}"; then
return 0
fi
return 1
}
resolve_route_cleanup_interface() {
local interface_name=""
local info_json="${BLITZ_5G_INFO_JSON:-}"
if [[ -n "${NETWORK_LAST_INTERFACE}" ]]; then
printf '%s\n' "${NETWORK_LAST_INTERFACE}"
return 0
fi
if [[ -n "${NETWORK_ROUTE_INTERFACE_LAST_KNOWN}" ]]; then
printf '%s\n' "${NETWORK_ROUTE_INTERFACE_LAST_KNOWN}"
return 0
fi
interface_name="$(blitz_read_5g_info_interface "${info_json}" || true)"
if [[ -n "${interface_name}" ]]; then
printf '%s\n' "${interface_name}"
return 0
fi
return 1
}
resolve_network_gateway() {
local interface_name="$1"
local default_route
local gateway=""
local tokens=()
local index
default_route="$(ip -o route show default dev "${interface_name}" 2>/dev/null | head -n 1 || true)"
if [[ -n "${default_route}" ]]; then
read -r -a tokens <<< "${default_route}"
for (( index=0; index<${#tokens[@]}-1; index++ )); do
if [[ "${tokens[index]}" == "via" ]]; then
gateway="${tokens[index + 1]}"
break
fi
done
fi
if [[ -n "${gateway}" ]]; then
printf '%s\n' "${gateway}"
return 0
fi
if [[ -n "${BLITZ_5G_GATEWAY:-}" ]]; then
printf '%s\n' "${BLITZ_5G_GATEWAY}"
return 0
fi
return 1
}
sync_target_routes_to_5g() {
local interface_name="$1"
local gateway="${2:-}"
local route_output=""
local updated=0
local target
local rc
if [[ -z "${interface_name}" ]]; then
return 1
fi
if [[ -z "${gateway}" ]]; then
gateway="$(resolve_network_gateway "${interface_name}" || true)"
fi
if [[ -z "${gateway}" ]]; then
blitz_log "${STEP}" "route-sync-gateway" "failure" "interface=${interface_name}" 1
return 1
fi
while IFS= read -r target; do
[[ -n "${target}" ]] || continue
route_output="$(ip route show "${target}/32" 2>/dev/null | head -n 1 || true)"
if [[ -n "${route_output}" ]] && route_is_desired_target_route "${route_output}" "${interface_name}" "${gateway}"; then
continue
fi
if ip route replace "${target}/32" via "${gateway}" dev "${interface_name}"; then
updated=1
blitz_log "${STEP}" "route-sync-target" "success" "target=${target} interface=${interface_name} gateway=${gateway}" 0
else
rc=$?
blitz_log "${STEP}" "route-sync-target" "failure" "target=${target} interface=${interface_name} gateway=${gateway}" "${rc}"
return "${rc}"
fi
done < <(network_route_targets)
if (( updated == 1 )); then
NETWORK_ROUTE_INTERFACE_LAST_KNOWN="${interface_name}"
log_target_route_paths "sync-to-5g"
fi
return 0
}
clear_target_routes_from_5g() {
local interface_name="${1:-}"
local gateway="${2:-}"
local route_output=""
local target
local removed_any=0
local rc
if [[ -z "${interface_name}" ]]; then
interface_name="$(resolve_route_cleanup_interface || true)"
fi
if [[ -z "${gateway}" && -n "${interface_name}" ]]; then
gateway="$(resolve_network_gateway "${interface_name}" || true)"
fi
if [[ -z "${gateway}" ]]; then
gateway="${BLITZ_5G_GATEWAY:-}"
fi
while IFS= read -r target; do
[[ -n "${target}" ]] || continue
route_output="$(ip route show "${target}/32" 2>/dev/null | head -n 1 || true)"
if [[ -z "${route_output}" ]] || ! route_is_managed_5g_route "${route_output}" "${interface_name}" "${gateway}"; then
continue
fi
if ip route del "${target}/32"; then
removed_any=1
blitz_log "${STEP}" "route-clear-target" "success" "target=${target} interface=${interface_name:-unknown} gateway=${gateway:-unknown}" 0
else
rc=$?
blitz_log "${STEP}" "route-clear-target" "failure" "target=${target} interface=${interface_name:-unknown} gateway=${gateway:-unknown}" "${rc}"
return "${rc}"
fi
done < <(network_route_targets)
if (( removed_any == 1 )); then
blitz_log "${STEP}" "route-clear" "success" "interface=${interface_name:-unknown} gateway=${gateway:-unknown}" 0
log_target_route_paths "clear-from-5g"
fi
return 0
}
repair_network_routes() {
local interface_name="$1"
local gateway=""
local route_output
if [[ -z "${interface_name}" ]]; then
return 1
fi
gateway="$(resolve_network_gateway "${interface_name}" || true)"
if [[ -z "${gateway}" ]]; then
blitz_log "${STEP}" "route-repair-gateway" "failure" "interface=${interface_name}" 1
return 1
fi
if ! sync_target_routes_to_5g "${interface_name}" "${gateway}"; then
clear_target_routes_from_5g "${interface_name}" "${gateway}" || true
return 1
fi
route_output="$(blitz_route_ready "${BLITZ_TIME_SERVER_IP}" "${interface_name}" || true)"
if [[ -z "${route_output}" ]]; then
clear_target_routes_from_5g "${interface_name}" "${gateway}" || true
blitz_log "${STEP}" "route-repair-postcheck" "failure" "interface=${interface_name} gateway=${gateway}" 1
return 1
fi
if ! ping -I "${interface_name}" -c 1 -W 2 "${BLITZ_TIME_SERVER_IP}" >/dev/null 2>&1; then
clear_target_routes_from_5g "${interface_name}" "${gateway}" || true
blitz_log "${STEP}" "route-repair-probe" "failure" "interface=${interface_name} target=${BLITZ_TIME_SERVER_IP}" 1
return 1
fi
blitz_log "${STEP}" "route-repair-postcheck" "success" "interface=${interface_name} gateway=${gateway} route=${route_output}" 0
return 0
}
network_is_healthy() {
local route_output
NETWORK_LAST_INTERFACE=""
if network_fault_injected; then
return 1
fi
if ! resolve_network_interface; then
return 1
fi
route_output="$(blitz_route_ready "${BLITZ_TIME_SERVER_IP}" "${NETWORK_LAST_INTERFACE}" || true)"
if [[ -z "${route_output}" ]]; then
return 1
fi
ping -I "${NETWORK_LAST_INTERFACE}" -c 1 -W 2 "${BLITZ_TIME_SERVER_IP}" >/dev/null 2>&1
}
fallback_network_is_healthy() {
local route_output
if [[ -z "${BLITZ_TIME_SERVER_IP:-}" ]]; then
return 1
fi
route_output="$(blitz_route_ready "${BLITZ_TIME_SERVER_IP}" || true)"
if [[ -z "${route_output}" ]]; then
return 1
fi
ping -c 1 -W 2 "${BLITZ_TIME_SERVER_IP}" >/dev/null 2>&1
}
wait_for_network_recovery() {
local timeout_sec="$1"
local waited=0
while (( waited < timeout_sec )); do
if network_is_healthy; then
blitz_log "${STEP}" "network-postcheck" "success" "interface=${NETWORK_LAST_INTERFACE} waited_sec=${waited}" 0
return 0
fi
if (( waited == 0 || waited % 5 == 0 )); then
blitz_log "${STEP}" "network-postcheck" "waiting" "interface=${NETWORK_LAST_INTERFACE:-unresolved} waited_sec=${waited}" 0
fi
sleep 1
waited=$(( waited + 1 ))
done
blitz_log "${STEP}" "network-postcheck" "failure" "interface=${NETWORK_LAST_INTERFACE:-unresolved} timeout_sec=${timeout_sec}" 1
return 1
}
perform_network_recovery() {
local rc=0
local incident_id=""
if resolve_network_interface && repair_network_routes "${NETWORK_LAST_INTERFACE}"; then
set_last_action "route-repair"
RECOVERY_ACTION_TAKEN=1
NETWORK_COOLDOWN_UNTIL=$(( $(now_epoch_sec) + BLITZ_NETWORK_RECOVERY_COOLDOWN_SEC ))
NETWORK_FAIL_COUNT=0
blitz_log "${STEP}" "network-recovery" "success" "mode=route-repair interface=${NETWORK_LAST_INTERFACE}" 0
watchdog_append_event "event" "route-repair-success" "network_or_robot_unreachable" "recovering" "interface=${NETWORK_LAST_INTERFACE}" ""
return 0
fi
incident_id="$(watchdog_launch_incident "network-recovery" "blitz-5g-dial.service")"
set_last_action "network-recovery"
RECOVERY_ACTION_TAKEN=1
blitz_log "${STEP}" "network-recovery" "start" "fail_count=${NETWORK_FAIL_COUNT}" 0
watchdog_append_event "event" "network-recovery-start" "network_or_robot_unreachable" "recovering" "fail_count=${NETWORK_FAIL_COUNT}" "${incident_id}"
systemctl stop "${B_SIDE_SERVICE}" || true
if bash "${BOOT_SCRIPT_DIR}/5g-dial.sh"; then
:
else
rc=$?
blitz_log "${STEP}" "network-redial" "failure" "fail_count=${NETWORK_FAIL_COUNT} script=${BOOT_SCRIPT_DIR}/5g-dial.sh" "${rc}"
watchdog_append_event "event" "network-recovery-failure" "network_or_robot_unreachable" "recovering" "stage=redial rc=${rc}" "${incident_id}"
return "${rc}"
fi
if wait_for_network_recovery "${BLITZ_5G_ROUTE_WAIT_SEC}"; then
:
else
rc=$?
blitz_log "${STEP}" "network-recovery" "failure" "fail_count=${NETWORK_FAIL_COUNT} interface=${NETWORK_LAST_INTERFACE:-unresolved}" "${rc}"
watchdog_append_event "event" "network-recovery-failure" "network_or_robot_unreachable" "recovering" "stage=postcheck rc=${rc}" "${incident_id}"
return "${rc}"
fi
NETWORK_COOLDOWN_UNTIL=$(( $(now_epoch_sec) + BLITZ_NETWORK_RECOVERY_COOLDOWN_SEC ))
NETWORK_FAIL_COUNT=0
watchdog_append_event "event" "network-recovery-success" "network_or_robot_unreachable" "recovering" "interface=${NETWORK_LAST_INTERFACE:-unresolved}" "${incident_id}"
if ros_receiver_healthy "${BLITZ_HEALTH_STALE_SEC}"; then
restart_bside_targeted "network" "network-recovered"
return 0
fi
full_restart_stack "network-recovered-ros-unhealthy"
return 0
}
blitz_load_boot_env
blitz_require_root "${STEP}"
blitz_require_command systemctl "${STEP}"
blitz_require_command stat "${STEP}"
blitz_require_command ping "${STEP}"
blitz_require_command python3 "${STEP}"
blitz_prepare_runtime_dir
blitz_require_run_context
B_SIDE_STATUS_FILE="${BLITZ_RUNTIME_DIR}/b-side-omnid.status.json"
ROS_STATUS_FILE="${BLITZ_RUNTIME_DIR}/ros-receiver.status.json"
WATCHDOG_STATUS_FILE="${BLITZ_RUNTIME_DIR}/watchdog.status.json"
NETWORK_FAULT_FILE="${BLITZ_RUNTIME_DIR}/fault-injection-network-down"
WATCHDOG_EVENT_LOG="${BLITZ_RUN_DIR}/watchdog-events.jsonl"
WATCHDOG_SAMPLE_LOG="${BLITZ_RUN_DIR}/watchdog-samples.jsonl"
while true; do
fault_reason="none"
recovery_state="ok"
network_ok=1
camera_ok=1
ros_ok=1
bside_ok=1
gps_ok=1
gps_device_present=1
RECOVERY_ACTION_TAKEN=0
now_sec="$(now_epoch_sec)"
if gps_monitor_enabled; then
gps_device_present="${GPS_DEVICE_PRESENT_STATE}"
if (( GPS_DEVICE_PRESENT_STATE == 0 || GPS_STACK_ACTIVE_STATE == 0 )); then
gps_ok=0
fi
fi
if (( BACKOFF_UNTIL > now_sec )); then
fault_reason="backoff"
recovery_state="backoff"
watchdog_record_state_transition "${fault_reason}" "${recovery_state}"
write_watchdog_status "${fault_reason}" "${recovery_state}" 0 0 0 0 "${gps_ok}" "${gps_device_present}"
watchdog_append_sample "sample" "loop" "${fault_reason}" "${recovery_state}" "" "" 0 0 0 0 "${gps_ok}" "${gps_device_present}"
sleep "${BLITZ_WATCHDOG_INTERVAL_SEC}"
continue
fi
if (( NETWORK_COOLDOWN_UNTIL > now_sec )); then
recovery_state="recovering"
elif ! network_is_healthy; then
clear_target_routes_from_5g || true
if fallback_network_is_healthy; then
NETWORK_FAIL_COUNT=0
fault_reason="network_fallback_active"
recovery_state="degraded"
blitz_log "${STEP}" "network-check" "fallback" "interface=${NETWORK_LAST_INTERFACE:-unresolved} target=${BLITZ_TIME_SERVER_IP}" 0
if (( NETWORK_PRIMARY_LAST_RETRY_SEC == 0 || now_sec - NETWORK_PRIMARY_LAST_RETRY_SEC >= 10 )); then
NETWORK_PRIMARY_LAST_RETRY_SEC="${now_sec}"
if resolve_network_interface && repair_network_routes "${NETWORK_LAST_INTERFACE}"; then
NETWORK_PRIMARY_LAST_RETRY_SEC=0
fault_reason="none"
recovery_state="ok"
blitz_log "${STEP}" "network-check" "primary-restored" "interface=${NETWORK_LAST_INTERFACE} target=${BLITZ_TIME_SERVER_IP}" 0
log_target_route_paths "primary-restored"
fi
fi
else
network_ok=0
NETWORK_FAIL_COUNT=$(( NETWORK_FAIL_COUNT + 1 ))
fault_reason="network_or_robot_unreachable"
recovery_state="recovering"
blitz_log "${STEP}" "network-check" "failure" "count=${NETWORK_FAIL_COUNT} interface=${NETWORK_LAST_INTERFACE:-unresolved}" 1
if (( NETWORK_FAIL_COUNT >= BLITZ_NETWORK_FAIL_THRESHOLD )); then
perform_network_recovery || true
fi
fi
else
NETWORK_PRIMARY_LAST_RETRY_SEC=0
NETWORK_FAIL_COUNT=0
sync_target_routes_to_5g "${NETWORK_LAST_INTERFACE}" || true
fi
if check_gps_health "${now_sec}"; then
gps_ok=1
else
gps_ok=0
gps_device_present="${GPS_DEVICE_PRESENT_STATE}"
if [[ "${fault_reason}" == "none" ]]; then
if (( GPS_DEVICE_PRESENT_STATE == 0 )); then
fault_reason="gps_device_missing"
else
fault_reason="gps_reconnect_failed"
fi
recovery_state="degraded"
fi
fi
gps_device_present="${GPS_DEVICE_PRESENT_STATE}"
if [[ ! -e "${OMNI_CAMERA_DEVICE}" ]]; then
camera_ok=0
fault_reason="camera_missing"
recovery_state="degraded"
CAMERA_MISSING_PREV=1
CAMERA_RECOVERY_STABLE_COUNT=0
elif (( RECOVERY_ACTION_TAKEN == 0 && CAMERA_MISSING_PREV == 1 )); then
CAMERA_RECOVERY_STABLE_COUNT=$(( CAMERA_RECOVERY_STABLE_COUNT + 1 ))
recovery_state="recovering"
fault_reason="camera_recovered"
if (( CAMERA_RECOVERY_STABLE_COUNT >= 2 )); then
restart_bside_targeted "camera" "camera-reappeared" || true
CAMERA_MISSING_PREV=0
CAMERA_RECOVERY_STABLE_COUNT=0
fi
else
CAMERA_RECOVERY_STABLE_COUNT=0
fi
if (( RECOVERY_ACTION_TAKEN == 0 )) && { ! service_is_active "${B_SIDE_SERVICE}" || ! status_file_fresh "${B_SIDE_STATUS_FILE}" "${BLITZ_HEALTH_STALE_SEC}"; }; then
bside_ok=0
fault_reason="bside_status_stale"
recovery_state="recovering"
restart_bside_targeted "bside" "bside-unhealthy" || true
fi
if (( RECOVERY_ACTION_TAKEN == 0 )) && ! ros_receiver_healthy "${BLITZ_HEALTH_STALE_SEC}"; then
ros_ok=0
fault_reason="ros_receiver_unhealthy"
recovery_state="recovering"
full_restart_stack "ros-unhealthy" || true
fi
watchdog_record_state_transition "${fault_reason}" "${recovery_state}"
write_watchdog_status "${fault_reason}" "${recovery_state}" "${network_ok}" "${camera_ok}" "${ros_ok}" "${bside_ok}" "${gps_ok}" "${gps_device_present}"
watchdog_append_sample "sample" "loop" "${fault_reason}" "${recovery_state}" "" "" "${network_ok}" "${camera_ok}" "${ros_ok}" "${bside_ok}" "${gps_ok}" "${gps_device_present}"
sleep "${BLITZ_WATCHDOG_INTERVAL_SEC}"
done

15
scripts/boot/boot-gate.sh Normal file
View File

@@ -0,0 +1,15 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="boot-gate"
blitz_load_boot_env
blitz_log "${STEP}" "start" "start" "delay_sec=${BLITZ_BOOT_DELAY_SEC}" 0
blitz_log "${STEP}" "delay" "start" "sleep ${BLITZ_BOOT_DELAY_SEC}s before starting Blitz services" 0
sleep "${BLITZ_BOOT_DELAY_SEC}"
blitz_log "${STEP}" "delay" "success" "boot gate released after ${BLITZ_BOOT_DELAY_SEC}s" 0

661
scripts/boot/common.sh Normal file
View File

@@ -0,0 +1,661 @@
#!/usr/bin/env bash
set -euo pipefail
BOOT_SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
DEV_SCRIPT_DIR="$(cd "${BOOT_SCRIPT_DIR}/../dev" && pwd)"
source_with_nounset_off() {
set +u
# shellcheck disable=SC1090
source "$1"
set -u
}
blitz_host_from_addr() {
local value="${1:-}"
if [[ -z "${value}" ]]; then
return 1
fi
if [[ "${value}" == \[*\]:* ]]; then
value="${value#\[}"
printf '%s\n' "${value%%]:*}"
return 0
fi
printf '%s\n' "${value%%:*}"
}
blitz_load_boot_env() {
local env_file
local default_time_server
local dev_run_root
local dev_runtime_dir
if [[ "${BLITZ_BOOT_ENV_LOADED:-0}" == "1" ]]; then
return 0
fi
export BLITZ_BOOT_LOADING_ENV="1"
# shellcheck disable=SC1091
source "${DEV_SCRIPT_DIR}/load-env.sh"
unset BLITZ_BOOT_LOADING_ENV
for env_file in \
"${BOOT_SCRIPT_DIR}/robot-boot.env" \
"${BOOT_SCRIPT_DIR}/robot-boot.env.local"
do
if [[ -f "${env_file}" ]]; then
set -a
# shellcheck disable=SC1090
source "${env_file}"
set +a
fi
done
if declare -F normalize_loaded_env_vars >/dev/null 2>&1; then
normalize_loaded_env_vars
fi
dev_run_root="${OMNISOCKETGO_ROOT}/logs"
dev_runtime_dir="${dev_run_root}/runtime"
if [[ -z "${BLITZ_RUN_ROOT:-}" || "${BLITZ_RUN_ROOT}" == "${dev_run_root}" ]]; then
export BLITZ_RUN_ROOT="/var/log/blitz-robot"
fi
if [[ -z "${BLITZ_RUNTIME_DIR:-}" || "${BLITZ_RUNTIME_DIR}" == "${dev_runtime_dir}" ]]; then
export BLITZ_RUNTIME_DIR="/run/blitz-robot"
fi
if [[ -z "${BLITZ_RUN_CONTEXT_FILE:-}" || "${BLITZ_RUN_CONTEXT_FILE}" == "${dev_runtime_dir}/run-context.env" ]]; then
export BLITZ_RUN_CONTEXT_FILE="${BLITZ_RUNTIME_DIR}/run-context.env"
fi
if [[ -z "${BLITZ_RUN_ID_FILE:-}" || "${BLITZ_RUN_ID_FILE}" == "${dev_runtime_dir}/run-id" ]]; then
export BLITZ_RUN_ID_FILE="${BLITZ_RUNTIME_DIR}/run-id"
fi
if [[ -z "${BLITZ_CURRENT_RUN_LINK:-}" || "${BLITZ_CURRENT_RUN_LINK}" == "${dev_run_root}/current" ]]; then
export BLITZ_CURRENT_RUN_LINK="${BLITZ_RUN_ROOT}/current"
fi
default_time_server="$(blitz_host_from_addr "${ROBOT_SIDE_OMNISOCKET_SERVER_ADDR:-}" || true)"
export BLITZ_BOOT_DELAY_SEC="${BLITZ_BOOT_DELAY_SEC:-30}"
export BLITZ_RUN_ROOT="${BLITZ_RUN_ROOT:-/var/log/blitz-robot}"
export BLITZ_LOG_FILE="${BLITZ_LOG_FILE:-/var/log/blitz-robot/startup.log}"
export BLITZ_RUNTIME_DIR="${BLITZ_RUNTIME_DIR:-/run/blitz-robot}"
export BLITZ_RUN_CONTEXT_FILE="${BLITZ_RUN_CONTEXT_FILE:-${BLITZ_RUNTIME_DIR}/run-context.env}"
export BLITZ_RUN_ID_FILE="${BLITZ_RUN_ID_FILE:-${BLITZ_RUNTIME_DIR}/run-id}"
export BLITZ_CURRENT_RUN_LINK="${BLITZ_CURRENT_RUN_LINK:-${BLITZ_RUN_ROOT}/current}"
export BLITZ_5G_DIAL_DIR="${BLITZ_5G_DIAL_DIR:-${BOOT_SCRIPT_DIR}}"
export BLITZ_5G_SERIAL_PORT="${BLITZ_5G_SERIAL_PORT:-/dev/ttyUSB7}"
export BLITZ_5G_INTERFACE="${BLITZ_5G_INTERFACE:-}"
export BLITZ_5G_MODEM_SUBNET="${BLITZ_5G_MODEM_SUBNET:-192.168.224.0/22}"
export BLITZ_5G_GATEWAY="${BLITZ_5G_GATEWAY:-192.168.225.1}"
export BLITZ_5G_SKIP_DHCP="${BLITZ_5G_SKIP_DHCP:-0}"
export BLITZ_5G_REMOVE_DEFAULT_ROUTE="${BLITZ_5G_REMOVE_DEFAULT_ROUTE:-1}"
export BLITZ_5G_ROUTE_TARGETS="${BLITZ_5G_ROUTE_TARGETS:-106.55.173.235}"
export BLITZ_5G_INFO_JSON="${BLITZ_5G_INFO_JSON:-${BLITZ_5G_DIAL_DIR}/modem_network_info.json}"
export BLITZ_5G_DISABLE_INTERFACES="${BLITZ_5G_DISABLE_INTERFACES:-}"
export BLITZ_5G_SERIAL_WAIT_SEC="${BLITZ_5G_SERIAL_WAIT_SEC:-60}"
export BLITZ_5G_ROUTE_WAIT_SEC="${BLITZ_5G_ROUTE_WAIT_SEC:-30}"
export BLITZ_TIME_SERVER_IP="${BLITZ_TIME_SERVER_IP:-${default_time_server}}"
export BLITZ_ROS_USER="${BLITZ_ROS_USER:-nvidia}"
export BLITZ_ROS_SOCKET_WAIT_SEC="${BLITZ_ROS_SOCKET_WAIT_SEC:-20}"
export BLITZ_WATCHDOG_INTERVAL_SEC="${BLITZ_WATCHDOG_INTERVAL_SEC:-5}"
export BLITZ_HEALTH_STALE_SEC="${BLITZ_HEALTH_STALE_SEC:-15}"
export BLITZ_OMNID_THREAD_HEARTBEAT_TIMEOUT_SEC="${BLITZ_OMNID_THREAD_HEARTBEAT_TIMEOUT_SEC:-15}"
export BLITZ_KCP_STATS_INTERVAL_MS="${BLITZ_KCP_STATS_INTERVAL_MS:-1000}"
export BLITZ_CONTROL_LATENCY_LOG_ENABLED="${BLITZ_CONTROL_LATENCY_LOG_ENABLED:-1}"
export BLITZ_CONTROL_LATENCY_LOG_SAMPLE_MOD="${BLITZ_CONTROL_LATENCY_LOG_SAMPLE_MOD:-100}"
export BLITZ_5G_LINK_LOG_INTERVAL_SEC="${BLITZ_5G_LINK_LOG_INTERVAL_SEC:-5}"
export BLITZ_JSONL_FLUSH_INTERVAL_MS="${BLITZ_JSONL_FLUSH_INTERVAL_MS:-1000}"
export BLITZ_JSONL_FLUSH_BYTES="${BLITZ_JSONL_FLUSH_BYTES:-262144}"
export BLITZ_JSONL_ROTATE_BYTES="${BLITZ_JSONL_ROTATE_BYTES:-134217728}"
export BLITZ_JSONL_ROTATE_FILES="${BLITZ_JSONL_ROTATE_FILES:-8}"
export BLITZ_INCIDENT_COMMAND_TIMEOUT_SEC="${BLITZ_INCIDENT_COMMAND_TIMEOUT_SEC:-5}"
export BLITZ_INCIDENT_TOTAL_TIMEOUT_SEC="${BLITZ_INCIDENT_TOTAL_TIMEOUT_SEC:-30}"
export BLITZ_NETWORK_FAIL_THRESHOLD="${BLITZ_NETWORK_FAIL_THRESHOLD:-3}"
export BLITZ_NETWORK_RECOVERY_COOLDOWN_SEC="${BLITZ_NETWORK_RECOVERY_COOLDOWN_SEC:-30}"
export BLITZ_GPS_MONITOR_ENABLED="${BLITZ_GPS_MONITOR_ENABLED:-1}"
export BLITZ_GPS_DEVICE_GLOB="${BLITZ_GPS_DEVICE_GLOB:-/dev/ttyCH341USB*}"
export BLITZ_GPS_CHECK_INTERVAL_SEC="${BLITZ_GPS_CHECK_INTERVAL_SEC:-10}"
export BLITZ_GPS_RESTART_UNITS="${BLITZ_GPS_RESTART_UNITS:-gpsd.socket gpsd.service}"
export BLITZ_WATCHDOG_ALLOW_FAULT_INJECTION="${BLITZ_WATCHDOG_ALLOW_FAULT_INJECTION:-0}"
export BLITZ_BOOT_ENV_LOADED="1"
}
blitz_timestamp() {
date '+%Y-%m-%d %H:%M:%S%z'
}
blitz_sanitize_detail() {
local detail="${1:-}"
detail="${detail//$'\n'/ ; }"
detail="${detail//$'\r'/ }"
printf '%s' "${detail}"
}
blitz_log() {
local step="${1:-unknown-step}"
local action="${2:-unknown-action}"
local result="${3:-info}"
local details="${4:-}"
local exit_code="${5:-0}"
printf '%s | %s | %s | %s | %s | %s\n' \
"$(blitz_timestamp)" \
"${step}" \
"${action}" \
"${result}" \
"$(blitz_sanitize_detail "${details}")" \
"${exit_code}"
}
blitz_join_cmd() {
local cmd=()
local arg
for arg in "$@"; do
cmd+=("$(printf '%q' "${arg}")")
done
printf '%s' "${cmd[*]}"
}
blitz_require_command() {
local command_name="$1"
local step="${2:-precheck}"
if command -v "${command_name}" >/dev/null 2>&1; then
blitz_log "${step}" "require-command" "success" "command=${command_name}" 0
return 0
fi
blitz_log "${step}" "require-command" "failure" "missing command: ${command_name}" 127
return 127
}
blitz_require_file() {
local path="$1"
local step="${2:-precheck}"
if [[ -f "${path}" ]]; then
blitz_log "${step}" "require-file" "success" "path=${path}" 0
return 0
fi
blitz_log "${step}" "require-file" "failure" "missing file: ${path}" 1
return 1
}
blitz_require_executable() {
local path="$1"
local step="${2:-precheck}"
if [[ -x "${path}" ]]; then
blitz_log "${step}" "require-executable" "success" "path=${path}" 0
return 0
fi
blitz_log "${step}" "require-executable" "failure" "missing executable: ${path}" 1
return 1
}
blitz_require_root() {
local step="${1:-precheck}"
if [[ "${EUID}" -eq 0 ]]; then
blitz_log "${step}" "require-root" "success" "uid=${EUID}" 0
return 0
fi
blitz_log "${step}" "require-root" "failure" "root privileges are required" 1
return 1
}
blitz_run() {
local step="$1"
local action="$2"
local rc
shift 2
blitz_log "${step}" "${action}" "start" "$(blitz_join_cmd "$@")" 0
if "$@"; then
blitz_log "${step}" "${action}" "success" "$(blitz_join_cmd "$@")" 0
return 0
else
rc=$?
fi
blitz_log "${step}" "${action}" "failure" "$(blitz_join_cmd "$@")" "${rc}"
return "${rc}"
}
blitz_route_ready() {
local target_ip="$1"
local expected_interface="${2:-}"
local route_output
route_output="$(ip route get "${target_ip}" 2>&1 || true)"
if [[ -z "${route_output}" ]]; then
return 1
fi
if [[ "${route_output}" == *"unreachable"* || "${route_output}" == *"prohibit"* ]]; then
return 1
fi
if [[ -n "${expected_interface}" && "${route_output}" != *" dev ${expected_interface} "* && "${route_output}" != *" dev ${expected_interface}" ]]; then
return 1
fi
printf '%s\n' "${route_output}"
return 0
}
blitz_interface_exists() {
local interface_name="${1:-}"
if [[ -z "${interface_name}" ]]; then
return 1
fi
ip link show dev "${interface_name}" >/dev/null 2>&1
}
blitz_read_5g_info_interface() {
local info_json="$1"
if [[ -z "${info_json}" || ! -f "${info_json}" ]]; then
return 1
fi
python3 - "${info_json}" <<'PY'
import json
import sys
path = sys.argv[1]
try:
with open(path, "r", encoding="utf-8") as handle:
payload = json.load(handle)
except Exception:
raise SystemExit(1)
interface = str(payload.get("interface") or "").strip()
if not interface:
raise SystemExit(1)
print(interface)
PY
}
blitz_detect_5g_interface_from_subnet() {
local modem_subnet="${1:-${BLITZ_5G_MODEM_SUBNET:-}}"
if [[ -z "${modem_subnet}" ]]; then
return 1
fi
python3 - "${modem_subnet}" <<'PY'
import ipaddress
import json
import subprocess
import sys
subnet = ipaddress.ip_network(sys.argv[1], strict=False)
skip = {"lo", "docker0", "l4tbr0"}
def priority(name: str) -> tuple[int, str]:
if name.startswith("enx"):
return (0, name)
if name.startswith("wwan"):
return (1, name)
if name.startswith("usb"):
return (2, name)
if name.startswith("eth"):
return (3, name)
return (9, name)
try:
output = subprocess.check_output(["ip", "-j", "-4", "addr", "show"], text=True)
payload = json.loads(output)
except Exception:
raise SystemExit(1)
candidates = []
for item in payload:
ifname = str(item.get("ifname") or "").strip()
if not ifname or ifname in skip:
continue
for addr in item.get("addr_info") or []:
if addr.get("family") != "inet":
continue
local = addr.get("local")
prefixlen = addr.get("prefixlen")
if not local or prefixlen is None:
continue
try:
iface = ipaddress.ip_interface(f"{local}/{prefixlen}")
except ValueError:
continue
if iface.ip in subnet:
candidates.append((priority(ifname), ifname))
break
if not candidates:
raise SystemExit(1)
candidates.sort(key=lambda item: item[0])
print(candidates[0][1])
PY
}
blitz_refresh_5g_info_json() {
local interface_name="$1"
local info_json="${2:-${BLITZ_5G_INFO_JSON:-}}"
if [[ -z "${interface_name}" || -z "${info_json}" ]]; then
return 1
fi
python3 - "${interface_name}" "${info_json}" <<'PY'
import json
import os
import subprocess
import sys
interface_name = sys.argv[1]
path = sys.argv[2]
try:
output = subprocess.check_output(["ip", "-j", "addr", "show", "dev", interface_name], text=True)
payload = json.loads(output)
except Exception:
raise SystemExit(1)
if not payload:
raise SystemExit(1)
item = payload[0]
ipv4 = []
ipv6 = []
for addr in item.get("addr_info") or []:
local = addr.get("local")
prefixlen = addr.get("prefixlen")
family = addr.get("family")
if not local or prefixlen is None:
continue
entry = f"{local}/{prefixlen}"
if family == "inet":
ipv4.append(entry)
elif family == "inet6":
ipv6.append(entry)
data = {
"interface": interface_name,
"ipv4": ipv4,
"ipv6": ipv6,
}
parent = os.path.dirname(path)
if parent:
os.makedirs(parent, exist_ok=True)
temp_path = f"{path}.tmp.{os.getpid()}"
with open(temp_path, "w", encoding="utf-8") as handle:
json.dump(data, handle, ensure_ascii=False, indent=2)
os.replace(temp_path, path)
PY
}
blitz_resolve_5g_interface() {
local explicit_interface="${BLITZ_5G_INTERFACE:-}"
local info_json="${BLITZ_5G_INFO_JSON:-}"
local recorded_interface=""
local detected_interface=""
if [[ -n "${explicit_interface}" ]]; then
if blitz_interface_exists "${explicit_interface}"; then
printf '%s\n' "${explicit_interface}"
return 0
fi
return 1
fi
recorded_interface="$(blitz_read_5g_info_interface "${info_json}" || true)"
if [[ -n "${recorded_interface}" ]] && blitz_interface_exists "${recorded_interface}"; then
printf '%s\n' "${recorded_interface}"
return 0
fi
detected_interface="$(blitz_detect_5g_interface_from_subnet || true)"
if [[ -n "${detected_interface}" ]]; then
if [[ "${detected_interface}" != "${recorded_interface}" ]]; then
blitz_refresh_5g_info_json "${detected_interface}" "${info_json}" >/dev/null 2>&1 || true
fi
printf '%s\n' "${detected_interface}"
return 0
fi
return 1
}
blitz_prepare_runtime_dir() {
local runtime_dir
blitz_load_boot_env
runtime_dir="${BLITZ_RUNTIME_DIR}"
mkdir -p "${runtime_dir}"
if [[ "${EUID}" -eq 0 ]]; then
chown "root:${BLITZ_ROS_USER}" "${runtime_dir}"
chmod 0775 "${runtime_dir}"
else
chmod 0775 "${runtime_dir}" 2>/dev/null || true
fi
blitz_log "runtime-dir" "prepare" "success" "path=${runtime_dir}" 0
}
blitz_prepare_run_root() {
local run_root
local run_dir
local incidents_dir
blitz_load_boot_env
run_root="${BLITZ_RUN_ROOT}"
run_dir="${run_root}/runs"
incidents_dir="${run_root}/incidents"
mkdir -p "${run_dir}" "${incidents_dir}"
if [[ "${EUID}" -eq 0 ]]; then
chown -R "root:${BLITZ_ROS_USER}" "${run_root}" 2>/dev/null || true
chmod 0775 "${run_root}" "${run_dir}" "${incidents_dir}" 2>/dev/null || true
fi
}
blitz_load_run_context_env() {
local context_file="${1:-${BLITZ_RUN_CONTEXT_FILE:-}}"
if [[ -z "${context_file}" || ! -f "${context_file}" ]]; then
return 1
fi
set -a
# shellcheck disable=SC1090
source "${context_file}"
set +a
return 0
}
blitz_read_run_id() {
local run_id_file="${BLITZ_RUN_ID_FILE:-}"
if [[ -z "${run_id_file}" || ! -f "${run_id_file}" ]]; then
return 1
fi
tr -d '\r\n' < "${run_id_file}"
}
blitz_utc_compact_timestamp() {
date -u '+%Y%m%dT%H%M%SZ'
}
blitz_new_run_id() {
printf '%s\n' "$(blitz_utc_compact_timestamp)"
}
blitz_new_incident_id() {
local prefix="${1:-incident}"
printf '%s-%s-%d\n' "${prefix}" "$(blitz_utc_compact_timestamp)" "$$"
}
blitz_new_instance_id() {
printf '%s-%d\n' "$(blitz_utc_compact_timestamp)" "$$"
}
blitz_git_commit() {
git -C "${OMNISOCKETGO_ROOT}" rev-parse HEAD 2>/dev/null || true
}
blitz_git_dirty_flag() {
if git -C "${OMNISOCKETGO_ROOT}" diff --quiet --ignore-submodules=dirty >/dev/null 2>&1; then
printf '0\n'
return 0
fi
printf '1\n'
}
blitz_write_run_context() {
local run_id="$1"
local run_dir="$2"
local boot_id="$3"
local context_file="${BLITZ_RUN_CONTEXT_FILE}"
local id_file="${BLITZ_RUN_ID_FILE}"
local temp_context
local temp_info
local commit_hash
local dirty_flag
local started_at
commit_hash="$(blitz_git_commit)"
dirty_flag="$(blitz_git_dirty_flag)"
started_at="$(date -u '+%Y-%m-%dT%H:%M:%SZ')"
temp_context="${context_file}.tmp.$$"
temp_info="${run_dir}/run-info.json.tmp.$$"
mkdir -p "${run_dir}"
printf '%s\n' "${run_id}" > "${id_file}"
cat > "${temp_context}" <<EOF
BLITZ_RUN_ID=${run_id}
BLITZ_RUN_DIR=${run_dir}
BLITZ_BOOT_ID=${boot_id}
BLITZ_RUN_ROOT=${BLITZ_RUN_ROOT}
EOF
mv -f "${temp_context}" "${context_file}"
python3 - "${temp_info}" "${run_id}" "${run_dir}" "${boot_id}" "${started_at}" "${commit_hash}" "${dirty_flag}" "${HOSTNAME:-$(hostname)}" <<'PY'
import json
import os
import sys
path, run_id, run_dir, boot_id, started_at, commit_hash, dirty_flag, hostname = sys.argv[1:9]
payload = {
"run_id": run_id,
"run_dir": run_dir,
"boot_id": boot_id,
"started_at": started_at,
"hostname": hostname,
"git_commit": commit_hash,
"git_dirty": dirty_flag == "1",
"env": {
key: os.environ.get(key, "")
for key in sorted(os.environ)
if key.startswith(("BLITZ_", "OMNI_", "ROBOT_RECEIVER_"))
},
}
with open(path, "w", encoding="utf-8") as handle:
json.dump(payload, handle, ensure_ascii=False, indent=2, sort_keys=True)
PY
mv -f "${temp_info}" "${run_dir}/run-info.json"
ln -sfn "${run_dir}" "${BLITZ_CURRENT_RUN_LINK}"
}
blitz_init_run_context() {
local run_id
local boot_id
local run_dir
blitz_load_boot_env
blitz_prepare_runtime_dir
blitz_prepare_run_root
run_id="$(blitz_new_run_id)"
boot_id="$(cat /proc/sys/kernel/random/boot_id 2>/dev/null || blitz_new_run_id)"
run_dir="${BLITZ_RUN_ROOT}/runs/${run_id}"
export BLITZ_RUN_ID="${run_id}"
export BLITZ_RUN_DIR="${run_dir}"
export BLITZ_BOOT_ID="${boot_id}"
blitz_write_run_context "${run_id}" "${run_dir}" "${boot_id}"
blitz_log "run-context" "init" "success" "run_id=${run_id} run_dir=${run_dir}" 0
}
blitz_require_run_context() {
blitz_load_boot_env
if blitz_load_run_context_env; then
return 0
fi
blitz_log "run-context" "load" "failure" "missing ${BLITZ_RUN_CONTEXT_FILE}" 1
return 1
}
blitz_ensure_instance_id() {
if [[ -n "${BLITZ_INSTANCE_ID:-}" ]]; then
return 0
fi
export BLITZ_INSTANCE_ID="$(blitz_new_instance_id)"
}
blitz_jsonl_rotate_if_needed() {
local path="$1"
local max_bytes="${2:-${BLITZ_JSONL_ROTATE_BYTES:-0}}"
local max_files="${3:-${BLITZ_JSONL_ROTATE_FILES:-0}}"
local size=0
local index
if [[ -z "${path}" || ! -f "${path}" ]]; then
return 0
fi
if (( max_bytes <= 0 || max_files <= 0 )); then
return 0
fi
size="$(stat -c %s "${path}" 2>/dev/null || echo 0)"
if (( size < max_bytes )); then
return 0
fi
for (( index=max_files; index>=1; index-- )); do
if [[ "${index}" -eq "${max_files}" ]]; then
rm -f "${path}.${index}"
fi
if [[ -f "${path}.${index}" ]]; then
mv -f "${path}.${index}" "${path}.$(( index + 1 ))"
fi
done
mv -f "${path}" "${path}.1"
}
blitz_jsonl_append_line() {
local path="$1"
local line="$2"
mkdir -p "$(dirname "${path}")"
blitz_jsonl_rotate_if_needed "${path}"
printf '%s\n' "${line}" >> "${path}"
}
blitz_launch_incident_capture() {
local launch_script="${BOOT_SCRIPT_DIR}/blitz-incident-capture-launch.sh"
if [[ ! -f "${launch_script}" ]]; then
return 1
fi
/bin/bash "${launch_script}" "$@" >/dev/null 2>&1 || return 1
}

View File

@@ -0,0 +1,53 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="disable"
SYSTEMD_DEST_DIR="/etc/systemd/system"
UNITS=(
"blitz-watchdog.service"
"blitz-5g-link-logger.service"
"blitz-b-side-omnid.service"
"blitz-ros-receiver.service"
"blitz-5g-dial.service"
"blitz-run-context.service"
"blitz-boot-gate.service"
"blitz-robot.target"
)
stop_unit_if_present() {
local unit_name="$1"
local unit_path="${SYSTEMD_DEST_DIR}/${unit_name}"
if [[ ! -f "${unit_path}" ]]; then
return 0
fi
blitz_run "${STEP}" "stop-unit" systemctl stop "${unit_name}" || true
}
disable_unit_if_present() {
local unit_name="$1"
local unit_path="${SYSTEMD_DEST_DIR}/${unit_name}"
if [[ ! -f "${unit_path}" ]]; then
return 0
fi
blitz_run "${STEP}" "disable-unit" systemctl disable "${unit_name}" || true
}
blitz_load_boot_env
blitz_require_root "${STEP}"
blitz_require_command systemctl "${STEP}"
for unit_name in "${UNITS[@]}"; do
stop_unit_if_present "${unit_name}"
done
for unit_name in "${UNITS[@]}"; do
disable_unit_if_present "${unit_name}"
done
blitz_log "${STEP}" "complete" "success" "boot chain stopped and disabled; next reboot will not auto-start blitz services" 0

View File

@@ -0,0 +1,71 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
SYSTEMD_TEMPLATE_DIR="${SCRIPT_DIR}/systemd"
SYSTEMD_DEST_DIR="/etc/systemd/system"
render_template() {
local template_path="$1"
local output_path="$2"
sed \
-e "s|@OMNISOCKETGO_ROOT@|${OMNISOCKETGO_ROOT}|g" \
-e "s|@BLITZ_LOG_FILE@|${BLITZ_LOG_FILE}|g" \
-e "s|@BLITZ_ROS_USER@|${BLITZ_ROS_USER}|g" \
"${template_path}" > "${output_path}"
}
install_unit() {
local template_name="$1"
local temp_output
temp_output="$(mktemp)"
render_template "${SYSTEMD_TEMPLATE_DIR}/${template_name}" "${temp_output}"
install -m 0644 "${temp_output}" "${SYSTEMD_DEST_DIR}/${template_name%.in}"
rm -f "${temp_output}"
blitz_log "install" "install-unit" "success" "unit=${SYSTEMD_DEST_DIR}/${template_name%.in}" 0
}
remove_unit_if_present() {
local unit_name="$1"
local unit_path="${SYSTEMD_DEST_DIR}/${unit_name}"
if [[ ! -f "${unit_path}" ]]; then
return 0
fi
systemctl disable --now "${unit_name}" >/dev/null 2>&1 || true
rm -f "${unit_path}"
blitz_log "install" "remove-unit" "success" "unit=${unit_path}" 0
}
blitz_load_boot_env
blitz_require_root "install"
blitz_require_command install "install"
blitz_require_command systemctl "install"
mkdir -p "${SYSTEMD_DEST_DIR}"
install -d -m 0755 "$(dirname "${BLITZ_LOG_FILE}")"
touch "${BLITZ_LOG_FILE}"
chmod 0644 "${BLITZ_LOG_FILE}"
blitz_log "install" "prepare-log-file" "success" "log_file=${BLITZ_LOG_FILE}" 0
blitz_prepare_runtime_dir
blitz_prepare_run_root
install_unit "blitz-boot-gate.service.in"
install_unit "blitz-run-context.service.in"
install_unit "blitz-5g-dial.service.in"
install_unit "blitz-5g-link-logger.service.in"
install_unit "blitz-ros-receiver.service.in"
install_unit "blitz-b-side-omnid.service.in"
install_unit "blitz-watchdog.service.in"
install_unit "blitz-robot.target.in"
remove_unit_if_present "blitz-time-sync.service"
blitz_run "install" "daemon-reload" systemctl daemon-reload
blitz_run "install" "enable-target" systemctl enable blitz-robot.target
blitz_log "install" "complete" "success" "run systemctl start blitz-robot.target to launch immediately" 0

View File

@@ -0,0 +1,9 @@
{
"interface": "enxb8f72c9e179a",
"ipv4": [
"192.168.225.160/22"
],
"ipv6": [
"fe80::52ae:a1c8:a9bb:a9a8/64"
]
}

View File

@@ -0,0 +1,12 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="runtime-dir"
blitz_load_boot_env
blitz_prepare_runtime_dir
blitz_log "${STEP}" "complete" "success" "runtime_dir=${BLITZ_RUNTIME_DIR}" 0

854
scripts/boot/rndis_dial.py Normal file
View File

@@ -0,0 +1,854 @@
#!/usr/bin/env python3
"""RM520N-GL RNDIS 自动拨号脚本。
流程:
1. 检测 USB 设备是否存在
2. 打开 AT 口并检查 SIM 状态
3. 配置 RNDIS 模式: AT+QCFG="usbnet",3
4. 重启模块: AT+CFUN=1,1
5. 等待模块重新枚举并识别 5G 网卡
6. 如果网卡还没有 IPv4, 自动尝试 DHCP
用法:
sudo python3 rndis_dial.py
sudo python3 rndis_dial.py --serial-port /dev/ttyUSB7
sudo python3 rndis_dial.py --interface eth0 #指定网口
"""
from __future__ import annotations
import argparse
import errno
import ipaddress
import json
import os
import select
import shlex
import shutil
import subprocess
import sys
import termios
import time
import tty
USB_ID = "2c7c:0801"
DEFAULT_SERIAL_PORT = "/dev/ttyUSB7" #串口设备节点
DEFAULT_BAUD_RATE = 115200
CHECK_INTERVAL = 2
SERIAL_READ_TIMEOUT = 0.2
SERIAL_POLL_INTERVAL = 0.1
SERIAL_SETTLE_DELAY = 0.3
AT_SYNC_RETRIES = 3
AT_SYNC_TIMEOUT = 2.5
# 示例地址 192.168.225.38/22 所在网段。
# 拨号成功后会用这个网段来最终确认哪个接口是 5G 模组。
DEFAULT_MODEM_SUBNET = "192.168.224.0/22"
DEFAULT_MODEM_GATEWAY = "192.168.225.1"
DEFAULT_PUBLIC_TARGETS = ("81.70.156.140", "106.55.173.235")
DEFAULT_INFO_JSON = "modem_network_info.json"
SKIP_INTERFACES = {"lo", "docker0", "l4tbr0"}
BAUD_RATE_MAP = {
9600: termios.B9600,
19200: termios.B19200,
38400: termios.B38400,
57600: termios.B57600,
115200: termios.B115200,
}
def run_cmd(cmd, timeout=30, check=False):
print(f"[CMD] {format_shell_cmd(cmd)}")
result = subprocess.run(
cmd,
capture_output=True,
text=True,
timeout=timeout,
check=False,
)
output = (result.stdout or "") + (result.stderr or "")
if check and result.returncode != 0:
raise RuntimeError(f"命令执行失败: {' '.join(cmd)}\n{output.strip()}")
return result.returncode, output.strip()
def format_shell_cmd(cmd):
"""把命令参数格式化成可直接阅读的 shell 形式。"""
return " ".join(shlex.quote(part) for part in cmd)
def parse_ipv4_address(value):
try:
return str(ipaddress.IPv4Address(value))
except ipaddress.AddressValueError as exc:
raise argparse.ArgumentTypeError(f"无效的 IPv4 地址: {value}") from exc
def dedupe_keep_order(values):
seen = set()
result = []
for value in values:
if value in seen:
continue
seen.add(value)
result.append(value)
return result
def require_root():
if os.geteuid() != 0:
print("[FAIL] 请使用 sudo 运行此脚本")
sys.exit(1)
def require_commands():
missing = [cmd for cmd in ("lsusb", "ip") if shutil.which(cmd) is None]
if missing:
print(f"[FAIL] 缺少系统命令: {', '.join(missing)}")
sys.exit(1)
def usb_device_present():
# 1. 第一次检测 lsusb确认模块已经被系统识别。
"""通过 lsusb 检查模块是否已经被系统识别。"""
code, output = run_cmd(["lsusb"], timeout=10)
if code != 0:
return False, output
for line in output.splitlines():
if USB_ID in line:
return True, line.strip()
return False, output
def wait_for_usb_device(expected_present, timeout):
"""等待模块 USB 设备下线或重新上线。"""
deadline = time.time() + timeout
last_seen = ""
while time.time() < deadline:
present, detail = usb_device_present()
last_seen = detail
if present == expected_present:
return True, detail
time.sleep(CHECK_INTERVAL)
return False, last_seen
def wait_for_path(path, timeout):
"""等待串口节点或其他路径重新出现。"""
deadline = time.time() + timeout
while time.time() < deadline:
if os.path.exists(path):
return True
time.sleep(1)
return False
def normalize_serial_output(text):
"""整理串口原始输出,便于后续匹配关键字。"""
cleaned = text.replace("\r", "\n")
return "\n".join(line for line in cleaned.splitlines() if line.strip()).strip()
def serial_response_complete(text):
if not text:
return False
for line in reversed(text.splitlines()):
stripped = line.strip()
if stripped == "OK":
return True
if "ERROR" in stripped:
return True
return False
class RawSerialSession:
"""使用 Python 标准库直接控制 Linux 串口,尽量贴近 stty/raw 行为。"""
def __init__(self, port, baudrate):
if baudrate not in BAUD_RATE_MAP:
raise RuntimeError(f"不支持的波特率: {baudrate}")
self.port = port
self.fd = None
self._original_attrs = None
try:
self.fd = os.open(port, os.O_RDWR | os.O_NOCTTY | os.O_NONBLOCK)
self._original_attrs = termios.tcgetattr(self.fd)
tty.setraw(self.fd, when=termios.TCSANOW)
attrs = termios.tcgetattr(self.fd)
attrs[0] = 0
attrs[1] = 0
attrs[2] &= ~(termios.PARENB | termios.CSTOPB | termios.CSIZE)
attrs[2] |= termios.CS8 | termios.CLOCAL | termios.CREAD
attrs[3] = 0
attrs[4] = BAUD_RATE_MAP[baudrate]
attrs[5] = BAUD_RATE_MAP[baudrate]
attrs[6][termios.VMIN] = 0
attrs[6][termios.VTIME] = 0
termios.tcsetattr(self.fd, termios.TCSANOW, attrs)
termios.tcflush(self.fd, termios.TCIOFLUSH)
except OSError as exc:
self.close()
raise RuntimeError(f"无法打开串口 {port}: {exc}") from exc
@property
def is_open(self):
return self.fd is not None
def reset_input_buffer(self):
if self.fd is not None:
termios.tcflush(self.fd, termios.TCIFLUSH)
def reset_output_buffer(self):
if self.fd is not None:
termios.tcflush(self.fd, termios.TCOFLUSH)
def write(self, data):
if self.fd is None:
raise OSError("串口未打开")
sent = 0
while sent < len(data):
try:
written = os.write(self.fd, data[sent:])
except BlockingIOError:
time.sleep(SERIAL_POLL_INTERVAL)
continue
if written <= 0:
raise OSError("串口写入返回 0 字节")
sent += written
def flush(self):
if self.fd is not None:
termios.tcdrain(self.fd)
def read_chunk(self, timeout, size=4096):
if self.fd is None:
return b""
ready, _, _ = select.select([self.fd], [], [], timeout)
if not ready:
return b""
try:
return os.read(self.fd, size)
except BlockingIOError:
return b""
def close(self):
if self.fd is None:
return
fd = self.fd
self.fd = None
if self._original_attrs is not None:
try:
termios.tcsetattr(fd, termios.TCSANOW, self._original_attrs)
except termios.error:
pass
os.close(fd)
def read_serial_output(session, timeout, allow_disconnect=False):
"""在给定时间窗口内读取 AT 响应,直到出现结束标记或超时。"""
deadline = time.time() + timeout
chunks = []
saw_terminal_line = False
last_data_time = None
while time.time() < deadline:
try:
chunk = session.read_chunk(timeout=min(SERIAL_READ_TIMEOUT, max(deadline - time.time(), 0)))
except OSError as exc:
if allow_disconnect and exc.errno in (errno.EIO, errno.ENODEV, errno.EBADF):
break
raise RuntimeError(f"读取串口响应失败: {exc}") from exc
if chunk:
chunks.append(chunk.decode(errors="ignore"))
last_data_time = time.time()
current_text = normalize_serial_output("".join(chunks))
if serial_response_complete(current_text):
saw_terminal_line = True
continue
if saw_terminal_line and last_data_time is not None and time.time() - last_data_time >= SERIAL_SETTLE_DELAY:
break
time.sleep(SERIAL_POLL_INTERVAL)
return normalize_serial_output("".join(chunks))
def open_serial_session(port):
"""打开 AT 串口会话,后续在同一连接里顺序发送多条命令。"""
ser = RawSerialSession(port=port, baudrate=DEFAULT_BAUD_RATE)
time.sleep(0.2)
ser.reset_input_buffer()
ser.reset_output_buffer()
return ser
def execute_serial_step(ser, command, expect=None, timeout=3, allow_disconnect=False):
"""在当前串口会话里发送一条 AT 命令并校验响应。"""
print(f"[AT] {command}")
try:
ser.reset_input_buffer()
ser.write((command + "\r").encode())
ser.flush()
except OSError as exc:
raise RuntimeError(f"AT 命令 `{command}` 发送失败: {exc}") from exc
response = read_serial_output(ser, timeout=timeout, allow_disconnect=allow_disconnect)
if response:
print(response)
else:
print("(无响应)")
if "ERROR" in response:
raise RuntimeError(f"AT 命令 `{command}` 执行失败: {response}")
if expect and expect not in response and not allow_disconnect:
raise RuntimeError(f"AT 命令 `{command}` 响应异常: {response or '空响应'}")
return response
def synchronize_at_channel(ser):
"""某些模组 AT 口在刚打开时需要先用 AT 做一次预热。"""
last_error = None
for attempt in range(1, AT_SYNC_RETRIES + 1):
try:
print(f"[INFO] 预热 AT 通道,第 {attempt}")
response = execute_serial_step(ser, "AT", expect="OK", timeout=AT_SYNC_TIMEOUT)
if "OK" in response:
return
except RuntimeError as exc:
last_error = exc
time.sleep(0.5)
if last_error is not None:
raise RuntimeError(
"AT 通道预热失败,请确认串口是否是 AT 命令口,例如 /dev/ttyUSB2"
) from last_error
raise RuntimeError("AT 通道预热失败")
def run_serial_steps(port, steps):
"""在同一个串口会话里顺序执行多条 AT 命令。"""
ser = None
try:
ser = open_serial_session(port)
synchronize_at_channel(ser)
for step in steps:
execute_serial_step(
ser,
step["command"],
expect=step.get("expect"),
timeout=step.get("timeout", 3),
allow_disconnect=step.get("allow_disconnect", False),
)
finally:
if ser is not None and ser.is_open:
ser.close()
def configure_rndis(port):
# 2. 用 Python 串口库在同一会话里顺序执行拨号相关 AT 命令。
"""切换到 RNDIS 模式并触发模块重启。"""
if not wait_for_path(port, timeout=30):
raise RuntimeError(f"串口不存在: {port}")
print(f"[OK] 串口已打开: {port}")
run_serial_steps(
port,
[
{"command": "AT+CPIN?", "expect": "READY", "timeout": 4},
{"command": 'AT+QCFG="usbnet",3', "expect": "OK", "timeout": 5},
{"command": "AT+CFUN=1,1", "timeout": 4, "allow_disconnect": True},
],
)
def get_interfaces():
"""列出当前系统中的接口,过滤明显无关的本地接口。"""
interfaces = []
try:
for name in os.listdir("/sys/class/net"):
if name in SKIP_INTERFACES or is_usb_gadget(name):
continue
interfaces.append(name)
except FileNotFoundError:
return []
return sorted(interfaces)
def is_usb_gadget(iface):
"""过滤 Jetson 自己暴露出去的 gadget 网卡。"""
sysfs_path = f"/sys/class/net/{iface}"
if not os.path.exists(sysfs_path):
return False
return "/gadget/" in os.path.realpath(sysfs_path)
def is_usb_network_interface(iface):
"""判断接口是否来自 USB 设备。"""
device_path = f"/sys/class/net/{iface}/device"
if not os.path.exists(device_path):
return False
real_path = os.path.realpath(device_path)
return "/usb" in real_path
def get_ipv4_addrs():
"""返回所有接口的 IPv4/CIDR 信息。"""
code, output = run_cmd(["ip", "-o", "-4", "addr", "show"], timeout=10)
if code != 0:
return {}
ipv4_addrs = {}
for line in output.splitlines():
parts = line.split()
if len(parts) >= 4:
iface = parts[1]
ipv4_addrs.setdefault(iface, []).append(parts[3])
return ipv4_addrs
def get_ipv6_addrs():
"""返回所有接口的 IPv6/CIDR 信息。"""
code, output = run_cmd(["ip", "-o", "-6", "addr", "show"], timeout=10)
if code != 0:
return {}
ipv6_addrs = {}
for line in output.splitlines():
parts = line.split()
if len(parts) >= 4:
iface = parts[1]
ipv6_addrs.setdefault(iface, []).append(parts[3])
return ipv6_addrs
def interface_priority(iface):
if iface.startswith("wwan"):
return 0
if iface.startswith("enx"):
return 1
if iface.startswith("usb"):
return 2
return 10
def list_usb_network_candidates(explicit_iface=None):
"""列出拨号前可尝试的 USB 网卡候选项。
这里不靠固定网口名确认 5G 模组,只是在还没有 IP 的时候先缩小范围。
真正确认模组接口,会在 DHCP 之后根据 IP 网段判断。
"""
candidates = []
for iface in get_interfaces():
if explicit_iface and iface != explicit_iface:
continue
if not is_usb_network_interface(iface):
continue
candidates.append((interface_priority(iface), iface))
if not candidates:
return []
candidates.sort()
return [iface for _, iface in candidates]
def ip_in_subnet(ip_cidr, subnet):
"""判断接口地址是否落在指定网段内。"""
try:
return ipaddress.ip_interface(ip_cidr).ip in ipaddress.ip_network(subnet, strict=False)
except ValueError:
return False
def find_interface_by_subnet(modem_subnet, explicit_iface=None):
"""拨号成功后,通过 IP 网段确认 5G 模组网卡。"""
candidates = []
for iface, addrs in get_ipv4_addrs().items():
if iface in SKIP_INTERFACES or is_usb_gadget(iface):
continue
if not is_usb_network_interface(iface):
continue
if explicit_iface and iface != explicit_iface:
continue
matched_addrs = [addr for addr in addrs if ip_in_subnet(addr, modem_subnet)]
if matched_addrs:
candidates.append((interface_priority(iface), iface, matched_addrs))
if not candidates:
return None, []
candidates.sort()
_, iface, matched_addrs = candidates[0]
return iface, matched_addrs
def wait_for_usb_candidates(explicit_iface=None, timeout=90):
"""等待模块枚举出 USB 网卡候选项。"""
deadline = time.time() + timeout
while time.time() < deadline:
candidates = list_usb_network_candidates(explicit_iface=explicit_iface)
if candidates:
return candidates
time.sleep(CHECK_INTERVAL)
return []
def bring_interface_up(iface):
code, output = run_cmd(["ip", "link", "set", "dev", iface, "up"], timeout=10)
if code != 0:
raise RuntimeError(f"拉起网卡失败: {iface}\n{output}")
def renew_dhcp(iface):
dhclient = shutil.which("dhclient")
udhcpc = shutil.which("udhcpc")
if dhclient:
print(f"[INFO] 使用 dhclient 为 {iface} 获取 IP")
code, output = run_cmd(["dhclient", "-1", "-v", iface], timeout=45)
return code == 0, output
if udhcpc:
print(f"[INFO] 使用 udhcpc 为 {iface} 获取 IP")
code, output = run_cmd(["udhcpc", "-n", "-q", "-i", iface], timeout=45)
return code == 0, output
return False, "系统中未找到 dhclient 或 udhcpc"
def get_default_routes(iface):
code, output = run_cmd(["ip", "-o", "route", "show", "default", "dev", iface], timeout=10)
if code != 0:
return []
return [line.strip() for line in output.splitlines() if line.strip()]
def resolve_gateway(iface, fallback_gateway):
for route in get_default_routes(iface):
tokens = route.split()
for index, token in enumerate(tokens[:-1]):
if token == "via":
gateway = tokens[index + 1]
print(f"[INFO] 从默认路由检测到 {iface} 网关: {gateway}")
return gateway
print(f"[INFO] 未从默认路由检测到 {iface} 网关,回退到 {fallback_gateway}")
return fallback_gateway
def delete_default_routes(iface):
removed = 0
while True:
routes = get_default_routes(iface)
if not routes:
return removed
deleted_this_round = False
for route in routes:
cmd = ["ip", "route", "del", *route.split()]
code, output = run_cmd(cmd, timeout=10)
if code != 0:
code, output = run_cmd(["ip", "route", "del", "default", "dev", iface], timeout=10)
if code != 0:
raise RuntimeError(f"删除默认路由失败: {iface}\n{output}")
removed += 1
deleted_this_round = True
if not deleted_this_round:
raise RuntimeError(f"未能删除 {iface} 的默认路由")
def install_host_routes(iface, gateway, targets):
for target in dedupe_keep_order(targets):
cmd = ["ip", "route", "replace", f"{target}/32", "via", gateway, "dev", iface]
code, output = run_cmd(cmd, timeout=10)
if code != 0:
raise RuntimeError(f"添加主机路由失败: {target} via {gateway} dev {iface}\n{output}")
print(f"[OK] 已添加主机路由: {target}/32 via {gateway} dev {iface}")
def enforce_route_policy(iface, fallback_gateway, route_targets):
gateway = resolve_gateway(iface, fallback_gateway)
removed = delete_default_routes(iface)
print(f"[OK] 已删除 {iface} 上的 {removed} 条默认路由")
if route_targets:
install_host_routes(iface, gateway, route_targets)
else:
print(f"[WARN] {iface} 未配置任何主机路由目标5G 将不再承载公网流量")
def ensure_ipv4(iface):
"""为指定接口申请 IPv4 地址。"""
ipv4_addrs = get_ipv4_addrs().get(iface, [])
if ipv4_addrs:
return ipv4_addrs
bring_interface_up(iface)
ok, output = renew_dhcp(iface)
if output:
print(output)
if not ok:
return []
return get_ipv4_addrs().get(iface, [])
def acquire_modem_interface(modem_subnet, explicit_iface=None):
"""通过 DHCP + IP 网段识别真正的模组接口。"""
iface, matched_addrs = find_interface_by_subnet(
modem_subnet,
explicit_iface=explicit_iface,
)
if iface:
return iface, matched_addrs
candidates = list_usb_network_candidates(explicit_iface=explicit_iface)
if not candidates:
raise RuntimeError("未找到可尝试 DHCP 的 USB 网卡候选项")
print(f"[INFO] 当前 USB 网卡候选项: {', '.join(candidates)}")
for iface in candidates:
print(f"[INFO] 尝试为 {iface} 获取 IPv4")
ensure_ipv4(iface)
matched_iface, matched_addrs = find_interface_by_subnet(
modem_subnet,
explicit_iface=explicit_iface,
)
if matched_iface:
return matched_iface, matched_addrs
return None, []
def print_interface_status(iface):
# 3. 拨号成功后,打印 ip/ifconfig确认模组网口和地址。
print(f"[OK] 检测到 5G 网卡: {iface}")
code, output = run_cmd(["ip", "-4", "addr", "show", "dev", iface], timeout=10)
if code == 0 and output:
print(output)
if shutil.which("ifconfig"):
code, ifconfig_output = run_cmd(["ifconfig", iface], timeout=10)
if code == 0 and ifconfig_output:
print("\n===== ifconfig =====")
print(ifconfig_output)
def save_interface_info(iface, output_file=DEFAULT_INFO_JSON):
"""把网口名称、IPv4、IPv6 保存到 JSON 文件。"""
data = {
"interface": iface,
"ipv4": get_ipv4_addrs().get(iface, []),
"ipv6": get_ipv6_addrs().get(iface, []),
}
with open(output_file, "w", encoding="utf-8") as json_file:
json.dump(data, json_file, ensure_ascii=False, indent=2)
print(f"[OK] 网口信息已保存到 {output_file}")
def ping_target(iface, target, count=3, timeout=15):
"""通过指定网口 ping 一个目标。"""
code, output = run_cmd(
["ping", "-I", iface, "-c", str(count), "-W", "3", target],
timeout=timeout,
)
return code == 0, output
def print_ping_summary(output):
"""只打印 ping 的关键结果。"""
for line in output.splitlines():
if "packets transmitted" in line or "rtt " in line or "Destination " in line:
print(line)
def verify_connectivity(iface, gateway=DEFAULT_MODEM_GATEWAY, targets=DEFAULT_PUBLIC_TARGETS, retry_interval=3, max_wait=45):
# 4. 最后先 ping 模组网关,再重试公网连通性。
"""先测模组网关,再轮询公网目标地址。"""
ok, output = ping_target(iface, gateway, count=3, timeout=15)
if ok:
print(f"[OK] {iface} 可到达模组网关 {gateway}")
print_ping_summary(output)
else:
print(f"[WARN] {iface} 无法到达模组网关 {gateway}")
if output:
print(output)
return False
deadline = time.time() + max_wait
attempt = 1
while True:
for target in targets:
ok, output = ping_target(iface, target, count=3, timeout=15)
if ok:
print(f"[OK] {iface} 可通过 {target}")
print_ping_summary(output)
return True
print(f"[WARN] 第 {attempt} 次 Ping {target} 失败")
if output:
print_ping_summary(output)
if time.time() >= deadline:
print(f"[WARN] {iface}{max_wait} 秒内仍无法连通 {', '.join(targets)}")
return False
attempt += 1
time.sleep(retry_interval)
def ping_via_interface(iface, targets=DEFAULT_PUBLIC_TARGETS):
"""保留原调用点,内部走完整连通性检查。"""
return verify_connectivity(iface, targets=targets)
def parse_args():
parser = argparse.ArgumentParser(description="RM520N-GL RNDIS 自动拨号脚本")
parser.add_argument(
"--serial-port",
default=DEFAULT_SERIAL_PORT,
help=f"AT 串口路径,默认 {DEFAULT_SERIAL_PORT}",
)
parser.add_argument(
"--interface",
help="指定期望的 5G 网卡名,例如 eth0",
)
parser.add_argument(
"--modem-subnet",
default=DEFAULT_MODEM_SUBNET,
help=f"拨号成功后用于识别模组接口的 IPv4 网段,默认 {DEFAULT_MODEM_SUBNET}",
)
parser.add_argument(
"--gateway",
type=parse_ipv4_address,
default=DEFAULT_MODEM_GATEWAY,
help=f"5G 模组网关地址,默认 {DEFAULT_MODEM_GATEWAY}",
)
parser.add_argument(
"--skip-dhcp",
action="store_true",
help="只等待 USB 网卡出现,不主动申请 IPv4",
)
parser.add_argument(
"--remove-default-route",
action="store_true",
help="拨号成功后删除 5G 接口上的默认路由,只保留显式主机路由",
)
parser.add_argument(
"--route-target",
action="append",
default=[],
type=parse_ipv4_address,
help="拨号完成后通过 5G 接口保留的 IPv4 主机路由目标,可重复传入",
)
return parser.parse_args()
def main():
args = parse_args()
require_root()
require_commands()
print("===== RM520N-GL RNDIS 自动拨号 =====")
print(f"[INFO] 目标模组网段: {args.modem_subnet}")
#1.检测 lsusb确认是否识别到模块
present, detail = usb_device_present()
if not present:
print(f"[FAIL] 未检测到模块 USB 设备 {USB_ID}")
if detail:
print(detail)
sys.exit(1)
print(f"[OK] 检测到 USB 设备: {detail}")
print(f"[INFO] 使用 AT 口: {args.serial_port}")
#2.进行 Python 串口拨号
try:
configure_rndis(args.serial_port)
print("[INFO] 已发送 AT+CFUN=1,1等待模块重启")
disappeared, _ = wait_for_usb_device(expected_present=False, timeout=25)
if disappeared:
print("[OK] 模块已下线,继续等待重新枚举")
else:
print("[WARN] 未观察到模块下线,继续等待重新枚举")
reappeared, detail = wait_for_usb_device(expected_present=True, timeout=90)
if not reappeared:
print(f"[FAIL] 模块重启后未重新枚举: {USB_ID}")
sys.exit(1)
print(f"[OK] 模块已重新枚举: {detail}")
candidates = wait_for_usb_candidates(explicit_iface=args.interface, timeout=90)
if not candidates:
print("[FAIL] 未检测到 5G 模组枚举出的 USB 网卡")
sys.exit(1)
if args.skip_dhcp:
print(f"[INFO] 当前 USB 网卡候选项: {', '.join(candidates)}")
iface, ipv4_addrs = find_interface_by_subnet(
args.modem_subnet,
explicit_iface=args.interface,
)
if not iface:
print(f"[WARN] 当前还没有接口拿到目标网段 {args.modem_subnet} 的地址")
sys.exit(1)
else:
iface, ipv4_addrs = acquire_modem_interface(
args.modem_subnet,
explicit_iface=args.interface,
)
if not iface:
print(f"[FAIL] 未找到落在目标网段 {args.modem_subnet} 内的模组接口")
sys.exit(1)
print_interface_status(iface)
if ipv4_addrs:
for addr in ipv4_addrs:
print(f"[OK] {iface} 已获取 IPv4: {addr}")
save_interface_info(iface)
route_targets = dedupe_keep_order(args.route_target)
if args.remove_default_route:
enforce_route_policy(iface, args.gateway, route_targets)
connectivity_targets = route_targets or list(DEFAULT_PUBLIC_TARGETS)
ping_via_interface(iface, targets=connectivity_targets)
print(f"[DONE] RNDIS 拨号完成,可执行: sudo python3 speed_test.py {iface}")
return
print(f"[WARN] {iface} 已出现,但还没有 IPv4 地址")
print(f"[INFO] 可手动检查: ip addr show {iface}")
sys.exit(1)
except (RuntimeError, subprocess.TimeoutExpired) as exc:
print(f"[FAIL] {exc}")
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,60 @@
# Boot-time settings for the robot-side autostart chain.
# Override machine-specific values in robot-boot.env.local.
BLITZ_BOOT_DELAY_SEC="30"
BLITZ_RUN_ROOT="/var/log/blitz-robot"
BLITZ_LOG_FILE="/var/log/blitz-robot/startup.log"
BLITZ_RUNTIME_DIR="/run/blitz-robot"
BLITZ_RUN_CONTEXT_FILE="${BLITZ_RUNTIME_DIR}/run-context.env"
BLITZ_RUN_ID_FILE="${BLITZ_RUNTIME_DIR}/run-id"
BLITZ_CURRENT_RUN_LINK="${BLITZ_RUN_ROOT}/current"
BLITZ_5G_DIAL_DIR="${OMNISOCKETGO_ROOT}/scripts/boot"
BLITZ_5G_SERIAL_PORT="/dev/ttyUSB2"
BLITZ_5G_INTERFACE=""
BLITZ_5G_MODEM_SUBNET="192.168.224.0/22"
BLITZ_5G_GATEWAY="192.168.225.1"
BLITZ_5G_SKIP_DHCP="0"
BLITZ_5G_REMOVE_DEFAULT_ROUTE="1"
BLITZ_5G_ROUTE_TARGETS="106.55.173.235"
BLITZ_5G_INFO_JSON="${OMNISOCKETGO_ROOT}/scripts/boot/modem_network_info.json"
BLITZ_5G_SERIAL_WAIT_SEC="60"
BLITZ_5G_ROUTE_WAIT_SEC="30"
# Leave empty to fall back to the host part of ROBOT_SIDE_OMNISOCKET_SERVER_ADDR.
BLITZ_TIME_SERVER_IP="81.70.156.140"
BLITZ_ROS_USER="nvidia"
BLITZ_ROS_SOCKET_WAIT_SEC="20"
BLITZ_WATCHDOG_INTERVAL_SEC="5"
BLITZ_HEALTH_STALE_SEC="15"
BLITZ_OMNID_THREAD_HEARTBEAT_TIMEOUT_SEC="15"
BLITZ_KCP_STATS_INTERVAL_MS="1000"
BLITZ_CONTROL_LATENCY_LOG_ENABLED="1"
BLITZ_CONTROL_LATENCY_LOG_SAMPLE_MOD="100"
BLITZ_CONTROL_ACK_SAMPLE_MOD="10"
BLITZ_VIDEO_STAGE_LOG_ENABLED="1"
BLITZ_VIDEO_STAGE_LOG_SAMPLE_MOD="10"
BLITZ_5G_LINK_LOG_INTERVAL_SEC="5"
BLITZ_JSONL_FLUSH_INTERVAL_MS="1000"
BLITZ_JSONL_FLUSH_BYTES="262144"
BLITZ_JSONL_ROTATE_BYTES="134217728"
BLITZ_JSONL_ROTATE_FILES="8"
# Log one normal relay packet out of every N packets. Drop events still log immediately.
OMNI_RELAY_PACKET_LOG_SAMPLE_EVERY="200"
BLITZ_INCIDENT_COMMAND_TIMEOUT_SEC="5"
BLITZ_INCIDENT_TOTAL_TIMEOUT_SEC="30"
BLITZ_NETWORK_FAIL_THRESHOLD="3"
BLITZ_NETWORK_RECOVERY_COOLDOWN_SEC="30"
BLITZ_GPS_MONITOR_ENABLED="1"
BLITZ_GPS_DEVICE_GLOB="/dev/ttyCH341USB*"
BLITZ_GPS_CHECK_INTERVAL_SEC="10"
BLITZ_GPS_RESTART_UNITS="gpsd.socket gpsd.service"
BLITZ_WATCHDOG_ALLOW_FAULT_INJECTION="0"
OMNI_CAMERA_DEVICE="/dev/v4l/by-path/platform-a80aa10000.usb-usb-0:3.2:1.4-video-index0"
# Boot units run b_side_omnid as root directly, so nested sudo must stay off.
B_SIDE_OMNID_USE_SUDO="0"
OMNI_CONTROL_ACK_PEER_ID="peer-b-ctrl-ack"
OMNI_CONTROL_ACK_TARGET_PEER="peer-a-ctrl-ack"

View File

@@ -0,0 +1,18 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="5g-link-logger-service"
blitz_load_boot_env
blitz_require_run_context
export OMNI_BOOT_MODE="1"
export BLITZ_INSTANCE_ID="${BLITZ_INSTANCE_ID:-$(blitz_new_instance_id)}"
export BLITZ_5G_LINK_LOG_PATH="${BLITZ_5G_LINK_LOG_PATH:-${BLITZ_RUN_DIR}/b-5g-link-quality.${BLITZ_INSTANCE_ID}.jsonl}"
blitz_log "${STEP}" "start" "start" "exec bash ${OMNISOCKETGO_ROOT}/scripts/boot/blitz-5g-link-logger.sh" 0
exec bash "${OMNISOCKETGO_ROOT}/scripts/boot/blitz-5g-link-logger.sh"

View File

@@ -0,0 +1,19 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="b-side-omnid"
blitz_load_boot_env
blitz_require_run_context
blitz_require_executable "${OMNISOCKETGO_ROOT}/bin/b_side_omnid" "${STEP}"
export OMNI_BOOT_MODE="1"
export B_SIDE_OMNID_USE_SUDO="0"
blitz_log "${STEP}" "start" "start" "exec bash ${OMNISOCKETGO_ROOT}/scripts/dev/start-b-side-omnid.sh" 0
exec bash "${OMNISOCKETGO_ROOT}/scripts/dev/start-b-side-omnid.sh"

View File

@@ -0,0 +1,18 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="ros-receiver"
blitz_load_boot_env
blitz_require_run_context
blitz_require_file "/opt/ros/${ROS_DISTRO}/setup.bash" "${STEP}"
blitz_require_file "${ROS_CONTROL_PY_DIR}/install/setup.bash" "${STEP}"
export OMNI_BOOT_MODE="1"
blitz_log "${STEP}" "start" "start" "exec bash ${OMNISOCKETGO_ROOT}/scripts/dev/start-ros-receiver.sh" 0
exec bash "${OMNISOCKETGO_ROOT}/scripts/dev/start-ros-receiver.sh"

View File

@@ -0,0 +1,15 @@
[Unit]
Description=Blitz robot 5G dial
PartOf=blitz-robot.target
After=blitz-run-context.service
Requires=blitz-run-context.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/bash @OMNISOCKETGO_ROOT@/scripts/boot/5g-dial.sh
StandardOutput=append:@BLITZ_LOG_FILE@
StandardError=append:@BLITZ_LOG_FILE@
[Install]
WantedBy=blitz-robot.target

View File

@@ -0,0 +1,19 @@
[Unit]
Description=Blitz robot 5G link logger
PartOf=blitz-robot.target
After=blitz-run-context.service blitz-5g-dial.service
Requires=blitz-run-context.service
Wants=blitz-run-context.service blitz-5g-dial.service
[Service]
Type=simple
EnvironmentFile=-/run/blitz-robot/run-context.env
ExecStartPre=/bin/bash @OMNISOCKETGO_ROOT@/scripts/boot/prepare-runtime-dir.sh
ExecStart=/bin/bash @OMNISOCKETGO_ROOT@/scripts/boot/start-5g-link-logger-service.sh
Restart=always
RestartSec=5
StandardOutput=append:@BLITZ_LOG_FILE@
StandardError=append:@BLITZ_LOG_FILE@
[Install]
WantedBy=blitz-robot.target

View File

@@ -0,0 +1,20 @@
[Unit]
Description=Blitz robot b-side omnid
PartOf=blitz-robot.target
After=blitz-run-context.service blitz-5g-dial.service blitz-ros-receiver.service
Requires=blitz-run-context.service
Wants=blitz-run-context.service blitz-5g-dial.service blitz-ros-receiver.service
[Service]
Type=simple
EnvironmentFile=-/run/blitz-robot/run-context.env
ExecStartPre=/bin/bash @OMNISOCKETGO_ROOT@/scripts/boot/prepare-runtime-dir.sh
ExecStart=/bin/bash @OMNISOCKETGO_ROOT@/scripts/boot/start-b-side-omnid-service.sh
ExecStopPost=/bin/bash -lc 'if [[ "${SERVICE_RESULT:-success}" != "success" ]]; then exec /bin/bash "@OMNISOCKETGO_ROOT@/scripts/boot/blitz-incident-capture-launch.sh" --source exec-stop-post --unit "%n" --result "${SERVICE_RESULT:-}" --exit-status "${EXIT_STATUS:-}" --reason b-side-service-exit; fi'
Restart=always
RestartSec=2
StandardOutput=append:@BLITZ_LOG_FILE@
StandardError=append:@BLITZ_LOG_FILE@
[Install]
WantedBy=blitz-robot.target

View File

@@ -0,0 +1,15 @@
[Unit]
Description=Blitz robot boot gate
PartOf=blitz-robot.target
After=multi-user.target network-online.target
Wants=network-online.target
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/bash @OMNISOCKETGO_ROOT@/scripts/boot/boot-gate.sh
StandardOutput=append:@BLITZ_LOG_FILE@
StandardError=append:@BLITZ_LOG_FILE@
[Install]
WantedBy=blitz-robot.target

View File

@@ -0,0 +1,13 @@
[Unit]
Description=Blitz robot boot chain
Wants=blitz-boot-gate.service
Wants=blitz-run-context.service
Wants=blitz-5g-dial.service
Wants=blitz-5g-link-logger.service
Wants=blitz-ros-receiver.service
Wants=blitz-b-side-omnid.service
Wants=blitz-watchdog.service
After=multi-user.target
[Install]
WantedBy=multi-user.target

View File

@@ -0,0 +1,23 @@
[Unit]
Description=Blitz robot ROS receiver
PartOf=blitz-robot.target
After=blitz-run-context.service blitz-5g-dial.service
Requires=blitz-run-context.service
Wants=blitz-run-context.service blitz-5g-dial.service
[Service]
Type=simple
User=@BLITZ_ROS_USER@
PermissionsStartOnly=true
EnvironmentFile=-/run/blitz-robot/run-context.env
ExecStartPre=/bin/bash @OMNISOCKETGO_ROOT@/scripts/boot/prepare-runtime-dir.sh
ExecStart=/bin/bash @OMNISOCKETGO_ROOT@/scripts/boot/start-ros-receiver-service.sh
ExecStartPost=/bin/bash @OMNISOCKETGO_ROOT@/scripts/boot/wait-for-unix-socket.sh --step ros-receiver
ExecStopPost=/bin/bash -lc 'if [[ "${SERVICE_RESULT:-success}" != "success" ]]; then exec /bin/bash "@OMNISOCKETGO_ROOT@/scripts/boot/blitz-incident-capture-launch.sh" --source exec-stop-post --unit "%n" --result "${SERVICE_RESULT:-}" --exit-status "${EXIT_STATUS:-}" --reason ros-service-exit; fi'
Restart=always
RestartSec=2
StandardOutput=append:@BLITZ_LOG_FILE@
StandardError=append:@BLITZ_LOG_FILE@
[Install]
WantedBy=blitz-robot.target

View File

@@ -0,0 +1,15 @@
[Unit]
Description=Blitz robot run context
PartOf=blitz-robot.target
After=blitz-boot-gate.service
Requires=blitz-boot-gate.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/bash @OMNISOCKETGO_ROOT@/scripts/boot/blitz-run-context.sh
StandardOutput=append:@BLITZ_LOG_FILE@
StandardError=append:@BLITZ_LOG_FILE@
[Install]
WantedBy=blitz-robot.target

View File

@@ -0,0 +1,19 @@
[Unit]
Description=Blitz robot health watchdog
PartOf=blitz-robot.target
After=blitz-run-context.service blitz-b-side-omnid.service blitz-ros-receiver.service
Requires=blitz-run-context.service
Wants=blitz-run-context.service blitz-b-side-omnid.service blitz-ros-receiver.service
[Service]
Type=simple
EnvironmentFile=-/run/blitz-robot/run-context.env
ExecStartPre=/bin/bash @OMNISOCKETGO_ROOT@/scripts/boot/prepare-runtime-dir.sh
ExecStart=/bin/bash @OMNISOCKETGO_ROOT@/scripts/boot/blitz-watchdog.sh
Restart=always
RestartSec=5
StandardOutput=append:@BLITZ_LOG_FILE@
StandardError=append:@BLITZ_LOG_FILE@
[Install]
WantedBy=blitz-robot.target

View File

@@ -0,0 +1,49 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/common.sh"
STEP="ros-receiver"
SOCKET_PATH=""
TIMEOUT_SEC=""
while [[ $# -gt 0 ]]; do
case "$1" in
--path)
SOCKET_PATH="$2"
shift 2
;;
--timeout)
TIMEOUT_SEC="$2"
shift 2
;;
--step)
STEP="$2"
shift 2
;;
*)
blitz_log "${STEP}" "wait-socket-arg" "failure" "unknown argument: $1" 2
exit 2
;;
esac
done
blitz_load_boot_env
SOCKET_PATH="${SOCKET_PATH:-${ROBOT_RECEIVER_LOCAL_SOCKET_PATH}}"
TIMEOUT_SEC="${TIMEOUT_SEC:-${BLITZ_ROS_SOCKET_WAIT_SEC}}"
blitz_log "${STEP}" "wait-socket" "start" "path=${SOCKET_PATH} timeout_sec=${TIMEOUT_SEC}" 0
for (( waited=0; waited< TIMEOUT_SEC; waited++ )); do
if [[ -S "${SOCKET_PATH}" ]]; then
blitz_log "${STEP}" "wait-socket" "success" "path=${SOCKET_PATH} waited_sec=${waited}" 0
exit 0
fi
sleep 1
done
blitz_log "${STEP}" "wait-socket" "failure" "path=${SOCKET_PATH} timeout_sec=${TIMEOUT_SEC}" 1
exit 1

156
scripts/dev/README.md Normal file
View File

@@ -0,0 +1,156 @@
# Dev Startup Scripts
This directory lives inside the `OmniSocketGo` repo and acts as the main launch entry for the whole local setup.
Default layout:
```text
~/Documents/
OmniSocketGo/
scripts/dev/
robot-command-center/
```
The scripts assume:
- `OmniSocketGo` is the current repo
- `robot-command-center` is a sibling directory next to it
If your `robot-command-center` is elsewhere, set `ROBOT_COMMAND_CENTER_ROOT` in `robot-remote.env.local`.
`start-backend.sh` and `start-frontend.sh` need that repo; `start-ros-receiver.sh` and `start-b-side-omnid.sh` do not.
## Files
- `robot-remote.env`: shared defaults for backend, frontend, ROS, and `b_side_omnid`
- `robot-remote.env.local`: optional local override file loaded after `robot-remote.env`
- `load-env.sh`: loads the shared environment into the current shell
- `apply-camera-controls.sh`: applies the camera preset before `b_side_omnid` starts
- `start-backend.sh`: starts Django ASGI with `uvicorn`
- `log-network-summary.py`: polls the backend `network/latest` API and appends compact JSONL snapshots
- `start-frontend.sh`: starts the Vite dev server
- `start-ros-receiver.sh`: starts the ROS2 `udp_teleop_bridge` receiver
- `start-b-side-omnid.sh`: applies camera controls, then starts `./bin/b_side_omnid` and uses `sudo -E` by default
- `start-dev-tmux.sh`: optional one-command `tmux` launcher for all four processes
## Usage
Run these from the `OmniSocketGo` repo root:
```bash
bash scripts/dev/start-backend.sh
bash scripts/dev/start-frontend.sh
bash scripts/dev/start-ros-receiver.sh
bash scripts/dev/start-b-side-omnid.sh
```
If you prefer one command and use `tmux`:
```bash
bash scripts/dev/start-dev-tmux.sh
```
If you only want the shared environment for manual commands:
```bash
source scripts/dev/load-env.sh
```
When you launch via `start-*.sh`, you do not need to manually `export` the variables from
`robot-remote.env` or `robot-remote.env.local`. `load-env.sh` loads those files with `set -a`,
so the variables are exported automatically for the child process. Manual `export` is only needed
if you bypass these scripts and start binaries directly from a clean shell.
## Customizing
Edit `scripts/dev/robot-remote.env` for shared changes such as:
- `ROBOT_COMMAND_CENTER_ROOT`
- `CONTROL_SIDE_OMNISOCKET_SERVER_ADDR`
- `CONTROL_SIDE_OMNISOCKET_RELAY_VIA`
- `ROBOT_SIDE_OMNISOCKET_SERVER_ADDR`
- `ROBOT_SIDE_OMNISOCKET_RELAY_VIA`
- `VITE_API_BASE_URL`
- `OMNI_CAMERA_DEVICE`
- `OMNI_CAMERA_PROFILE`
- `OMNI_CAMERA_BRIGHTNESS`
- `OMNI_CAMERA_CUSTOM_CTRL`
- `OMNI_CAMERA_VERIFY`
- `OMNI_VIDEO_PEER_ID`
- `OMNI_CONTROL_PEER_ID`
- `OMNI_VIDEO_SOFT_BACKPRESSURE_SEGMENTS`
- `OMNI_VIDEO_HARD_BACKPRESSURE_SEGMENTS`
- `OMNI_VIDEO_HARD_BACKPRESSURE_HOLD_MS`
- `OMNI_CONTROL_SERVER_IDLE_RECONNECT_MS`
- `OMNI_VIDEO_MAX_FRAME_AGE_MS`
- `OMNISOCKET_TELEMETRY_PEER_ID`
- `OMNISOCKET_TELEMETRY_INTERVAL_MS`
- `OMNISOCKET_TELEMETRY_STALE_AFTER_MS`
- `OMNI_NETWORK_SUMMARY_LOG_ENABLED`
- `OMNI_NETWORK_SUMMARY_LOG_PATH`
- `OMNI_NETWORK_SUMMARY_LOG_INTERVAL_MS`
Camera presets use `v4l2-ctl` from `v4l-utils` on the robot side.
Role mapping:
- `start-backend.sh` uses the `CONTROL_SIDE_*` address pair
- `start-b-side-omnid.sh` uses the `ROBOT_SIDE_*` address pair
- `start-b-side-omnid.sh` also applies the `OMNI_CAMERA_*` preset before the daemon opens the camera
- `start-ros-receiver.sh` defaults to the robot-side address pair, but with `transport=unix_dgram` it usually does not need the server address
New repair knobs:
- `OMNI_VIDEO_SOFT_BACKPRESSURE_SEGMENTS`, `OMNI_VIDEO_HARD_BACKPRESSURE_SEGMENTS`, and `OMNI_VIDEO_HARD_BACKPRESSURE_HOLD_MS` are used by `b_side_omnid`
- `OMNI_CONTROL_SERVER_IDLE_RECONNECT_MS` is used by `b_side_omnid`
- `OMNI_VIDEO_MAX_FRAME_AGE_MS` is used by `start-backend.sh` on the A-side backend, not by `b_side_omnid`
- `OMNISOCKET_TELEMETRY_INTERVAL_MS` and `OMNISOCKET_TELEMETRY_STALE_AFTER_MS` tune the backend's D-side telemetry freshness window
- `OMNI_NETWORK_SUMMARY_LOG_*` controls the A-side JSONL summary logger that polls `GET /api/network/latest/`
Default long-run network logging:
- A-side starts a compact JSONL logger by default at `${OMNISOCKETGO_ROOT}/logs/a-network-summary.jsonl`
- The default A-side polling interval is `2000 ms`
- For D-side long runs, prefer:
```bash
./bin/kcpserver -listen 0.0.0.0:10909 \
-telemetry-peer peer-a-telemetry \
-telemetry-interval 1000ms \
-kcp-session-stats-log logs/d-kcp-stats.jsonl \
-kcp-session-stats-interval 1000ms
```
- Keep `-latency-log` and `-kcp-ts-debug-log` off by default for multi-hour runs
- Do not continuously redirect relay `C` stderr to a file unless you are reproducing a short issue window
Put machine-specific overrides into `scripts/dev/robot-remote.env.local`. Example:
```bash
ROBOT_COMMAND_CENTER_ROOT="$HOME/Documents/robot-command-center"
OMNI_CAMERA_DEVICE="/dev/video30"
B_SIDE_OMNID_USE_SUDO="0"
OMNI_NETWORK_SUMMARY_LOG_INTERVAL_MS="5000"
```
Default camera behavior is the `night` preset:
```bash
OMNI_CAMERA_PROFILE="night"
# Optional per-machine tweak:
OMNI_CAMERA_BRIGHTNESS="8"
```
To switch to a daytime preset with brightness only:
```bash
OMNI_CAMERA_PROFILE="day"
OMNI_CAMERA_BRIGHTNESS="8"
```
To send the raw `v4l2-ctl --set-ctrl=...` payload yourself:
```bash
OMNI_CAMERA_PROFILE="custom"
OMNI_CAMERA_CUSTOM_CTRL="brightness=8,auto_exposure=1,exposure_time_absolute=800,gain=64"
OMNI_CAMERA_VERIFY="1"
```

View File

@@ -0,0 +1,292 @@
#!/usr/bin/env python3
from __future__ import annotations
import argparse
from datetime import datetime, timezone
import html
import json
from pathlib import Path
from typing import Any
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description="Aggregate run logs into control/video latency estimate outputs.")
parser.add_argument("--run-dir", required=True, help="Run directory containing JSONL logs.")
parser.add_argument("--output-dir", help="Output directory. Defaults to --run-dir.")
return parser.parse_args()
def iter_jsonl(path: Path) -> list[dict[str, Any]]:
records: list[dict[str, Any]] = []
if not path.exists():
return records
with path.open("r", encoding="utf-8") as handle:
for raw_line in handle:
line = raw_line.strip()
if not line:
continue
try:
payload = json.loads(line)
except json.JSONDecodeError:
continue
if isinstance(payload, dict):
records.append(payload)
return records
def load_glob_jsonl(run_dir: Path, pattern: str) -> list[dict[str, Any]]:
records: list[dict[str, Any]] = []
for path in sorted(run_dir.glob(pattern)):
records.extend(iter_jsonl(path))
return records
def write_jsonl(path: Path, records: list[dict[str, Any]]) -> None:
with path.open("w", encoding="utf-8") as handle:
for record in records:
handle.write(json.dumps(record, ensure_ascii=False, separators=(",", ":")))
handle.write("\n")
def parse_unix_ms(value: Any) -> int | None:
if value is None:
return None
if isinstance(value, (int, float)):
return int(value)
text = str(value).strip()
if not text:
return None
if text.endswith("Z"):
text = f"{text[:-1]}+00:00"
try:
return int(datetime.fromisoformat(text).astimezone(timezone.utc).timestamp() * 1000)
except ValueError:
return None
def flatten_net_epoch(samples: list[dict[str, Any]]) -> list[dict[str, Any]]:
flattened: list[dict[str, Any]] = []
for sample in samples:
links = sample.get("links") or {}
a_to_d = (links.get("a_to_d") or {}).get("sessions") or {}
d_to_b = (links.get("d_to_b") or {}).get("sessions") or {}
a_control = (a_to_d.get("control") or {}).get("kcp") or {}
d_control = (d_to_b.get("control") or {}).get("kcp") or {}
a_video = (a_to_d.get("video") or {}).get("kcp") or {}
d_video = (d_to_b.get("video") or {}).get("kcp") or {}
flattened.append(
{
"updated_at": sample.get("updated_at"),
"a_to_d_control_srtt_ms": a_control.get("srtt_ms"),
"a_to_d_control_min_srtt_ms": a_control.get("min_srtt_ms"),
"d_to_b_control_srtt_ms": d_control.get("srtt_ms"),
"d_to_b_control_min_srtt_ms": d_control.get("min_srtt_ms"),
"a_to_d_video_srtt_ms": a_video.get("srtt_ms"),
"a_to_d_video_min_srtt_ms": a_video.get("min_srtt_ms"),
"d_to_b_video_srtt_ms": d_video.get("srtt_ms"),
"d_to_b_video_min_srtt_ms": d_video.get("min_srtt_ms"),
"a_to_d_control_feedback_age_ms": a_control.get("last_feedback_age_ms"),
"d_to_b_control_feedback_age_ms": d_control.get("last_feedback_age_ms"),
"a_to_d_video_feedback_age_ms": a_video.get("last_feedback_age_ms"),
"d_to_b_video_feedback_age_ms": d_video.get("last_feedback_age_ms"),
"a_to_d_control_retrans_delta": ((a_to_d.get("control") or {}).get("trend") or {}).get("retrans_delta"),
"d_to_b_control_retrans_delta": ((d_to_b.get("control") or {}).get("trend") or {}).get("retrans_delta"),
"a_to_d_video_retrans_delta": ((a_to_d.get("video") or {}).get("trend") or {}).get("retrans_delta"),
"d_to_b_video_retrans_delta": ((d_to_b.get("video") or {}).get("trend") or {}).get("retrans_delta"),
"a_to_d_video_window_pressure_pct": a_video.get("window_pressure_pct"),
"d_to_b_video_window_pressure_pct": d_video.get("window_pressure_pct"),
"robot_health": sample.get("robot_health"),
}
)
return flattened
def aggregate_control_estimates(
network_samples: list[dict[str, Any]],
control_events: list[dict[str, Any]],
control_acks: list[dict[str, Any]],
) -> list[dict[str, Any]]:
if control_acks:
return control_acks
fallback: list[dict[str, Any]] = []
for sample in network_samples:
estimate = sample.get("latency_estimate") or {}
fallback.append(
{
"updated_at": sample.get("updated_at"),
"estimate_method": "srtt_fallback",
"control_loop_rtt_ms": estimate.get("control_loop_rtt_ms"),
"control_to_persist_est_ms": estimate.get("control_to_persist_est_ms"),
"control_oneway_srtt_est_ms": estimate.get("control_oneway_srtt_est_ms"),
"control_oneway_bestcase_est_ms": estimate.get("control_oneway_bestcase_est_ms"),
"source_event_count": len(control_events),
}
)
return fallback
def aggregate_video_estimates(
network_samples: list[dict[str, Any]],
frame_recv_records: list[dict[str, Any]],
display_probe_records: list[dict[str, Any]],
) -> list[dict[str, Any]]:
network_timeline = sorted(
(
(updated_at_ms, sample.get("latency_estimate") or {})
for sample in network_samples
for updated_at_ms in [parse_unix_ms(sample.get("updated_at"))]
if updated_at_ms is not None
),
key=lambda item: item[0],
)
probes_by_seq = {
int(record["frame_seq"]): record
for record in display_probe_records
if record.get("frame_seq") is not None
}
estimates: list[dict[str, Any]] = []
timeline_index = 0
for record in frame_recv_records:
frame_seq = record.get("frame_seq")
if frame_seq is None:
continue
probe = probes_by_seq.get(int(frame_seq))
backend_received_unix_ns = record.get("backend_received_unix_ns")
backend_received_unix_ms = None
try:
if backend_received_unix_ns is not None:
backend_received_unix_ms = int(int(backend_received_unix_ns) / 1_000_000)
except (TypeError, ValueError):
backend_received_unix_ms = None
latency_estimate: dict[str, Any] = {}
if backend_received_unix_ms is not None and network_timeline:
while timeline_index + 1 < len(network_timeline) and network_timeline[timeline_index + 1][0] <= backend_received_unix_ms:
timeline_index += 1
if network_timeline[timeline_index][0] <= backend_received_unix_ms:
latency_estimate = network_timeline[timeline_index][1]
network_oneway = latency_estimate.get("video_network_oneway_est_ms")
capture_to_send = record.get("b_side_capture_to_send_ms")
partial_est = None
if capture_to_send is not None or network_oneway is not None:
partial_est = round(float(capture_to_send or 0.0) + float(network_oneway or 0.0), 3)
request_to_paint_ms = None
if probe is not None and probe.get("request_to_paint_ms") is not None:
request_to_paint_ms = round(float(probe["request_to_paint_ms"]), 3)
elif probe is not None and probe.get("request_started_unix_ms") is not None and probe.get("paint_unix_ms") is not None:
request_to_paint_ms = round(float(probe["paint_unix_ms"]) - float(probe["request_started_unix_ms"]), 3)
video_e2e_est_ms = round(partial_est + request_to_paint_ms, 3) if partial_est is not None and request_to_paint_ms is not None else None
estimates.append(
{
"frame_seq": frame_seq,
"backend_received_unix_ns": record.get("backend_received_unix_ns"),
"frame_hash": record.get("frame_hash"),
"estimate_method": "capture_to_send+srtt/2+request_to_paint" if video_e2e_est_ms is not None else "capture_to_send+srtt/2",
"video_network_oneway_est_ms": network_oneway,
"b_side_capture_to_send_ms": capture_to_send,
"request_to_paint_ms": request_to_paint_ms,
"response_to_paint_ms": probe.get("response_to_paint_ms") if probe is not None else None,
"backend_to_request_ms": probe.get("backend_to_request_ms") if probe is not None else None,
"backend_to_request_ms_raw": probe.get("backend_to_request_ms_raw") if probe is not None else None,
"backend_to_paint_ms": probe.get("backend_to_paint_ms") if probe is not None else None,
"backend_to_paint_ms_raw": probe.get("backend_to_paint_ms_raw") if probe is not None else None,
"browser_backend_clock_offset_ms": probe.get("browser_backend_clock_offset_ms") if probe is not None else None,
"browser_backend_clock_rtt_ms": probe.get("browser_backend_clock_rtt_ms") if probe is not None else None,
"video_partial_est_ms": partial_est,
"video_e2e_est_ms": video_e2e_est_ms,
"sequence_gap": record.get("sequence_gap"),
"repeat_flag": record.get("repeat_flag"),
"sender_clock_delta_ms_raw": record.get("sender_clock_delta_ms_raw"),
}
)
return estimates
def write_html_summary(
path: Path,
*,
net_epochs: list[dict[str, Any]],
control_estimates: list[dict[str, Any]],
video_estimates: list[dict[str, Any]],
) -> None:
latest_control = control_estimates[-1] if control_estimates else {}
latest_video = video_estimates[-1] if video_estimates else {}
latest_net = net_epochs[-1] if net_epochs else {}
html_text = f"""<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Latency Estimates</title>
<style>
body {{ font-family: Arial, sans-serif; margin: 24px; background: #0b1020; color: #eef2ff; }}
.grid {{ display: grid; grid-template-columns: repeat(3, minmax(0, 1fr)); gap: 16px; }}
.card {{ border: 1px solid #334155; border-radius: 8px; padding: 16px; background: #111827; }}
h1, h2 {{ margin-top: 0; }}
p {{ margin: 6px 0; line-height: 1.5; }}
code {{ color: #93c5fd; }}
</style>
</head>
<body>
<h1>Latency Estimates</h1>
<div class="grid">
<section class="card">
<h2>Control</h2>
<p><strong>loop RTT:</strong> {html.escape(str(latest_control.get("control_loop_rtt_ms")))}</p>
<p><strong>to persist:</strong> {html.escape(str(latest_control.get("control_to_persist_est_ms")))}</p>
<p><strong>method:</strong> {html.escape(str(latest_control.get("estimate_method")))}</p>
<p><strong>samples:</strong> {len(control_estimates)}</p>
</section>
<section class="card">
<h2>Video</h2>
<p><strong>network one-way:</strong> {html.escape(str(latest_video.get("video_network_oneway_est_ms")))}</p>
<p><strong>partial:</strong> {html.escape(str(latest_video.get("video_partial_est_ms")))}</p>
<p><strong>end-to-end:</strong> {html.escape(str(latest_video.get("video_e2e_est_ms")))}</p>
<p><strong>samples:</strong> {len(video_estimates)}</p>
</section>
<section class="card">
<h2>Net Epoch</h2>
<p><strong>a→d control srtt:</strong> {html.escape(str(latest_net.get("a_to_d_control_srtt_ms")))}</p>
<p><strong>d→b control srtt:</strong> {html.escape(str(latest_net.get("d_to_b_control_srtt_ms")))}</p>
<p><strong>a→d video srtt:</strong> {html.escape(str(latest_net.get("a_to_d_video_srtt_ms")))}</p>
<p><strong>d→b video srtt:</strong> {html.escape(str(latest_net.get("d_to_b_video_srtt_ms")))}</p>
</section>
</div>
</body>
</html>
"""
path.write_text(html_text, encoding="utf-8")
def main() -> int:
args = parse_args()
run_dir = Path(args.run_dir).resolve()
output_dir = Path(args.output_dir).resolve() if args.output_dir else run_dir
output_dir.mkdir(parents=True, exist_ok=True)
network_samples = load_glob_jsonl(run_dir, "a-network-summary.*.jsonl")
control_events = load_glob_jsonl(run_dir, "a-control-events.*.jsonl")
control_acks = load_glob_jsonl(run_dir, "a-control-acks.*.jsonl")
frame_recv_records = load_glob_jsonl(run_dir, "a-video-frame-recv.*.jsonl")
display_probe_records = load_glob_jsonl(run_dir, "a-video-display-probe.*.jsonl")
net_epochs = flatten_net_epoch(network_samples)
control_estimates = aggregate_control_estimates(network_samples, control_events, control_acks)
video_estimates = aggregate_video_estimates(network_samples, frame_recv_records, display_probe_records)
write_jsonl(output_dir / "net-epoch-summary.jsonl", net_epochs)
write_jsonl(output_dir / "control-latency-estimates.jsonl", control_estimates)
write_jsonl(output_dir / "video-latency-estimates.jsonl", video_estimates)
write_html_summary(
output_dir / "latency-estimates.html",
net_epochs=net_epochs,
control_estimates=control_estimates,
video_estimates=video_estimates,
)
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,111 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/load-env.sh"
camera_device="${OMNI_CAMERA_DEVICE}"
camera_profile="${OMNI_CAMERA_PROFILE}"
camera_brightness="${OMNI_CAMERA_BRIGHTNESS}"
camera_custom_ctrl="${OMNI_CAMERA_CUSTOM_CTRL}"
camera_verify="${OMNI_CAMERA_VERIFY}"
is_truthy() {
case "${1:-0}" in
1|true|TRUE|yes|YES|on|ON)
return 0
;;
*)
return 1
;;
esac
}
require_v4l2_ctl() {
if command -v v4l2-ctl >/dev/null 2>&1; then
return 0
fi
echo "Missing required command: v4l2-ctl. Install v4l-utils on the robot side before starting b_side_omnid." >&2
exit 1
}
run_v4l2_ctl() {
v4l2-ctl -d "${camera_device}" "$@"
}
set_ctrl() {
local ctrl="$1"
echo "[camera-controls] set ${camera_device} ${ctrl}"
run_v4l2_ctl "--set-ctrl=${ctrl}"
}
verify_ctrl() {
local ctrl="$1"
echo "[camera-controls] verify ${camera_device} ${ctrl}"
run_v4l2_ctl "--get-ctrl=${ctrl}"
}
needs_v4l2_ctl=0
case "${camera_profile}" in
night)
needs_v4l2_ctl=1
;;
day)
if [[ -n "${camera_brightness}" ]]; then
needs_v4l2_ctl=1
fi
;;
custom)
if [[ -z "${camera_custom_ctrl}" ]]; then
echo "OMNI_CAMERA_CUSTOM_CTRL must be non-empty when OMNI_CAMERA_PROFILE=custom." >&2
exit 1
fi
needs_v4l2_ctl=1
;;
*)
echo "Unsupported OMNI_CAMERA_PROFILE: ${camera_profile}. Expected one of: night, day, custom." >&2
exit 1
;;
esac
if is_truthy "${camera_verify}"; then
needs_v4l2_ctl=1
fi
if [[ "${needs_v4l2_ctl}" == "0" ]]; then
echo "[camera-controls] profile=${camera_profile}; no camera controls requested"
exit 0
fi
require_v4l2_ctl
case "${camera_profile}" in
night)
set_ctrl "auto_exposure=1"
set_ctrl "exposure_time_absolute=800"
set_ctrl "gain=64"
if [[ -n "${camera_brightness}" ]]; then
set_ctrl "brightness=${camera_brightness}"
fi
;;
day)
if [[ -n "${camera_brightness}" ]]; then
set_ctrl "brightness=${camera_brightness}"
fi
;;
custom)
set_ctrl "${camera_custom_ctrl}"
;;
esac
if is_truthy "${camera_verify}"; then
verify_ctrl "auto_exposure"
verify_ctrl "exposure_time_absolute"
verify_ctrl "gain"
verify_ctrl "brightness"
fi

324
scripts/dev/load-env.sh Normal file
View File

@@ -0,0 +1,324 @@
#!/usr/bin/env bash
set -euo pipefail
LOAD_ENV_SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
DEFAULT_OMNISOCKETGO_ROOT="$(cd "${LOAD_ENV_SCRIPT_DIR}/../.." && pwd)"
die() {
echo "$*" >&2
return 1 2>/dev/null || exit 1
}
normalize_loaded_env_vars() {
local var_name
local value
for var_name in $(compgen -A variable); do
case "${var_name}" in
BACKEND_*|BLITZ_*|B_SIDE_*|CONTROL_*|FRONTEND_*|OMNI_*|PYTHON3_BIN|PYTHON_VENV_PATH|ROBOT_*|ROS_DISTRO|VITE_*)
value="${!var_name}"
if [[ "${value}" == *$'\r' ]]; then
printf -v "${var_name}" '%s' "${value%$'\r'}"
export "${var_name}"
fi
;;
esac
done
}
is_omnisocketgo_root() {
local dir="$1"
[[ -f "${dir}/Makefile" && -f "${dir}/cmd/b_side_omnid.c" && -d "${dir}/ros-control-py" ]]
}
is_robot_command_center_root() {
local dir="$1"
[[ -f "${dir}/backend/config/asgi.py" && -f "${dir}/frontend/package.json" ]]
}
require_robot_command_center_root() {
if ! is_robot_command_center_root "${ROBOT_COMMAND_CENTER_ROOT}"; then
die "ROBOT_COMMAND_CENTER_ROOT must point to the robot-command-center repo root. Current value: ${ROBOT_COMMAND_CENTER_ROOT}. Set it in ${LOAD_ENV_SCRIPT_DIR}/robot-remote.env.local if needed."
fi
}
export OMNISOCKETGO_ROOT="${OMNISOCKETGO_ROOT:-${DEFAULT_OMNISOCKETGO_ROOT}}"
omni_camera_device_was_set=0
omni_camera_profile_was_set=0
omni_camera_brightness_was_set=0
omni_camera_custom_ctrl_was_set=0
omni_camera_verify_was_set=0
if [[ "${OMNI_CAMERA_DEVICE+x}" == "x" ]]; then
omni_camera_device_was_set=1
preserved_omni_camera_device="${OMNI_CAMERA_DEVICE}"
fi
if [[ "${OMNI_CAMERA_PROFILE+x}" == "x" ]]; then
omni_camera_profile_was_set=1
preserved_omni_camera_profile="${OMNI_CAMERA_PROFILE}"
fi
if [[ "${OMNI_CAMERA_BRIGHTNESS+x}" == "x" ]]; then
omni_camera_brightness_was_set=1
preserved_omni_camera_brightness="${OMNI_CAMERA_BRIGHTNESS}"
fi
if [[ "${OMNI_CAMERA_CUSTOM_CTRL+x}" == "x" ]]; then
omni_camera_custom_ctrl_was_set=1
preserved_omni_camera_custom_ctrl="${OMNI_CAMERA_CUSTOM_CTRL}"
fi
if [[ "${OMNI_CAMERA_VERIFY+x}" == "x" ]]; then
omni_camera_verify_was_set=1
preserved_omni_camera_verify="${OMNI_CAMERA_VERIFY}"
fi
ENV_FILES=(
"${LOAD_ENV_SCRIPT_DIR}/robot-remote.env"
"${LOAD_ENV_SCRIPT_DIR}/robot-remote.env.local"
)
for env_file in "${ENV_FILES[@]}"; do
if [[ -f "${env_file}" ]]; then
set -a
# shellcheck disable=SC1090
source "${env_file}"
set +a
fi
done
normalize_loaded_env_vars
if [[ "${omni_camera_device_was_set}" == "1" ]]; then
export OMNI_CAMERA_DEVICE="${preserved_omni_camera_device}"
fi
if [[ "${omni_camera_profile_was_set}" == "1" ]]; then
export OMNI_CAMERA_PROFILE="${preserved_omni_camera_profile}"
fi
if [[ "${omni_camera_brightness_was_set}" == "1" ]]; then
export OMNI_CAMERA_BRIGHTNESS="${preserved_omni_camera_brightness}"
fi
if [[ "${omni_camera_custom_ctrl_was_set}" == "1" ]]; then
export OMNI_CAMERA_CUSTOM_CTRL="${preserved_omni_camera_custom_ctrl}"
fi
if [[ "${omni_camera_verify_was_set}" == "1" ]]; then
export OMNI_CAMERA_VERIFY="${preserved_omni_camera_verify}"
fi
export OMNISOCKETGO_ROOT="${OMNISOCKETGO_ROOT:-${DEFAULT_OMNISOCKETGO_ROOT}}"
export ROBOT_COMMAND_CENTER_ROOT="${ROBOT_COMMAND_CENTER_ROOT:-$(dirname "${OMNISOCKETGO_ROOT}")/robot-command-center}"
if ! is_omnisocketgo_root "${OMNISOCKETGO_ROOT}"; then
die "OMNISOCKETGO_ROOT must point to the OmniSocketGo repo root. Current value: ${OMNISOCKETGO_ROOT}"
fi
export BACKEND_DIR="${BACKEND_DIR:-${ROBOT_COMMAND_CENTER_ROOT}/backend}"
export FRONTEND_DIR="${FRONTEND_DIR:-${ROBOT_COMMAND_CENTER_ROOT}/frontend}"
export ROS_CONTROL_PY_DIR="${ROS_CONTROL_PY_DIR:-${OMNISOCKETGO_ROOT}/ros-control-py}"
export PYTHON3_BIN="${PYTHON3_BIN:-python3}"
export PYTHON_VENV_PATH="${PYTHON_VENV_PATH:-${OMNISOCKETGO_ROOT}/.venv}"
export BACKEND_HOST="${BACKEND_HOST:-0.0.0.0}"
export BACKEND_PORT="${BACKEND_PORT:-8001}"
export FRONTEND_HOST="${FRONTEND_HOST:-0.0.0.0}"
export FRONTEND_PORT="${FRONTEND_PORT:-5173}"
export OMNISOCKET_TELEMETRY_PEER_ID="${OMNISOCKET_TELEMETRY_PEER_ID:-peer-a-telemetry}"
export OMNISOCKET_TELEMETRY_INTERVAL_MS="${OMNISOCKET_TELEMETRY_INTERVAL_MS:-1000}"
export OMNISOCKET_TELEMETRY_STALE_AFTER_MS="${OMNISOCKET_TELEMETRY_STALE_AFTER_MS:-3000}"
export OMNI_NETWORK_SUMMARY_LOG_ENABLED="${OMNI_NETWORK_SUMMARY_LOG_ENABLED:-1}"
export OMNI_NETWORK_SUMMARY_LOG_PATH="${OMNI_NETWORK_SUMMARY_LOG_PATH:-${OMNISOCKETGO_ROOT}/logs/a-network-summary.jsonl}"
export OMNI_NETWORK_SUMMARY_LOG_INTERVAL_MS="${OMNI_NETWORK_SUMMARY_LOG_INTERVAL_MS:-1000}"
export OMNI_NETWORK_SUMMARY_LOG_REQUEST_TIMEOUT_SEC="${OMNI_NETWORK_SUMMARY_LOG_REQUEST_TIMEOUT_SEC:-3}"
export CONTROL_SIDE_OMNISOCKET_SERVER_ADDR="${CONTROL_SIDE_OMNISOCKET_SERVER_ADDR:-}"
export CONTROL_SIDE_OMNISOCKET_RELAY_VIA="${CONTROL_SIDE_OMNISOCKET_RELAY_VIA:-}"
export ROBOT_SIDE_OMNISOCKET_SERVER_ADDR="${ROBOT_SIDE_OMNISOCKET_SERVER_ADDR:-}"
export ROBOT_SIDE_OMNISOCKET_RELAY_VIA="${ROBOT_SIDE_OMNISOCKET_RELAY_VIA:-}"
export ROS_DISTRO="${ROS_DISTRO:-jazzy}"
export ROBOT_RECEIVER_TRANSPORT="${ROBOT_RECEIVER_TRANSPORT:-unix_dgram}"
export ROBOT_RECEIVER_SERVER_ADDR="${ROBOT_RECEIVER_SERVER_ADDR:-${ROBOT_SIDE_OMNISOCKET_SERVER_ADDR:-}}"
export ROBOT_RECEIVER_RELAY_VIA="${ROBOT_RECEIVER_RELAY_VIA:-${ROBOT_SIDE_OMNISOCKET_RELAY_VIA:-}}"
export ROBOT_RECEIVER_PEER_ID="${ROBOT_RECEIVER_PEER_ID:-ros-bridge-ctrl}"
export ROBOT_RECEIVER_EXPECTED_SENDER="${ROBOT_RECEIVER_EXPECTED_SENDER:-}"
export ROBOT_RECEIVER_LOCAL_SOCKET_PATH="${ROBOT_RECEIVER_LOCAL_SOCKET_PATH:-/tmp/omnisocket-b-side-cmd.sock}"
export ROBOT_RECEIVER_OUTPUT_TOPIC="${ROBOT_RECEIVER_OUTPUT_TOPIC:-/hric/robot/cmd_vel}"
export ROBOT_RECEIVER_FRAME_ID="${ROBOT_RECEIVER_FRAME_ID:-pelvis}"
export ROBOT_RECEIVER_WATCHDOG_TIMEOUT="${ROBOT_RECEIVER_WATCHDOG_TIMEOUT:-0.5}"
export ROBOT_RECEIVER_PUBLISH_RATE_HZ="${ROBOT_RECEIVER_PUBLISH_RATE_HZ:-100.0}"
export OMNI_CAMERA_DEVICE="${OMNI_CAMERA_DEVICE:-/dev/video0}"
export OMNI_CAMERA_PROFILE="${OMNI_CAMERA_PROFILE:-night}"
export OMNI_CAMERA_BRIGHTNESS="${OMNI_CAMERA_BRIGHTNESS:-}"
export OMNI_CAMERA_CUSTOM_CTRL="${OMNI_CAMERA_CUSTOM_CTRL:-}"
export OMNI_CAMERA_VERIFY="${OMNI_CAMERA_VERIFY:-0}"
export OMNI_GPSD_HOST="${OMNI_GPSD_HOST:-127.0.0.1}"
export OMNI_VIDEO_SERVER_ADDR="${OMNI_VIDEO_SERVER_ADDR:-${ROBOT_SIDE_OMNISOCKET_SERVER_ADDR:-}}"
export OMNI_VIDEO_RELAY_VIA="${OMNI_VIDEO_RELAY_VIA:-${ROBOT_SIDE_OMNISOCKET_RELAY_VIA:-}}"
export OMNI_CONTROL_SERVER_ADDR="${OMNI_CONTROL_SERVER_ADDR:-${ROBOT_SIDE_OMNISOCKET_SERVER_ADDR:-}}"
export OMNI_CONTROL_RELAY_VIA="${OMNI_CONTROL_RELAY_VIA:-${ROBOT_SIDE_OMNISOCKET_RELAY_VIA:-}}"
export OMNI_CONTROL_UNIX_SOCKET_PATH="${OMNI_CONTROL_UNIX_SOCKET_PATH:-${ROBOT_RECEIVER_LOCAL_SOCKET_PATH}}"
export OMNI_CONTROL_ACK_PEER_ID="${OMNI_CONTROL_ACK_PEER_ID:-peer-b-ctrl-ack}"
export OMNI_CONTROL_ACK_TARGET_PEER="${OMNI_CONTROL_ACK_TARGET_PEER:-peer-a-ctrl-ack}"
export B_SIDE_OMNID_USE_SUDO="${B_SIDE_OMNID_USE_SUDO:-1}"
export BLITZ_RUNTIME_DIR="${BLITZ_RUNTIME_DIR:-${OMNISOCKETGO_ROOT}/logs/runtime}"
export BLITZ_RUN_ROOT="${BLITZ_RUN_ROOT:-${OMNISOCKETGO_ROOT}/logs}"
export BLITZ_RUN_CONTEXT_FILE="${BLITZ_RUN_CONTEXT_FILE:-${BLITZ_RUNTIME_DIR}/run-context.env}"
export BLITZ_RUN_ID_FILE="${BLITZ_RUN_ID_FILE:-${BLITZ_RUNTIME_DIR}/run-id}"
export BLITZ_CURRENT_RUN_LINK="${BLITZ_CURRENT_RUN_LINK:-${BLITZ_RUN_ROOT}/current}"
export BLITZ_5G_INTERFACE="${BLITZ_5G_INTERFACE:-}"
export BLITZ_5G_MODEM_SUBNET="${BLITZ_5G_MODEM_SUBNET:-192.168.224.0/22}"
export BLITZ_5G_GATEWAY="${BLITZ_5G_GATEWAY:-192.168.225.1}"
export BLITZ_5G_ROUTE_TARGETS="${BLITZ_5G_ROUTE_TARGETS:-106.55.173.235}"
export BLITZ_5G_INFO_JSON="${BLITZ_5G_INFO_JSON:-${OMNISOCKETGO_ROOT}/scripts/boot/modem_network_info.json}"
export BLITZ_TIME_SERVER_IP="${BLITZ_TIME_SERVER_IP:-}"
export BLITZ_KCP_STATS_INTERVAL_MS="${BLITZ_KCP_STATS_INTERVAL_MS:-1000}"
export BLITZ_CONTROL_LATENCY_LOG_ENABLED="${BLITZ_CONTROL_LATENCY_LOG_ENABLED:-1}"
export BLITZ_CONTROL_LATENCY_LOG_SAMPLE_MOD="${BLITZ_CONTROL_LATENCY_LOG_SAMPLE_MOD:-100}"
export BLITZ_CONTROL_ACK_SAMPLE_MOD="${BLITZ_CONTROL_ACK_SAMPLE_MOD:-10}"
export BLITZ_VIDEO_STAGE_LOG_ENABLED="${BLITZ_VIDEO_STAGE_LOG_ENABLED:-1}"
export BLITZ_VIDEO_STAGE_LOG_SAMPLE_MOD="${BLITZ_VIDEO_STAGE_LOG_SAMPLE_MOD:-10}"
export BLITZ_5G_LINK_LOG_INTERVAL_SEC="${BLITZ_5G_LINK_LOG_INTERVAL_SEC:-5}"
export BLITZ_JSONL_FLUSH_INTERVAL_MS="${BLITZ_JSONL_FLUSH_INTERVAL_MS:-1000}"
export BLITZ_JSONL_FLUSH_BYTES="${BLITZ_JSONL_FLUSH_BYTES:-262144}"
export BLITZ_JSONL_ROTATE_BYTES="${BLITZ_JSONL_ROTATE_BYTES:-134217728}"
export BLITZ_JSONL_ROTATE_FILES="${BLITZ_JSONL_ROTATE_FILES:-8}"
blitz_dev_utc_compact_timestamp() {
date -u '+%Y%m%dT%H%M%SZ'
}
blitz_dev_git_commit() {
git -C "${OMNISOCKETGO_ROOT}" rev-parse HEAD 2>/dev/null || true
}
blitz_dev_git_dirty_flag() {
if git -C "${OMNISOCKETGO_ROOT}" diff --quiet --ignore-submodules=dirty >/dev/null 2>&1; then
printf '0\n'
return 0
fi
printf '1\n'
}
blitz_dev_prepare_dirs() {
mkdir -p "${BLITZ_RUNTIME_DIR}" "${BLITZ_RUN_ROOT}/runs" "${BLITZ_RUN_ROOT}/incidents"
}
blitz_dev_write_run_info() {
local run_dir="$1"
local run_id="$2"
local boot_id="$3"
local tmp_info="${run_dir}/run-info.json.tmp.$$"
local started_at
local commit_hash
local dirty_flag
started_at="$(date -u '+%Y-%m-%dT%H:%M:%SZ')"
commit_hash="$(blitz_dev_git_commit)"
dirty_flag="$(blitz_dev_git_dirty_flag)"
python3 - "${tmp_info}" "${run_id}" "${run_dir}" "${boot_id}" "${started_at}" "${commit_hash}" "${dirty_flag}" "${HOSTNAME:-$(hostname)}" <<'PY'
import json
import os
import sys
path, run_id, run_dir, boot_id, started_at, commit_hash, dirty_flag, hostname = sys.argv[1:9]
payload = {
"run_id": run_id,
"run_dir": run_dir,
"boot_id": boot_id,
"started_at": started_at,
"hostname": hostname,
"git_commit": commit_hash,
"git_dirty": dirty_flag == "1",
"env": {
key: os.environ.get(key, "")
for key in sorted(os.environ)
if key.startswith(("BLITZ_", "OMNI_", "ROBOT_RECEIVER_"))
},
}
with open(path, "w", encoding="utf-8") as handle:
json.dump(payload, handle, ensure_ascii=False, indent=2, sort_keys=True)
PY
mv -f "${tmp_info}" "${run_dir}/run-info.json"
}
blitz_dev_init_run_context() {
local run_id="${1:-$(blitz_dev_utc_compact_timestamp)}"
local boot_id="dev-$(blitz_dev_utc_compact_timestamp)"
local run_dir="${BLITZ_RUN_ROOT}/runs/${run_id}"
local tmp_context="${BLITZ_RUN_CONTEXT_FILE}.tmp.$$"
blitz_dev_prepare_dirs
mkdir -p "${run_dir}"
export BLITZ_RUN_ID="${run_id}"
export BLITZ_RUN_DIR="${run_dir}"
export BLITZ_BOOT_ID="${boot_id}"
printf '%s\n' "${run_id}" > "${BLITZ_RUN_ID_FILE}"
cat > "${tmp_context}" <<EOF
BLITZ_RUN_ID=${run_id}
BLITZ_RUN_DIR=${run_dir}
BLITZ_BOOT_ID=${boot_id}
BLITZ_RUN_ROOT=${BLITZ_RUN_ROOT}
EOF
mv -f "${tmp_context}" "${BLITZ_RUN_CONTEXT_FILE}"
ln -sfn "${run_dir}" "${BLITZ_CURRENT_RUN_LINK}"
blitz_dev_write_run_info "${run_dir}" "${run_id}" "${boot_id}"
}
blitz_dev_load_run_context() {
if [[ ! -f "${BLITZ_RUN_CONTEXT_FILE}" ]]; then
return 1
fi
set -a
# shellcheck disable=SC1090
source "${BLITZ_RUN_CONTEXT_FILE}"
set +a
}
blitz_dev_ensure_run_context() {
if blitz_dev_load_run_context; then
return 0
fi
blitz_dev_init_run_context
}
blitz_dev_reset_run_context() {
rm -f "${BLITZ_RUN_CONTEXT_FILE}" "${BLITZ_RUN_ID_FILE}"
blitz_dev_init_run_context
}
blitz_dev_init_instance_context() {
if [[ -z "${BLITZ_INSTANCE_ID:-}" ]]; then
export BLITZ_INSTANCE_ID="$(blitz_dev_utc_compact_timestamp)-$$"
fi
}
blitz_dev_component_log_path() {
local stem="$1"
printf '%s/%s.%s.jsonl\n' "${BLITZ_RUN_DIR}" "${stem}" "${BLITZ_INSTANCE_ID}"
}
blitz_dev_prepare_backend_logging_env() {
blitz_dev_init_instance_context
if [[ "${OMNI_NETWORK_SUMMARY_LOG_PATH}" == "${OMNISOCKETGO_ROOT}/logs/a-network-summary.jsonl" ]]; then
export OMNI_NETWORK_SUMMARY_LOG_PATH
OMNI_NETWORK_SUMMARY_LOG_PATH="$(blitz_dev_component_log_path "a-network-summary")"
fi
export BLITZ_A_CONTROL_EVENTS_LOG_PATH="${BLITZ_A_CONTROL_EVENTS_LOG_PATH:-$(blitz_dev_component_log_path "a-control-events")}"
export BLITZ_A_CONTROL_ACKS_LOG_PATH="${BLITZ_A_CONTROL_ACKS_LOG_PATH:-$(blitz_dev_component_log_path "a-control-acks")}"
export BLITZ_A_VIDEO_FRAME_RECV_LOG_PATH="${BLITZ_A_VIDEO_FRAME_RECV_LOG_PATH:-$(blitz_dev_component_log_path "a-video-frame-recv")}"
export BLITZ_A_VIDEO_DISPLAY_PROBE_LOG_PATH="${BLITZ_A_VIDEO_DISPLAY_PROBE_LOG_PATH:-$(blitz_dev_component_log_path "a-video-display-probe")}"
}
blitz_dev_prepare_bside_logging_env() {
blitz_dev_init_instance_context
export BLITZ_KCP_STATS_LOG_PATH="${BLITZ_KCP_STATS_LOG_PATH:-$(blitz_dev_component_log_path "b-kcp-session-stats")}"
export BLITZ_CONTROL_LATENCY_LOG_PATH="${BLITZ_CONTROL_LATENCY_LOG_PATH:-$(blitz_dev_component_log_path "b-control-latency")}"
export BLITZ_VIDEO_STAGE_LOG_PATH="${BLITZ_VIDEO_STAGE_LOG_PATH:-$(blitz_dev_component_log_path "b-video-frame-stages")}"
}
blitz_dev_prepare_5g_logging_env() {
blitz_dev_init_instance_context
export BLITZ_5G_LINK_LOG_PATH="${BLITZ_5G_LINK_LOG_PATH:-$(blitz_dev_component_log_path "b-5g-link-quality")}"
}
if [[ "${BLITZ_SKIP_DEV_RUN_CONTEXT_INIT:-0}" != "1" && "${BLITZ_BOOT_LOADING_ENV:-0}" != "1" && "${OMNI_BOOT_MODE:-0}" != "1" ]]; then
blitz_dev_ensure_run_context
elif [[ -f "${BLITZ_RUN_CONTEXT_FILE}" ]]; then
blitz_dev_load_run_context || true
fi

View File

@@ -0,0 +1,100 @@
#!/usr/bin/env python3
from __future__ import annotations
import argparse
import json
import signal
import sys
import time
import urllib.error
import urllib.request
from pathlib import Path
STOP_REQUESTED = False
def handle_signal(signum: int, frame: object) -> None:
del signum, frame
global STOP_REQUESTED
STOP_REQUESTED = True
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description="Poll /api/network/latest/ and append JSONL snapshots.")
parser.add_argument("--url", required=True, help="HTTP endpoint that returns the network summary JSON.")
parser.add_argument("--output", required=True, help="Output JSONL path.")
parser.add_argument(
"--interval-ms",
type=int,
default=2000,
help="Polling interval in milliseconds. Default: 2000.",
)
parser.add_argument(
"--request-timeout-sec",
type=float,
default=3.0,
help="Single request timeout in seconds. Default: 3.0.",
)
return parser.parse_args()
def sleep_with_stop(seconds: float) -> None:
deadline = time.monotonic() + max(0.0, seconds)
while not STOP_REQUESTED:
remaining = deadline - time.monotonic()
if remaining <= 0.0:
return
time.sleep(min(remaining, 0.2))
def fetch_json(url: str, timeout_sec: float) -> str:
request = urllib.request.Request(
url,
headers={
"Accept": "application/json",
"Cache-Control": "no-cache",
},
method="GET",
)
with urllib.request.urlopen(request, timeout=timeout_sec) as response:
charset = response.headers.get_content_charset("utf-8")
payload = response.read().decode(charset)
parsed = json.loads(payload)
return json.dumps(parsed, separators=(",", ":"), ensure_ascii=False)
def main() -> int:
args = parse_args()
interval_sec = max(args.interval_ms, 200) / 1000.0
output_path = Path(args.output)
last_error_log_monotonic = 0.0
signal.signal(signal.SIGINT, handle_signal)
signal.signal(signal.SIGTERM, handle_signal)
output_path.parent.mkdir(parents=True, exist_ok=True)
with output_path.open("a", encoding="utf-8") as output_file:
while not STOP_REQUESTED:
started = time.monotonic()
try:
line = fetch_json(args.url, args.request_timeout_sec)
except (TimeoutError, urllib.error.URLError, urllib.error.HTTPError, json.JSONDecodeError) as error:
now = time.monotonic()
if now - last_error_log_monotonic >= 10.0:
print(f"[network-summary] poll failed: {error}", file=sys.stderr)
last_error_log_monotonic = now
else:
output_file.write(line)
output_file.write("\n")
output_file.flush()
elapsed = time.monotonic() - started
sleep_with_stop(max(0.0, interval_sec - elapsed))
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,10 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
export BLITZ_SKIP_DEV_RUN_CONTEXT_INIT="1"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/load-env.sh"
blitz_dev_reset_run_context
printf 'run_id=%s\nrun_dir=%s\n' "${BLITZ_RUN_ID}" "${BLITZ_RUN_DIR}"

View File

@@ -0,0 +1,76 @@
# Optional absolute path override for the companion repo.
# By default the scripts assume:
# OmniSocketGo -> current repo
# robot-command-center -> sibling directory next to OmniSocketGo
# Example:
# ROBOT_COMMAND_CENTER_ROOT="$HOME/Documents/robot-command-center"
CONTROL_SIDE_OMNISOCKET_SERVER_ADDR="81.70.156.140:10909" # D
CONTROL_SIDE_OMNISOCKET_RELAY_VIA="106.55.173.235:10909" # C
ROBOT_SIDE_OMNISOCKET_SERVER_ADDR="81.70.156.140:10909" # D
ROBOT_SIDE_OMNISOCKET_RELAY_VIA="81.70.156.140:10909" # 直连 D
# Log one normal relay packet out of every N packets. Drop events still log immediately.
OMNI_RELAY_PACKET_LOG_SAMPLE_EVERY="200"
CONTROL_WS_ALLOWED_ORIGINS="http://127.0.0.1:5173,http://localhost:5173"
VITE_API_BASE_URL="http://127.0.0.1:8001"
PYTHON3_BIN="python3"
PYTHON_VENV_PATH="${OMNISOCKETGO_ROOT}/.venv"
BACKEND_HOST="0.0.0.0"
BACKEND_PORT="8001"
OMNISOCKET_TELEMETRY_PEER_ID="peer-a-telemetry"
OMNISOCKET_TELEMETRY_INTERVAL_MS="1000"
OMNISOCKET_TELEMETRY_STALE_AFTER_MS="3000"
OMNI_NETWORK_SUMMARY_LOG_ENABLED="1"
OMNI_NETWORK_SUMMARY_LOG_PATH="${OMNISOCKETGO_ROOT}/logs/a-network-summary.jsonl"
OMNI_NETWORK_SUMMARY_LOG_INTERVAL_MS="1000"
OMNI_NETWORK_SUMMARY_LOG_REQUEST_TIMEOUT_SEC="3"
FRONTEND_HOST="0.0.0.0"
FRONTEND_PORT="5173"
ROS_DISTRO="jazzy"
ROBOT_RECEIVER_TRANSPORT="unix_dgram"
ROBOT_RECEIVER_SERVER_ADDR="${ROBOT_SIDE_OMNISOCKET_SERVER_ADDR}"
ROBOT_RECEIVER_RELAY_VIA="${ROBOT_SIDE_OMNISOCKET_RELAY_VIA}"
ROBOT_RECEIVER_PEER_ID="ros-bridge-ctrl"
ROBOT_RECEIVER_EXPECTED_SENDER=""
ROBOT_RECEIVER_LOCAL_SOCKET_PATH="/tmp/omnisocket-b-side-cmd.sock"
ROBOT_RECEIVER_OUTPUT_TOPIC="/hric/robot/cmd_vel"
ROBOT_RECEIVER_FRAME_ID="pelvis"
ROBOT_RECEIVER_WATCHDOG_TIMEOUT="0.5"
ROBOT_RECEIVER_PUBLISH_RATE_HZ="100.0"
OMNI_VIDEO_PEER_ID="peer-b-video"
OMNI_VIDEO_TARGET_PEER="peer-a-video"
OMNI_GPSD_HOST="127.0.0.1"
OMNI_CAMERA_DEVICE="/dev/v4l/by-path/platform-a80aa10000.usb-usb-0:3.2:1.4-video-index0"
OMNI_CAMERA_PROFILE="day"
OMNI_CAMERA_BRIGHTNESS=""
OMNI_CAMERA_CUSTOM_CTRL=""
OMNI_CAMERA_VERIFY="0"
OMNI_VIDEO_SERVER_ADDR="${ROBOT_SIDE_OMNISOCKET_SERVER_ADDR}"
OMNI_VIDEO_RELAY_VIA="${ROBOT_SIDE_OMNISOCKET_RELAY_VIA}"
OMNI_VIDEO_SOFT_BACKPRESSURE_SEGMENTS="256"
OMNI_VIDEO_HARD_BACKPRESSURE_SEGMENTS="1024"
OMNI_VIDEO_HARD_BACKPRESSURE_HOLD_MS="5000"
OMNI_VIDEO_FRAME_STALL_RECONNECT_MS="30000"
OMNI_CONTROL_PEER_ID="peer-b-ctrl"
OMNI_CONTROL_EXPECTED_SENDER="peer-a-ctrl"
OMNI_CONTROL_SERVER_ADDR="${ROBOT_SIDE_OMNISOCKET_SERVER_ADDR}"
OMNI_CONTROL_RELAY_VIA="${ROBOT_SIDE_OMNISOCKET_RELAY_VIA}"
OMNI_CONTROL_UNIX_SOCKET_PATH="${ROBOT_RECEIVER_LOCAL_SOCKET_PATH}"
OMNI_CONTROL_ACK_PEER_ID="peer-b-ctrl-ack"
OMNI_CONTROL_ACK_TARGET_PEER="peer-a-ctrl-ack"
BLITZ_CONTROL_ACK_SAMPLE_MOD="10"
BLITZ_VIDEO_STAGE_LOG_ENABLED="1"
BLITZ_VIDEO_STAGE_LOG_SAMPLE_MOD="10"
OMNI_CONTROL_SERVER_IDLE_RECONNECT_MS="30000"
# A-side backend video freshness guard. Used by scripts/dev/start-backend.sh.
OMNI_VIDEO_MAX_FRAME_AGE_MS="1000"
B_SIDE_OMNID_USE_SUDO="1"

View File

@@ -0,0 +1,9 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/load-env.sh"
blitz_dev_prepare_5g_logging_env
exec bash "${OMNISOCKETGO_ROOT}/scripts/boot/blitz-5g-link-logger.sh"

View File

@@ -0,0 +1,55 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/load-env.sh"
blitz_dev_prepare_bside_logging_env
cd "${OMNISOCKETGO_ROOT}"
export OMNISOCKET_SERVER_ADDR="${ROBOT_SIDE_OMNISOCKET_SERVER_ADDR}"
export OMNISOCKET_RELAY_VIA="${ROBOT_SIDE_OMNISOCKET_RELAY_VIA}"
export OMNI_VIDEO_SERVER_ADDR="${OMNI_VIDEO_SERVER_ADDR}"
export OMNI_VIDEO_RELAY_VIA="${OMNI_VIDEO_RELAY_VIA}"
export OMNI_CONTROL_SERVER_ADDR="${OMNI_CONTROL_SERVER_ADDR}"
export OMNI_CONTROL_RELAY_VIA="${OMNI_CONTROL_RELAY_VIA}"
logger_pid=""
cleanup() {
if [[ -n "${logger_pid}" ]]; then
kill "${logger_pid}" 2>/dev/null || true
wait "${logger_pid}" 2>/dev/null || true
fi
}
start_5g_link_logger_if_needed() {
if [[ "${OMNI_BOOT_MODE:-0}" == "1" ]]; then
return 0
fi
bash "${SCRIPT_DIR}/start-5g-link-logger.sh" &
logger_pid=$!
echo "[start-b-side-omnid] 5G link logger -> ${BLITZ_5G_LINK_LOG_PATH:-unset}" >&2
}
if [[ ! -x "./bin/b_side_omnid" ]]; then
if [[ "${OMNI_BOOT_MODE:-0}" == "1" ]]; then
echo "Missing ./bin/b_side_omnid in boot mode; build it before enabling the autostart service." >&2
exit 1
fi
make b_side_omnid
fi
launch_b_side_omnid() {
trap cleanup EXIT INT TERM
start_5g_link_logger_if_needed
bash "${SCRIPT_DIR}/apply-camera-controls.sh"
./bin/b_side_omnid
}
if [[ "${B_SIDE_OMNID_USE_SUDO}" == "1" && "${EUID}" -ne 0 ]]; then
exec sudo -E bash -lc 'cd "$1" && export B_SIDE_OMNID_USE_SUDO=0 && exec bash "$2"' _ "${OMNISOCKETGO_ROOT}" "${SCRIPT_DIR}/start-b-side-omnid.sh"
fi
launch_b_side_omnid

54
scripts/dev/start-backend.sh Executable file
View File

@@ -0,0 +1,54 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/load-env.sh"
require_robot_command_center_root
blitz_dev_prepare_backend_logging_env
if [[ ! -d "${PYTHON_VENV_PATH}" ]]; then
"${PYTHON3_BIN}" -m venv "${PYTHON_VENV_PATH}"
fi
# shellcheck disable=SC1091
source "${PYTHON_VENV_PATH}/bin/activate"
cd "${BACKEND_DIR}"
export OMNISOCKET_SERVER_ADDR="${CONTROL_SIDE_OMNISOCKET_SERVER_ADDR}"
export OMNISOCKET_RELAY_VIA="${CONTROL_SIDE_OMNISOCKET_RELAY_VIA}"
logger_pid=""
cleanup() {
if [[ -n "${logger_pid}" ]]; then
kill "${logger_pid}" 2>/dev/null || true
wait "${logger_pid}" 2>/dev/null || true
fi
}
start_network_summary_logger() {
local logger_url
local logger_dir
if [[ "${OMNI_NETWORK_SUMMARY_LOG_ENABLED}" != "1" ]]; then
return
fi
logger_url="http://127.0.0.1:${BACKEND_PORT}/api/network/latest/"
logger_dir="$(dirname "${OMNI_NETWORK_SUMMARY_LOG_PATH}")"
mkdir -p "${logger_dir}"
python "${SCRIPT_DIR}/log-network-summary.py" \
--url "${logger_url}" \
--output "${OMNI_NETWORK_SUMMARY_LOG_PATH}" \
--interval-ms "${OMNI_NETWORK_SUMMARY_LOG_INTERVAL_MS}" \
--request-timeout-sec "${OMNI_NETWORK_SUMMARY_LOG_REQUEST_TIMEOUT_SEC}" &
logger_pid=$!
echo "[start-backend] network summary logger -> ${OMNI_NETWORK_SUMMARY_LOG_PATH} (${OMNI_NETWORK_SUMMARY_LOG_INTERVAL_MS} ms)" >&2
}
trap cleanup EXIT INT TERM
start_network_summary_logger
python -m uvicorn config.asgi:application --host "${BACKEND_HOST}" --port "${BACKEND_PORT}"

21
scripts/dev/start-dev-tmux.sh Executable file
View File

@@ -0,0 +1,21 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SESSION_NAME="${1:-robot-remote}"
if ! command -v tmux >/dev/null 2>&1; then
echo "tmux is required for this launcher" >&2
exit 1
fi
if tmux has-session -t "${SESSION_NAME}" 2>/dev/null; then
exec tmux attach -t "${SESSION_NAME}"
fi
tmux new-session -d -s "${SESSION_NAME}" -n backend "bash -lc '${SCRIPT_DIR}/start-backend.sh'"
tmux new-window -t "${SESSION_NAME}:" -n frontend "bash -lc '${SCRIPT_DIR}/start-frontend.sh'"
tmux new-window -t "${SESSION_NAME}:" -n ros "bash -lc '${SCRIPT_DIR}/start-ros-receiver.sh'"
tmux new-window -t "${SESSION_NAME}:" -n b-side "bash -lc '${SCRIPT_DIR}/start-b-side-omnid.sh'"
exec tmux attach -t "${SESSION_NAME}"

10
scripts/dev/start-frontend.sh Executable file
View File

@@ -0,0 +1,10 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/load-env.sh"
require_robot_command_center_root
cd "${FRONTEND_DIR}"
exec npm run dev -- --host "${FRONTEND_HOST}" --port "${FRONTEND_PORT}"

View File

@@ -0,0 +1,50 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source_with_nounset_off() {
set +u
# shellcheck disable=SC1090
source "$1"
set -u
}
# shellcheck disable=SC1091
source "${SCRIPT_DIR}/load-env.sh"
if [[ ! -f "/opt/ros/${ROS_DISTRO}/setup.bash" ]]; then
echo "Missing ROS distro setup: /opt/ros/${ROS_DISTRO}/setup.bash" >&2
exit 1
fi
source_with_nounset_off "/opt/ros/${ROS_DISTRO}/setup.bash"
cd "${ROS_CONTROL_PY_DIR}"
if [[ ! -f "install/setup.bash" ]]; then
echo "Missing ROS workspace setup: ${ROS_CONTROL_PY_DIR}/install/setup.bash" >&2
exit 1
fi
source_with_nounset_off "install/setup.bash"
launch_args=(
"transport:=${ROBOT_RECEIVER_TRANSPORT}"
"peer_id:=${ROBOT_RECEIVER_PEER_ID}"
"local_socket_path:=${ROBOT_RECEIVER_LOCAL_SOCKET_PATH}"
"output_topic:=${ROBOT_RECEIVER_OUTPUT_TOPIC}"
"frame_id:=${ROBOT_RECEIVER_FRAME_ID}"
"watchdog_timeout:=${ROBOT_RECEIVER_WATCHDOG_TIMEOUT}"
"publish_rate_hz:=${ROBOT_RECEIVER_PUBLISH_RATE_HZ}"
)
if [[ -n "${ROBOT_RECEIVER_SERVER_ADDR}" ]]; then
launch_args+=("server_addr:=${ROBOT_RECEIVER_SERVER_ADDR}")
fi
if [[ -n "${ROBOT_RECEIVER_RELAY_VIA}" ]]; then
launch_args+=("relay_via:=${ROBOT_RECEIVER_RELAY_VIA}")
fi
if [[ -n "${ROBOT_RECEIVER_EXPECTED_SENDER}" ]]; then
launch_args+=("expected_sender:=${ROBOT_RECEIVER_EXPECTED_SENDER}")
fi
exec ros2 launch udp_teleop_bridge robot_udp_receiver.launch.py "${launch_args[@]}"

333
src/gps_buffer.c Normal file
View File

@@ -0,0 +1,333 @@
#include "gps_buffer.h"
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <netdb.h>
#include <math.h>
#include <errno.h> // 确保包含 errno
// 全局共享变量
static gps_video_sample_t g_current_gps_data = {0.0, 0.0};
static volatile int g_running = 0;
static pthread_t g_gps_thread;
static pthread_mutex_t g_gps_mutex = PTHREAD_MUTEX_INITIALIZER;
static double normalize_coordinate(double coordinate) {
return round(coordinate * 1000000.0) / 1000000.0;
}
static void store_gps(double latitude, double longitude) {
pthread_mutex_lock(&g_gps_mutex);
g_current_gps_data.latitude = normalize_coordinate(latitude);
g_current_gps_data.longitude = normalize_coordinate(longitude);
pthread_mutex_unlock(&g_gps_mutex);
}
static void clear_gps(void) {
pthread_mutex_lock(&g_gps_mutex);
g_current_gps_data.latitude = 0.0;
g_current_gps_data.longitude = 0.0;
pthread_mutex_unlock(&g_gps_mutex);
}
static gps_video_sample_t load_gps(void) {
gps_video_sample_t sample;
pthread_mutex_lock(&g_gps_mutex);
sample = g_current_gps_data;
pthread_mutex_unlock(&g_gps_mutex);
return sample;
}
static void gps_sleep_before_retry(void) {
int retry_ms = 1000;
int step_ms = 100;
int elapsed_ms = 0;
while (g_running && elapsed_ms < retry_ms) {
usleep((useconds_t) step_ms * 1000U);
elapsed_ms += step_ms;
}
}
// 将经纬度规范化为 double保留 6 位小数。
static int normalize_gps(double latitude, double longitude, gps_video_sample_t* sample) {
if (!isfinite(latitude) || !isfinite(longitude)) {
return -1;
}
// 过滤掉 0,0 这种无效坐标
if (fabs(latitude) < 1e-6 && fabs(longitude) < 1e-6) {
return -1;
}
if (sample == NULL) {
return -1;
}
sample->latitude = normalize_coordinate(latitude);
sample->longitude = normalize_coordinate(longitude);
return 0;
}
// =================================================================
// 以下是借鉴 gps_parse.c 实现的底层解析函数
// =================================================================
// 1. 辅助函数:在 JSON 字符串中查找键对应的值的起始位置
static const char* find_json_value(const char* json, const char* key) {
char pattern[64];
int written;
const char* position;
if (json == NULL || key == NULL) return NULL;
// 构建搜索模式: "key":
written = snprintf(pattern, sizeof(pattern), "\"%s\":", key);
if (written < 0 || (size_t)written >= sizeof(pattern)) {
return NULL;
}
position = strstr(json, pattern);
if (position == NULL) {
return NULL;
}
// 跳过 "key":
position += written;
// 跳过可能存在的空格
while (*position == ' ' || *position == '\t') {
position++;
}
return position;
}
// 2. 解析函数:从 JSON 字符串中提取 Double 类型的值
static int json_extract_double(const char* json, const char* key, double* value) {
const char* position;
char* endptr = NULL;
double parsed;
position = find_json_value(json, key);
if (position == NULL) {
return 0; // 键不存在
}
// 确保当前位置是数字或负号
if (*position != '-' && !(*position >= '0' && *position <= '9')) {
return 0;
}
// 重置 errno 以检测错误
errno = 0;
parsed = strtod(position, &endptr);
// 检查转换是否成功
if (errno != 0 || endptr == position || !isfinite(parsed)) {
return 0;
}
*value = parsed;
return 1;
}
// 3. 解析函数:从 JSON 字符串中提取 Int 类型的值
static int json_extract_int(const char* json, const char* key, int* value) {
double dval;
if (json_extract_double(json, key, &dval)) {
*value = (int)dval;
return 1;
}
return 0;
}
// 4. 检查是否为 TPV (定位数据) 包
static int is_tpv_class(const char* json) {
char class_buf[32] = {0};
const char* pos = find_json_value(json, "class");
if (pos == NULL || *pos != '"') return 0;
// 简单提取 class 的值 (TPV/SKY/DEVICES)
sscanf(pos, "\"%31[^\"]\"", class_buf);
return (strcmp(class_buf, "TPV") == 0);
}
// =================================================================
// 后台线程函数:负责连接 gpsd 并更新全局变量
// =================================================================
void* gps_update_thread(void* arg) {
const char* host = (const char*)arg;
const char* gpsd_host = (host != NULL && host[0] != '\0') ? host : "127.0.0.1";
while (g_running) {
int sockfd = -1;
struct addrinfo hints;
struct addrinfo *res = NULL;
struct addrinfo *rp = NULL;
int s;
char buffer[4096];
size_t offset = 0;
// 1. 解析地址并连接 gpsd (默认端口 2947)
memset(&hints, 0, sizeof(hints));
hints.ai_family = AF_UNSPEC; // 兼容 IPv4/IPv6
hints.ai_socktype = SOCK_STREAM;
s = getaddrinfo(gpsd_host, "2947", &hints, &res);
if (s != 0) {
fprintf(stderr, "GPS线程: 解析 gpsd 地址失败 %s:2947: %s\n", gpsd_host, gai_strerror(s));
gps_sleep_before_retry();
continue;
}
// 尝试连接每一个解析出来的地址
for (rp = res; rp != NULL; rp = rp->ai_next) {
sockfd = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
if (sockfd == -1) {
continue;
}
if (connect(sockfd, rp->ai_addr, rp->ai_addrlen) != -1) {
break;
}
close(sockfd);
sockfd = -1;
}
freeaddrinfo(res);
if (sockfd < 0) {
fprintf(stderr, "GPS线程: 无法连接到 %s:29471 秒后重试\n", gpsd_host);
gps_sleep_before_retry();
continue;
}
printf("GPS线程: 已连接到 gpsd %s\n", gpsd_host);
// 2. 发送 WATCH 命令,开启 JSON 流
{
const char* watch_cmd = "?WATCH={\"enable\":true,\"json\":true};\n";
if (send(sockfd, watch_cmd, strlen(watch_cmd), 0) < 0) {
perror("GPS线程: 发送 WATCH 命令失败");
close(sockfd);
gps_sleep_before_retry();
continue;
}
}
// 3. 主循环:读取并解析数据流
// 注意gpsd 数据是以 \n 结尾的,不能直接用固定长度 recv
while (g_running) {
ssize_t len = recv(sockfd, buffer + offset, sizeof(buffer) - 1 - offset, 0);
if (len <= 0) {
break;
}
offset += (size_t) len;
buffer[offset] = '\0'; // 确保字符串结束
// 查找换行符 \n因为一条完整的 JSON 消息以 \n 结尾
char* start = buffer;
char* end;
while ((end = memchr(start, '\n', (buffer + offset) - start)) != NULL) {
*end = '\0'; // 临时截断,形成独立字符串
// --- 核心解析逻辑 ---
// 1. 检查是否为 TPV 数据包
if (is_tpv_class(start)) {
double lat = 0.0;
double lon = 0.0;
int mode = 0;
int has_fix = 0;
// 2. 提取定位模式 (mode: 1=无定位, 2=2D, 3=3D)
if (json_extract_int(start, "mode", &mode)) {
has_fix = (mode >= 2);
}
// 3. 如果有定位,提取经纬度
if (has_fix) {
int got_lat = json_extract_double(start, "lat", &lat);
int got_lon = json_extract_double(start, "lon", &lon);
if (got_lat && got_lon) {
gps_video_sample_t sample;
// 4. 更新全局共享变量,使用 double 直接携带经纬度。
if (normalize_gps(lat, lon, &sample) == 0) {
store_gps(sample.latitude, sample.longitude);
}
// 调试:取消注释可查看实时经纬度
// printf("更新GPS: lat=%.6f, lon=%.6f\n", lat, lon);
}
}
// 如果无定位,这里不操作,保持上一次的有效值
}
// --- 解析结束 ---
// 移动指针到下一条消息
start = end + 1;
}
// 处理完所有完整消息后,将剩余未处理的数据移到缓冲区头部
if (start < buffer + offset) {
size_t remaining = (size_t) ((buffer + offset) - start);
memmove(buffer, start, remaining);
offset = remaining;
} else {
offset = 0; // 缓冲区已清空
}
}
close(sockfd);
if (g_running) {
fprintf(stderr, "GPS线程: 连接断开1 秒后重连...\n");
gps_sleep_before_retry();
}
}
return NULL;
}
// =================================================================
// 接口函数实现
// =================================================================
gps_video_sample_t get_latest_gps_for_video(void) {
return load_gps();
}
int gps_buffer_init(const char* host) {
if (g_running) return 0;
g_running = 1;
clear_gps();
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
// 创建后台线程
if (pthread_create(&g_gps_thread, &attr, gps_update_thread, (void*)host) != 0) {
g_running = 0;
pthread_attr_destroy(&attr); // 清理属性
perror("无法创建 GPS 线程");
return -1;
}
pthread_attr_destroy(&attr); // 清理属性
return 0;
}
void gps_buffer_cleanup(void) {
g_running = 0;
// 等待线程结束
usleep(10000); // 等待 100ms 让后台线程有机会处理退出标志
}
//gcc main.c video_pipeline_run.c gps_buffer.c -lpthread -lm -o my_app 请确保在编译命令中链接 pthread 和 m (math) 库

View File

@@ -18,7 +18,7 @@ kcp_packet_debug_logger_t *kcp_packet_debug_open_jsonl(const char *path) {
fclose(file);
return NULL;
}
omni_file_logger_init(&logger->file_logger, file);
omni_file_logger_init_path(&logger->file_logger, file, path, 0);
logger->enabled = 1;
return logger;
}

View File

@@ -73,7 +73,7 @@ kcp_session_stats_logger_t *kcp_session_stats_open_jsonl(const char *path) {
fclose(file);
return NULL;
}
omni_file_logger_init(&logger->file_logger, file);
omni_file_logger_init_path(&logger->file_logger, file, path, 0);
logger->enabled = 1;
return logger;
}
@@ -156,10 +156,38 @@ int kcp_session_stats_log(kcp_session_stats_logger_t *logger, const kcp_session_
kcp_session_stats_appendf(&line, &line_len, ",\"srtt_ms\":%d", record->srtt_ms) != 0) {
goto cleanup;
}
if (record->has_min_srtt_ms &&
kcp_session_stats_appendf(&line, &line_len, ",\"min_srtt_ms\":%d", record->min_srtt_ms) != 0) {
goto cleanup;
}
if (record->has_srttvar_ms &&
kcp_session_stats_appendf(&line, &line_len, ",\"srttvar_ms\":%d", record->srttvar_ms) != 0) {
goto cleanup;
}
if (record->has_last_feedback_age_ms &&
kcp_session_stats_appendf(&line, &line_len, ",\"last_feedback_age_ms\":%u", record->last_feedback_age_ms) != 0) {
goto cleanup;
}
if (record->has_snd_wnd &&
kcp_session_stats_appendf(&line, &line_len, ",\"snd_wnd\":%u", record->snd_wnd) != 0) {
goto cleanup;
}
if (record->has_rmt_wnd &&
kcp_session_stats_appendf(&line, &line_len, ",\"rmt_wnd\":%u", record->rmt_wnd) != 0) {
goto cleanup;
}
if (record->has_inflight &&
kcp_session_stats_appendf(&line, &line_len, ",\"inflight\":%u", record->inflight) != 0) {
goto cleanup;
}
if (record->has_window_limit &&
kcp_session_stats_appendf(&line, &line_len, ",\"window_limit\":%u", record->window_limit) != 0) {
goto cleanup;
}
if (record->has_window_pressure_pct &&
kcp_session_stats_appendf(&line, &line_len, ",\"window_pressure_pct\":%.3f", record->window_pressure_pct) != 0) {
goto cleanup;
}
if (record->has_bytes_sent &&
kcp_session_stats_appendf(&line, &line_len, ",\"bytes_sent\":%" PRIu64, record->bytes_sent) != 0) {
goto cleanup;

View File

@@ -32,7 +32,7 @@ latency_logger_t *latencylog_open_jsonl(const char *path) {
fclose(file);
return NULL;
}
omni_file_logger_init(&logger->file_logger, file);
omni_file_logger_init_path(&logger->file_logger, file, path, 0);
logger->enabled = 1;
return logger;
}

View File

@@ -544,9 +544,217 @@ const char *omni_path_base_name(const char *path) {
return slash == NULL ? path : slash + 1;
}
static uint64_t omni_now_monotonic_ms64(void) {
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts);
return (uint64_t) ts.tv_sec * 1000ULL + (uint64_t) (ts.tv_nsec / 1000000L);
}
static int omni_positive_int_env(const char *name, int default_value) {
const char *raw = getenv(name);
long parsed;
char *endptr = NULL;
if (raw == NULL || raw[0] == '\0') {
return default_value;
}
parsed = strtol(raw, &endptr, 10);
if (endptr == raw || *endptr != '\0' || parsed <= 0) {
return default_value;
}
return (int) parsed;
}
static size_t omni_positive_size_env(const char *name, size_t default_value) {
const char *raw = getenv(name);
unsigned long long parsed;
char *endptr = NULL;
if (raw == NULL || raw[0] == '\0') {
return default_value;
}
parsed = strtoull(raw, &endptr, 10);
if (endptr == raw || *endptr != '\0' || parsed == 0ULL) {
return default_value;
}
return (size_t) parsed;
}
static int omni_file_logger_flush_locked(omni_file_logger_t *logger, uint64_t now_ms) {
if (logger == NULL || logger->file == NULL) {
errno = EINVAL;
return -1;
}
if (fflush(logger->file) != 0) {
return -1;
}
logger->buffered_bytes = 0U;
logger->last_flush_monotonic_ms = now_ms;
return 0;
}
static int omni_build_rotated_path(char *buffer, size_t buffer_len, const char *path, int suffix) {
size_t path_len;
int written;
if (buffer == NULL || buffer_len == 0U || path == NULL || path[0] == '\0') {
errno = EINVAL;
return -1;
}
path_len = strlen(path);
if (path_len + 16U >= buffer_len) {
errno = ENAMETOOLONG;
return -1;
}
memcpy(buffer, path, path_len);
written = snprintf(buffer + path_len, buffer_len - path_len, ".%d", suffix);
if (written < 0 || (size_t) written >= buffer_len - path_len) {
errno = ENAMETOOLONG;
return -1;
}
return 0;
}
static int omni_file_logger_reopen_append_locked(omni_file_logger_t *logger) {
struct stat st;
FILE *file;
if (logger == NULL || logger->path[0] == '\0') {
errno = EINVAL;
return -1;
}
file = fopen(logger->path, "ab");
if (file == NULL) {
return -1;
}
logger->file = file;
logger->current_bytes = 0U;
if (stat(logger->path, &st) == 0) {
logger->current_bytes = (size_t) st.st_size;
}
logger->buffered_bytes = 0U;
logger->last_flush_monotonic_ms = omni_now_monotonic_ms64();
return 0;
}
static int omni_file_logger_recover_after_rotate_locked(omni_file_logger_t *logger, const char *rotated_current_path) {
int reopen_errno;
if (omni_file_logger_reopen_append_locked(logger) == 0) {
return 0;
}
reopen_errno = errno;
if (rotated_current_path != NULL && rotated_current_path[0] != '\0') {
if (rename(rotated_current_path, logger->path) == 0) {
if (omni_file_logger_reopen_append_locked(logger) == 0) {
return 0;
}
}
}
errno = reopen_errno;
return -1;
}
static int omni_file_logger_rotate_locked(omni_file_logger_t *logger) {
int index;
int saved_errno = 0;
int should_recover = 0;
char rotated_current_path[PATH_MAX];
char from_path[PATH_MAX];
char to_path[PATH_MAX];
if (logger == NULL || logger->path[0] == '\0' || logger->max_bytes == 0U || logger->max_files <= 0) {
return 0;
}
rotated_current_path[0] = '\0';
if (logger->file != NULL) {
if (omni_file_logger_flush_locked(logger, omni_now_monotonic_ms64()) != 0) {
return -1;
}
should_recover = 1;
if (fclose(logger->file) != 0) {
logger->file = NULL;
saved_errno = errno;
goto recover;
}
logger->file = NULL;
}
if (omni_build_rotated_path(from_path, sizeof(from_path), logger->path, logger->max_files) != 0) {
saved_errno = errno;
goto recover;
}
unlink(from_path);
for (index = logger->max_files - 1; index >= 1; --index) {
if (omni_build_rotated_path(from_path, sizeof(from_path), logger->path, index) != 0 ||
omni_build_rotated_path(to_path, sizeof(to_path), logger->path, index + 1) != 0) {
saved_errno = errno;
goto recover;
}
if (rename(from_path, to_path) != 0 && errno != ENOENT) {
saved_errno = errno;
goto recover;
}
}
if (omni_build_rotated_path(to_path, sizeof(to_path), logger->path, 1) != 0) {
saved_errno = errno;
goto recover;
}
if (rename(logger->path, to_path) != 0 && errno != ENOENT) {
saved_errno = errno;
goto recover;
}
snprintf(rotated_current_path, sizeof(rotated_current_path), "%s", to_path);
if (omni_file_logger_reopen_append_locked(logger) != 0) {
saved_errno = errno;
goto recover;
}
return 0;
recover:
if (should_recover) {
int recover_errno = saved_errno != 0 ? saved_errno : errno;
if (omni_file_logger_recover_after_rotate_locked(logger, rotated_current_path) == 0) {
errno = recover_errno;
} else if (saved_errno != 0) {
errno = saved_errno;
}
} else if (saved_errno != 0) {
errno = saved_errno;
}
return -1;
}
void omni_file_logger_init(omni_file_logger_t *logger, FILE *file) {
memset(logger, 0, sizeof(*logger));
logger->file = file;
pthread_mutex_init(&logger->mutex, NULL);
logger->flush_bytes = 1U;
logger->flush_interval_ms = 0;
logger->immediate_flush = 1;
logger->last_flush_monotonic_ms = omni_now_monotonic_ms64();
}
void omni_file_logger_init_path(omni_file_logger_t *logger, FILE *file, const char *path, int immediate_flush) {
struct stat st;
omni_file_logger_init(logger, file);
if (path != NULL && path[0] != '\0') {
snprintf(logger->path, sizeof(logger->path), "%s", path);
if (stat(path, &st) == 0) {
logger->current_bytes = (size_t) st.st_size;
}
}
logger->flush_bytes = omni_positive_size_env("BLITZ_JSONL_FLUSH_BYTES", 262144U);
logger->flush_interval_ms = omni_positive_int_env("BLITZ_JSONL_FLUSH_INTERVAL_MS", 1000);
logger->max_bytes = omni_positive_size_env("BLITZ_JSONL_ROTATE_BYTES", 134217728U);
logger->max_files = omni_positive_int_env("BLITZ_JSONL_ROTATE_FILES", 8);
logger->immediate_flush = immediate_flush != 0;
}
void omni_file_logger_destroy(omni_file_logger_t *logger) {
@@ -555,13 +763,32 @@ void omni_file_logger_destroy(omni_file_logger_t *logger) {
int omni_file_logger_write_line(omni_file_logger_t *logger, const char *line) {
int rc = 0;
size_t line_len;
uint64_t now_ms;
if (logger == NULL || logger->file == NULL || line == NULL) {
errno = EINVAL;
return -1;
}
line_len = strlen(line) + 1U;
now_ms = omni_now_monotonic_ms64();
pthread_mutex_lock(&logger->mutex);
if (fputs(line, logger->file) == EOF || fputc('\n', logger->file) == EOF || fflush(logger->file) != 0) {
if (fputs(line, logger->file) == EOF || fputc('\n', logger->file) == EOF) {
rc = -1;
} else {
logger->current_bytes += line_len;
logger->buffered_bytes += line_len;
if (logger->immediate_flush ||
logger->buffered_bytes >= logger->flush_bytes ||
(logger->flush_interval_ms > 0 && now_ms - logger->last_flush_monotonic_ms >= (uint64_t) logger->flush_interval_ms)) {
if (omni_file_logger_flush_locked(logger, now_ms) != 0) {
rc = -1;
}
}
if (rc == 0 && logger->max_bytes > 0U && logger->current_bytes >= logger->max_bytes) {
if (omni_file_logger_rotate_locked(logger) != 0) {
rc = -1;
}
}
}
pthread_mutex_unlock(&logger->mutex);
return rc;

View File

@@ -1,24 +1,343 @@
#include "peer_kcp_client.h"
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define KCP_CLIENT_REGISTER_TIMEOUT_MS 3000
#define KCP_CLIENT_CTRL_REGISTER_OK "{\"type\":\"server_register_ok\"}"
#define KCP_CLIENT_CTRL_PEER_REPLACED "{\"type\":\"server_peer_replaced\",\"reason\":\"new_instance_wins\"}"
#define KCP_CLIENT_CTRL_HEARTBEAT "{\"type\":\"server_heartbeat\"}"
#define KCP_CLIENT_CTRL_HEARTBEAT_ACK "{\"type\":\"server_heartbeat_ack\"}"
struct kcp_client {
char id[OMNI_MAX_PEER_ID];
char server_addr[OMNI_MAX_ADDR_TEXT];
kcp_conn_t *conn;
latency_logger_t *logger;
pthread_mutex_t id_mu;
pthread_mutex_t state_mu;
uint64_t next_message_id;
int registered;
uint32_t last_server_activity_ms;
char last_server_error[256];
};
static int kcp_client_next_message_id(kcp_client_t *client, uint64_t *out_id) {
pthread_mutex_lock(&client->id_mu);
if (client == NULL || out_id == NULL) {
errno = EINVAL;
return -1;
}
pthread_mutex_lock(&client->state_mu);
*out_id = ++client->next_message_id;
pthread_mutex_unlock(&client->id_mu);
pthread_mutex_unlock(&client->state_mu);
return 0;
}
static void kcp_client_set_registered(kcp_client_t *client, int registered) {
if (client == NULL) {
return;
}
pthread_mutex_lock(&client->state_mu);
client->registered = registered != 0;
pthread_mutex_unlock(&client->state_mu);
}
static void kcp_client_touch_server_activity(kcp_client_t *client) {
if (client == NULL) {
return;
}
pthread_mutex_lock(&client->state_mu);
client->last_server_activity_ms = omni_now_millis32();
pthread_mutex_unlock(&client->state_mu);
}
static void kcp_client_set_last_server_error(kcp_client_t *client, const char *message) {
if (client == NULL) {
return;
}
pthread_mutex_lock(&client->state_mu);
snprintf(client->last_server_error, sizeof(client->last_server_error), "%s", message == NULL ? "" : message);
pthread_mutex_unlock(&client->state_mu);
}
static void kcp_client_clear_last_server_error(kcp_client_t *client) {
kcp_client_set_last_server_error(client, "");
}
static int kcp_client_server_error_invalidates_registration(const char *message) {
if (message == NULL || message[0] == '\0') {
return 0;
}
return strstr(message, "not registered") != NULL
|| strstr(message, "first message must be register") != NULL
|| strstr(message, "peer replaced") != NULL
|| strstr(message, "timed out waiting for server_register_ok") != NULL;
}
static int kcp_client_is_registered(kcp_client_t *client) {
int registered;
if (client == NULL) {
return 0;
}
pthread_mutex_lock(&client->state_mu);
registered = client->registered;
pthread_mutex_unlock(&client->state_mu);
return registered;
}
static int kcp_client_text_body_equals(const message_t *msg, const char *payload) {
size_t expected_len;
if (msg == NULL || payload == NULL || msg->body == NULL) {
return 0;
}
expected_len = strlen(payload);
return msg->body_len == expected_len && memcmp(msg->body, payload, expected_len) == 0;
}
static void kcp_client_copy_server_error_body(const message_t *msg, char *buffer, size_t buffer_len) {
size_t copy_len;
if (buffer == NULL || buffer_len == 0) {
return;
}
buffer[0] = '\0';
if (msg == NULL || msg->body == NULL || msg->body_len == 0) {
return;
}
copy_len = msg->body_len < (buffer_len - 1U) ? msg->body_len : (buffer_len - 1U);
memcpy(buffer, msg->body, copy_len);
buffer[copy_len] = '\0';
}
static int kcp_client_registration_errno_from_message(const char *message) {
if (message == NULL || message[0] == '\0') {
return ECONNREFUSED;
}
if (strstr(message, "duplicate peer id") != NULL) {
return EEXIST;
}
if (strstr(message, "first message must be register") != NULL) {
return EPROTO;
}
return ECONNREFUSED;
}
static int kcp_client_send_text_internal(kcp_client_t *client, const char *to, const char *text, int log_business_event) {
message_t msg;
uint64_t id;
if (client == NULL || to == NULL || text == NULL || client->conn == NULL) {
errno = EINVAL;
return -1;
}
protocol_message_init(&msg);
if (kcp_client_next_message_id(client, &id) != 0) {
return -1;
}
msg.type = MSG_TYPE_TEXT;
msg.id = id;
snprintf(msg.from, sizeof(msg.from), "%s", client->id);
snprintf(msg.to, sizeof(msg.to), "%s", to);
msg.body = (uint8_t *) omni_strdup(text);
if (msg.body == NULL) {
return -1;
}
msg.body_len = strlen((const char *) msg.body);
if (log_business_event) {
latencylog_log_message_event(client->logger, OMNI_NODE_ROLE_PEER, client->id, EVENT_A_APP_PREP_BEGIN, &msg);
}
if (kcp_conn_send(client->conn, &msg) != 0) {
protocol_message_clear(&msg);
return -1;
}
protocol_message_clear(&msg);
return 0;
}
static int kcp_client_send_business_preflight(kcp_client_t *client) {
if (client == NULL || client->conn == NULL) {
errno = ENOTCONN;
return -1;
}
if (!kcp_client_is_registered(client)) {
errno = ENOTCONN;
return -1;
}
return 0;
}
static int kcp_client_handle_reserved_server_message(kcp_client_t *client, const message_t *msg) {
if (client == NULL || msg == NULL) {
errno = EINVAL;
return -1;
}
if (msg->type != MSG_TYPE_TEXT || strcmp(msg->from, SERVER_PEER_ID) != 0) {
return 0;
}
kcp_client_touch_server_activity(client);
if (kcp_client_text_body_equals(msg, KCP_CLIENT_CTRL_REGISTER_OK)) {
kcp_client_set_registered(client, 1);
kcp_client_clear_last_server_error(client);
return 1;
}
if (kcp_client_text_body_equals(msg, KCP_CLIENT_CTRL_HEARTBEAT)) {
if (kcp_client_send_text_internal(client, SERVER_PEER_ID, KCP_CLIENT_CTRL_HEARTBEAT_ACK, 0) != 0) {
kcp_client_set_registered(client, 0);
kcp_client_set_last_server_error(client, "failed to acknowledge server heartbeat");
(void) kcp_conn_close(client->conn);
return -1;
}
return 1;
}
if (kcp_client_text_body_equals(msg, KCP_CLIENT_CTRL_HEARTBEAT_ACK)) {
return 1;
}
if (kcp_client_text_body_equals(msg, KCP_CLIENT_CTRL_PEER_REPLACED)) {
kcp_client_set_registered(client, 0);
kcp_client_set_last_server_error(client, "server peer replaced this session");
(void) kcp_conn_close(client->conn);
errno = ECONNRESET;
return -1;
}
return 0;
}
static int kcp_client_remaining_timeout_ms(int original_timeout_ms, uint32_t start_ms) {
uint32_t elapsed_ms;
if (original_timeout_ms < 0) {
return -1;
}
elapsed_ms = omni_now_millis32() - start_ms;
if (elapsed_ms >= (uint32_t) original_timeout_ms) {
return 0;
}
return original_timeout_ms - (int) elapsed_ms;
}
static int kcp_client_wait_for_register_ok(kcp_client_t *client) {
uint32_t start_ms;
if (client == NULL || client->conn == NULL) {
errno = EINVAL;
return -1;
}
start_ms = omni_now_millis32();
for (;;) {
message_t msg;
int rc;
int remaining_timeout_ms = kcp_client_remaining_timeout_ms(KCP_CLIENT_REGISTER_TIMEOUT_MS, start_ms);
if (remaining_timeout_ms <= 0) {
kcp_client_set_registered(client, 0);
kcp_client_set_last_server_error(client, "timed out waiting for server_register_ok");
(void) kcp_conn_close(client->conn);
errno = ETIMEDOUT;
return -1;
}
protocol_message_init(&msg);
rc = kcp_conn_receive_timed(client->conn, &msg, remaining_timeout_ms);
if (rc == 1) {
protocol_message_clear(&msg);
kcp_client_set_registered(client, 0);
kcp_client_set_last_server_error(client, "timed out waiting for server_register_ok");
(void) kcp_conn_close(client->conn);
errno = ETIMEDOUT;
return -1;
}
if (rc != 0) {
protocol_message_clear(&msg);
kcp_client_set_registered(client, 0);
return -1;
}
if (msg.type == MSG_TYPE_ERROR && strcmp(msg.from, SERVER_PEER_ID) == 0) {
char error_text[256];
kcp_client_copy_server_error_body(&msg, error_text, sizeof(error_text));
kcp_client_touch_server_activity(client);
kcp_client_set_registered(client, 0);
kcp_client_set_last_server_error(client, error_text);
protocol_message_clear(&msg);
(void) kcp_conn_close(client->conn);
errno = kcp_client_registration_errno_from_message(error_text);
return -1;
}
rc = kcp_client_handle_reserved_server_message(client, &msg);
protocol_message_clear(&msg);
if (rc < 0) {
return -1;
}
if (rc > 0 && kcp_client_is_registered(client)) {
return 0;
}
kcp_client_set_registered(client, 0);
kcp_client_set_last_server_error(client, "unexpected message while waiting for server_register_ok");
(void) kcp_conn_close(client->conn);
errno = EPROTO;
return -1;
}
}
static int kcp_client_receive_business_timed(kcp_client_t *client, message_t *out_msg, int timeout_ms) {
uint32_t start_ms;
if (client == NULL || out_msg == NULL || client->conn == NULL) {
errno = EINVAL;
return -1;
}
start_ms = omni_now_millis32();
protocol_message_init(out_msg);
for (;;) {
int rc;
int reserved_rc;
int effective_timeout_ms = timeout_ms < 0 ? -1 : kcp_client_remaining_timeout_ms(timeout_ms, start_ms);
if (timeout_ms >= 0 && effective_timeout_ms <= 0) {
return 1;
}
protocol_message_clear(out_msg);
rc = kcp_conn_receive_timed(client->conn, out_msg, effective_timeout_ms);
if (rc != 0) {
if (rc != 1) {
kcp_client_set_registered(client, 0);
}
return rc;
}
if (strcmp(out_msg->from, SERVER_PEER_ID) == 0) {
kcp_client_touch_server_activity(client);
}
reserved_rc = kcp_client_handle_reserved_server_message(client, out_msg);
if (reserved_rc < 0) {
protocol_message_clear(out_msg);
return -1;
}
if (reserved_rc > 0) {
protocol_message_clear(out_msg);
continue;
}
if (out_msg->type == MSG_TYPE_ERROR && strcmp(out_msg->from, SERVER_PEER_ID) == 0) {
char error_text[256];
kcp_client_copy_server_error_body(out_msg, error_text, sizeof(error_text));
kcp_client_set_last_server_error(client, error_text);
if (kcp_client_server_error_invalidates_registration(error_text)) {
kcp_client_set_registered(client, 0);
}
}
latencylog_log_message_event(client->logger, OMNI_NODE_ROLE_PEER, client->id, EVENT_B_APP_RECV, out_msg);
return 0;
}
}
static int kcp_client_persist_message_to_disk(const message_t *msg, const char *inbox_dir, char *out_path, size_t out_path_len) {
char path[512];
@@ -107,7 +426,8 @@ kcp_client_t *kcp_client_dial_with_options(const char *server_addr, const char *
}
snprintf(client->id, sizeof(client->id), "%s", peer_id);
snprintf(client->server_addr, sizeof(client->server_addr), "%s", server_addr == NULL ? "" : server_addr);
pthread_mutex_init(&client->id_mu, NULL);
pthread_mutex_init(&client->state_mu, NULL);
client->last_server_activity_ms = omni_now_millis32();
client->logger = logger;
client->conn = kcp_conn_dial_with_options(actual_dial_addr, bind_ip, bind_device, options, packet_logger, logger, OMNI_NODE_ROLE_PEER, peer_id, stats_logger, stats_interval_ms);
if (client->conn == NULL) {
@@ -128,6 +448,12 @@ kcp_client_t *kcp_client_dial_with_options(const char *server_addr, const char *
errno = saved_errno;
return NULL;
}
if (kcp_client_wait_for_register_ok(client) != 0) {
saved_errno = errno;
kcp_client_free(client);
errno = saved_errno;
return NULL;
}
return client;
}
@@ -140,34 +466,27 @@ const char *kcp_client_id(const kcp_client_t *client) {
}
int kcp_client_send_text(kcp_client_t *client, const char *to, const char *text) {
message_t msg;
uint64_t id;
if (client == NULL || to == NULL || text == NULL) {
errno = EINVAL;
return -1;
}
protocol_message_init(&msg);
kcp_client_next_message_id(client, &id);
msg.type = MSG_TYPE_TEXT;
msg.id = id;
snprintf(msg.from, sizeof(msg.from), "%s", client->id);
snprintf(msg.to, sizeof(msg.to), "%s", to);
msg.body = (uint8_t *) omni_strdup(text);
if (msg.body == NULL) {
if (kcp_client_send_business_preflight(client) != 0) {
return -1;
}
msg.body_len = strlen((const char *) msg.body);
latencylog_log_message_event(client->logger, OMNI_NODE_ROLE_PEER, client->id, EVENT_A_APP_PREP_BEGIN, &msg);
if (kcp_conn_send(client->conn, &msg) != 0) {
protocol_message_clear(&msg);
return -1;
}
protocol_message_clear(&msg);
return 0;
return kcp_client_send_text_internal(client, to, text, 1);
}
int kcp_client_send_binary(kcp_client_t *client, const char *to, const void *data, size_t data_len) {
return kcp_client_send_binary_with_id(client, to, data, data_len, NULL);
}
int kcp_client_send_binary_with_id(
kcp_client_t *client,
const char *to,
const void *data,
size_t data_len,
uint64_t *out_id
) {
message_t msg;
uint64_t id;
@@ -175,8 +494,13 @@ int kcp_client_send_binary(kcp_client_t *client, const char *to, const void *dat
errno = EINVAL;
return -1;
}
if (kcp_client_send_business_preflight(client) != 0) {
return -1;
}
protocol_message_init(&msg);
kcp_client_next_message_id(client, &id);
if (kcp_client_next_message_id(client, &id) != 0) {
return -1;
}
msg.type = MSG_TYPE_BINARY;
msg.id = id;
snprintf(msg.from, sizeof(msg.from), "%s", client->id);
@@ -194,6 +518,9 @@ int kcp_client_send_binary(kcp_client_t *client, const char *to, const void *dat
protocol_message_clear(&msg);
return -1;
}
if (out_id != NULL) {
*out_id = id;
}
protocol_message_clear(&msg);
return 0;
}
@@ -209,11 +536,17 @@ int kcp_client_send_file_path(kcp_client_t *client, const char *to, const char *
errno = EINVAL;
return -1;
}
if (kcp_client_send_business_preflight(client) != 0) {
return -1;
}
if (omni_read_file(path, &body, &body_len) != 0) {
return -1;
}
protocol_message_init(&msg);
kcp_client_next_message_id(client, &id);
if (kcp_client_next_message_id(client, &id) != 0) {
free(body);
return -1;
}
msg.type = MSG_TYPE_FILE;
msg.id = id;
snprintf(msg.from, sizeof(msg.from), "%s", client->id);
@@ -231,18 +564,11 @@ int kcp_client_send_file_path(kcp_client_t *client, const char *to, const char *
}
int kcp_client_receive_timed(kcp_client_t *client, message_t *out_msg, int timeout_ms) {
int rc;
if (client == NULL || out_msg == NULL) {
errno = EINVAL;
return -1;
}
rc = kcp_conn_receive_timed(client->conn, out_msg, timeout_ms);
if (rc != 0) {
return rc;
}
latencylog_log_message_event(client->logger, OMNI_NODE_ROLE_PEER, client->id, EVENT_B_APP_RECV, out_msg);
return 0;
return kcp_client_receive_business_timed(client, out_msg, timeout_ms);
}
int kcp_client_receive(kcp_client_t *client, message_t *out_msg) {
@@ -264,6 +590,7 @@ int kcp_client_receive_binary_into(kcp_client_t *client, void *buffer, size_t bu
protocol_message_init(&msg);
rc = kcp_client_receive_timed(client, &msg, timeout_ms);
if (rc != 0) {
protocol_message_clear(&msg);
return rc;
}
@@ -294,8 +621,48 @@ int kcp_client_persist_message(kcp_client_t *client, const message_t *msg, const
return 0;
}
void kcp_client_state_snapshot(kcp_client_t *client, kcp_client_state_t *out_state) {
kcp_runtime_stats_t runtime_stats;
if (out_state == NULL) {
return;
}
memset(out_state, 0, sizeof(*out_state));
if (client == NULL) {
return;
}
memset(&runtime_stats, 0, sizeof(runtime_stats));
if (client->conn != NULL) {
kcp_conn_runtime_stats_snapshot(client->conn, &runtime_stats);
out_state->connected = runtime_stats.connected;
}
pthread_mutex_lock(&client->state_mu);
out_state->registered = client->registered;
out_state->server_idle_ms = client->last_server_activity_ms == 0
? 0
: (omni_now_millis32() - client->last_server_activity_ms);
snprintf(out_state->last_server_error, sizeof(out_state->last_server_error), "%s", client->last_server_error);
pthread_mutex_unlock(&client->state_mu);
}
void kcp_client_runtime_stats_snapshot(kcp_client_t *client, kcp_runtime_stats_t *out_stats) {
if (out_stats == NULL) {
return;
}
memset(out_stats, 0, sizeof(*out_stats));
if (client == NULL || client->conn == NULL) {
return;
}
kcp_conn_runtime_stats_snapshot(client->conn, out_stats);
}
int kcp_client_close(kcp_client_t *client) {
return client == NULL ? 0 : kcp_conn_close(client->conn);
if (client == NULL) {
return 0;
}
kcp_client_set_registered(client, 0);
return kcp_conn_close(client->conn);
}
void kcp_client_free(kcp_client_t *client) {
@@ -303,6 +670,6 @@ void kcp_client_free(kcp_client_t *client) {
return;
}
kcp_conn_free(client->conn);
pthread_mutex_destroy(&client->id_mu);
pthread_mutex_destroy(&client->state_mu);
free(client);
}

Some files were not shown because too many files have changed in this diff Show More