About chek-ego-miner

Crowdsource EGO robot data capture, contribution, and public-safe edge-host bring-up.

c

Published by

chekdata

Visit View Profile

README.md

View on GitHub

English | 简体中文

CHEK EGO Miner

Capture first-person EGO data with a phone and computer, contribute sessions, and browse reusable datasets.

Start Here

Download the iOS app: TestFlight
Choose your hardware: Hardware Guide
Check the current public roadmap: TODO
See validation status: Public Validation Matrix
Understand the repo boundary: Public and Private Runtime Boundary
Understand the business goals, shared contract, and multi-phone ownership rule between the two repos: Repo Business Contract
Get step-by-step help:
Browse and download contributed datasets:
- EGO Dataset Portal

What You Can Do

start with one phone and one computer
add a stereo camera when you want better spatial cues
move to a dedicated edge setup for higher-throughput capture
use an AI assistant for guided install and troubleshooting
contribute sessions and explore downloadable datasets

Public-First Scope

This repository is the public-first entry point for contributors. It should give people a clear install path, working operator surfaces, agent-guided troubleshooting, and a usable frontend without forcing them to understand the full internal runtime topology first.

It is the path for people who want to assemble their own edge machine or run the stack from a computer install, instead of depending on a factory-integrated device workflow.

As the project evolves, this repo should keep that public-first experience while sharing common building blocks with the factory edge engineering line. The goal is to avoid long-term duplicate runtime or module implementations, while still supporting a genuinely different installation and hardware path.

Capability Lanes

From the public product point of view, chek-ego-miner should expose usable entry points for SLAM, VLM, and time-sync so that a self-assembled edge host or a computer install can actually run those capabilities.

That does not mean all three should become long-term ego-miner-only module implementations:

SLAM
- today is still more tightly coupled to the factory-integrated edge line, especially around sensing bring-up, calibration, replay, training gates, and engineering observability
- chek-ego-miner should expose the public install and operator path for it, but should not fork a second long-lived core SLAM stack
VLM
- must remain directly usable from chek-ego-miner, including model fetch, sidecar startup, service wiring, and public diagnostics
- the underlying runtime behavior should converge with the factory edge line instead of drifting into two different VLM implementations
time-sync
- is a shared capture-quality capability needed by both product lines
- chek-ego-miner should surface install, validation, and operator feedback, while deeper factory calibration and engineering observability can stay in chek-edge-runtime

The rule going forward is simple: if two files express the same capability in modules/, profiles/, services/, install backends, or shared UI panels, they should be deduped into shared building blocks, templates, or versioned assets instead of being maintained as two drifting copies.

System View

flowchart LR
  Phone["iPhone + CHEK App"] --> Host["Computer or Edge Host"]
  Camera["Your Camera or Stereo Camera"] --> Host
  Agent["Codex / Claude / OpenClaw"] --> Host
  Host --> Upload["Upload EGO Sessions"]
  Upload --> Portal["Dataset Portal"]
  Upload --> Rewards["Token Rewards"]

Choose a Setup

Tier	Setup	Who it is for
`Lite`	computer + your own camera	fastest way to start
`Stereo`	computer + stereo camera	better spatial quality
`Pro`	edge machine + stereo camera	dedicated capture and higher throughput

You will also need a first-person phone mount. See Hardware Guide for buying criteria, setup tradeoffs, search keywords, and direct purchase examples including China marketplace links.

Get Step-by-Step Help

If you want guided setup instead of reading long docs, start with:

AGENTS.md
one of the ready-to-use prompts:

Recommended flow:

Tell the assistant which hardware tier you have.
Share your OS and what is already installed.
Ask for one step at a time.
Keep hardware checks, app install, and camera validation in the flow.

Before You Install

Before a longer install session, run the lightweight host self-check:

python3 scripts/check_host_basics.py

If you plan to share your own fork or public changes, run:

./scripts/scan_public_safety.sh .

Or use the CLI:

./cli/chek-ego-miner doctor
./cli/chek-ego-miner camera-probe
./cli/chek-ego-miner readiness --tier lite
./cli/chek-ego-miner readiness --tier pro
./cli/chek-ego-miner public-e2e --tier lite

Use ./cli/chek-ego-miner camera-probe --capture-smoke when you need to distinguish "camera is listed by the OS" from "the current terminal session can open the camera and read a frame".

public-e2e is the single public summary command. It reports host OS, hardware tier, camera readiness, VLM policy, local capture result and upload policy. It does not upload by default.

Lite Setup on Linux or macOS

If you want the quickest supported setup path, start here:

./cli/chek-ego-miner install \
  --profile basic \
  --apply \
  --system-install \
  --enable-services

python3 -m pip install --user --break-system-packages -r scripts/edge_phone_vision_requirements.txt
./cli/chek-ego-miner fetch-phone-vision-models --json
./scripts/start_edge_phone_vision_service.sh

./cli/chek-ego-miner basic-e2e \
  --edge-base-url http://127.0.0.1:8080 \
  --edge-token chek-ego-miner-local-token \
  --trip-id trip-public-basic-e2e \
  --session-id sess-public-basic-e2e \
  --output-dir ./artifacts/basic-e2e \
  --json

Or run the same local capture flow through the public summary command:

./cli/chek-ego-miner public-e2e \
  --tier lite \
  --run-basic-e2e \
  --edge-base-url http://127.0.0.1:8080 \
  --edge-token chek-ego-miner-local-token \
  --trip-id trip-public-basic-e2e \
  --session-id sess-public-basic-e2e \
  --json

If Homebrew-managed macOS python3 blocks pip install --user, install the same requirements into a compatible interpreter such as python3.10; the start script will auto-select it when available.

After the basic flow finishes, you should see:

ok: true
validation.ok: true
validation.score_percent: 100.0
public_download/demo_capture_bundle.json in your output directory

Notes:

This path is intended for Linux x86_64 and macOS arm64 basic hosts.
On macOS, install --system-install --enable-services stages the runtime under ~/.chek-edge/runtime/macos/basic.
time_sync_samples can stay empty on the single-phone basic path.

Training Threshold Validation

Raw upload/download success is not the same as training readiness. To check a downloaded session bundle against the public SLAM + time-sync candidate gate:

python3 scripts/generate_slam_time_sync_benchmark.py \
  --bundle /path/to/raw_bundle.tar.gz \
  --tier pro \
  --output /tmp/slam_time_sync_benchmark.json \
  --json

python3 scripts/validate_training_thresholds.py \
  --bundle /path/to/raw_bundle.tar.gz \
  --tier pro \
  --slam-benchmark-report /tmp/slam_time_sync_benchmark.json \
  --json

The validator checks:

VLM events, segments, fallback usage, and latency
time-sync sample count, accepted mapping ratio, per-source RTT, and offset span
phone pose, stereo pose, Wi-Fi pose, and fisheye track completeness
whether a SLAM benchmark report with drift, reprojection, pose-graph, and body-tracking metrics exists and passes the candidate budgets

The benchmark generator only emits metrics that can be computed from the bundle facts. It can compute stereo reprojection error and body-tracking coverage from the current raw bundle. It keeps trajectory drift and pose-graph residual as explicit blockers until the bundle includes ground truth, loop-closure evidence, or a SLAM optimizer report.

It returns exit code 0 only when training_ready=true; incomplete or candidate-only bundles return exit code 2 and list the blocking checks. Raw clock_offset_ns values can cross clock domains, so the validator uses per source-kind offset span/stability instead of treating the absolute offset as the sync error.

It intentionally separates:

signal_candidate_ready: the bundle has enough live signals for a candidate review
training_ready: the bundle passes frozen thresholds and has required SLAM benchmark metrics

Until configs/slam_time_sync_training_v1.json is frozen and real benchmark metrics are present, the tool will refuse to claim final training readiness.

Pro Setup on Jetson

If you want the full Pro runtime path on Jetson, bootstrap the machine first. This brings in:

stereo calibration
the Wi-Fi sensing model and sensing-server
edge-orchestrator, ruview-leap-bridge, and ruview-unitree-bridge binaries
RuView/ui-react/dist
an existing Jetson GPU VLM environment plus SmolVLM model cache

./cli/chek-ego-miner jetson-professional-bootstrap -- --force
./cli/chek-ego-miner install \
  --profile professional \
  --apply \
  --system-install \
  --runtime-edge-root "$PWD"

If you only want the Jetson VLM path, use the bundled sidecar and model fetch flow:

./cli/chek-ego-miner install \
  --profile professional \
  --apply \
  --system-install \
  --enable-services

python3 -m pip install --user -r scripts/edge_vlm_requirements.txt
./cli/chek-ego-miner fetch-vlm-models --json
./cli/chek-ego-miner vlm-start

If the target Jetson already has a working GPU VLM environment and local model cache, you can wire only those VLM assets and enable the sidecar through systemd-user:

./cli/chek-ego-miner jetson-vlm-bootstrap -- --force
./cli/chek-ego-miner service-install \
  --profile professional \
  --service chek-edge-vlm-sidecar \
  --enable \
  --runtime-edge-root "$PWD"

Notes:

fetch-vlm-models downloads the core Hugging Face files needed by transformers.
Default model files are stored under model-candidates/huggingface/.
A successful Jetson bring-up should look like:
- ./cli/chek-ego-miner readiness --tier pro reports the host is ready
- required services reach active
- /health, /association/hint, /api/v1/stream/status, and /infer return live responses on the host

Dataset Portal

Search and download contributed data from:

https://www.chekkk.com/humanoid/ego-dataset

What You Can Do Today

onboard a new capture setup
choose hardware and accessories
use prompts for guided setup
run the Lite/basic path on Linux or macOS
bring up the Pro Jetson VLM and service path
learn how contribution, rewards, and dataset discovery work

Docs

Contributing

See CONTRIBUTING.md.

Security

See SECURITY.md.

License

See LICENSE.

chek-ego-miner