Skip to content

Conversation

@rvagg
Copy link
Contributor

@rvagg rvagg commented Jan 17, 2026

Draft to start with because I'm still testing with this but it does work as is for now.

I'm testing FilOzone/synapse-sdk#544 & filecoin-project/curio#864, which will end up being default behaviour so we need to account for it here. The challenge I'm solving for here is that I need the Curio nodes to be able to talk to each other, for the host to talk to the Curio nodes, and the SP registry to provide a serviceUrl that works inside the containers and outside. Unfortunately I don't think there's a clean Docker way to achieve this that works across Linux and macOS so instead I'm going with the default Docker internal host naming of host.docker.internal and requiring that the host also know that name for its own localhost. This is a solution I've found various people using online to achieve something like this.

foc-devnet will now check that you have that set up before proceeding and tell you how to fix it if it's needed.

Note also we're doing a default of 2 Curio nodes here to simulate the durability requirements that we need to achieve, so this will also be standard going forward.

@FilOzzy FilOzzy added this to FOC Jan 17, 2026
@github-project-automation github-project-automation bot moved this to 📌 Triage in FOC Jan 17, 2026
- Preserves Portainer for persistent access
- Clears run ID

### Pausing and Resuming (Resource Saving)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm learning that this might not be all that good, Curio doesn't seem to like sleeping, its harmony task heartbeat seems to get out of whack and it needs a restart so I might have to remove these instructions, or at least provide curio restart instructions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to remove this section, I haven't been able to make pause & resume work well for me. I'd prefer a full start/stop that doesn't do the whole deployment cycle I think.

@BigLep BigLep moved this from 📌 Triage to ⌨️ In Progress in FOC Jan 25, 2026
@BigLep BigLep linked an issue Jan 25, 2026 that may be closed by this pull request
@redpanda-f redpanda-f marked this pull request as ready for review February 2, 2026 08:05
Copilot AI review requested due to automatic review settings February 2, 2026 08:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables SP-to-SP (Service Provider to Service Provider) data fetching by using host.docker.internal as a hostname that resolves to localhost from both the host machine and inside Docker containers. The changes ensure that service URLs registered in the SP registry work across different network contexts. Additionally, the default number of Curio service providers is increased from 1 to 2 to better simulate durability requirements.

Changes:

  • Increases default SP count from 1 to 2 to simulate durability requirements
  • Adds a pre-startup check that validates host.docker.internal resolves to localhost, with clear error messages and setup instructions
  • Updates service URLs to use host.docker.internal instead of localhost for cross-container compatibility
  • Adds CURIO_PULL_ALLOW_INSECURE=1 environment variable to allow HTTP connections in devnet
  • Adds --add-host=host.docker.internal:host-gateway to Curio containers for proper hostname resolution

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/config.rs Updates default approved_pdp_sp_count and active_pdp_sp_count from 1 to 2
src/commands/start/mod.rs Adds check_host_docker_internal function to validate hostname resolution before startup
src/commands/start/pdp_service_provider/pdp_service_provider_step.rs Changes service URL from localhost to host.docker.internal for cross-container compatibility
src/commands/start/curio/db_setup.rs Adds CURIO_PULL_ALLOW_INSECURE environment variable for insecure HTTP sources
src/commands/start/curio/daemon.rs Adds --add-host Docker flag to enable host.docker.internal resolution in containers
README.md Documents the host.docker.internal prerequisite with setup instructions
README_ADVANCED.md Adds detailed Yugabyte database query examples and clarifies multi-SP setup

for (sp_index, sp_address, sp_eth_address, pdp_port, should_approve) in sp_data {
let service_url = format!("http://localhost:{}", pdp_port);
// Use host.docker.internal so the URL works from both host and containers
let service_url = format!("http://host.docker.internal:{}", pdp_port);
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string "host.docker.internal" is repeated multiple times throughout the codebase (in this file, pdp_service_provider_step.rs, and daemon.rs). According to the coding guidelines, magic names should be defined as constants. Consider defining a constant like DOCKER_HOST_INTERNAL at the module or crate level to avoid repetition and make future changes easier.

Copilot generated this review using guidance from repository custom instructions.
"--network".to_string(),
pdp_miner_network_name(run_id, sp_index),
// Enable host.docker.internal for SP-to-SP fetch (resolves to host gateway)
"--add-host=host.docker.internal:host-gateway".to_string(),
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Docker argument "--add-host=host.docker.internal:host-gateway" contains the magic name "host.docker.internal" which is used in multiple places. According to the coding guidelines, all magic names should be constants. Consider defining a constant for this Docker argument or for the hostname component to ensure consistency across the codebase.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +50 to +103
fn check_host_docker_internal() -> Result<(), Box<dyn std::error::Error>> {
info!("Checking host.docker.internal resolution...");

// Try to resolve host.docker.internal:80 (port doesn't matter, just need DNS resolution)
match "host.docker.internal:80".to_socket_addrs() {
Ok(mut addrs) => {
// Check if any resolved address is 127.0.0.1
let is_localhost = addrs.any(|addr| addr.ip().is_loopback());

if is_localhost {
info!("✓ host.docker.internal resolves to localhost");
Ok(())
} else {
error!("════════════════════════════════════════════════════════════════════");
error!("ERROR: host.docker.internal does not resolve to localhost (127.0.0.1)");
error!("════════════════════════════════════════════════════════════════════");
error!("");
error!("SP-to-SP fetch requires host.docker.internal to resolve to 127.0.0.1");
error!("so that registered SP URLs work from both host and containers.");
error!("");
error!("To fix this, add the following line to /etc/hosts:");
error!("");
error!(" 127.0.0.1 host.docker.internal");
error!("");
error!("You can do this with:");
error!(" echo '127.0.0.1 host.docker.internal' | sudo tee -a /etc/hosts");
error!("");
error!("════════════════════════════════════════════════════════════════════");
Err("host.docker.internal must resolve to 127.0.0.1".into())
}
}
Err(_) => {
error!("════════════════════════════════════════════════════════════════════");
error!("ERROR: host.docker.internal does not resolve");
error!("════════════════════════════════════════════════════════════════════");
error!("");
error!("SP-to-SP fetch requires host.docker.internal to resolve to 127.0.0.1");
error!("so that registered SP URLs work from both host and containers.");
error!("");
error!("Add the following line to /etc/hosts:");
error!("");
error!(" 127.0.0.1 host.docker.internal");
error!("");
error!("You can do this with:");
error!(" echo '127.0.0.1 host.docker.internal' | sudo tee -a /etc/hosts");
error!("");
error!("For GitHub Actions, add this step before running foc-devnet:");
error!(" - run: echo '127.0.0.1 host.docker.internal' | sudo tee -a /etc/hosts");
error!("");
error!("════════════════════════════════════════════════════════════════════");
Err("host.docker.internal must be resolvable".into())
}
}
}
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check_host_docker_internal function exceeds the 15-line limit specified in the coding guidelines. This 53-line function should be decomposed into smaller functions. Consider extracting the error message generation into a separate function and the resolution check logic into another function.

Copilot generated this review using guidance from repository custom instructions.
info!("Checking host.docker.internal resolution...");

// Try to resolve host.docker.internal:80 (port doesn't matter, just need DNS resolution)
match "host.docker.internal:80".to_socket_addrs() {
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The port number 80 used in "host.docker.internal:80" should be defined as a constant per the coding guidelines which require all magic numbers to be constants. Consider defining a constant like DNS_CHECK_PORT at the module level.

Copilot generated this review using guidance from repository custom instructions.
@redpanda-f
Copy link
Collaborator

Inter-SP Communication Solution Analysis

TL;DR

Where we are: localhost:<port> doesn't work for inter-SP comms because localhost means different things inside vs. outside a container.

The fix: Use host.docker.internal — it resolves to 127.0.0.1 consistently from both the host and inside containers, letting SPs talk to each other while staying accessible from the host.


The Problem

So here's the thing: Docker networking makes this tricky. You've got SPs running in containers that need to:

  1. Talk to each other (inter-SP communication)
  2. Be reachable from the host (for testing/synapse)

But localhost doesn't work because it's context-dependent:

  • On the host, localhost = the machine
  • Inside a container, localhost = that container
  • So when SP1 tries to hit localhost:8080 thinking it's reaching SP2, it's actually hitting itself

Docker DNS names (foc-curio-1, etc.) work great within containers but don't resolve from the host at all. Catch-22.

The Solution: host.docker.internal

Rod's approach (PR #34) is actually pretty elegant. Use host.docker.internal as the unified endpoint:

How it works:

  • Add 127.0.0.1 host.docker.internal to /etc/hosts on the host (one-liner, macOS Docker Desktop does this automatically)
  • Pass --add-host=host.docker.internal:host-gateway when launching Curio containers
  • SPs register themselves in the registry as http://host.docker.internal:<port> instead of localhost:<port>
  • Set CURIO_PULL_ALLOW_INSECURE=1 to bypass TLS cert validation for this internal hostname

Why it works:

  • From the host: host.docker.internal resolves to 127.0.0.1 (via /etc/hosts)
  • From containers: Docker's host-gateway mapping points it back to the host's IP
  • Same hostname, same port, works everywhere — SP1 can reach SP2, and both can be reached from the host

The Good Stuff ✅

  • Single endpoint — one URL works everywhere (host + all containers)
  • Enables inter-SP comms — SPs can actually call each other
  • Host still accessible — testing code on the host works fine
  • Low overhead — literally just DNS + one Docker flag, no architectural changes
  • Minimal setup/etc/hosts one-liner (automated for macOS)
  • Backward compatible — existing tests keep working

The Tradeoffs ⚠️

  • Host-level modification — Need to edit /etc/hosts. Automated with clear instructions; Linux users do it once.
  • CURIO_PULL_ALLOW_INSECURE=1 — Bypasses TLS validation. Fine for devnet; production would need proper certs.
  • CI/CD setup — GitHub Actions et al. need the /etc/hosts line too (documented in error messages).
  • Docker dependency — Relies on host-gateway being available (it's standard in modern Docker).

None of these are dealbreakers for a devnet tool. The setup friction is minimal and the benefit is immediate.

Validation & Safety

The code already checks that host.docker.internal resolves correctly before startup:

fn check_host_docker_internal() -> Result<(), Box<dyn std::error::Error>> {
    match "host.docker.internal:80".to_socket_addrs() {
        Ok(mut addrs) => {
            let is_localhost = addrs.any(|addr| addr.ip().is_loopback());
            if is_localhost { /* ✓ we're good */ }
            // Otherwise: friendly error + exact fix instructions
        }
    }
}

Catches the problem early with clear, copy-pasteable fix instructions for different platforms.

What's Actually Implemented

  • Service URL registration: http://host.docker.internal:{port}
  • Container launch flag: --add-host=host.docker.internal:host-gateway
  • Env var: CURIO_PULL_ALLOW_INSECURE=1
  • Pre-flight DNS check: validates setup before startup ✓

@redpanda-f redpanda-f removed this from FOC Feb 2, 2026
@BigLep BigLep added this to FOC Feb 2, 2026
@github-project-automation github-project-automation bot moved this to 📌 Triage in FOC Feb 2, 2026
@BigLep BigLep moved this from 📌 Triage to 🔎 Awaiting review in FOC Feb 2, 2026
@BigLep BigLep requested a review from redpanda-f February 2, 2026 18:35
@BigLep
Copy link
Contributor

BigLep commented Feb 2, 2026

So @rvagg can't give official-in-github approval because he's the original author of the draft.

I think the next steps are:

  1. @rvagg give comment approval
  2. Once that comes through, @redpanda-f can give "rubber-stamp" approval so he can then merge the PR.

@BigLep BigLep assigned rvagg and unassigned rvagg Feb 2, 2026
@BigLep
Copy link
Contributor

BigLep commented Feb 2, 2026

@redpanda-f : I assume you'll go through the copilot comments and see which ones are relevant and resolve accordingly.

@github-project-automation github-project-automation bot moved this from 🔎 Awaiting review to ✔️ Approved by reviewer in FOC Feb 3, 2026
@redpanda-f redpanda-f merged commit 624a78b into main Feb 3, 2026
2 checks passed
@github-project-automation github-project-automation bot moved this from ✔️ Approved by reviewer to 🎉 Done in FOC Feb 3, 2026
@redpanda-f redpanda-f deleted the rvagg/multi-sp branch February 3, 2026 10:26
@rjan90 rjan90 added this to the M4.0: mainnet staged milestone Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🎉 Done

Development

Successfully merging this pull request may close these issues.

Inter-SP communication setup

5 participants