env server/client #744

mikasenghaas · 2026-01-19T13:57:16Z

Description

WIP. Based on #734 and #739.

This PR introduces the EnvClient and EnvServer which are a drop-in replacement for executing environments in a separate process (pool). This is especially useful for multi-env training (e.g. in prime-rl) and evals (e.g. via vf-eval or in online evals during training). A couple of notes on design decisions:

The EnvClient mirrors the in-process Environment for the most common public facing methods, such as run_rollout, run_group, generate and evaluate
The client/server pattern is protocol-agnostic, i.e. the client may communicate with the server via any protocol (gRPC/ZMQ/HTTP/...). For now, only ZMQ is implemented.

Example

The env server pattern is integrated into vf-eval to sidecar an env server

uv run vf-eval gsm8k -n5 -r3 --use-env-server

Design

EnvServer

A EnvServer is initialized like a regular environment with an env_id and env_args

env_server = ZMQEnvServer(
    env_id=args.env_id,
    env_args=args.env_args,
    address=address
)

try:
    await server.run()
finally:
    await server.close()

EnvClient

A EnvClient communicates with a env server over the configured address

env = ZMQEnvClient(address=address)

await env.run_rollout(...) # same as Environment.run_rollout
await env.run_group(...) # same as Environment.run_group
await env.evaluate(...) # same as Environment.evaluate

Sidecar Pattern

To sidecar an env server (e.g. from vf-eval) simply wrap the run_server class method in a Process and connect the client to the same address

env_server = Process(
    target=ZMQEnvServer.run_server,
    args=(config.env_id, config.env_args),
    kwargs=dict(address=address)
)
env_server.start()
env = ZMQEnvClient(address=address)

try:
   results = await env.evaluate(...)
finally:
  env_worker.terminate()
  env_worker.join(timeout=5)
  if env_worker.is_alive():
      env_worker.kill()
      env_worker.join()

Breaking

Pass client_config: ClientConfig instead of client: AsyncOpenAI to public-facing methods. This is because clients are not serializable, so there is no way to mirror the Environment API otherwise
Can only limit concurrency for an entire rollout for now because the AsyncContextManagers are not serializable. We can prob define an env-level gen+score concurrency limits that is enforced across all calls but it's still breakng in the user for run_rollout and run_group

TODOs

Typed inputs/outputs (this required lazily creating clients and removing them from the public-facing API)
Integrate with multi-env evals
Logging
Graceful server termination
Unit tests
Multi-process environment execution

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Test improvement

Testing

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Additional Notes

samsja · 2026-01-19T20:48:46Z

verifiers/envs/environment.py

        self,
        input: RolloutInput,
-        client: AsyncOpenAI,
+        client_config: ClientConfig,


wondering if we shouldn't have some kind backward compatbility and still accept client as input but failed in the case of runtime ?

yea, we could support this but it would always come at the cost of the EnvClient not being a drop-in replacement for an Environment because we cannot mirror the API with client as it's not serializable

would prefer at least initially not breaking the rollout API -- some integrations (e.g. Tinker) use this directly, we'll need to do some other changes here around non-OpenAI client types like Anthropic, which can get subsumed by a generic Client / ClientConfig type eventually.

It's nice for users to be able to play around with envs in scripts/notebooks where the rollout method can be used directly, and IMO we should still support this with a generic OpenAI client.

My preference would be to handle it the same way we're doing DatasetBuilder, where we have a union type + keep the old var name back-compatibility, and check the type where relevant. We can allow certain code paths (prime-rl orchestrator, vf-eval) to fail if a generic client is used.

yeah I was also thinking about Union type here so that we can do both, we can also introduce an new method that only take client config and let this one for backward compatiblity

samsja · 2026-01-19T20:50:23Z

verifiers/workers/client/__init__.py

@@ -0,0 +1,3 @@
+from verifiers.workers.client.zmq_env_client import ZMQEnvClient


wondering if client name is not missleading with open ai client. Do we expect user to instantiate a client or should be handle underthehood by a the load_enviornment

we would expect users to instantiate a client/server. imo naming is pretty clear here, it's called exactly what it is: a client to interface with an environment

env servers wrap load_environment (and hence Environment), clients are used to interface with those env servers

which users here? IMO the spawning of clients/servers should always be automated by entrypoints like vf-eval / orchestrator. both can pull in info from configs, create their own clients, and spawn + connect to as many servers as needed

ah yea, im considering myself a user of verifiers when developing prime-rl haha. i agree, the user who is only running commands is blind to this and should never have to spawn an env server themselves

samsja · 2026-01-19T20:51:04Z

verifiers/workers/client/env_client.py

+)
+
+
+class EnvClient(ABC):


I would use Protocol here over ABC unless we have a lot of shared code. But not deal breaker can also stick with abc

i will check if possible. i think this might work for the client, but unlikely for the server

mikasenghaas changed the base branch from main to eval-tui January 19, 2026 13:57

mikasenghaas added 10 commits January 19, 2026 17:40

poc zmq env server/client

d7ab0c5

deprecate gen/score_sem + pass client config instead of clients

1895c3d

poc integration with vf-eval

777df83

type req/res

18b261b

cleanup server

9e9db4f

graceful shutdown

1e118cc

allow logging to files

cd25701

setup logging on env server

117a7e8

minor

269bbab

unique address for multi-env evals

d8cf093

mikasenghaas force-pushed the mika/env-worker branch from 8d79a1d to d8cf093 Compare January 19, 2026 17:47

mikasenghaas changed the base branch from eval-tui to multi-env-eval+dataset-builder January 19, 2026 17:47

mikasenghaas added 6 commits January 19, 2026 17:59

main runner

a644681

cleanup

ddd7501

cleanup env client

50f4bef

cleanup

6a1d994

revert flattening input keys in state

5fdda90

fix ty

309295b

mikasenghaas requested review from samsja and willccbb January 19, 2026 20:00

rename to use_env_server

b63c5a3

samsja reviewed Jan 19, 2026

View reviewed changes

mikasenghaas changed the title ~~env worker~~ env server/client Jan 20, 2026

		@@ -0,0 +1,3 @@
		from verifiers.workers.client.zmq_env_client import ZMQEnvClient

env server/client #744

Are you sure you want to change the base?

env server/client #744

Uh oh!

Conversation

mikasenghaas commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Example

Design

EnvServer

EnvClient

Sidecar Pattern

Breaking

TODOs

Type of Change

Testing

Checklist

Additional Notes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mikasenghaas commented Jan 19, 2026 •

edited

Loading