Skip to content

[Bug] qwen3 30b vl precision mismatch between npu and gpu #23222

@cjy0x

Description

@cjy0x

Checklist

  • I searched related issues but found no solution.
  • The bug persists in the latest version.
  • Issues without environment info and a minimal reproducible demo are hard to resolve and may receive no feedback.
  • If this is not a bug report but a general question, please start a discussion at https://github.com/sgl-project/sglang/discussions. Otherwise, it will be closed.
  • Please use English. Otherwise, it will be closed.

Describe the bug

I ran sglang v0.5.10 qwen3 30b vl model under slime’s debug-rollout-only mode, and I found that the raw_reward curve mismatched between npu and gpu.

Can you align the precision between NPU and GPU?

Reproduction

Just run sglang v0.5.10 qwen3 30b vl model under slime’s debug-rollout-only mode.

Compare npu & gpu raw_reward curve.

Environment

sglang: v0.5.10
slime: v0.2.2
gpu: h100; npu: a3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions