vllm.model_executor.layers.quantization.utils.nvfp4_emulation_utils ¶
   __all__  module-attribute  ¶
    kE2M1ToFloat  module-attribute  ¶
 kE2M1ToFloat = tensor(
    [0.0, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 6.0], dtype=float32
)
  break_fp4_bytes ¶
  Source code in vllm/model_executor/layers/quantization/utils/nvfp4_emulation_utils.py
   cast_to_fp4 ¶
  Source code in vllm/model_executor/layers/quantization/utils/nvfp4_emulation_utils.py
   convert_swizzled_to_linear ¶
 convert_swizzled_to_linear(
    a_sf_swizzled: Tensor, m, k, block_size
)
Source code in vllm/model_executor/layers/quantization/utils/nvfp4_emulation_utils.py
   dequantize_to_dtype ¶
  Dequantize the fp4 tensor back to high precision.
Source code in vllm/model_executor/layers/quantization/utils/nvfp4_emulation_utils.py
   get_reciprocal ¶
  Source code in vllm/model_executor/layers/quantization/utils/nvfp4_emulation_utils.py
    ref_nvfp4_quant ¶
  Source code in vllm/model_executor/layers/quantization/utils/nvfp4_emulation_utils.py
   run_nvfp4_emulations ¶
 run_nvfp4_emulations(
    x: Tensor,
    input_global_scale: Tensor,
    weight: Tensor,
    weight_scale_swizzled: Tensor,
    weight_global_scale: Tensor,
)