Skip to content

Conversation

n1ck-guo
Copy link
Contributor

@n1ck-guo n1ck-guo commented Sep 23, 2025

quantizer

Replaces the original compressor's quantize method and is responsible for the specific quantization process
Subclasses (coarse to fine granularity):

  • mode (RTN, Tune): Different quantizer processes and quantize function logic.
  • model_type (llm, vlm, diffusion): Different calibration methods, data processing, etc.
  • data_type (gguf, mxfp8, waquanizer): Requires additional algorithms (imatrix for gguf), special processes (register_act_max_hook for WA, fused_layer_global_scale for nvfp)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant