Upload processor

#14
by m-ric HF staff - opened
No description provided.

I've noticed that there are two PRs named "Upload Processor," and they both contain some overlapping changes. Which one should I merge?

Additionally, could someone explain what this PR does? Why is it necessary to update the tokenizer?

image.png
Hi @m-ric I noticed in your PR (https://huggingface.co/rhymes-ai/Aria/discussions/11) that weights were added for the vision tower's post_norm layer. However, this layer isn't present in our current model architecture. Could you help clarify this addition?

Hi @m-ric

Thank you for your contribution. After further testing, we've encountered several issues that led us to revert the changes introduced in the following two PRs:

The main reasons for the reversion are as follows:

  1. Model Parameter Name Changes: The new parameter names broke our existing inference and training pipelines, which rely on the original naming convention.
  2. Training Issues: The inclusion of the vision_tower.post_layernorm parameter, although having zero weights and no impact on inference, requires freezing this layer during training, which is not part of our current setup.
  3. Downstream Modifications: Changes to components like vLLM and quantization need to be re-adapted, adding extra maintenance overhead.

Given our team's current capacity and priorities, we aren't able to fully maintain these changes at this time. We apologize for any inconvenience this may cause.

To facilitate your work, we have moved your changes to a new branch, hf, where you can continue development. We also suggest that Huggingface create a separate repository (e.g., Aria-hf) to manage these model-specific modifications, making it easier for others to integrate them into their own workflows.

Thank you for your understanding!

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment