dummy column refactoring

#688
by alozowski HF staff - opened
Open LLM Leaderboard org
No description provided.
Open LLM Leaderboard org

Changes in app.py

  • Search Model Function: Updated to search within AutoEvalColumn.model.name instead of AutoEvalColumn.dummy.name.
  • Select Columns Function: Removed references to AutoEvalColumn.dummy.name and adjusted to only include necessary columns.
  • Dataframe Component in Gradio Interface: Updated to ensure that the table no longer includes AutoEvalColumn.dummy.name.

Changes in css_html_js.py

  • CSS Rules: Removed CSS rules that hid the last column (which was the dummy column).

Changes in utils.py

  • Column Definitions: The main thing – refactored the way columns are defined and initialized. Removed the initialization that directly added the dummy column.
  • Dataclass Initialization: Updated to create the AutoEvalColumn dataclass without the dummy column.

Changes in filter_models.py

  • Flag Models Function: Modified to use AutoEvalColumn.model.name for flagging logic instead of using a separate "model_name_for_query" which was tied to the dummy column.
  • Remove Forbidden Models Function: Updated to check against AutoEvalColumn.model.name instead of a non-existent "model_name_for_query".

Changes in read_evals.py

  • EvalResult to_dict Method: Removed the line that added the dummy column's data to the dictionary. Now it only uses relevant and existing columns.

Changes in collections.py

  • Update Collections Function: Updated the sorting and selection logic to use AutoEvalColumn.model.name instead of AutoEvalColumn.dummy.name.

Functionality

I have tested that the search and buttons are working correctly. I'll need your review @clefourrier :3

Open LLM Leaderboard org

Hello!

Looking good!

Some specific comments to address before merging:

  • src/display/utils.py - Doesn't your new system reorder the columns in the table?
  • src/leaderboard/filter_models.py. You remove
# Merges and moes are flagged automatically
        if model_data[AutoEvalColumn.flagged.name]:
            flag_key = "merged"

which means that some models (relying on the tags and dynamic data for flagging) will no longer be flagged I think.

Open LLM Leaderboard org
  • appends in src/display/utils.py are my long-standing problem, each time I'm trying to make them more readable. You're right, this idea reorder the column in the table, let revert this change.
  • my bad with src/leaderboard/filter_models.py 😱

Besides, I see, if we don't use this dummy column so we don't have a simple model name, and AutoEvalColumn.model.name contains specs, for example, this one:
model check: <a target="_blank" href="https://huggingface.co/ceadar-ie/FinanceConnect-13B" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">ceadar-ie/FinanceConnect-13B</a> <a target="_blank" href="https://huggingface.co/datasets/open-llm-leaderboard/details_ceadar-ie__FinanceConnect-13B" style="color: var(--link-text-color); text-decoration: underline;text-decoration-style: dotted;">πŸ“‘</a>

And when I removed the dummy column and tried to filter by AutoEvalColumn.model.name, the flags filter breaks. It's manageable, let me think.

Open LLM Leaderboard org

So, the dummy column appeared to be a useful one, but let's rename it to "fullname", it's a little bit more clear. Three thoughts:

  1. I should've called this PR "There and Back Again", but we found out the model name is not a simple model name.
  2. We need to refactor read_evals.py because there're lots data transformation right now, we need to simplify it.
  3. We need to do smth with model name fields, it's currently a mess.
alozowski changed pull request status to open
alozowski changed pull request title from dummy column removal to dummy column refactoring
alozowski changed pull request status to merged
Open LLM Leaderboard org

PR is now merged/closed. The ephemeral Space has been deleted.
(This is an automated message.)

Sign up or log in to comment