Skip to content

finetune.sh will not run as distributed in the repo because of paths in train_example.jsonl and val_example.jsonl #231

Open
@jdalegonzalez

Description

@jdalegonzalez

Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

🐛 Bug

With a fresh install, run ./finetune.sh and it will fail, even if it can find train_ds.py. This is because the train_example.jsonl val_example.jsonl files that come in the data directory and are referenced by the script refer to files on a local file-system that is not a part of the repo. Worse, the error that is raised isn't "file not found". Instead, because of a bug in FunASR, what is reported is "str has no attribute size".

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

  1. Run cmd `./finetune.sh'
  2. See error
    in output.log "str has no attribute size".

Expected behavior

Ideally, either all of the variables that a user must change in order for finetune.sh would be listed together at the top:

  • CUDA_VISIBLE_DEVICES
  • train_data
  • val_data
  • train_tool

with a comment saying "YOU HAVE TO CHANGE THESE. NOTHING IS GOING TO WORK" or accept them as command line arguments and don't set them in the file. Better still would be changing train_example.jsonl and val_example.jsonl to point to https:.//somepath/to/values/on/the/web so that the finetune.sh script would work out of the box. (Although, something would still need to be done about CUDA_VISIBLE_DEVICES)

Environment

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions