-
Notifications
You must be signed in to change notification settings - Fork 14
Force index rebuild when a new repo is pulled #216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
|
MLCommons CLA bot All contributors have signed the MLCommons CLA βοΈ β |
mlc/repo_action.py
Outdated
| repos.append(Repo(path=p, meta=meta)) | ||
|
|
||
| # rebuild index via constructor | ||
| Index(self.repos_path, repos) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are building the index for all the repos here right? So, if a repo has a dependency we'll rebuild the index twice right? Is it possible to rebuild the index per repo? And we also need to fix the index on unregister_repo right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are building the index for all the repos here right? So, if a repo has a dependency we'll rebuild the index twice right? Is it possible to rebuild the index per repo?
I think that would require refactoring of index.py and all the places where Index class is being accessed, because currently index gets build whenever Index class is being accessed as build_index() is called in init function of Index class.
And we also need to fix the index on unregister_repo right?
If we intend to continue script execution after unregister_repo then yes we would need that, but I guess ideally that wont be the case and otherwise whenever the next time a mlc command is run which access Index class, index will be forcefully rebuilded as it will pickup that repo.json is modified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sujik18 we can have a custom update function in the Index class to selectively update the entries for a given repo right?
The requirement for handling unregister_repo is when we do pull or a MLC repo which has a fork already registered in mlcflow. Then mlcflow automatically unregisters the fork and pull the new one. But here, the Index entries may not be consistent anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The requirement for handling unregister_repo is when we do pull or a MLC repo which has a fork already registered in mlcflow. Then mlcflow automatically unregisters the fork and pull the new one. But here, the Index entries may not be consistent anymore.
@arjunsuresh I have added a remove_repo_from_index function to remove index entries of a repo when the repo is deleted, currently called only in mlc rm command, if it looks good same function can be reused to update index entries while unregistering a repo.
β PR Checklist
devπ Note: PRs must be raised against
dev. Do not commit directly tomain.β Testing & CI
π Documentation
π File Hygiene & Output Handling
π‘οΈ Safety & Security
π Contribution Hygiene
Fixes #orCloses #.Fixes #209
Logs: