Auto choose most appropriate explainable model#355
Auto choose most appropriate explainable model#355gaugup wants to merge 5 commits intointerpretml:mainfrom
Conversation
gaugup
commented
Dec 17, 2020
- This PR helps choose the best possible surrogate model by training multiple surrogate models based on accuracy or r2_score.
- If the training of multiple surrogate model fails for some reason, then we train the explainable model passed on by the user.
- We compute a replication metric (accuracy for classification and r2_score for regression) which helps find which of the surrogate models was a better fit.
Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
…eModel Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
Signed-off-by: Gaurav Gupta <gaugup@microsoft.com>
imatiach-msft
left a comment
There was a problem hiding this comment.
I think the code itself looks good but I'm concerned about structure and complexity, maybe we can discuss these changes more before moving forward with this PR
| :param reset_index: Uses the pandas DataFrame index column as part of the features when training | ||
| the surrogate model. | ||
| :type reset_index: str | ||
| :param auto_select_explainable_model: Set this to 'True' if you want to use the MimicExplainer with an |
There was a problem hiding this comment.
I wonder if this should be a separate explainer or function - mimic explainer takes a specific surrogate model and not a list. This also seems like something that complicates mimic explainer logic. Maybe we can discuss more.
Thinking of other libraries, usually there is a distinction between hyperparameter tuning and training (eg in both v1 studio and designer there is a Train Model and Tune Hyperparameters or Cross validate module, in spark ML the hyperparameter tuner is a separate estimator, in scikit-learn similarly grid search cv is a separate function). I feel like for users who want to do this we should have a separate function/class instead of complicating the current mimic explainer.
| if isinstance(training_data, DenseData): | ||
| training_data = training_data.data | ||
|
|
||
| self._original_eval_examples = None |
There was a problem hiding this comment.
this is quite a bit of logic to put inside mimic explainer, I'm really wondering how we could simplify this as mimic explainer is already quite complicated