You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be nice to include sections on stability selection + knockoffs + cousins / work in that universe. I don't know much about this but it's a big topic of late, especially in the high dimensional ML community. Nice overview (probably a bit out of date by now?) at https://www.stat.cmu.edu/~ryantibs/journalclub/stability.pdf.
The Simple Filters section encourages feature selection based on p-values from some sort of GLM / GAM / etc. While this is a standard approach, significant features are not necessarily predictive (see http://biorxiv.org/lookup/doi/10.1101/327437 for example). Scoring predictors based on actual predictive measure seems like a better recommendation (LOOCV / PRESS for people who don't want have time for randomization tests, or resampled error estimates / permuted LOOCV for people who do).
Either way, it would interesting to see simulations comparing various feature "scores" for their efficacy in selecting features via simple filters.
The text was updated successfully, but these errors were encountered:
alexpghayes
changed the title
Thoughts on feature selection
Feature selection: potentially add stability selection + knockoffs, using predictive measure for variable scoring instead of p-values
Feb 26, 2019
alexpghayes
changed the title
Feature selection: potentially add stability selection + knockoffs, using predictive measure for variable scoring instead of p-values
Feature selection: potentially add stability selection + knockoffs, use predictive measure for variable scoring instead of p-values
Feb 26, 2019
It would be nice to include sections on stability selection + knockoffs + cousins / work in that universe. I don't know much about this but it's a big topic of late, especially in the high dimensional ML community. Nice overview (probably a bit out of date by now?) at https://www.stat.cmu.edu/~ryantibs/journalclub/stability.pdf.
The Simple Filters section encourages feature selection based on p-values from some sort of GLM / GAM / etc. While this is a standard approach, significant features are not necessarily predictive (see http://biorxiv.org/lookup/doi/10.1101/327437 for example). Scoring predictors based on actual predictive measure seems like a better recommendation (LOOCV / PRESS for people who don't want have time for randomization tests, or resampled error estimates / permuted LOOCV for people who do).
Either way, it would interesting to see simulations comparing various feature "scores" for their efficacy in selecting features via simple filters.
The text was updated successfully, but these errors were encountered: