Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature selection: potentially add stability selection + knockoffs, use predictive measure for variable scoring instead of p-values #53

Open
alexpghayes opened this issue Feb 26, 2019 · 0 comments

Comments

@alexpghayes
Copy link

  1. It would be nice to include sections on stability selection + knockoffs + cousins / work in that universe. I don't know much about this but it's a big topic of late, especially in the high dimensional ML community. Nice overview (probably a bit out of date by now?) at https://www.stat.cmu.edu/~ryantibs/journalclub/stability.pdf.

  2. The Simple Filters section encourages feature selection based on p-values from some sort of GLM / GAM / etc. While this is a standard approach, significant features are not necessarily predictive (see http://biorxiv.org/lookup/doi/10.1101/327437 for example). Scoring predictors based on actual predictive measure seems like a better recommendation (LOOCV / PRESS for people who don't want have time for randomization tests, or resampled error estimates / permuted LOOCV for people who do).

Either way, it would interesting to see simulations comparing various feature "scores" for their efficacy in selecting features via simple filters.

@alexpghayes alexpghayes changed the title Thoughts on feature selection Feature selection: potentially add stability selection + knockoffs, using predictive measure for variable scoring instead of p-values Feb 26, 2019
@alexpghayes alexpghayes changed the title Feature selection: potentially add stability selection + knockoffs, using predictive measure for variable scoring instead of p-values Feature selection: potentially add stability selection + knockoffs, use predictive measure for variable scoring instead of p-values Feb 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant