因为工作的需要,我今后可能更多侧重贴点比较pratical 的东西,尤其希望能贴点large scale data management and learning的东西。variable selection在实际中相当重要,zz下wloo师兄的list,希望能有人来总结下哪些比较有希望large scale上也能 work

zz from http://proof.ycool.com/post.2663900.html

1. 经典方法
Akaike (1973). Proc. 2nd International Symposium on Information Theory, pp. 267-281. "AIC"
Schwarz (1978). Ann. Statist. 6(2): 461-464. "BIC"

2. Lasso 路线
Tibshirani (1996). J. Roy. Statist. Soc. Ser. B 58: 267-288. "Lasso"
Knight and Fu (2000). Ann. Statist. 28: 1356-1378. "Asymptotics for Lasso"
Efron, Hastie, Johnstone and Tibshirani (2004). Ann. Statist. 32: 407-499. "LARS"
Zou (2006). J. Amer. Statist. Assoc. 101: 1418-1429. "Adaptive Lasso"

3. SCAD 路线
Fan and Li (2001). J. Amer. Statist. Assoc. 96: 1348-1360. "SCAD and the oracle property"
Fan and Peng (2004). Ann. Statist. 32: 928-961. "SCAD with diverging p"
Zou and Li (2008). Ann. Statist. 36: 1509-1533. "LLA"

4. Elastic Net 路线
Zou and Hastie (2005). J. Roy. Statist. Soc. Ser. B 67 301-320. "EN"
Zou and Zhang (2009). Ann. Statist. 37: 1733-1751. "Adaptive EN"

5. Dantzig Selector 路线
Candes and Tao (2007). Ann. Statist. 35: 2313-2351. "DS and nonasymptotics"
Bickel, Ritov and Tsybakov (2009). Ann. Statist. 37: 1705-1732. "Nonasymptotics for Lasso and DS"

6. Screening and preconditioning
Fan and Lv (2008). J. Roy. Statist. Soc. Ser. B 70: 849-911. "SIS"
Paul, Bair, Hastie and Tibshirani (2008). Ann. Statist. 36: 1595-1618. "Preconditioning"
Wasserman and Roeder (2009). Ann. Statist. 37: 2178-2201. "Screening-cleaning"

7. 综述
Hastie, Tibshirani and Friedman (2009). The Elements of Statistical Learning, 2nd ed., especially Chapters 3, 7 and 18.
Hesterberg, Choi, Meier and Fraley (2008). Statist. Surveys 2: 61-93. "Review for Lasso and LARS"
Fan and Lv (2010). Statist. Sinica, to appear. "Review emphasizing SCAD and SIS"

顺便广告:这个blog以后可能更偏重技术更新,关于我的生活和8g基本会搬到space上去,虽然那里留言需要有space帐号= =
http://wshxzt.spaces.live.com/
希望和感谢大家支持~