The Road Less Traveled By: 失之毫釐是不是謬以千里

Sunday, April 10, 2011

失之毫釐是不是謬以千里

在西班牙電信公司 Telefonica 研究院工作的學者 Xavier Amatriain，前幾天在網誌上發表了一篇文章 Recommender Systems: We're doing it (all) wrong ，談到研究推薦系統的學者和開發者，在使用數據時，務必要注意數據的性質。

很多人使用 Likert Scale 做評分（Ratings）的量表基礎，比如說像「非常不喜歡、喜歡、無所謂、不喜歡、非常不喜歡」這樣的評分表就極爲常見，但是 Xavier 提醒我們 Likert Scale 的數據是 ordinal data ，這種數據僅僅表達次序關係，但是兩兩評分之間未必是 equidistant 的。若用這樣的數據計算距離（計算距離是相似性的基礎），其結果可能是失真的，循此邏輯推演下去，計算推薦系統準確率的指標 RMSE 的意義也可能失準。

從數學的角度來看，誤用定義當然是極爲嚴重的基本功的失誤，但是若從實務上考量，把 Likert 式評分當做 internal data，對推薦系統的成果究竟影響又多大，實在不好說。不過，看來在這一點上不察，誤把馮京當馬涼的研究人員和開發人員可能不少哦！

Xavier Amatriain 寫這篇文章，是受 Judy Robertson 在 Blog@ACM 上的文章 We're Doing It Wrong 所啓發。Judy 在文中提到 2010 ACM Conference on Human Factors in Computing Systems 有學者發表研究前一年會議中發表論文《Powerful and consistent analysis of Likert-type rating scales 》，爬梳學者使用的數據和統計工具，發現驚人的事實，原文是這樣的：

Kaptein, Nass, & Markopoulos (2010) published a paper in CHI last year found that in the previous year's CHI proceedings, 45% of the papers reported on likert type data but only 8% used non-parametric stats to do the analysis. 95% reported on small sample sizes (under 50 people). This is statistically problematic even if it gets past reviewers!

使用 Likert Scale 作爲實驗分析方法的學者竟然約略達到五成，Judy 在文章下半部提出她對此現象原因的觀察和建議，我對統計是大外行，只能點頭諾諾。但最抓住我眼球的句子是“95% reports on small sample size”這句，產業界鮮少有人信服學界真能做出「有用」的東西，確實有點道理，怨不得人。

[參考資料]
Kaptein, M., Nass, C., Markopoulos, P. (2010) Powerful and consistent analysis of Likert-type rating scales. In Proceedings CHI 2010, ACM, New York, NY, 2391-2394. DOI= http://doi.acm.org/10.1145/1753326.1753686

The Road Less Traveled By

Sunday, April 10, 2011

失之毫釐是不是謬以千里

No comments:

Post a Comment

如果我的心是一朵蓮花

Report Abuse