Sections
You are here: Home ICG Publications Assessing the Quality of Web Content

Assessing the Quality of Web Content

Authors Elisabeth Lex, Khan Inayatullah, Bischof Horst, Michael Granitzer
Appeared in ECML/PKDD Discovery Challenge
Date  2010
Abstract This paper describes our approach towards the ECML/PKDD Discovery Challenge 2010. The challenge consists of three tasks: (1) a Web genre and facet classification task for English hosts, (2) an English quality task, and (3) a multilingual quality task (German and French). In our approach, we create an ensemble of three classifiers to predict unseen Web hosts whereas each classifier is trained on a different feature set. Our final NDCG on the whole test set is 0.575 for Task 1, 0.852 for Task 2, and 0.81 (French) and 0.77 (German) for Task 3, which ranks second place in the ECML/PKDD Discovery Challenge 2010.
Link

PDF

[Powered by Plone]