Between-Subjects Elicitation Studies
- 7 May 2016
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
- p. 3390-3402
- https://doi.org/10.1145/2858036.2858228
Abstract
Elicitation studies, where users supply proposals meant to effect system commands, have become a popular method for system designers. But the method to date has assumed a within-subjects procedure and statistics. Despite the benefits of examining the relative agreement of independent groups (e.g., men versus women, children versus adults, novices versus experts, etc.), the lack of appropriate tools for between-subjects agreement rate analysis have prevented so far such comparative investigations. In this work, we expand the elicitation method to between-subjects designs. We introduce a new measure for evaluating coagreement between groups and a new statistical test for agreement rate analysis that reports the exact p-value to evaluate the significance of the difference between agreement rates calculated for independent groups. We show the usefulness of our tools by re-examining previously published gesture elicitation data, for which we discuss significant differences in agreement for technical and non-technical participants, men and women, and different acquisition technologies. Our new tools will enable practitioners to properly analyze their user-elicited data resulted from complex experimental designs with multiple independent groups and, consequently, will help them understand agreement data and verify hypotheses about agreement at more sophisticated levels of analysis.Keywords
Funding Information
- UEFISCDI (PN-II-RU-TE-2014-4-1187)
This publication has 37 references indexed in Scilit:
- I'm home: Defining and evaluating a gesture set for smart-home controlInternational Journal of Human-Computer Studies, 2011
- Coefficient Kappa: Some Uses, Misuses, and AlternativesEducational and Psychological Measurement, 1981
- Measuring nominal scale agreement among many raters.Psychological Bulletin, 1971
- Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit.Psychological Bulletin, 1968
- Significance Tests in Discrete DistributionsJournal of the American Statistical Association, 1961
- Reliability of Content Analysis: The Case of Nominal Scale CodingPublic Opinion Quarterly, 1955
- The $\chi^2$ Test of Goodness of FitThe Annals of Mathematical Statistics, 1952
- On Information and SufficiencyThe Annals of Mathematical Statistics, 1951
- The Problem of $m$ RankingsThe Annals of Mathematical Statistics, 1939
- X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random samplingThe London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 1900