This is quite a good paper (15 page PDF) that looks in detail about addressing bias in data used in learning analytics. Bias, we should be clear, refers here to only a property of the data set (the term 'bias' in AI also refers to the sensitivity setting in neural networks). The authors study two types of data biases, distribution bias and the less well-known hardness bias, in data from students of different sexes and first-language backgrounds. 'Hardness bias' refers to how easy or hard it is for the algorithm to label data (the paper offers a technical definition in terms of k-NN). It's addressed by resampling a less hard subset of the data from the particular demographic group.
Today: 4 Total: 103 [Share]
] [