There's a number of questions that arise in this case:
- Normal courses have anything from 10 to 300+ students. Do the size of the corpus make a big difference when it comes to feedback?
- 1 to 10 assignments during a semester is normal, the corpus should include all assignments for feedback or just the collection formed by one assignment?
- Somewhat similar to 2. Should we build a big corpus for the feedback or user small ones that groups about topics?
- The calculation of several statistical techniques are prohibitive within an online interactive environment due the size of the term/document matrix. This implies that we have to do some feature selection before calculating principal components. A lot of questions arise from this issue:
- How many features are needed for student feedback?
- As the collection changes, what do we do with new features? Do we calculate everything again?
- Do we have different importance for features that comes from training than those that comes from the students essays, how do the express that difference?
- How can we build a model that reflects all this issues so we can compare and answer all these questions?
A little problem with Sakai is that now in version 2.3.1 resources are not getting indexed in their whole content, don't know why. By now I will try to make a document about what is getting indexed and what is not.