Thursday, July 9, 2015

What feedback should a programming MOOC provide?



What sort of feedback models can be used in a MOOC and other online learning activities? In this note we look at the different styles of giving feedback before looking at automated feedback in a little more detail, and come to a conclusion about what is essential in the next iteration of the Erlang MOOC.

Different feedback styles



One mechanism is to assign a Teaching Assistant (TA) to perform the work. The advantage of this is that the TA can tune their feedback to the particular piece of work, and the individual submission for that work. Feedback can be given on partial or incorrect programs, and can cover all aspects of the program’s behaviour. This can also be given ay any stage during the assessment, and not only when a completed solution is available.

One limitation is that it can be impossible to give human feedback alone on a large assessment, for which some testing data and other automated analyses may be necessary to provide the TA with suitable diagnostic information. A more fundamental difficulty is that this is difficult to scale, or to provide in a timely way to suit the participants’ different speed of working. So, while this can work for a small, focussed, group, isn’t feasible for the sort of numbers expected in a MOOC (or even in a university class of 150).

Participant self-assessment provides one route to scaling. The model for this is to give the student a comprehensive mark scheme and marking instructions, and to ask them to apply that scheme to their own work. This can then be moderated by a TA or other teacher. This approach has the pedagogical advantage that it gives the student a chance to gain insight into their own work through applying the marking scheme, but it can only be used on final or complete solutions to assessments, and certainly can’t be used to help students towards a final solution.

A variant of this is peer assessment, under which pairs of students are asked to give feedback to each other on their work. Again, this has a pedagogical advantage of allowing  students to gain further insights into how work is assessed, and in giving them a better perspective on their work through seeing another’s. It can also be used to give students support during the development of the solution, and can indeed become a form of peer programming. 

There are two variants of the peer assessment approach: the pairs can be assigned “statically” in advance of the work being done, or the process can be more dynamic than that, assigning the pairing only when the two have each made a submission. It is obviously a scalable solution, though without the consistency of having a single marker; it can be supported with appropriate materials, including marking schemes etc., just as for self-assessment.

The final option, and the one that we look at in more detail in the next section, is providing some form of automated feedback to students. Of course, it it possible to combine these modes – as hinted earlier – supplementing some form of person to person feedback with data from a variety of automated tests and analyses.

Automated feedback



On the whole, a pre-requisite for giving feedback is that the program unit – call it a module here – needs to be compilable. Without that it is difficult to give any meaningful automated feedback. Though difficult, it is not impossible, and some sorts of lexical-based analysis are possible even for programs that are incomplete or contain errors. Nevertheless, the rest of this section will assume that the program can be compiled successfully.


What kind of feedback can be provided using some sort of automated analysis?


  • Static checks
    • type checks
    • compiler warnings and compiler errors
    • abstract interpretations and other approximate analyses e.g, dead code analysis
  • Style analysis
    • decidable e.g. lengths of identifiers
    • undecidable (requires approximation), e.g. the module inclusion graph.
    • intensional properties: e.g. this is / is not tail recursive
  • Tests
    • hand-written unit tests, performed by the user
    • unit tests within a test framework; can be performed by the user, or the system
    • integration tests of a set of components, or of a module within context
    • these may be tested against mocked components
    • user interface testing
  • Properties
    • logical properties for functions
    • property-based testing for stateful APIs
    • both of these can be accompanied by “shrinking” of counter-examples to minimal
  • Non-functional properties
    • efficiency
    • scalability
    • fault-tolerance (e.g. through fuzz testing)


What role does testing play in online learning?


  • Confirming that a solution is (likely to be) correct or incorrect.
  • Pointing out how a solution is incorrect.
  • Pointing out which parts of a solution are correct / incorrect.
  • Pointing out how a solution is incomplete: e.g. a case overlooked.
  • Assessing non-functional properties: efficiency, scalability, style etc


Rationale for using automated feedback


Reasons in favour of using automated feedback, either on its own or as one of a number of mechanisms.
  • Timeliness: feedback can be (almost) immediate, and certainly substantially quicker than from a peer reviewer.
  • Comprehensiveness: can do a breadth of assessment / feedback which would be infeasible for an individual to complete.
  • Scalability: in principle it scales indefinitely, or at least to the limit of available resources; if the evaluation is done client-side then there is no resource cost at all.
  • Consistency of the feedback between different participants.
  • Complements peer interactions: can concentrate on “why did this fail this test?”


There are, however, drawbacks to using automated feedback.
  • Can miss the high-level feedback – covering “semantic” or “pragmatic” aspects – in both positive and negative cases.
  • Restricted to programs that can be compiled or compiled and run. 
  • Difficult to tie errors to faults and therefore to the appropriate feedback.

There is a cost to automating testing, as it needs to be implemented, either on the client side – typically within a browser – or on a server. There is a potential security risk if the testing is performed server-side, and so some sort of virtualisation or container will generally be used to test programs that could potentially interact with the operating system. On the server side, any technology can be used, whereas on the client it will typically use whatever facilities are provided by the browser.


Recommendations



At the base level we should provide two modes of feedback when the Erlang MOOC is run next.

Dynamic peer feedback, where pairs are matched dynamically as final submissions are made. This allows solutions that are pretty much complete – and hopefully correct – can receive feedback not only on correctness but also on style, efficiency, generality etc. This gives qualitative as well as quantitative feedback, but requires at least one other participant to be at the same stage at roughly the same time.

Unit tests and properties should be provided for participants to execute themselves. While this is less seamless than testing on the server side, it gives participants freedom to use the tests as they wish, e.g. in the course of solving the problem, rather than just on submission.

Understanding tools. We should include information about interpreting output from the compiler and dialyzer; some information is there, but should make it clearer.

Other feedback is desirable, and particularly integrating as much of this feedback into the browser would make an attractive user experience, particularly if it is integrated with the teaching materials themselves, à la Jupyter. As an alternative, the feedback support could be integrated with the server, but this will require effort to craft a bespoke solution for Moodle if we’re not yet using a mainstream MOOC platform.