Practically Possibly Perfect

I believe I’m supposed to have an opinion on the changes to A-level science subjects with respect to practical work. I can’t muster up much enthusiasm for the debates, although I like what I saw late last week on Twitter when it was pointed out that Nobel Laureates don’t know very much about how practical work is run/assessed in schools, teachers do.  I don’t know very much about how practical work is run/assessed in schools but I do know what it looks like from the point of view of teaching the 1st year undergraduate chemistry cohort. The question for this post is how to assess practical work effectively. I’ll probably write a second post on how we deal with our incoming students’ practical skills.

We have two practical exams, one in December of 1st year and the other in May of 2nd year. Both are intended to directly assess the practical skills of our chemistry students by directly observing them in the lab, and also by measuring the more typical indirect measures of competent practical work such as results obtained, lab diary entry, data analysis. The vast majority of practical work assessment that goes on through out degree courses is indirect assessment – that is, assessing the analysis of results obtained (and the results themselves). This is poor for a number of reasons. Firstly, many people confuse assessing results analysis with assessing practical skills. They aren’t the same thing. Given that most undergraduate practicals are comparatively fail-safe, it may not take that much skill to get enough product to do the write-up.  Secondly, results can be fudged (or more scientifically, be open to measurement bias – more on that later), shared, googled, extrapolated and just plain made up.  How many departments list ‘doctoring experimental results’ as academic misconduct? Finally, some marks schemes don’t assign enough marks to the bits that require critical appraisal of how the experiment has been carried out.  The yield is good for this, the NMR can be clean or not (unless the peaks have been deleted in processing), and looking at the quantity of stuff in a sample vial tells you more. Marks have to be assigned for those tasks.

For many experiments, it is possible to be biased when making measurements. I can understand why titrations are popular experiments, a nice convenient number that can be verified by a competent person using the same chemicals and procedure. However, getting two or three concordant readings after running a rough titration doesn’t strike me as something that’s particularly robust. Once you know where the end point roughly is, it’s probably very easy for your brain to trick you into thinking you’ve reached it, giving you better results than you might otherwise get. You need to be very honest when performing such techniques, with a good sense of what the end point looks like to be consistent here. I’d prefer three titrations using three different concentrations of stuff in the burette and see how close that rough titration is to the precise value. That’s a bit less biased really, but possibly a better test of the observation and skill needed.

Practical exams are stressful for all concerned and there is no way to get around that. You have to watch what people are doing. Frustratingly, there is substantial variation in what is viewed as correct technique, even within the same module or course.  One member of staff may believe that putting the lab diary down on the balance table when weighing out is bad practice, another just prefers that the students write the mass down quickly and doesn’t care where the lab diary is. One member of staff prefers that students use a weighing boat and quantitative transfer when preparing samples for UV-Vis analysis, another may feel that weighing directly into the volumetric flask is less likely to induce errors in the results (the ones they will mark).  The only way to overcome this is to have a mark scheme that allows for subjectivity on the part of the marker:

In the opinion of the invigilator, the student performed the task (e.g. preparing a solution in volumetric glassware):

0 – did not perform task (e.g. ran out of time, produced no product…)

1 – poorly (significant mistakes made that would affect the quality of the results obtained – e.g. weighed out on a weighing boat but failed to quantitatively transfer the material or weigh by difference)

2 – adequately (minor mistakes made that would affect the quality of the results – e.g. weighed directly into the flask but failed to notice the quantity spilled on the balance resulting in lower mass than expected in the flask)

3 – well (minor mistakes make with some impact on the quality of results – e.g. used the recommended mass from the instructions to calculate the concentration of the prepared solution rather than the actual mass used, actual mass not recorded)

4 – very well (minor mistakes related to the procedure that would not affect the quality of the results – e.g. used the recommended mass from the instructions to calculate the concentration of the prepared solution rather than the actual mass used, but the actual mass was recorded and the mistake could be rectified with prompting)

5 – exceptionally (could not find fault)

Where does the threshold for a pass sit in such a scheme? Probably around 2.5 I’d say, so 50%, a little higher than the standard 40%.

Safety is the next thing to factor in. For our 2nd year practical exam, we operated a 3 strikes policy – if a student must be prompted on a safety aspect more than 3 times, they fail the component. We meant things like taking off safety goggles, spilling substances and not alerting a demonstrator or cleaning it up (no points were to be deducted for spilling, only if it was ignored), or setting up equipment in a way that meant the demonstrator had to intervene before it was turned on such as an unclamped reflux.  I’m not convinced this is the way to go – perhaps a deduction of marks rather than limiting it to 3 strikes because there will always be a couple of student who push that to its limits! We also reserved the right to ask a student to leave if we felt they were unable to work safely and were a danger to those around them, if they turned up inappropriately attired, or broke any of the general lab safety rules in a serious way.

Finally, a serious discussion about what a pass looks like in terms of lab performance needs to be had. A bare pass probably involves a fair bit of prompting by the demonstrators but with the student able to complete the tasks when prompted. For example, a prep may say that the reaction was cooled rapidly to 0 ºC and a student may have forgotten how to do this. A demonstrator may prompt with ‘use an ice bath’ and the student would then do the task. I’d probably give a bare pass if they just used an ice bath, but if they used an ice-water bath and clamped the flask in it, I’d give a little more because there is at least evidence of further thought.  This is entirely debatable however and it may be the opinion of those conducting the assessment that prompts should not be needed.

The only suggestions I have to minimise the stress of being watched doing lab work is to run a mock exam (and this also helps you test drive the mark scheme and train the demonstrators), ask each demonstrator to mark 4 – 6 students at a time, and ask the students to think of it as a really quite lab class but where common courtesy such as ‘are you finished with the acetone’ is still permitted. The last point makes a big difference in stress levels and atmosphere. It’s also really important that students understand that they can and should ask for prompts in order to get through the allocated tasks.

2 Replies to “Practically Possibly Perfect”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.