Here’s my take on the 2016 AP chemistry exam debrief, offered by members of the College Board, that occurred at the 2016 BCCE meeting on the morning of Wednesday, August 3rd.

As ever, these notes are made quickly at the time, and no recording is allowed. I mention that since if anyone finds anything that is not 100% accurate, then I would gladly correct if you could produce evidence that I have got something wrong. Having said that, I’m pretty confident that everything catalogued below is a true representation of the meeting.

As the presentation forms part of the George R. Hague Jr. Memorial AP Chemistry Symposium, the proceedings were introduced by Harvey Gendreau. In his introduction, Harvey mentioned me and described me as ‘noted’ and ‘enthusiastic’. I’ll leave it at that! Harvey went on to make a reference to my persistent calls for an AP chemistry Tsar, and then proceeded to give the College Board folk (and me), a Thermochromic coffee mug to mark the same. Highly amusing, at least to me!

Tsarmug

Prior to the meeting starting, Serena Magrogan (Director, AP Research & AP Science Curriculum & Content Development) came over to say ‘Hi’. She and I had a brief conversation that covered a few points. She confirmed that Prior Knowledge IS as my current understanding. She also reiterated a desire to move ahead in a collaborative manner, “that would allow us to move forward.” Fair enough, but I said that I thought the best way to do that would be via the AP Tsar posting answers to any questions on the CB web site. This conversation took place prior to Harvey’s presentation.

As is usual at these meetings, the Chief Examiner for AP Chemistry, Roger Kugel, gave a complete breakdown of the operational FRQ’s. I got a chance to meet Roger this time, and I found him to be very personable. He seemed genuinely interested in meeting me, and listening to what I had to say – I honestly found the brief meeting very amicable.

Kugel’s comments didn’t reveal much that I didn’t already know or that I couldn’t work out for myself, but as usual, there were a few interesting points. I have detailed the things that I thought worth passing on, below.

  • FRQ’s are written and developed only by the TDC, but MCQ’s involve OIW’s (outside item writers)
  • For the foreseeable future, the 7 questions on the FRQ section will have FIXED point values of 10, 10, 10, 4, 4, 4, 4, i.e., will always total 46 points. There will be no flexibility on this. This brought up an interesting point relating to Q5 that I’ll come to in a minute, but it also means that in order to make the MCQ and FRQ sections to be of equal value, the 46 points in the FRQ section will always be multiplied by 1.087 as long as the 46 points remains fixed
  • 154,228 papers were taken in total. This includes the operational (released) exam, plus others (international and alternate)
  • Sig figs were examined on Q7 in 2016
  • During the presentation of each of the FRQ’s, Kugel had certain words and phrases highlighted in three colors; red, green and blue. The red and green words were those that he described as being from the new curriculum, and included, amongst others, the dreaded ‘justify’ and ‘explain’. But here’s the interesting thing. The ‘blue’ words were described by him as ‘legacy words’, and the word that matters in this context was ‘calculate‘. Now, that isn’t really telling us something that we don’t know already know, but I found this fascinating (also, see below).

In terms of the exam questions themselves, here are some of the salient points.

FRQ 1. Average = 3.71/10

Since the prompt included units in (a)(ii), there was no need to use units in the answer. Interestingly, it was OK to contradict the question prompt and use OTHER, CORRECT units, e.g., J instead of kJ and still get full credit. I find that a bit odd.

Obviously (at least to me), the word complete means that [Ne] was marked wrong.

‘Shielding’ was (correctly IMO) not accepted as an argument for difference in size in part (c). Kugel said that it is, ‘Usually used exclusively for explaining for periodic trends across the period and not down the group’. I agree. Also, the specific use of the terminology of quantum number/shell was not important, rather the key was to explain that whatever term was used, that ‘it’ was occupied.

‘More tightly packed’ was not accepted in part (d), but I suggested that saying ‘smaller r’ would be good; Roger agreed and this reflects a Coulomb’s law emphasis.

Note to students, please avoid ionic charges on water (e.g., O2- and H+). Obviously delta + and – is the way to go.

FRQ 2. Average = 4.39/10

‘Easiest’ question according to the average score. Bob Corell (a reader who was present at the meeting) pointed out that 2 points on this question could be scored by pure, random guessing from a choice of three things given – (a) and (d)(i) – and that this was probably not the best idea! I agree.

FRQ 3. Average = 3.83/10

FRQ 4. Average = 1.35/10

Kugel said that the use of certain technical words such as ‘deprotonated’ certainly have the potential to be problems for ELL students. Honestly, since this is a technical word associated with chemistry, I really have no issue with this, and this tied in perfectly with Paul Price pointing out that although it could limit accessibility, if it’s in the CED then such words are absolutely fair game. I complete agree with Paul. What I’m less happy about is the general, enormous shift of emphasis away from chemistry and toward process (a little more on that below).

FRQ 5. Average = 2.08/4

Note that no justification was required in part (b). The combination of that, and the stoichiometric ‘2’ in the reaction, meant that nobody could know if the kids actually knew what they were doing when writing ‘2’ as an answers, or had simply guessed! I suggested that if they wanted to clear this up, then they would have to move away from the fixed total of four points for each of Q’s 4-7, and make this part a two point question by forcing justification into the mix.

In part (c), both 5 and 2.5 were accepted numerical values (as I predicted). This led to a longer conversation about precedent, where I pointed out that two values for k in such circumstances had been allowed in the past. The conversation about precedent was clear on one hand (that if a legacy – or indeed any other, earlier question – was examining precisely the same thing as a new question – then precedent might be useful), but less clear on another. The problem here lies in the fact that students and teachers are going to have difficulty deciding if the two questions at hand are indeed sufficiently similar to allow the use of precedent. I think it is possible to imagine a kinetics question in the future where only ONE of the k values would be accepted.

FRQ 6. Average = 0.45/4

Did the use of a complex ion in the equilibrium confuse? Kugel said, ‘maybe’. I really have no issue with this. Just like organic molecules being used to illustrate concepts in IMF questions, this is a thing that teachers need coach their kids on, in order to make sure that they aren’t phased by such things. That’s an important point for teachers to note.

Bo Corell (who briefly graded Q6 in 2016), said he saw very little of the Q v. K explanation for part (b). I think this is a CRUCIAL point, and one I made here. Teachers need to get up to speed in terms of the TDC being obsessed with certain things, in this case Q v. K. If one does that, this question is easy IMO. Because people haven’t gotten up to speed on the Q v. K emphasis, a lot of chemistry teachers got this wrong! Wow!

FRQ 7. Average = 1.76/4

In my initial exam comments, I asked the question about the 2nd decimal on a buret. I was taught (and have consequently always taught my own students), that the 2nd place on a buret should always be either a 0 (if the meniscus sits directly on a graduation), or a 5 (if between graduations). Kugel said that he tries to estimate better than that, but since there was a +/-0.02 leeway given on the initial and final readings, it makes this point irrelevant I feel. As such, 37.30 and 5.65 have latitude, and so does the answer, e.g., 37.10 and 5.67 could be the numbers that one uses to do the calculation. BTW, for credit here, one needed to record the initial and final buret readings in your answer. Harvey pointed out that the buret was NOT shown to be graduated in mL!

Kugel went on to offer some general advice which turned out to be no more than a lot of things that we see every year. For example, read the questions carefully it’s increasingly important to do so. Yes, we KNOW! He told us that the average score on the FRQ was 38.20%, and that in 2016 the statistical analysis suggested that the kids were a little more able than in 2015, so more 5’s were awarded.

Interestingly, he also said that the new exam has been deliberately testing common misconceptions of conceptual understanding. That’s interesting, but one would have to know what the CB consider ‘common misconceptions’. Luckily, this is something that I specifically address in my online Summer Workshops each year.

Kugel’s comments were followed by an open forum where he, Serena Magrogan, Paul Price and David Yaron, opened the floor up for a Q & A. Here are a few highlights from that part of the morning.

A question based on word density and reliance upon reading skills was asked. Serena said that, ‘ESL students are a concern’, and that they would not use terms that are NOT in the CED. However, if a word were to be used that was not in the CED and was unfamiliar, then they would be defined in the question. Kugel suggested that this would add to the wordiness of the questions, which is of course true, but I haven’t seen much of this that I recall, and it certainly would not account for the almost doubling of the average number of words per MCQ that we have seen from legacy to new curriculum.

More than one person reported that now PSAT English scores are better predictors of AP chemistry exam success than PSAT Math. I think that’s NUTS. Paul Price offered that this is because the CB are now specifically ‘promoting skills that require being able to handle data in wordy situations’. He described that as a ‘higher education’ skill, and part of the process of ‘doing science’. This confirms for me that there is a specific philosophical shift, driven by schools of education, that process is placed above content; I believe that’s a huge mistake.

The possibility of ‘tear out’ periodic tables etc. was once again brought up, and Serena said that was driven by the ETS and $. Well, we all know that, but the question is, should the CB be applying a lot more pressure to put the students as a higher priority above $? Interestingly, I had heard the CB suggest that it was to avoid the disintegration of booklets, but it was reported that this had not been much of an issue in the past.

Serena also said that in her opinion lab is ‘very important’. I still completely disagree about this, both in terms of anecdotal evidence of schools asking for notebooks, and in terms of getting a 5 on the exam. I will continue to urge new teachers in particular, to be wary of the horrible bang for buck that a lot of lab work gives.

Paul offered the statement,  “4 is the new 5” which is something that we already know, but to have it come out of the mouth of the co-chair of the TDC, underlines that fact.

Serena said that the CB is ‘working toward’ making AP insight free. Now further details on the timeline were offered, so don’t necessarily expect that soon.

Paul also suggested that if we were to analyze the total number of points awarded for calculations, that we would find that things have not changed much from legacy to new. I have not done this analysis yet, but that seems very surprising to me given that everything here is also 100% true. I have no reason to doubt Paul here, but I would say three things.

  1. I would like to have his statement empirically confirmed.
  2. Assuming he is correct, then I wonder if that would stand up if we analyzed the whole exam?
  3. If Paul’s statement did hold up after both the FRQ, AND the MCQ were analyzed, then I believe that a subsequent analysis of the complexity of those calculations would give us all that we need to know.

So, that’s it for this years debrief. Not much to report if I’m honest, and there’s still no sign of the AP Chemistry Tsar that we so desperately need.