Article Information

Author:
Sarah Bansilal1

Affiliation:
1Department of Mathematics Education, University of KwaZulu-Natal, South Africa

Correspondence to:
Sarah Bansilal

Postal address:
8 Zeeman Place, Malvern 4093, South Africa

Dates:
Received: 22 Oct. 2011
Accepted: 04 June 2012
Published: 17 Aug. 2012

How to cite this article:
Bansilal, S. (2012). Using conversions and treatments to understand students’ engagement with problems based on the normal distribution curve. Pythagoras, 33(1), Art. #132, 13 pages. http://dx.doi.org/10.4102/
pythagoras.v33i1.132

Copyright Notice:
© 2012. The Authors. Licensee: AOSIS OpenJournals.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Using conversions and treatments to understand students’ engagement with problems based on the normal distribution curve
In This Original Research...
Open Access
Abstract
Introduction
   • Some literature on the normal distribution curve
   • The analytic framework
Methodology
   • Ethical considerations and recruitment procedures
   • Reliability and validity
   • The test items
   • Remarks about, and solutions to, test items
      • Question 1: We need to find P(x < 400) = ?
      • Question 2: Here we need P(450 < x < 600) = ?
      • Question 3: We need to find an x-score so that P(x > ?) = 0.1
Findings and discussion
   • Findings for Question 1
      • × Blank or unrelated algorithm
      • ■ Partial treatments (PT)
      • ■ Complete or full treatments (FT)
      • ■ Partial conversions (PC)
      • ■ Complete or full conversions (FC)
   • Findings for Question 2
   • Findings for Question 3
   • Performance on the three questions
   • Success rates in conversion transformations and treatment transformations
   • Direction of conversions
Summary
   • Implications of the findings
Acknowledgements
   • Competing interests
References
Abstract

Including probability and statistics in the core curriculum of mathematics in South African schools has made it necessary to train teachers to teach statistics at high school level. This study concentrates on practising mathematics teachers who were students in an in-service programme. The purpose of the study was to investigate students’ success rates on different questions of a multi-part task based on the normal distribution curve. The theory that I used to understand the students’ difficulties is Duval’s theory about movement within and between semiotic representation systems, called treatment transformations and conversion transformations respectively. The first two parts of the problem were unknown percentage problems and involved a treatment followed by a conversion. The third was an unknown value problem and required a conversion before the students could undertake a treatment transformation. The findings reveal that the success rates the students achieved in treatment transformations were higher than those they achieved in conversion transformations. The study also revealed that the direction of the conversions played a role in success rates. Recognising the different challenges the two types of transformations pose requires that teachers pay particular attention to actions that involve movement between different representation systems.

Introduction

Including probability and statistics in the core curriculum of mathematics in South African schools has made it necessary to train teachers to teach statistics at high school level. Although the normal distribution curve is not part of the school curriculum, it is part of a basic course in statistics that aims to equip teachers to teach probability and statistics up to Grade 12 level.

This exploratory study was conducted with 290 in-service secondary school mathematics teachers who had enrolled in an in-service mathematics programme. It focuses on one multi-part problem, which was part of the course summative assessment and includes ‘unknown percentage’ and ‘unknown value’ problems (Watkins, Scheaffer & Cobb, 2004).

In unknown percentage problems, students first transform a given value into an associated z-score using the standardisation process. Thereafter, students identify the probability associated with the z-score and interpret the value in terms of the graph. This involves working simultaneously with properties of the standard normal distribution and the properties of particular z-table values.

In unknown value problems, students have the percentage and they have first to identify the z-score from a table of z-values that corresponds to the given percentage by working with the properties of the standard normal distribution. Thereafter, they calculate the x-score by ‘unstandardising’ it, or reversing the standardising process.

In this article, I refer to these in-service teachers as students because they were participants in the programme. In analysing the students’ performance, I drew on Duval’s (2006) framework for transforming semiotic representations, where he distinguishes between transformations that occur within the same system of representations (treatments) and those that involve a change of register (conversions).

The purpose of this study was, firstly, to investigate whether there were differences in students’ success rates on the two types of transformations (conversions and treatments) that are inherent in one multi-part problem and, secondly, to investigate whether the direction of the conversion transformations influenced success rates.

Students experience unknown value and percentage problems as challenging for many reasons, including because they involve applying and not just recalling the properties of the normal distribution curve.

In this article I report one particular aspect of the challenges. I am looking at students’ proficiency in carrying out treatment and conversion transformations and investigating whether the differential engagement with these two types of transformations could account for the differences in success rates. In doing so, I do not suggest that this is the only factor that accounts for the challenges associated with these types of problems.

Some literature on the normal distribution curve
Reading and Canada (2011) think that distribution of data is a fundamental concept in its own right, but that it is complex despite its relatively straightforward definition. One can see probability distributions as even more complex and understanding the differences between data distributions and probability distributions is a key step in statistical reasoning (Cohen & Chechile, 1997). The authors’ comment that, despite the emphasis on hands-on data analysis and alternative methods of inference, the concept of probability distributions should be part of all introductory statistics courses.

Unlike data distributions, probability distributions are formal theoretical models statisticians use to describe the likelihood of a variable taking on a value or a range of values. It is this theoretical nature that brings out contrasts between probability and data, thereby helping students develop ideas about stochasm (Cohen & Chechile, 1997, p. 2). Wilensky (1997) regards probability distributions as a key concept in probability and statistics because of their importance in understanding statistical models in scientific research and because they stand ‘at the interface between the traditional study of probability and the traditional study of statistics’ (p. 175), and therefore provides an opportunity to make strong connections between the two fields.

Concern about the lack of research into students’ understanding of the normal distribution led to Pfannkuch and Reading (2006) publishing a special issue of the Statistics Education Research Journal. It focused on reasoning about distributions and provides suggested research questions that could address various aspects of reasoning about distributions, including one about the ‘difficulties that students encounter when working with analysing and interpreting distributions’ (p. 5).

Bakker and Gravemeijer (2004) regard a distribution as a conceptual entity for thinking about variability in data. Pfannkuch and Reading (2006) warn that any discussion about the nature of distributions needs to include a conceptual perspective (which clarifies the notions that underpin distributions and why they are important) and an operational perspective (which explains how distributions capture, display and manipulate specific sets of data). Reading and Reid (2006) included both perspectives in their development of a two-cycle hierarchy of reasoning about distributions, based on the application of the structure of observed learning outcomes (SOLO) taxonomy. The first cycle involved understanding key elements whilst the second, more cognitively sophisticated levels, involved using those elements.

Pfaff and Weinberg (2009) believed that actively generating data before analysing them would increase understanding of the statistical concepts. One may see this as indicative of the operational perspective that Pfannkuch and Reading (2006) described. However, their study (Pfaff & Weinberg, 2009) found that, despite the fact that their students actively generated data, their students’ performance in their post-activity assessments was no better than it was in their pre-activity performance.

Carlson and Windquist (2011), in their comment about these unexpected results, argued that Pfaff and Weinberg were correct in concluding that ‘the physical act of generating data was not sufficient to produce learning’ (p. 3). However, they disagreed with the conclusion that the authors (Pfaff & Weinberg, 2009) drew that ‘active learning approaches in general are ineffective’ (Carlson & Windquist, 2011, p. 3).

North and Zewotir (2006) move beyond considering only the approach to teaching statistics. They question the content that introductory statistics courses should cover. They call for a re-think of the statistics courses for social scientists and argue for courses that focus on how to use descriptive statistics instead of focusing on calculations like those based on grouped data. They advise that the courses should devote more time to understanding principles and developing statistical reasoning by using rich contexts. (North & Zewotir, 2006).

However, the situation of social scientists, who are learning how to interpret and use statistics when studying socioeconomic phenomena, is different to that of teachers who are learning how to teach statistics to school children – the context of the current study. Reading and Canada (2011) describe two studies about the statistical reasoning of elementary teachers. Both studies ‘firmly cast the teacher in the role of the learner’ (p. 229)

In the current study, the teachers were also the learners in a basic course in statistics that aimed to equip them to teach probability and statistics up to Grade 12 level. The module covered aspects of statistics like central tendencies, grouped data, distributions, bivariate data, regression, probability concepts and probability distributions.

A concept like the normal distribution curve is not part of the school curriculum. However, one can see it as an example of what Ball, Thames and Phelps (2008) call horizon knowledge. This is an ‘awareness of how mathematical topics are related over the span of mathematics included in the curriculum’ (p. 403) and is one of the six domains that comprise their model of mathematical knowledge for teaching. Having knowledge of the horizon can help teachers make decisions about how to teach concepts like variation, distributions and other statistical topics.

The analytic framework
A set of elementary signs, a set of rules for producing and transforming signs as well as an underlying meaning structure that derives from the relationship between the signs within the system characterise a semiotic system (Ernest, 2006).

Radford (2001) has argued that using signs and tools modify our cognitive functions. On the other hand, Ernest (2006) says that a focus on signs and sign use is the characterising feature of a semiotic perspective of mathematical activity that provides a way of conceptualising the teaching and learning of mathematics. Each semiotic system has its own specific way of working.

Duval (2006) points out that the role semiotic systems of representation play is not only to designate mathematical objects or to communicate but also to work on, and with, mathematical objects. Duval asserts that two different types of transformations of semiotic representations can occur during any mathematical activity. The first type, called treatments, involves transformations from one semiotic representation to another within the same system or register (Duval, 2006, p. 110). Duval (2002) argues that the treatments that one can perform depend on the register one uses and:

the procedures for carrying out a numerical operation depend just as much on the system of representation used for the numbers as on the mathematical properties of the of the operation. (p. 111)

He illustrates his argument with the fact that the algorithm for adding fractions is different for a decimal notation and a fractional notation of the same numbers (0.2 + 0.25 as opposed to Eqn 1 ).

Furthermore, when dealing with treatments, the semiotic system eases the connection of different representations because the rules of the semiotic system link different representations of the same object.

The second type, called conversions, involves changing the system but retaining the reference to the same objects (Duval, 2006, p. 112).

In order to illustrate the differences between treatments and conversions further, I will use an example from transformation geometry.

Consider a point A (2; 3) on the Cartesian plane with the required transformation on A being a clockwise rotation of 90° around the origin. A person can perform the transformation on A by applying the algebraic rule (x, y) (-y, x) to get the result A (-3; 2). This transformation is an example of a treatment because it does not require a change in the system of representation because, after applying the formula, the object is being described by the same representation.

A study by Bansilal and Naidoo (2012), on learners’ engagement with transformation geometry, identified a learner who considered the representation of A in a different register by identifying the location of the point A(2; 3) on the Cartesian plane before performing the rotation transformation. This movement (from the two-coordinate description of A to the location of the point A in the Cartesian plane) is an example of a conversion transformation because the register has changed but not the object (point A). Thereafter the learner worked out the resulting location of the point when he rotated it 90° through the origin by interpreting the motion within the new register. He then identified the location of the rotated point and thereafter assigned the coordinates based on its new position (Bansilal & Naidoo, 2012).

This example illustrates how it is possible for one to perform a transformation using the same representation system (a treatment) and how one could perform it using a representation from a different register. However, the second case needed a conversion transformation to move to the different register of representation before one could perform a treatment using the second system of representation.

Duval gives conversions a more central role in understanding mathematics than he does to treatments and regards conversions as a cognitive threshold that is the main cause of learning difficulties in mathematics. He argues that one cannot reduce a conversion of a representation (change of register) to a treatment. Therefore, conversions account for one of the sources of incomprehension in mathematics.

He believes that ‘we cannot deeply analyse and understand the problem of mathematics comprehension for most learners if we do not start by separating the two types of representation transformation’ (Duval, 2006, p. 127). Duval’s contention is that treatments command more attention in mathematics whilst conversions cause the greatest difficulties in mathematics. He argues that conversions only become relevant because we need to choose ‘the register in which the necessary treatments can be carried out most economically or most powerfully’. Another reason he suggests for using conversions is that they provide ‘a second register to serve as a support or guide for the treatments being carried out in another register’ (p. 127).

The Visualiser/Analyser (VA) model of Zazkis, Dautermann and Dubinsky (1996), which specifies two elements (visualisation and analysis) as two interacting modes of thought, may help us develop an insight into the effort students require to understand conversion transformations. The model describes a series of movements between visual and analytic representations, each of which is mutually dependent in problem solving rather than unrelated opposites.

In their model, the thinking begins with an act of visualisation, V1 (see Figure 1). It could consist of looking at some ‘picture’ and constructing mental processes or objects. The next step is an act of analysis, A1, which consists of some kind of coordination of the objects and processes constructed in step V1. This analysis can lead to new constructions. In a subsequent act of visualisation, V2, learners return to the same ‘picture’ they used in V1. However, because of the analysis in A1, the picture has changed. As learners repeat the movement between the V and A, they use each act of analysis, based on the previous act of visualisation, to produce new and richer visualisations that they then subject to more sophisticated analyses. This creates a spiral effect.

In this model, the acts of analysis deepen the acts of visualisation and vice versa. It is also important to note that, according to this model, as learners repeat the horizontal motion in the model, the acts of visualisation and analysis become successively closer. At first, the passage from one to the other may represent a major mental effort. However, the two kinds of thought become gradually more interrelated and the movement between them becomes less of a concern.

The VA model suggests that the repetition of these successive visual and analytic acts move closer together over time. The implication of this is that this fusion occurs when learners are able to see the properties of the object emerging from the various representations as a whole and can appreciate that the different representations of the same object emphasise different properties of the object. However, it is still one object, like seeing the object from different perspectives. At the stage when learners can see past the differences in representations and understand the connections between the properties revealed by the different registers, then conversion transformations are less likely to present barriers.

Therefore, the VA theory suggests, that it is at this stage when the two kinds of perspectives merge, that the ease of conversion transformations may be facilitated. On the other hand, when learners view representations from two registers as being separate and unconnected, conversion transformations would be more laborious because the learners do not appreciate the links between the properties that each representation conveys.

FIGURE 1: Visualisation/Analysis model.

Methodology

The study utilised an interpretive approach because the main goal of the study was to understand the students’ interpretations of reality (Cohen, Manion & Morrison, 2000) when it comes to solving problems based on the normal distribution curve.

The participants were 290 practising teachers who had enrolled in an in-service programme designed to upgrade and retrain mathematics teachers in the Further Education and Training (FET) band. The programme was for an Advanced Certificate in Education (ACE) with a Mathematics FET specialisation. The programme consisted of eight modules, four of which were specific to mathematics, two of which were generic education modules and two were mathematics education modules.

This article focuses on one of the four mathematics modules devoted to a study of introductory probability and statistics suitable for teachers of FET mathematics.

The test items was selected in the module specifically for assessment and research purposes and presented the three-part task as part of a summative classroom assessment, which included questions from other sections of the module.

One can regard the analysis of the students’ responses as content analysis to throw ‘additional light on the source of communication, its author, and on its intended recipients, those to whom the message is directed’ (Cohen et al., 2000, p. 165). In this case, the students’ responses are the source of the communication intended to convey their engagement with the concept.

The research questions that focused on one multi-part problem based on the normal distribution are:

• Are students more likely to succeed in completing the treatment or conversion transformations the problem requires?
• What role does the direction of the conversion transformations play in the students’ success rates?

The data analysis process involved studying the responses of the 290 students in order to understand the ‘what’, the ‘why’ and the ‘how’ that underlies the data (Henning, 2004). Dey (1993, p. 30) describes data analysis as ‘a process of resolving data into its constituent components to reveal its characteristic elements and structure’.

The students’ responses were broken down into constituent parts that reflected phases of treatments and conversions. I did this to classify and make connections between the data elements (Henning, 2004, p. 128). This means presenting ‘the operations by which data are broken down, conceptualised, and put together in new ways’ (Strauss & Corbin, 1998, p. 120) in order to assess their responses in terms of movement within the same system or between different systems. The students’ responses were then categorised into various categories according to their written explanations.

The findings (see below) explain the specific coding, with examples.

Ethical considerations and recruitment procedures
The participants in this study were the teachers who had enrolled in the particular ACE programme. All students signed informed consent forms and agreed that their responses could be used on condition that no real names or personal details would be revealed. No student refused permission.

Reliability and validity
The test items were carefully selected after discussing them with a colleague from the United States of America (USA). I ensured that the questions were ones that the students would have encountered in their learning during the course. The language was sufficiently basic to ensure that most students would understand it.

I coded the responses myself. However, discussions with an experienced statistics education researcher constituted peer debriefing to improve the credibility of the analysis. Peer debriefing occurs when researchers describe the research to peers who ask the ‘why’ and ‘so what’ questions and may suggest alternative frameworks.

The test items
The tasks used an application of the properties of the standard normal distribution as its basis. When the distribution of a variable in a set of data is approximately normal, one can use the properties of the standard normal distribution curve to make inferences about the variable under discussion.

The standard normal distribution has a mean of 0 and a standard deviation of 1. One refers to the scores as z-scores in the standard normal distribution. Converting to standard units, or standardising, is the two-step process of re-centring and re-scaling that turns any normal distribution into the standard normal. Firstly, one re-centres all the values in the normal distribution by subtracting the mean from each. This results in a distribution with a mean of 0. Thereafter, one divides all the values by the standard deviation (re-scaling). This results in a distribution with a standard deviation of 1.

This process of re-centring and re-scaling allows one to solve problems like the unknown percentage problem (Question 1 and Question 2) and unknown value problem (Question 3) (Watkins et al., 2004). Students encountered both types of questions during class discussions and assessments.

In unknown percentage problems, students first transform a given value into an associated z-score by re-centring and re-scaling. The next step, in which students identify the probability associated with the z-score and interpret its value in terms of the graph, involves working simultaneously with properties of the standard normal distribution and the properties of particular z-table values.

Unknown value problems require students first to identify the z-score from a table of z-values that corresponds to a given percentage. Thereafter, they calculate the x-score by ‘unstandardising’ it, or reversing the standardising process.

The questions under scrutiny in this study are:

A university entrance examination scores are scaled so that they are approximately normal. The mean is about 505 and the standard deviation is about 111.

1. Find the probability that a randomly selected student has a score below 400.
2. Find the probability that a randomly selected student has a score between 450 and 600.
3. The school will offer scholarships to students scoring in the top 10%. What score should be used to decide who should be offered scholarships?

Remarks about, and solutions to, test items
Note that these types of questions were familiar to the students because part of the course was devoted to solving such problems using applications of the normal distribution curve. Defining the random variable X is important for computing the probabilities associated with the random variable.

In this case, the random variable is the entrance examination scores, which have a normal distribution. In order to solve this problem, students received a formula sheet that contained the standardisation formula Eqn 2. The students could use scientific calculators.

Different statistics textbooks use different tabulation values of a standard normal curve area for a given positive value z0, like P(0 < Z < z0) or P(Z < z0) or P(Z > z0), where these are associated with the area of the corresponding sectors. In the lectures and the assessments, the z-table the students had used was P(0 < Z < z0) for positive z0.

In order to answer these questions, it is necessary for students to use properties that apply to the standard normal distribution, like having a mean of 0, a standard deviation of 1 and an area (under the curve) of 1. The area under the curve is the probability. The symmetry of the curve means that the area to the left of 0 is equal to the area to the right of 0. Because of symmetry at 0, P(-z0 < Z < 0) = P(0 < Z < z0) and P(Z < -z0) = P(Z > z0), where z0 is positive and -z0 is negative.

Question 1: We need to find P(x < 400) = ?
The unknown percentage problem requires students to calculate the corresponding z-score from the given x-score using the process of ‘standardising’:

Eqn 3.

Students then identify the percentage that corresponds to the z-score from the z-table and interpret it. Figure 2 shows the categorisation of the steps as treatments and conversions.

Table 1 presents a summary of the solution with explanatory comments and diagrams.

Question 2: Here we need P(450 < x < 600) = ?
Figure 3 is a diagram that explains the decomposition of the problem into treatments and conversions.

Table 2 presents a summary of the solution with explanatory comments and diagrams.

Question 3: We need to find an x-score so that P(x > ?) = 0.1
The unknown value problem requires students first to identify the z-score from a table of z-scores that corresponds to the given percentage. The students can then calculate the x-score by ‘unstandardising’, or reversing the standardising process. Figure 4 shows the categorisation of the steps as treatments and conversions.

The diagram in Figure 4 has the arrows reversed from Question 1 to show that the direction of the solution is opposite to that of Question 1. Table 3 presents a summary of the solution with explanatory comments and diagrams.

Findings and discussion

One can regard the standardisation procedure as a treatment transformation because it is within the same register. The x-score is the input and the z-score is the output of the procedure. As Duval predicted, most students did not experience problems at this point. For Question 1 and Question 2, students could complete the standardisation procedure in one register with the visualisation serving only as ‘a second register to serve as a support or guide for the treatments being carried out in another register’ (Duval, 2006, p. 127).

For Question 3, the situation is a bit different because the conversions are necessary because we need to choose ‘the register in which the necessary treatments can be carried out most economically or most powerfully’ (Duval, 2006, p. 127) which permitted the process of unstandardising of the z-score. One could not access the z-score without doing a c onversion operation which would allow movement from the percentage value to the z-score.

FIGURE 2: Question 1 broken down in terms of conversions and treatments.

FIGURE 3: Question 2 broken down in terms of conversions and treatments.

FIGURE 4: Question 3 broken down in terms of conversions and treatments.

TABLE 1: Solution to Question 1 with explanations.

TABLE 2: Solution to Question 2 with explanations.

TABLE 3: Solutions to Question 3 with explanations.

It is necessary to distinguish between direct and inverse problems (Groetsch, 1999) in this study. A direct problem is one that asks for an output when students have the input and the process. In an inverse problem, students have the output and the problem could ask for the input or the process that led to the output.

One can regard Question 1 as a direct problem and Question 2 as a two–step direct problem. One can regard Question 3 as an inverse problem because it consists of a conversion that takes a p-value and converts it to a z-score. The z-table is organised according to the z-scores. For a given z-value, students can read off a corresponding p-value. In Question 3, the students had a probability value and had to scan the tables until they identified a suitable z-score that corresponded to the given probability. Secondly, the formula in the formula sheet was the standardisation formula Eqn 4.

In Question 1 and Question 2, the students used the formula in the form presented. The value x was the input and the output was z. However, for Question 3, the students had output z and they had to calculate the input. Therefore, one can regard Question 3 as a combination of two inverse problems and as an inverse problem in the way that Groetsch (1999) described.

In order to present the analysis, the students’ responses are labelled to serve as references, for example, S17 which means the response was that of student 17. Students’ responses are labelled from S1 to S290.

The students’ responses are verbatim, although the layout has been changed because of limited space.

Findings for Question 1

× Blank or unrelated algorithm
Here responses were coded blank if students made no attempt. A response was coded as unrelated algorithms if students wrote a formula where the algorithms did not relate to the standardisation procedure. Two examples follow:

Eqn 5

■ Partial treatments (PT)
Here responses were coded as partial treatments (PT) if students wrote the appropriate standardisation formula but did not substitute the correct values or substituted the correct values but did not compute the result correctly, for example:

Eqn 6

■ Complete or full treatments (FT)
Here responses were coded as complete or full treatments (FT) if students completed the standardisation and arrived at the correct figure of -0.945 or if they wrote the value 0.945 as the value they would read off from the z-table. If they went on to other steps that were incorrect, then the responses were coded as FT. For example, some students (12) did not read off a p-value from the z-table and interpreted the z-score as a probability. An example follows:

Eqn 7

Some students continued and used the resulting ‘probability value’ (obtained as for S1) to determine a z-score in the z-table. An example follows:

Eqn 8

Here the student used the z-score (-0.946) as a probability value, found the z-score that corresponded to the ‘probability value’ and presented the z-score (1.83) as a probability (even though it was greater than 1).

■ Partial conversions (PC)
Responses were coded as partial conversions (PC) if students determined a p-value from the z-table that corresponded to the z-score even if the value was not accurate as long as there was a reading of a p-value from a related z-score. An example follows:

Eqn 9

where 0.95 corresponds to a z-score of 33.65%

■ Complete or full conversions (FC)
Here responses were coded as complete or full conversions (FC) if students interpreted the p-values of the z-table in terms of the area under the curve to provide correct (or nearly correct) answers.

Each step depends on the previous step. Therefore, a student who completed an FC, would have done the PC, FT and PT steps.

Table 4 shows that, of the 290 students, 223 (77%) were able to recognise the correct standardisation formula. Only 199 (69%) were able to complete the standardisation procedure correctly. Fifty-five (19%) performed partial conversions and 79 (27%) completed the conversions and the question.

TABLE 4: Results for Question 1.

In order to get a clearer idea of how the students progressed from the treatment steps to the conversion steps, we can consider the cumulative totals:

• the number of students who managed partial treatments will include those who completed the treatments
• those who completed the treatments will include those who managed partial conversions
• those who managed partial conversions will include those who completed full conversions.

The bar graph in Figure 5 gives these numbers. Of the 290 students, 223 (77%) students began the appropriate standardisation procedure. Of these 223 students, 199 (89%) completed the standardisation treatments and of these, 134 (67%) were able to complete the first part of the conversions. Seventy-nine (59%) of the last group were able to complete the conversions correctly.

Findings for Question 2
The following codes were used for Question 2. It is not necessary to give examples of responses in all categories because they are similar to those for Question 1 except that there are two sets of treatments and conversions.

× Blank or unrelated algorithm

Partial treatments (PT), where students chose the appropriate standardisation formula (in one or in both cases) but did not complete both.

Full or complete treatments (FT), where students completed the standardisation procedure in one or in both cases but completed no further correct steps.

Partial conversions (PC), where students read off a p-value

from the z-table in one or in both cases, but did not combine the two p-values correctly, for example:

Eqn 10

Full or complete conversions (FC), where students interpreted the p-values of the z-table in terms of the area under curve to provide correct (or nearly correct) answers.

Table 5 shows that, of the 290 students, 174 (60%) started one or both standardisation procedures, whilst only 156 (54%) were able to complete one or both standardisation procedures correctly. Only 40 students (14%) completed the questions correctly (two of whom had a final answer that differed slightly from the expected one).

TABLE 5: Results for Question 2.

In order to get a clearer idea of how the students progressed from the treatment steps to the conversion steps, I considered the cumulative totals from right to left:

• the number of students who managed partial treatments will include those who completed treatments
• those who completed treatments will include those who managed partial conversions
• those who managed partial conversions will include those who completed full conversions.

The bar graph in Figure 5 gives these numbers. Of the 290 students, 174 (60%) were able to recognise the correct standardisation formula, whilst only 156 (90%) of these student were able to complete it correctly once or twice. Of these 156 students, 96 (62%) completed only the first part of the conversions once or twice (they read off the p-value for the corresponding z-score). Only 40 (42%) of these were able to complete the conversions and arrive at the correct result.

Findings for Question 3
The following codes were used for Question 3:

× Blank or unrelated algorithm

Partial conversions (PC), where students interpreted the percentage value given as a p-value, which was the correct one (p = 0.4), but did not carry out any further correct steps or could have interpreted the percentage as an incorrect p-value.

Eqn 11

Full or complete conversions (FC), where students read off p-values in a z-table to generate a z-score which was correct or incorrect; students who completed full conversions all continued.

Partial treatments (PT), where students chose the appropriate formula for unstandardising a z-score.

Eqn 12

Full or complete treatments (FT), where students completed the procedure for unstandardisation correctly or nearly correctly.

Eqn 13

TABLE 6: Results for Question 3.

The response of S133’s was coded almost correct compared to that of S135, where the final answer was not close to the expected one.

Table 6 shows that, of the 290 students, 108 students did not respond and 34 used an irrelevant algorithm. Therefore, 142 (49%) did not even begin partial conversions. Seventy-eight (27%) tried but did not generate the correct p-value whilst 20 (7%) students completed partial conversions by correctly extracting the p-value from the information the students had. Three (1%) students completed the conversions and started the unstandardising treatments, whilst 47 (15%) students managed complete treatments and obtained a correct or almost correct solution (the final answer that 26 students reached differed slightly from the expected answer).

In order to get a clearer idea of how the students progressed from the conversion steps to the treatment steps, I considered the cumulative totals from right to left:

• the number of students who completed partial conversions will include those who completed full conversions
• those who completed full conversions will include those who completed partial treatments
• those who completed partial treatments will include those who completed full treatments.

The bar graph in Figure 6 gives these figures. There were 148 (51%) students who started the conversions (obtained p-values). Of these 148 students, 50 were able to complete the conversions by reading off p-values and chose the correct formula for unstandardising. That is, 34% completed the conversions (read off the p-values for the corresponding z-score) and started treatments whilst 47 (94%) of the 50 students were able to complete the treatments and solve the problem (the final answers of 26 students differed slightly from the expected one).

Performance on the three questions
Students clearly found that Question 2 was more challenging than Question 1 was. Only 40 students got Question 2 correct whilst 79 students managed to complete Question 1 correctly – almost twice as many. Furthermore, there were 67 blank or incorrect algorithms for Question 1 compared to 116 for Question 2. This showed that more students did not attempt to solve Question 2 than those who failed to attempt Question 1.

It is clear that Question 2 was more complex than Question 1 because it involves regions bounded by two given x-scores. Therefore, there were two sets of treatments as well as two sets of partial conversions and completing the conversions meant that students had to take a global view of the two areas and decide how they would use them to generate the required percentages. Consequently, solving Question 2 would have been more demanding than just carrying out treatments followed by conversions, as Question 1 required.

Question 3 was challenging for the 142 (49%) students who did not start correctly. Forty-seven completed the whole question correctly or almost correctly. This was more than the 40 who completed Question 2 correctly or almost correctly but fewer than the 69 who completed Question 1 correctly or almost correctly.

If one compares performance on Question 3 with that on Question 1, 67 students did not start Question 1 correctly. On the other hand, there were more than twice as many (142) students who did not begin Question 3 correctly. There are two possible reasons for this.

Firstly, the inverse nature of the question meant that the steps to the solution were reversed, which made it more complex (Bansilal, Mkhwanazi & Mahlabela, in press; Groetsch, 1999; Nathan & Koedinger, 2000). Secondly, students had to complete the conversions before the treatments. This created a bigger first barrier than the situation where the first barrier was not as great as the second was.

Duval’s (2006) theory maintains that conversion transformations are more difficult than treatment transformations are because they require crossing into another register of representation. Conversions are more complex because they involve movement in each of the two registers and movement across them, whilst treatments require movement in one register only.

Success rates in conversion transformations and treatment transformations
The bar graph in Figure 5 provides a visual representation of the progress of students through the stages for Question 1 and Question 2. It shows the number of students who did a PT, FT, FT PC and FT FC respectively and excludes the students who made no response or used a wrong formula. Note that, in this graph, the first set includes the second, which includes the third, which includes the fourth and derives from the figures Tables 4 and Table 5 provide.

FIGURE 5: Number of students progressing at each stage to the final solution for Question 1 and Question 2.

FIGURE 6: Number of students progressing at each stage in Question 3 to the final solution.

The cumulative picture for Question 3 (see Figure 6) shows the number of students who completed a PC, FC, FC PT and FC FT respectively. The first set includes the second, which includes the third, which includes the fourth. These figures derive from the information Table 6 provides.

There are clear trends in performance on Question 1 and Question 2.

Of the 290 students, 223 (77%) performed a PT on Question 1. Of these, 199 (89%) completed the treatments. Of this group, 134 (67%) went on to complete a PC and 79 (59%) of this group were successful. For Question 2, the numbers from Table 2 are 290 (original), 174 (PT), 156 (FT), 96 (PC) and 40 (FC). The flow diagrams below show these figures:

Question 1: 100% → (PT) 77% → (FT) 89% → (PC) 67% → (FC) 59%

Question 2: 100% → (PT) 60% → (FT) 90% → (PC) 62% → (FC) 42%

The attrition rate at each stage of Question 2 was higher than that for Question 1, except for the progression from partial treatments to full treatments, where 90%of students who managed partial treatments for Question 2 completed the treatments. The corresponding percentage for Question 1 was 89%. However, for all other stages, the progression rate from one to the next was higher for Question 1 than it was for Question 2. On both questions, the highest attrition rate was in the progress from PC to FC. It showed that only 59% of students who started conversions for Question 1 completed them, whilst for Question 2 only 42% of students who started the conversions were able to complete them.

When one considers the performance on Question 3, the numbers from Table 6 are 290, 148 (PC), 50 (FC), 50 (PT), 47 (FT). The flow diagram below shows the figures:

Question 3: 100% → (PC) 51% → (FC) 34% → (PT) 100% → (FT) 94%

Here, as for Question 1 and Question 2, the highest attrition rate was in the movement from PC to FC. Only 34% of the group who started conversions were able to complete them and all of these students went on to start treatments. Thereafter, there were few challenges for this group and only three students did not complete the procedure.

The treatment procedure for Question 3 was not a problem for those students who completed their conversions. Forty-seven of the 50 students (94%) who completed conversions were able to complete treatments.

The conversions were problems in Question 1 and Question 2. They were insurmountable for many, because only 79 of the 199 (39%) and 40 of the 156 (25%) of the students who completed treatments were successful with conversions.

A comparison between trends in responses across the questions supports Duval’s assertion that conversion transformations can be more complex than treatments. For Question 1 and Question 2, the percentage of students who proceeded from full treatments to full conversions was 39% and 25% respectively, whilst for Question 3 the percentage of students who proceeded from full conversions to full treatments was 94%.

It is clear that, for the group as a whole, the students’ success rates in conversion transformations were lower than in treatment transformations. However, not all the students would have experienced conversions as more difficult than treatments. The movement between the two registers was not a problem for some students.

Direction of conversions
The direction of conversions is another factor that Duval contends affects the complexity of mathematical activities. Duval maintains that a ‘conversion in one direction can be without any cognitive link with this in the reverse direction’ (Duval, 2008, p. 47), suggesting that the direction of the conversions is important. Duval has shown that, when the original and destination registers of conversions change, students’ performances vary considerably. In one case of linear algebra, 83% of students were able to move successfully between a two-dimensional table representation of a vector to a two-dimensional graphical representation, whereas only 34% of students were able to move in the opposite direction.

The direction of the conversions seems to have been a factor that influenced the students’ success rates. Sixty-nine students completed Question 1 correctly, whilst only 40 students did so on Question 3. Of the students who started conversions for Question 1, 59% were able to complete them, whilst only 34% of the students who started conversions for Question 3 were able to do so.

The reason for the lower completion rate for the conversions for Question 3 could lie in the fact that the conversion transformation of Question 1 involved moving from the z-scores to the probability value (or area) that travelled in the opposite direction to the conversion in Question 3 (moving from the probability value to the z-score). In addition, 89% of the students who completed conversions for Question 3 went on to complete the treatments. Therefore, the conversions were bigger hurdles. The percentage of students who proceeded from full treatments to full conversions in Question 1 was 39%.

One of the factors that made Question 3 more challenging was the direction of the conversions, which was different in the two cases. Duval’s own observations about linear algebra (2008) support this. However, we need further research to help us understand why conversions in one direction were more challenging to complete than were conversions in another.

Summary

This article presented an analysis of 290 students’ responses to a three-part task using applications of the normal distribution curve. Duval’s framework was used to explain the students’ difficulties with solving the task.

Question 1 and Question 2 of the task are ‘unknown percentage problems’ and Question 3 is an example of an ‘unknown value problem’ (Watkins et al., 2004) and one can regard it as an inverse problem (Groetsch, 1999).

Different parts of the solutions to the questions were categorised into conversions and treatments, depending on whether the operation required students to move across a register or stay within the same register. The students’ responses were coded according to whether they performed partial treatments, complete treatments, partial conversions or complete conversions.

The findings show that Question 2 was more difficult than Question 1: twice as many students completed Question 1 correctly compared to Question 2. It was argued that Question 2 was more challenging because students had to complete two sets of conversions and two sets of treatments. The results of these transformations had to be synthesised together to produce an answer.

It was also found that Question 3 was more challenging than Question 1 was. Seventy-nine students obtained correct answers for Question 1 and only 47 obtained correct, or close to correct, answers for Question 3. It was argued that one factor could be the inverse nature of Question 3, whilst Question 1 was a direct problem. The other factor could be that students needed to complete the conversion transformations for Question 3 before the treatment transformations. Furthermore, because the conversions were bigger hurdles, more students could not progress further. The students encountered the treatment transformations first in Question 1. More students succeeded with this hurdle than with the first hurdle in Question 3, allowing them to progress.

Duval’s theory that conversions are more challenging than treatments is supported by the findings in this study. When the attrition rate is examined at each stage in each of the three questions, there were clear patterns in the performance of the students. On Question 1 and Question 2, 59% and 42%, respectively, of the group that started conversions were able to complete them. This compares to approximately 90% of the group who started treatments who were able to complete at least one treatment. In addition, only 34% of the group who started conversions for Question 3 were able to complete them, whereas 94% of the group who started treatments were able to complete them. This shows that completing the conversions was harder than completing the treatments in all three of the questions.

Furthermore, this study supports Duval’s (2006) examples in linear algebra that show that the direction of conversions also plays a role in the difficulty level of questions. He writes that ‘when the roles of source register and target register are inverted within a semiotic representation, the problem is radically changed for students’ and that ‘performances vary according to the pairs (source register, target register)’ (p. 122, brackets added). This was true for Question 1 and Question 3.

In Question 1, if one considers the group of 134 who completed the treatments, then 79 of these (or 58%) succeeded in completing the conversions when the movement was from z0 to P(Z < z0). In Question 3, when the movement was from P(Z > z0) to z0, the success rate was 34% (50 of the 148 had identified some sort of p-value). This shows that the students found the second conversion more difficult. If one considers the percentages for the whole group of 290, then 79 of the 290 (or 27%) were able to complete conversions for Question 1 whilst only 50 of the 290 (or 17%) were able to complete conversions for Question 3.

Implications of the findings
Duval (2006) differentiated between treatments and conversions and commented that ‘we cannot deeply analyse and understand the problem of mathematics comprehension for most learners if we do not start by separating the two types of representation transformation’ (p. 127).

This study has also shown that conversions and treatments in this problem offer different levels of challenges to students. Therefore, educators should note the additional challenge of moving between systems of representations. The findings suggest that educators may need to support conversion transformations more than treatment transformations to help learners to overcome the challenges.

One aspect that deserves notice is that this group of students did not receive any computer-aided instruction, nor could they work through computer simulations of normal curves, as normally happens in probability and statistics modules nowadays.

If they had had some exposure, they might have had a better idea of the visual aspects of the normal distribution curve and may have been able to switch between representations more easily. Applets or other computer simulation activities could allow students to engage with the properties the different representations reveal. They could also help students to explore situations that show links between the changes in the z-scores with the changes in the area values in the different modes of representation.

Drawing on Zazkis et al.’s (1996) VA model, perhaps such opportunities will help students move more effortlessly between the different registers, thus reducing the barriers related to carrying out conversion transformations.

The solutions to these questions involved coordinating two different registers, which were initially separate. However, Zazkis et al. (1996) suggest, in their VA model, that even though movement between two modes may start as distinct and separate, they eventually merge. Zazkis et al. confine their discussion to the movement between the acts of visualisation and analysis. However, we can apply it to the two registers that we have identified here to suggest that, at some point, the students will regard the combination of these two registers as one that enriches their ‘cognitive architecture’ (Duval, 2006), and which will enable them to move on to further layers of movement between more complicated registers.

Finally, this article delved into students’ engagements with the treatment and conversion transformations associated with one particular problem. Readers may want to consider whether one could look at other areas in similar ways and whether they could help to explain the students’ difficulties in those areas.

It is hoped that this study will encourage other researchers to look for evidence to support or contradict these findings in other areas. Additionally, it is hoped that such further research would help to illuminate further the challenges that learners experience when they work with problems that involve moving across different registers of representation.

Acknowledgements

I acknowledge a grant from the United States Agency for International Development (USAID), administered through the non-governmental organisation Higher Education for Development for research on the different modules in the ACE certification programme. There was no specific grant for this article.

I also acknowledges the contribution from Thomas Schroeder (University at Buffalo, State University of New York [SUNY], USA), who assisted with a preliminary report on this project, sketched the normal distribution curves in the article and acted as peer debriefer during the analysis process.

Competing interests
I declare that I have no financial or personal relationship(s) that may have inappropriately influenced me when I wrote this article.

References

Bansilal, S., & Naidoo, J. (2012). Learners engaging with Transformation Geometry. South African Journal of Education, 32, 26–39. Available from http://www.sajournalofeducation.co.za/index.php/saje/article/view/452/291

Bansilal, S., Mkhwanazi, T.W., & Mahlabela, P. (in press). Mathematical Literacy teachers’ engagement with contextual tasks based on personal finance. Perspectives in Education.

Bakker, A., & Gravemeijer, K. (2004). Learning to reason about distribution. In D. Ben-Zvi, & J. Garfield (Eds.), The challenge of developing statistical literacy, reasoning and thinking (pp. 147–168). Dordrecht: Kluwer Academic Publishers.

Ball, D.L., Thames, M.H., & Phelps, G. (2008). Content knowledge for teaching: What makes it special? Journal of Teacher Education, 59(5), 389–407. http://dx.doi.org/10.1177/0022487108324554

Batanero, C., Tauber, L.M., & Sanchez, V. (2004). Students’ reasoning about the normal distribution. In D. Ben-Zvi, & J.Garfield (Eds.), The challenges of developing statistical literacy, reasoning, and thinking (pp. 257–276). Dordrecht: Kluwer. http://dx.doi.org/10.1007/1-4020-2278-6_11

Carlson, K.A., & Winquist, J.R. (2011). Evaluating an active learning approach to teaching introductory statistics: A classroom workbook approach. Journal of Statistics Education, 19(1), 1–22. Available from http://www.amstat.org/publications/jse/v19n1/carlson.pdf

Cohen, L., Manion, L., & Morrison, K. (2000). Research methods in education. London: Routledge Falmer. http://dx.doi.org/10.4324/9780203224342

Cohen, S., & Chechile, R.A. (1997). Probability distributions, assessment and instructional software: Lessons learned from an evaluation of curricular software. In I. Gal, & J.B. Garfield (Eds.), The assessment challenge in statistics education (pp. 253–262). Amsterdam: IOS Press. Available from http://www.stat.auckland.ac.nz/~iase/publications/assessbk/chapter19.pdf

Dey, I. (1993). Qualitative data analysis: A user-friendly guide for social scientists. London and New York: Routledge. http://dx.doi.org/10.4324/9780203412497

Duval, R. (2002). The cognitive analysis of problems of comprehension in the learning of mathematics. Mediterranean Journal for Research in Mathematics Education, 1(2), 1–16.

Duval, R. (2006). A cognitive analysis of problems of comprehension in the learning of mathematics. Educational Studies in Mathematics, 61, 103–131. http://dx.doi.org/10.1007/s10649-006-0400-z

Duval, R. (2008). Eight problems for a semiotic approach in mathematics. In L. Radford, G. Schubring, & F. Seeger (Eds.), Semiotics in mathematics education (pp. 39–62). Rotterdam: Sense Publishers.

Ernest, P. (2006). A semiotic perspective of mathematical activity: The case of number. Educational Studies in Mathematics, 61, 67–101 . http://dx.doi.org/10.1007/s10649-006-6423-7

Groetsch, C.W. (1999). Inverse problems: Activities for undergraduates. Washington D.C: Mathematical Association of America.

Henning, H. (2004). Finding your way in qualitative research. Pretoria: Van Schaik Publishers.

Nathan, M.J., & Koedinger, K.R. (2000). Teachers’ and researchers’ beliefs about the development of algebraic reasoning. Journal for Research in Mathematics Education, 31(1), 171–190.

North, D., & Zewotir, T. (2006). Teaching statistics to social science students: Making it valuable. South African Journal of Higher Education, 20(4), 503–514.

Pfaff, T.P., & Weinberg, A. (2009). Do hands-on activities increase student understanding? A case study. Journal of Statistics Education, 19(1), 1–34. Available from http://www.amstat.org/publications/jse/v17n3/pfaff.pdf

Pfannkuch, M., & Reading, C (2006). Reasoning about distribution: A complex process. Statistics Education Research Journal, 5(2), 4–9.

Radford, L. (2001, July). On the relevance of semiotics in Mathematics Education. Paper presented to the Discussion Group on Semiotics and Mathematics Education at the 25th Conference of the International Group for the Psychology of Mathematics Education. Utrecht, Netherlands. Available from http://www.laurentian.ca/NR/rdonlyres/C81F52CE-648B-44DF-928D-5A1B0C0612C0/0/On_the_relevance.pdf

Reading, C., & Canada, D. (2011). Teachers’ knowledge of distribution. In C. Batanero, G. Burrill, & C. Reading (Eds.), Teaching statistics in school mathematics: Challenges for teaching and teacher education (pp. 223–234). Dordrecht: Springer. http://dx.doi.org/10.1007/978-94-007-1131-0_23

Reading, C., & Reid, J. (2006). An emerging hierarchy of reasoning about distribution: From a variation perspective. Statistics Education Research Journal, 5(2), 46–68. Available from http://www.stat.auckland.ac.nz/~iase/serj/SERJ5(2)_Reading_Reid.pdf

Strauss, A.L., & Corbin, J. (1998). Basics of qualitative research: Techniques and procedures for developing grounded theory. Thousand Oaks, CA: Sage.

Wilensky, U. (1997). What is normal anyway? Therapy for epistemological anxiety. Educational Studies in Mathematcs, 33, 171–202. http://dx.doi.org/10.1023/A:1002935313957

Watkins, A.E., Scheaffer, R.L., & Cobb, G.W. (2004). Statistics in action. Understanding a world of data. Emeryville, CA: Key Curriculum Press.

Zazkis, R., Dautermann, J., & Dubinsky, E. (1996). Using visual and analytic strategies: A study of students’ understanding of permutation and symmetry groups. Journal for Research in Mathematics Education, 27(4), 435–457. http://dx.doi.org/10.2307/749876