Supporting Scientific Argumentation in the Classroom

A new generation of standards for the teaching and learning of science highlight the importance of evidence-based argumentation, both as a powerful learning experience and as a critical practice that reflects how knowledge is constructed in the sciences (NGSS Lead States, 2013; Osborne, 2010).

At the same time, while considerable attention–especially for assessment–has been placed on written arguments, research has advanced our understanding of the essential role that speaking and listening plays in student learning as well (see for example, Chi & Wylie, 2014). However, while speaking and listening to others may be fundamental to science literacy and learning in general, it is the reading and writing modalities that typically receive the preponderance of attention when it comes to consequential educational assessments. As a result, teachers are privy to relatively few valid and reliable assessments for speaking and listening, and next to none that are focused on the construction and critique of oral scientific arguments. Needed, then, are resources to monitor and support the disciplinary aspects of oral scientific argumentation in the classroom.

In response, and through funding from the National Science Foundation (DRL#s 1621441 & 1621496), we have created a digital formative assessment system named DiALoG (Diagnosing the Argumentation Levels of Groups). DiALoG helps teachers recognize and assess 8 important aspects of oral argumentation as it happens in their classrooms and also empowers teachers to take action on assessment information with a suite of RMLs (Responsive Mini-Lessons) tailored to the scores given for each of construct assessed by DiALoG. RMLs are follow-up activities designed to further move the needle for these important facets of oral argumentation.

Oral arguments consist of both the content voiced by students and the dialectical process of generating and working with that content – what Erduran, Simon, and Osborne (2004) characterize as a distinction between argument and argumentation. Incorporating this theoretical distinction, we originally drafted two bundles of assessment items – one group of items purported to assess the substantive content of what students say when engaging—in activities designed to promote scientific arguments, while another group of items were designed to gauge the quality of social interaction between interlocutors during these same activities. As our assessment instrument was intended to target classroom discussion, the wording of assessment items was guided by the Michaels, O’Connor, and Resnick (2008) work on Accountable Talk. Their framework is predicated on a Vygotskian perspective in which the development of social interaction skills is intertwined with individual cognitive development. To wit, talking is thinking.

In order to evaluate the substantive content of classroom talk, we created items probing the degree to which students were accountable for both the logical requirements of a valid argument in addition to the scientific accuracy and relevance of their utterances. As for assessment of argumentation processes, items were guided by the Michaels et al. notion of Accountability to the Learning Community, which emphasizes respect for and critical attention to the contributions of others so that ideas can be built upon one another.

Recognizing that formative assessment is most powerful when teachers take specific, targeted action on the information they gather, we developed a portfolio of instructional suggestions aligned with the DiALoG instrument. The development process for these lessons drew from our own work creating and iteratively testing a variety of argumentation and discursive elements of the Amplify Science curriculum. Drawing on the work of Osborne and colleagues (see for example, Driver, Newton, & Osborne, 2000; Osborne Erduran, & Simon, 2004; Osborne, 2010) we sought to provide students both with the skills and information needed to participate in scientific argumentation, as well as many open-ended opportunities to practice these skills. Over an eight-year development period, the iterative field-testing of oral and written student activities and analysis of student work led us to an understanding of identifiable stages through which most students go as they begin to internalize the norms of participation in a community in which arguing from evidence and respectfully convincing others of scientific ideas are common practices. We leveraged these insights to develop activities that isolated particular aspects of argumentation such as reasoning, and allowed students to practice developing their capacities for engaging in this difficult skill.

Simultaneous to this curricular work, we participated in two studies, Constructing and Critiquing Arguments in Middle School Science Classrooms: Supporting Teachers with Multimedia Educative Curriculum Materials (National Science Foundation, DRL-1119584) and Constructing and Critiquing Arguments: Diagnostic Assessment for Information and Action System (Carnegie Corporation of NY, B-8780), both of which enabled us to create specialized lessons intended to increase students’ understanding of, and capacity to participate in oral argumentation. Knowledge gained and successful prototypes developed through this eight-year process about how to support students to better understand and participate in oral and written argumentation became the starting point for development of the Responsive Mini Lessons (RMLs). To further strengthen the formative assessment information-action cycle, the RMLs were explicitly designed to correspond to a range of possible scores for the different dimensions measured by DiALoG.

Within each dimension – critiquing, listening, co-constructing, claims, evidence and reasoning – there are 3 possible levels the teacher can assign, based on her formative assessment of students’ abilities for a given dimension. A score of 0 (“not descriptive”) or 1 (“somewhat descriptive”) indicates that students likely need more support with that particular component of oral argumentation. A lesson is then suggested that is responsive to both the identified component and the level of support needed: if a score of 0 is given, the corresponding lesson provides basic, introductory support; if a score of 1 is given, the lesson assumes some basic facility with that component by building upon the previous RML and providing an opportunity to practice it with more focus. RMLs are intended to be 30-60 minute lessons and designed to be used as a near-term follow-up to the argumentation episodes for which the DiALoG instrument’s use has revealed a need for further support.

With data obtained for both the psychometric development of the DiALoG assessment as well as preliminary classroom use of DiALoG by pilot teachers, we analyzed the evidence for both the validity/reliability of the assessment and its feasibility for classroom use, respectively.
Psychometric Findings

In addition to high inter-rater reliability (R2 = .933) for the total scores allocated by each of two raters to n = 28 videotaped episodes of classroom group argumentation, eight weeks later the same video episodes were scored in the exact same order by the same two raters, yielding high test-retest reliability as well (r(28)= .966, p < .001). Exploratory factor analysis with varimax rotation allowed us to consolidate items with high levels of shared variance into a more parsimonious set of items. Figure 2 provides a visual representation of the eight assessment items that remained after this dimension reduction. Scree plot inspection indicated that this factor analysis was consistent with our theoretical framework, as it identified two primary factors that we interpreted as intrapersonal and interpersonal – located on the left and right of Figure 2, respectively. The intrapersonal factor had a Cronbach’s alpha of .980, an initial eigenvalue of 7.611, and accounted for 63.42% of the total variance in scores. The interpersonal factor had a Cronbach’s alpha of .933, an initial eigenvalue of 2.683, and accounted for 22.35% of the total variance in scores. No other factors had initial eigenvalues above 0.6, with most below 0.3. We now describe the items pertaining to each of these two dimensions that remained after multiple cycles of video coding and empirical review of item correlations and explained variance.

In addition to psychometric testing, we’ve used classroom observations, interviews, and surveys from teachers who have participated in piloting work in two different states to iteratively refine this two-pronged formative assessment of oral classroom argumentation. Lessons learned include the realization by pilot teachers that using the DiALoG assessment attunes them to particular classroom interactions that they had not previously considered, and the accompanying RMLs help fill gaps in their pedagogical content knowledge and repertoire. Furthermore, pilot teacher feedback strongly suggests that refinement of the user interface for the DiALoG assessment is not just an issue of improving the ease of teacher use, but also expanding the ways in which the assessment is used in the classroom. This early formative experience with DiALoG suggests substantial potential for insights into the research, design, and development of educational technology more broadly.

After a brief overview of underlying pedagogy and emerging research findings described above, this session will include opportunities for participants to access the DiALoG system, and hands-on, guided activities to gain practice using the system. Mirroring a vetted component of the professional learning experiences we provide to participants in our pilot study, we will engage workshop participants in collective review of a video clip of oral scientific argumentation in a middle school classroom, with participants using DiALoG to individually score the videotaped instance of argumentation. Our basic process is:
1) engage participants in a “cold viewing” of the classroom clip, without using DiALoG;
2) discuss what was observed;
3) present and discuss the DiALoG scoring tool;
4) engage participants in a “warm viewing” of the same clip, using the tool;
5) discuss challenges and insights from the experience.

Through these experiences, we expect participants in the workshop to come away with a better understanding of the following:

1) What disciplinary features are most important to pay attention to when formatively assessing oral scientific argumentation.
2) How to use a freely available technological tool for formative assessment of classroom argumentation.
3) Early research findings, including insights from teachers who have used the DiALoG system.

In addition to providing materials about the DiALoG system and its underlying research, we will also share online access to the freely available digital system (scoring tool, educative materials, and RMLs) so teachers can continue to use it in their classrooms.