4. Emerging Technologies: Artificial Intelligence

4.4. Affordances and Examples of AI Use in Teaching and Learning

Zawacki-Richter et al. (2019) in a review of the literature on AI in education initially identified 2,656 research papers in English or Spanish, then narrowed the list down by eliminating duplicates, limiting publication to articles in peer-reviewed journals published between 2007 and 2018, and eliminating articles that turned out in the end not to be about the use of AI in education. This resulted in a final 145 articles which were then analyzed. Zawacki-Richter et al. then classified these 145 papers into different uses of AI in education. This section draws heavily on this classification. (It should be noted that within the 145 articles, only 92 were focused on instruction/student support. The rest were on institutional uses such as identifying at risk students before admission).

The Zawacki-Richter study offers one insight into the main ways that AI has been used in education for teaching and learning over the ten years between 2007 and 2018, the closest we can come to ‘affordances’. First, three main general ‘instructional’ categories (with considerable overlap) from the study are provided below, followed by some specific examples. (I have omitted Zawacki-Richter et al.’s category of profiling and prediction concerned with administrative issues such as admissions, course scheduling, and early warning systems for students at risk.)

Intelligent Tutoring Systems (29 out of 92 Articles Reviewed by Zawacki-Richter Et Al.)

Intelligent tutoring systems:

  • Provide teaching content to students and, at the same time, support them by giving adaptive feedback and hints to solve questions related to the content, as well as detecting students’ difficulties/errors when working with the content or the exercises.
  • Curate learning materials based on student needs, such as by providing specific recommendations regarding the type of reading material and exercises done, as well as personalized courses of action.
  • Facilitate collaboration between learners, for instance, by providing automated feedback, generating automatic questions for discussion, and the analysis of the process.

Assessment and Evaluation (36 out of 92)

AI supports assessment and evaluation through:

  • Automated grading
  • Feedback, including a range of student-facing tools, such as intelligent agents that provide students with prompts or guidance when they are confused or stalled in their work
  • Evaluation of student understanding, engagement and academic integrity

Adaptive Systems and Personalization (27 out of 92)

AI enables adaptive systems and the personalization of learning by:

  • Teaching course content then diagnosing strengths or gaps in student knowledge, and providing automated feedback;
  • Recommending personalized content;
  • Supporting teachers in learning design by recommending appropriate teaching strategies based on student performance;
  • Supporting representation of knowledge in concept maps.

Klutka et al. (2018) identified several uses of AI for teaching and learning in universities in the USA. ECoach, developed at the University of Michigan, provides formative feedback for a variety of mainly large classes in the STEM field. It tracks students progress through a course and directs them to appropriate actions and activities on a personalized basis. Other applications listed in the report include sentiment analysis (using students’ facial expressions to measure their level of engagement in studying), an application to monitor student engagement in discussion forums, and organizing commonly shared mistakes in exams into groups for the instructor to respond once to the group rather than individually.

Chatbots

A chatbot is programming that simulates the conversation or ‘chatter’ of a human being through text or voice interactions (Rouse, 2018). Chatbots in particular is a tool used to automate communications with students. Bayne (2014) describes one such application in a MOOC with 90,000 subscribers. Much of the student activity took place outside the Coursera platform within social media. The five academics teaching the MOOC were all active on Twitter, each with large networks, and Twitter activity around the MOOC hashtag (#edcmooc) was high across all instances of the course (for example, a total of around 180,000 tweets were exchanged on the first offering of the MOOC). A ‘Teacherbot’ was designed to roam the tweets using the course Twitter hashtag, using keywords to identify ‘issues’ then choosing pre-designed responses to these issues, which often entailed directing students to more specific research on a topic. For a review of research on chatbots in education, see Winkler and Söllner (2018).

Automated Essay Grading

Natural language processing (NLP) artificial intelligence systems – often called automated essay scoring engines – are now either the primary or secondary grader on standardized tests in at least 21 states in the USA (Feathers, 2019). According to Feathers:

Essay-scoring engines don’t actually analyze the quality of writing. They’re trained on sets of hundreds of example essays to recognize patterns that correlate with higher or lower human-assigned grades. They then predict what score a human would assign an essay, based on those patterns.

Feathers though claims that research from psychometricians and AI experts show that these tools are susceptible to a common flaw in AI: bias against certain demographic groups (see Ongweso, 2019).

Lazendic et al. (2018) offer a detailed account of the plan for machine grading in Australian high schools. They state:

It is …crucially important to acknowledge that the human scoring models, which are developed for each NAPLAN writing prompt, and their consistent application, ensure and maintain the validity of NAPLAN writing assessments. Consequently, the statistical reliability of human scoring outcomes is fundamentally related to and is the key evidence for the validity of NAPLAN writing marking.

In other words, the marking must be based on consistent human criteria. However, it was announced later (Hendry, 2018) that Australian education ministers agreed not to introduce automated essay marking for NAPLAN writing tests, heeding calls from teachers’ groups to reject the proposal.

 Perelman (2013) developed a computer program called the BABEL generator that patched together strings of sophisticated words and sentences into meaningless gibberish essays. The nonsense essays consistently received high, sometimes perfect, scores when run through several different scoring engines. See also Mayfield, 2013, and Parachuri, 2013, for thoughtful analyses of the issues in the automated marking of writing.

At the time of writing, despite considerable pressure to use automated essay grading for standardized exams, the technology still has many questions lingering over it.