The Paradox of Embracing AI in Higher Education
The more I learn about generative AI, the more uneasy I become about the future of academia.
A few months ago, I was optimistic that generative AI might reverse sliding enrolments in the Arts. I could imagine a future full of AI tools like GPT-3.5 which would require active human supervision to get good results. I agreed with those that thought this might actually grow demand for the critical thinking and research skills we claim to teach in the humanities and social sciences. However, as I continue to learn about AI and use newer tools like recursive agents and GPT-4 that require less human “supervision”, I see that I was naïve.
While it is possible that the frantic pace of AI development may slow, that seems increasingly unlikely. The largest tech companies in the world are engaged in a massive arms race and are pouring billions into developing Artificial General Intelligence (AGI). In a leaked internal memo, a Google engineer suggests that “basement” developers using open-source models are actually outpacing big tech in AI development. What we are witnessing is the simultaneous development and democratization of a revolutionary technology—and both are happening at lightning speed.
If it is true that Google has “no moat” to protect its business model, neither does higher education. I still think AI will inevitably transform what we do as scholars and teachers—and that we need to learn to use it effectively—but I now think that it’s likely to accelerate pre-existing enrolment trends away from the humanities.
A ChatGPT Assignment
What really changed my thinking was the ChatGPT assignment I had my third year history students do this past semester. It required students to write a 4,000-word research essay on some aspect of the fur trade using eight secondary sources and three primary sources. A standard research paper. Students could also opt to do a “ChatGPT version,” using the AI as a research copilot, helping them refine their questions, improve their keyword search skills, develop a thesis, create an essay outline, and (above all else) edit their papers. At the time, only ChatGPT 3.5 was freely available so I explicitly told them not to use it to find sources or factual information and to double check everything—I would, I said, be paying extra attention to citations. They later had the choice to submit a brief reflection on how they used ChatGPT in the assignment.
The vast majority decided to try the AI assignment, which was more than I anticipated. At the time, I anticipated that although “A” students might improve their writing, “C” and “B” students would benefit the most. I recall being pretty optimistic, telling colleagues that I hoped ChatGPT would allow struggling students to “level-up” their writing, in effect raising the bar for everyone. I also expected that some students would submit unsourced and error-ridden papers (ie fully plagiarized papers) that would inevitably fail.
It turns out I was wrong.
Results
Contrary to my expectations, as far as I could tell none of the students submitted a plagiarized paper. That is the good news. Remarkably, all the papers had the required number of sources and all used footnotes—and as anyone involved in teaching history knows, that never happens.
In general, though, I found very little evidence that AI improved student’s writing or the quality of the papers. In fact, I could find very little evidence of its use at all. In reading the AI-assisted papers, I found the usual spelling mistakes, grammatical errors, and stylistic oddities. The clearest thing was that the text was most definitely not AI generated.
In reading both the papers and reflections, it was clear that few students followed the provided guides. I also got the sense that many found the assignment to be a lot more work than just writing a paper the “old fashioned way”. In the end, I did not see any evidence that ChatGPT improved the class average: it ended up right where it’s been in my third year courses for the last fifteen years. It is telling too that none of the students who achieved an “A” used ChatGPT.
A Pyrrhic Victory
At first glance, this might sound like a victory for the AI doubters: in this assignment anyway, AI was clearly no magic bullet. Nor did it result in widespread plagiarism. So maybe we should just breathe a sigh of relief, shrug and move on, right? Long live the classic research essay! That was my initial thought, but upon reflection I don’t think it’s the right conclusion to draw at all.
What strikes me most about the papers is that ChatGPT was conspicuous mainly in its absence. It’s remarkable how little evidence I can find of it in the papers, positive or negative. Students seem to have tried it, become intrigued, but ultimately found that it offered only a few marginal benefits around the edges. As tempting as it might be to blame them for not following directions, it’s now clear to me that my assignment was the problem.
Back in February, I tried to create a task that preserved the traditional research essay approach while leveraging GPT-3.5’s strengths to “level-up” student writing. But I also specifically forbid students from using the AI to write the papers for them. Although this flowed logically from my concerns with maintaining academic integrity and fairness, in hindsight no one would ever actually use the technology like that in the “real world.”
Ask yourself a simple question: what was so shocking about ChatGPT when you first saw it in action? Almost certainly it was the speed and relative competence of its answers. This is what makes AI so revolutionary after all: it automates the process of research and writing (as well as coding, data analysis, and a range of other things). In drafting the assignment as I did, I explicitly prohibited students from applying AI in the most obvious, useful way. In effect, I told them to use it for everything except the thing that makes it so revolutionary. And I think my students intuitively realized that this was an unrealistic assignment.
The Medium is the Message
In the humanities, I think we have traditionally conceived of formal writing, including both research papers and creative writing, as existing on an idealized spectrum ranging from elementary school “term papers” at the one end to published work at the other. Whether this reflects some empirical reality or not is certainly debatable, but our programs are geared towards progressing students along just such a spectrum, through the BA, MA, and finally to the PhD, literally the “terminal degree.” But I don’t think that AI-generated writing can actually be integrated into this type of laddered-curriculum. The papers it writes for a ten-year-old will be as good as those it writes for a PhD student, provided the questions are the same. In effect, it disrupts and flattens the progressive nature of the learning process by allowing users to skip forward somewhere close to the end. Zooming out, I think this is wonderful from an equity perspective as it will democratize the production and dissemination of knowledge. But returning to the confines of the classroom, I am not sure what this form of “writing” actually is…
I have been thinking a lot about Marshall McLuhan’s Understanding Media lately, because I increasingly think we need to see Generative AI as a completely different medium. I see it as existing in parallel with things like human-made art, writing, music, and other traditional forms of expression, but it does not, to me, exactly replace them. Human generative work derives its value not only from the time and labour involved in the act of creation, but also from all the training and skills development that preceded it. That is why we tell parents at university recruitment fairs that university graduates, regardless of degree, generally earn better salaries. Regardless of major, they leave accredited to research, analyze, and write about problems and ideas.
Generative AI is valuable for precisely the opposite reason: it comes pre-trained which greatly reduces the skill and experience required of the operator while speeding up the creative process exponentially. In effect, it democratizes creative or expressive possibilities while devaluing the time and effort involved in creation. One also does not need to be accredited to use AI, but the results are increasingly indistinguishable from those produced by real people with specialized skills. While it is true that AI can’t (yet) generate outputs that are aesthetically better than the things created by highly skilled and experienced humans, most human writers and artists don’t earn their living producing books or fine art. So when we realize that very few history students go on to be published authors, but most aspire to careers in which they will be paid to research and write, the problem should begin to come into view.
What IKEA Can Tell Us About AI
The best analogy I’ve found to help me understand what this all means is to look to the history of furniture making (probably because I am a hobbyist woodworker: I make mission-style furniture in my garage). There was a time not long ago when all furniture was made by highly skilled craftspeople. While a dresser with mortise and tenon joints and dovetailed drawer-sides is stunning, building one takes a long time. The results are also inconsistent, varying with the craftsperson’s skill and the quality (and availability) of materials. The shift to mass-produced furniture after the Second World War was enabled by automation, the introduction of composite materials and machined hardware which made it possible to quickly manufacture furniture that was cheaper, more consistent in appearance, and easily distributed. As might be expected, it also reduced wages and eliminated jobs.
But that is only one part of the story. Another less visible consequence was the generalization (or democratization) of the once specialized skills of the joiner and cabinet maker. Industrialization during World War II allowed for the cheap production of machined fasteners. When these were combined with the simplicity of the Allen-key wrench, ordinary people with no training, tools, or knowledge of woodworking could suddenly assemble “good enough” furniture in their own homes. Commodifying the assembly process as well as the furniture itself devalued both the creation process and the output.
This had a less visible, cultural effect as well. As any woodworker can tell you, friends, family, and neighbours will readily send pictures of furniture from online catalogues and ask whether it is possible to “make something like this” out of solid wood. While it is indeed almost always “possible,” the cost of that bespoke desk is usually at least 5 to 10 times greater than its mass-produced equivalent. Few are willing to pay for something like that. It is also true that while most people might intrinsically know that solid wood and handcrafted are “better” many also still prefer the fluid design, speed, ease, consistency, and low-investment cost of mass-produced furniture. Just look on Etsy and you will find lots of woodworkers selling handmade solid wood furniture that looks like it was made in a factory—it’s what the market has come to demand.
Conclusions
There are several important parallels here. Generative AI is not only automating but also commodifying the research and writing process. It’s not just the end product that has been cheapened, the process too no longer requires much training which renders accreditation unnecessary. I can’t imagine how this could not change both the demand for humans with those skills as well as the value we place on related educational processes. To be clear, I don’t think this is a good thing, but it seems inevitable at this point.
Of course, people will still want to study and read history or English, just as people still want to purchase handmade furniture, whether used or bespoke. But the market for those skills will be much more limited. Most people don’t want to pay a skilled craftsperson to produce a Kallax bookshelf for ten times the price. But we also need to acknowledge that the proliferation of AI will almost certainly change consumer tastes and demands too. Although people will experience AI generated text alongside human-generated text (at least for the foreseeable future), it is a different medium. It is typically shorter, more concise, almost always grammatically and stylistically correct, and bland. It is not hard to imagine that shorter, more succinct, and less variable forms of writing will soon become more desirable than longer, expositive, and potentially problematic texts. Consumers like mass produced furniture because it is predictable and disposable. I expect employers will find the same qualities appealing in AI generated writing.
And that brings me back to the ChatGPT assignment. It now seems rather pointless to try and meet AI halfway when that inevitably means artificially blunting its real strengths. What would be the goal of doing so? We would neither be preparing students to use AI effectively nor would we be teaching them to think, research and write independently.
I am sure there are alternative approaches that would have worked better; hopefully I figure some of them out before next autumn. But I can’t help but think that right now, with GPT-4, there is not much teaching required to get the AI to produce good text. My six-year-old daughter can easily ask ChatGPT to “write a paper on the fur trade” just as well as a second-year student (especially using Whisper). As the technology evolves, the inputs necessary to get original, publishable results are getting simpler by the week.
But it seems to me that if we cede that ground to AI, we are going to be left with a very small number of very dedicated students, probably far fewer than our large network of post-secondary institutions were designed to accommodate. Perhaps there will be a good market for “slow” history, and while that may be good for elite institutions, it’s potentially disastrous for smaller schools.
I am struck by the way this highlights a fundamental paradox. In higher education, embracing the technology almost certainly means automating the very skills that are our bread and butter. But if we refuse to do so, we risk delivering a more or less irrelevant curriculum, one that emphasizes hand tools in an age of mass-produced knowledge. In both cases decline seems inevitable, it’s just a question of how steep the slope gets. Either way, we are still going to need to learn to swim with generative AI.
Academia never able to innovate, strangely. People who criticize AI for its writing ability fail to consider the innovations inherent in Word and Grammarly and spellcheck. What damage did those do to human writing ability? In Word, paragraphs and formatting, in Grammarly grammar is automated, and spell check definitely eradicated the need to learn to spell. Email now writes itself. Text automatically finishes a sentence while typing. As a professional writer myself, I welcome AI's ability to write and do the menial work. What I remain and what AI augments and enhances, is me as Author, as authority over my life. I also exercise critical thinking as instructor to AI, directing and managing its output. AI is media and McLuhan would have welcomed it as a message in itself - that Man and machine are irreversibly linked in the physical world even though the two may be on roads no longer perpendicular or parallel, but converging into a consilience of the power to conquer limitations and overcome challenges and solve pressing problems. Technology and Man are not mutually exclusive: we're extensions of each other, from the day the Sumerians discovered writing on clay tablets and baking it in the sun, to the day the wheel was invented or fire showed up with dreadful fear but arming promise. Media, Man and Machine - we're a bonded force of nature forged from absolute necessity into a new entity, but, happily, to augment human intellect. -sms