Microsoft Copilot Necessitates Some Tough Conversations
The integration of generative AI directly into word processors will give rise to a new, acceptable form of synthesized AI-human writing and universities are not prepared.
For more than a year now, people have been writing about how ChatGPT will make universities rethink the way we teach and assess our students. But while most institutions have devised policies around generative AI of some sort, they’ve also tended to put off the tough conversations about what we do and why. I think this was feasible while “generative AI” lurked behind foreboding external logins, requiring people to consciously open an OpenAI account in order to use those tools. But I don’t think that is possible anymore.
On 15 January, Microsoft finally made Copilot for Office available to individuals and families including students. Google is also doing the same, integrating Gemini into its own productivity suite. Since Microsoft announced (somewhat prematurely) the integration of generative AI into Office last year, I’ve been anxious to see it in action. I’ve argued that these integrations will be what actually make AI usage ubiquitous. Now that AI writing is just a button click away in Word or Google Docs, it will quickly become as ordinary and mundane as spell check. From here on out, anyone still hoping to “catch” AI writing, avoid it, or pretend it isn’t happening is going to be out of luck.
My goal in this post is to show how Copilot works in Office and to explain why its very banality will be the thing that makes us actaully begin to redefine concepts such as authorship, authenticity, and plagarism, as well as reassess the utility of written assessments. This also means that all those policies everyone drafted last fall are going to have to change.
Introducing Copilot
First of all, let’s talk about the name (because it’s confusing) then what the tool actually does. Copilot is now Microsoft’s branded interface for any of its AI tools and applications, all of which rely on OpenAI’s GPT-3.5, GPT-4, and GPT-4-Turbo to do their thing. Bing Chat has been renamed Copilot (RIP Sydney) and is basically a streamlined, more constrained version of ChatGPT with a similar web-based chat interface with free and paid (pro) versions. Here is the confusing part: Copilot is also the name given to Microsoft’s AI tools in its productivity software, both in an integrated chatbot form and as standalone word widgets that co-exist with the chart creators and image insertion tools you’ll be familiar with. Basically in Microsoft products, Copilot now just means AI. Google is doing a similar thing, rebranding Bard (their AI chatbot) as Gemini which is also being integrated into apps like Google Docs.
Purchasing or activating Copilot is a bit confusing too. It’s been available to enterprise and business users for months, but now individuals can access it by purchasing a subscription to Copilot Professional for $27.00 CAD per month. This subscription includes unlimited and priority access to the Copilot chatbot and image generator, powered by Microsoft’s finetuned version of GPT-4, GPT-4-Turbo, and Dalle-3. Once subscribed to Copilot Pro, users can enable Copilot in both offline and web-based Office apps (Word, PowerPoint, Excel, Outlook, etc) through their Microsoft account page.
Using Copilot in Word
Copilot integrates generative AI into Word in three main ways: it adds a chatbot widget similar to ChatGPT into Word itself; it offers a Copilot editor tool that rewrites existing sentences or creates tables; and a generation function that proactively and earnestly offers to draft whole documents whenever you open a document or write new paragraphs whenever you start a new line of text. The functionality of these tools overlap somewhat and can be customized through dropdown menus to select different writing styles or change the content via natural text prompting (example prompts are provided).
CoPilot the Chatbot
The chatbot interface will be the most familiar of the tools in that it is exactly what we have come to expect from generative AI chatbots, only this one lives in Word. A Copilot button on the Home tab of the toolbar opens a sidebar chat window where Copilot will “chat, respond to your questions, and help you with writing and summarizing [the active] document.”
In practice, this means you can ask GPT-4 to do tasks or answer questions about your active document, such as to provide a summary, generate a list of action items, or things like “how many times did I use the word ‘copilot’ in my document”. It can also show you how to do things like insert a footnote or a table of contents, but it can’t actually act as an agent and do those things for you which is annoying. Personally, this was really the only thing I was hoping for from Copilot: it would be nice to be able to cut and paste in formatting guidelines and have Copilot set line spacing, margins, tabs, and font size automatically. That will surely come, but sadly not yet.
Copilot seems to automatically assume that any questions you ask it are about the active document. This means that while Word has normally been thought of as a production tool, you can now use it to analyze and understand documents created by third parties including journal articles, reports, and notes. You can, of course, also ask it to do “normal” ChatGPT things like explain how to cite sources in the Chicago Style or provide a list of five sources on the history of Canadian Confederation. When it uses external sources, it tells you that the answer did not come from the document. There is also a “Copy” button in the response window that allows you to easily put any text from the chat window into your word document.
Clearly, the focus of the chatbot is to encourage users to use it to edit, add to, change, and interact with large documents, something that would have required a lot of cutting and pasting to an external AI like ChatGPT in the past. Unlike external editors, though, Microsoft guarantees that “Copilot for Microsoft 365 is compliant with our existing privacy, security, and compliance commitments to Microsoft 365 commercial customers, including the General Data Protection Regulation (GDPR) and European Union (EU) Data Boundary” and that “prompts, responses, and data accessed through Microsoft Graph aren't used to train foundation LLMs, including those used by Microsoft Copilot for Microsoft 365.” In other words, this is going to make it both easier and safer for businesses and ordinary people to do things with AI involving sensitive documents or private information.
Copilot the Editor
Copilot’s in-line suggestions in the editor are designed to be intuitive to use, similar to the way familiar tools like autocorrect, grammar check, and spell check work. When a user selects text, a small “Copilot” icon appears in the left-hand margin which provides options to rewrite the selected text or generate a table from it. If you click “rewrite”, a widget window appears with three new and different AI generated versions of the sentence to choose from as well as options to regenerate it in various styles of writing including casual, professional, or imaginative. Users can either replace the existing text with one of these AI generated options, paste the AI text below the original text, or hit cancel.
Generating tables works in a similar way. User can select a paragraph and the AI will essentially try and visualize it as a table by rewriting the key information to fit into rows and columns. Obviously, this works better for some types of text/data than for others, but it is surprisingly versatile and adaptable. A small window allows users to modify the table through natural language, asking the AI to do things like “remove the top row”, “merge columns”, or “add a row.”
Copilot the Writer
Copilot’s generation function will be a bit scary for anyone that writes for a living. With copilot enabled, whenever you open a new word doc—and before you start typing—a small window appears captioned “Draft with Copilot” containing a small textbox. It actively encourages users to explain what it is they are planning to do in this new, blank document. And if you type something like “I need to write an essay on the origins of the First World War” it thinks for a moment and then generates a cogent paper complete with headings and a space to insert your name. You then have the option of accepting the AI generated text “as is”, regenerating it in a different style, or modifying it with natural language commands like “change the bullet points into paragraphs” or “add another paragraph to the conclusion.”
But here is where it gets really frightening. This same generation function is also available whenever you start a new paragraph or hit enter to a new line, allowing users to expand and build a paper a few paragraphs at a time. You can also ask it to do things like “add 1000 words” or simply add another paragraph (or ten), quickly expanding an 800 word paper into a 5,000 word essay. I quickly tried this with the paper on the origins of the First World War and had a half-decent B paper assembled in a few minutes. It’s also important to note that this same functionality will allow users to open an existing paper—or copy and paste text in from a variety of sources—and have Word automatically rewrite it, all the while keeping any existing references intact and attached to the correct areas of the text.
While it has always been possible to do something similar in ChatGPT, the integration of AI generation directly into Word makes the process both more intuitive as well as much simpler and faster. Previously, one would have had to do so a lot of copying, pasting, and convoluted prompting between a Word document and ChatGPT. Not anymore.
Implications of Copilot in Word
Together Microsoft and Google control 96% of the market for office productivity tools. So make no mistake about it: Copilot and Gemini will quickly normalize a new, synthetic form of AI-human writing whether we like it or not. Of course, people have already been doing this for some time with ChatGPT, but it’s been mainly in secret, in large part because it clearly fits traditional definitions of plagiarism. But when a respected, button-down company like Microsoft says its ok, it will just become a part of how people write and those definitions will quickly change. Again, I am not saying this is a good thing, but from a purely practical point of view if you’ve been running from AI, you’ve just hit a dead end.
For just over a year now, academics and university administrators have been wringing their hands about how to respond to generative AI. Most institutions have come up with well-meaning guidelines that try to address immediate issues like plagiarism, privacy, and information security while avoiding larger existential questions about the utility of teaching and evaluating writing in the AI era. In effect, we have been trying to adapt existing policies to encompass AI writing, pretending that it does not represent something wholly different both from a technical and cultural point of view. But Copilot and Gemini blur all the lines that universities have been trying to draw around the use of generative AI because it is something new. Now we have to contend with it.
At my own institution, like many universities, we have told students that they can only use generative AI tools if their instructor allows them to do so—and if it is not explicitly allowed in the syllabus, they will have committed academic misconduct. This might have seemed logical in October, but now it won’t work. Word is now the most capable generative AI writing tool available. While keeping in mind that students can buy Word at a discount from the university and that our email system is run via Microsoft Outlook, using that program could now be considered academic misconduct.
According to our guidance, students must also explicitly cite anything that they create with generative AI, but what does that mean for text generated at least in part by Word itself? How much AI text is too much and how little is acceptable? If a student right clicks a grammatical suggestion in word and accepts a revision that is OK. But what if the exact same revision is suggested via Copilot? Even more to the point, what happens when copilot, spell-check and grammar-check are integrated into one universal proofing tool as they almost certainly will be? We have to ask ourselves: what exactly are we trying to control and why?
These are all highly technical and pedantic questions but they illustrate a larger problem: we will either need to decide to constantly redraft policies in order to micromanage an ever growing array of AI tools and use-cases, or we need to adjust our definitions of authorship, plagiarism, and authenticity. The first option seems like a really depressing and pointless battle but I have no idea how to go about the second. But what I do know is that we can no longer avoid the tough conversations.
Hi Mark!
Thanks for posting this piece. We have been having conversations at my own institution about this very thing for the past year or so. Same concern: GAI will become ubiquitous once it is seamlessly integrated into word processors and other standard office software; we need to prepare for this change.
I also signed up for Copilot Pro and integrated it with my personal Microsoft 365 account. Here are my observations of the product right now:
1. Copilot for Microsoft 365 is still very much a beta product. It is somewhat stapled onto the existing suite of Office applications (currently just Word, Excel, PowerPoint, OneNote, and Outlook). It has varying functionality and efficacy depending on what program you’re using (Excel’s version of Copilot is quite limited, especially with natural language data analysis and no where near as sophisticated or useful as ChatGPT Plus).
The Copilot integrations are extremely buggy. The writer function in Word breaks repeatedly, throws up error messages, and sometimes just quits mid-sentence. But if you hammer away at it with prompts, you can get it to go and do much of what you say. Here’s a video of me writing a 2000+ word essay on the development of the welfare state in Canada in less than 15 minutes: https://youtu.be/htaX9qZR_e8?si=cpaehJbp8HtjgHh1
It’s impressive, but still limited. With that said, this is the worst this technology is going to get. It will improve with time.
2. The references are unreliable. In my tests writing AI-generated history essays, I struggled to get Copilot in Word to generate one real citation. I got some real historical quotes, but no accurate references. There were real journal titles, but fake articles, and other citation details. This continues to be a severe limitation in this kind of software and I don’t know how soon the capabilities of GAI in generating footnotes will improve. Microsoft still hasn’t enabled the ability to add comments to footnotes so I’m not holding my breath just yet.
3. If you watch the video of my welfare state essay, you’ll see much of what you describe in your article. I can insert paragraphs, ask for particular details to be added, and expand this into a more complex piece of work. As I worked on this, a colleague asked me a key question: Would you be able to generate such effective prompts if you didn’t already have prior knowledge of the subject matter. I should probably try writing something on a topic I know nothing about. That would, at the very least, severely limit my ability to catch factual errors. For example, in an essay on the Fall of New France that I tried writing with Copilot, the first draft said that the Seven Years War was fought between Britain, France, and India.
4. Writing with Copilot in Word reminds me of vacuuming my living room with a robot vacuum. I have to move the furniture, clear cables and cords off the floor, check for cat toys that might have rolled under the couch, and then do a little sweeping after to catch areas the robot missed. At a certain point, writing with Copilot can start to feel more like just writing this thing myself. The line can get blurry, but from the perspective of academic misconduct, the amount of work I need to do to achieve higher order thinking and analysis in my writing with Copilot starts to become more work than just writing the essay without Copilot.
I think this is where GAI is driving those who teach in higher education to rethink assessment. As you say, we will need to have difficult conversations about writing as a method of assessment. There are types of assignments that may no longer make sense and we may have to push for assignments that ask for high order thinking and analysis. For now, we can start by asking for some research sources and citations.
Please keep up your writing on this subject. I’m reading with great interest.
Hi Mark (and Sean), thanks very much for this discussion.
I have a "newbie" question. If I were to write an original research paper in Word with Co-Pilot enabled, would that mean that my writing would be added to Co-Pilot's "memory" so to speak and then my own in-progress research findings would become publicly available?
Case in point, I downloaded Co-Pilot Pro this morning to test it out in Word, then later on started to draft a new research article in Word wherein I was going to put the results of my recent research, but that little "Co-Pilot" cursor was blinking at me. It certainly gave me pause.
Does enabling Co-Pilot in MS Office make our data less secure than ever? What are your thoughts? Thanks in advance.