Crafting Policies for Generative AI in Research Will Not be Easy for Universities
Why a one size-fits all model won’t work
Generative AI caught most universities flat-footed last winter and the result was something akin to panic. In the intervening months, most of the focus has been on student plagiarism, but universities will soon need to craft policies to deal with faculty and graduate student research as well as human resources and administration. This raises a whole new host of issues in ethics, data privacy and security, as well as academic integrity. But how well do we actually understand the field?
While many universities might prefer to craft a simple one-size fits all type of policy, unfortunately that’ll be impossible given the range of generative AI tools already out there and their rapid evolution. While everyone has heard of ChatGPT, few non-experts are familiar with new, more specialized tools that will likely prove the most important for researchers in the longer-term.
So what are these tools and what sorts of issues do they raise?
Chatbots
Chatbots like ChatGPT are the way most people first learn about and experience generative AI and so they tend to dominate the discussion. They are general purpose tools that do a range of things from parlor tricks to data analysis and visualization all through a natural language interface. This is what makes them so appealing: you can ask ChatGPT to write you some python code to reformat a spreadsheet and then immediately ask it for a good guacamole recipe.
Don’t get me wrong: chatbots can do a whole host of useful things that have been amply documented over the past few months, but they also have their limitations, especially when it comes to specialized research. Because ChatGPT is the Swiss army knife of LLMs, it is unlikely to have the domain specific expertise necessary to answer the types of complex, specialized questions that publishing academics routinely work on. There are alternatives: you can send it information by uploading text and PDFs (using plugins). But aside from copyright issues, at a practical level, the data we use as researchers often has ethical or privacy restrictions that prevent us from freely sharing it. This likely includes uploading it to ChatGPT—at least in a raw or unredacted form—and so universities and individual researchers will need to carefully consider how they engage with these tools.
But the key take away is that despite the hype, chatbots are really an entry level tool. In the long run, they are unlikely to be the main interface researchers use to harness the power of LLMs.
APIs
This is where Application Programming Interfaces (APIs) come in. APIs allow one computer program to call another computer program over the internet, send it information, and get a response. In the context of generative AI, we can use an API to send information to a generative AI model like GPT-4, ask it to do something with that information, and then get the results. These interactions are also highly customizable: you can give the AI specific instructions, change its temperature settings (which affect the consistency and randomness of the output), and require output in a specific format (like a table). You can also send it customized data and tell it to confine its analysis to that information, limiting hallucinations. Once you learn the ropes, it’s also a lot faster and more efficient.
Imagine you have a spreadsheet full of hundreds or thousands of textual survey responses that you want to categorize. It would be impractical to cut and paste each one into ChatGPT but you could write a program in python that would almost instantly send each response to the GPT-4 API, asking it to categorize them for you according to criteria you specify.
APIs are also what people are using to allow you to use AI to answer questions from a customized library of PDFs and other textual documents. This ability to seed the model with relevant, specific information gives LLMs the domain specific expertise that ChatGPT lacks. While writing code might sound daunting, third-party apps are rapidly becoming available that can do these types of tasks for you.
Many of these API models can also be fine-tuned (given additional training) to do the tasks you want them to do. In essence, this allows researchers to take a base model like GPT-3 and build on it, creating new models tailored specifically to their needs.
I suspect that as these tools become more commonplace and accessible, academics will find them much more useful than chatbots. But they are also going to create the most headaches for universities, specifically around research ethics, data privacy and security, and copyright. Will research ethics boards (REBs) allow researchers to send raw survey data, as in the example above, to an API for analysis? It will likely depend on the nature of the data as well as the API’s specific terms of service which are not usually fixed but change regularly. While some APIs (like Microsoft Azure which offers a version of OpenAI’s GPT-4) are specifically designed with privacy and data security in mind while others are not.
Copyright is another major issue that is far from clear-cut. Do educational and fair dealing exemptions apply to generative AI? It will be for the courts to decide. I am no expert, but I suspect that if you send GPT-4 an entire book or journal article, that would not be considered fair dealing. But what about short excerpts which are normally allowed to be communicated for research purposes? I think researchers could make a good case that those same exemptions apply here. The question of whether a model can be lawfully fine-tuned with data taken from books and articles is also still untested.
The point is, that evaluating REB applications like this will require significant knowledge and expertise in a range of highly specialized areas that may not be familiar to many REB members or administrators.
Meet Your Local Llama
The most important recent development, though, will ultimately prove to be local llamas. In-House LLMs are open-sourced or licenced generative AI models that reside on local computers which may or may not be connected to the internet. Stable Diffusion is an early example of a local model, but the space is now dominated by models created by Meta. In February, the company formerly known as FaceBook released Llama to researchers, granting a non-commercial license that allowed them to run their LLM locally. The catch was that you needed to have an expensive computer to make it work. But within days, the model was leaked and independent researchers soon after found a way to make it run on much more affordable, consumer grade computers. Last week, Meta changed the rules again by releasing Llama 2, a more powerful, more robust model that has taken the DIY AI world by storm.
Meta’s intention seems to be to steal OpenAI’s thunder by getting people to use LLMs on smaller devices. Although it was created by a major tech company, which might make you skeptical, the models run entirely independently once downloaded, even on a computer that is not connected to the internet. And that is the key point: local LLMs can do much of what APIs can do, but without the same concerns for data privacy, security, and copyright. For example, while it might be problematic to upload sensitive data to an API, you could use an LLM to analyze that data locally with the same level of risk you’d have using Microsoft Excel.
Most importantly, Llama 2 is also licensed for commercial and developers are now racing to get the first Local Llama applications into production. It’s a certainty that within a few months researchers will be able to purchase off-the-shelf software that will allow them to use a secure LLM on their own computers. While these tools are still forms of generative AI, they are very different from chatbots and APIs in terms of their implications for security, privacy, and copyright. Will that be understood, though?
Conclusion
This fall, as universities begin to draft policies on Generative AI, it will be important to consider each type of tool separately. While most people are familiar with chatbots, universities will need to look beyond them to the emerging tools that will ultimately prove most useful to researchers. This will be difficult because they are still relatively esoteric tools that even those with a background in tech know relatively little about. The real risk is that in the face of this complexity, universities will take the easy way out, as some have already done with ChatGPT, and move to ban or severely restrict the use of generative AI. Finding the expertise and knowledge-base necessary to develop guidelines that protect academic integrity, privacy, and copyright restrictions while still allowing researchers to harness the power of AI will not be easy.
Mark, This is the clearest and most straightforward explanation of these issues and tools I have come across.
I am busily trying to educate myself on how our Organization (a Homeowners Association) can use AI to make access to information easier for our Board and for our Members. Initially we will deploy a chat service trained on our website contents and key documentation to make our mostly elderly members more successful in finding information and guidance that they want, but I want to expand that to provide insight into the decades of records we have accrued since 1990.
Many thanks and look for ward to more from You!