I recently wrote a blog post about generative AI and open education for the BCcampus website. I thought I would repost it here as it represents some of the things we have been considering with regard to open education and generative AI.
As an organization that provides funding and support for open educational resource (OER) initiatives in B.C., we have been considering some of the potential benefits and implications of generative artificial intelligence (AI) tools like ChatGPT and the use of AI-generated content as part of the OER creation process. For example, what guidelines do we provide our OER grantees around the use of generative AI tools in our OER creation grants? Can generative AI tools make creating OER more efficient? Who controls the copyright for content created by generative AI tools, and can you apply an open license to AI-generated content? How accurate is the content created by the current generation of generative AI tools? How has the AI tool been trained, what kind of corpus was used to train it, and what kinds of biases may be present in the results depending on the training data used to train the tool?
There is no doubt generative tools like ChatGPT hold great potential to save time and effort for OER creators. With a few well-crafted prompts, ChatGPT can generate thousands of words on a subject, create dozens of sample questions that could be included in an open textbook for students to be able to self-evaluate their own learning, create lesson plans and assignments, question prompts that can be used as asynchronous discussion prompts in discussion forums — basically any type of learning material an educator can think of. Who doesn’t like a tool that makes work easier? ChatGPT certainly has the capability to do that for people who create OER.
For open educators working with OER, who owns the copyright of AI-generated works is important to determine as, by definition, OER are materials unencumbered by legal restrictions that may prevent the reuse, sharing, redistribution, and adaptation of copyrightable works. While some are using the rise of generative AI to question the validity of copyright itself, who owns the copyright when a work is created by AI is still a gray area, both legally and ethically.
Legally in Canada, it appears AI-generated content does not fall under the Canadian copyright act and is therefore not copyrightable and in the public domain, according to analysis from Victoria Fricke of McGill Law. However, this could change quickly as international laws change and adapt. As Quebec lawyer Tom Lebrun notes, “But because Canada is a little fish in a big copyright pond, many decisions about the legal status of generative AI may be settled abroad.”
Ethically, the situation is more complex. For example, should the person who invests time and energy crafting prompts that generate the output be able to make a copyright claim to it? Do the programmers who created the models and gathered the data that trained the AI have an ownership claim? What about the thousands of authors and artists who originally created works that were used to train the AI? Do they have a claim to some form of ownership over what gets created?
This final point has become an especially contentious issue as the specific sources of the training data used to train the generative AI is often unknown or is not made transparent by the developers of most generative AI tools. This has led to pushback by many in the visual arts community who see their unique styles mimicked in the outputs of AI-generated art and who argue their copyrighted art has been used without their permission.
For open educators, this runs counter to the very reason we use OER in the first place. Many open educators choose OER because there are legal permissions that allow for the ethical reuse of other people’s material — material the creators have generously and freely made available through the application of open licenses to it. The thought of using work that has not been freely gifted to the commons by the creator feels wrong for many open educators and is antithetical to the generosity inherent in the OER community.
The lack of transparency around the source of the training material is also a crucial issue when we talk about using generative AI to create learning materials, as learning materials rely on identifiable sources and validated information. Imagine a textbook that did not cite sources. How would you determine if the content is valid? When you create with ChatGPT, the source of the information is not apparent, and ChatGPT is known to just make up citations when you ask it to cite its work. This missing attribution and the all-important academic trail to where information comes from is a foundational component to ensure you are working with valid and factual information. So far ChatGPT is failing miserably at making visible how it knows what it knows.
However, it can be argued this is where knowledgeable humans are needed, and the role of an OER author using generative AI tools becomes more like that of an editor whose primary job is to verify the facts and outputs given to it by ChatGPT — that texts created by ChatGPT become the first draft and that humans are then the revisors. The human subject-matter expert becomes the filter to judge whether the content is credible and then work to revise the content. Perhaps tools like ChatGPT signal the start of a revise revolution in OER creation, where the role of an OER creator starts not with creating content but with creating the correct prompts to generate a first draft of content, then spends the bulk of their time revising and validating the content.
There is also the issue of bias present in the system. When Fast Company’s Kieran Snyder asked ChatGPT to write performance review feedback for different types of job roles, what ChatGPT came up with was riddled with examples of sexist and racist language. While the exact training data being used to train generative AI is not known, we can safely assume it has been harvested from the internet, a place that contains spaces that are both good and beautiful as well as brutal and destructive. There is an old computing acronym GI/GO, which means garbage in/garbage out. If there is bias present in the training corpus, there will be bias present in the outputs.
Finally, there are two issues I think are imperative to keep in mind as we grapple with the implications of ChatGPT in open education. First, the current tools will not be the future tools. If technology has taught us anything, it is tools are refined and changed over time, and what generative AI tools will look like in the future will be vastly different from what we are seeing today. Search engines, for example, are vastly different in 2023 than they were in 2003, so some criticisms today around issues of accuracy and attribution may not be issues in the future as generative AI becomes more refined.
Second, we cannot ignore generative AI tools and hope they will go away. AI has been slowly working its way into our tech for decades, and generative AI tools are only the latest incarnation. It is new and novel now, so there is heightened attention. But it will not be new and novel forever. Soon these tools will just be part of the knowledge-creation tools we already use. Microsoft has already integrated ChatGPT into its search engines, and Google is rolling out its own version of generative AI, called Bard, to augment its search engine. It is important we all pay attention to what is happening in the generative AI space and approach these tools like we do any other education technology — with a critical and ethical eye that balances the pros and cons to assess whether the tool helps us achieve our goals as open educators