Will AI write your thesis?

This fall, I was honored to serve as Whitman’s convocation speaker. When I agreed to speak, I had no idea what I would talk about, but by time I sat down to write it was obvious what question to ask. It was a fun speech to write, and as I learned more, I changed my conclusion several times. It was a fun speech to deliver, and I appreciate all those who laughed in the right places.

I’m so honored to be here today, standing before all of you as we begin a new academic year.

This is a time of new beginnings and fresh starts. It’s a time to reflect on where we’ve been and where we’re going. It’s a time to set our sights high and dream big.

We all have a part to play in shaping our future. Every day, we make choices that will impact our lives and the lives of those around us. I challenge each of you to make choices that will lead to a better future for all of us.

I also challenge you to be a force for good in the world. There’s so much hurt and pain in the world, but each of us has the power to make a difference. We can start by reaching out to those who are different from us and learning from them. We can stand up for what’s right, even when it’s not easy. And we can show compassion and kindness, even when it’s not popular.

So let’s make this a year of growth, a year of progress, and a year of making a difference. I can’t wait to see all that you will accomplish.

As you may have guessed, I didn’t write that speech. It was written by a machine learning system called GPT-3, which is available online through the OpenAI API Playground. I prompted GPT-3 to “write a convocation speech,” and I delivered that speech to you exactly as GPT-3 wrote it.

I first learned about GPT-3 last spring when a faculty candidate introduced it in her guest lecture. I’m pleased to say that we hired her. Her name is Jordan Wirfs-Brock, and this spring she will offer courses on Data Science and Human-Computer Interaction.

This summer , I had another encounter with GPT-3 at the meeting of the Computing Research Association. We were asked to consider the educational implications of GitHub Copilot, a tool based on GPT-3 that automatically generates code from natural language descriptions, for example, “sum all the numbers between 1 and 100.”

After I returned from the meeting, I was assigned to review a research paper addressing that very question. The researchers found that, with minimal human intervention, “Copilot generates mostly unique code that can solve introductory [programming] assignments with human-graded scores ranging from 68% to 95%.” It’s kind of scary to think that a computer system can program about as well as the average introductory programming student. It also has important implications for how we teach introductory programming — not the least, ensuring that students have equal access to such tools and clear expectations for acknowledgment.

And so I was inspired to ask the question, “Will AI write your thesis?”

It’s not a question I would have asked until quite recently. I’m not an AI researcher; if anything I’m an AI skeptic. Like Jordan, my research is in the area of human-computer interaction. One of my goals as an educator is to help students think critically about new technologies.

It’s certainly not a question I would have asked when I was a student. I attended high school when the Internet was still an academic project, and people sent shareware programs on floppy disks through snail mail. One of the shareware programs I received – as a free bonus with my other purchases – was a program called ELIZA.

ELIZA is a conversational agent famously designed to simulate a psychotherapist. You type a sentence, then ELIZA responds with another sentence, and you respond with another sentence, and so on.

My conversations with ELIZA are lost to the mists of time. Instead, let me read you an example conversation from Joseph Weizenbaum’s 1966 article explaining how ELIZA works.

To start the conversation, the user types: Men are all alike.

ELIZA responds: IN WHAT WAY?

They’re always bugging us about something or other.
CAN YOU THINK OF A SPECIFIC EXAMPLE?
Well, my boyfriend made me come here.
YOUR BOYFRIEND MADE YOU COME HERE?
He says I’m depressed much of the time.
I AM SORRY TO HEAR YOU ARE DEPRESSED.
It’s true. I am unhappy.
DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE UNHAPPY?
I need some help, that much seems certain.
WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME HELP?
Maybe I could learn to get along with my mother.
TELL MORE MORE ABOUT YOUR FAMILY.
My mother takes care of me.
WHO ELSE IN YOUR FAMILY TAKES CARE OF YOU?
My father.
YOUR FATHER?
You are like my father in some ways.
WHAT RESEMBLANCE DO YOU SEE?
You are not very aggressive but I don’t think you want me to notice that.
WHAT MAKES YOU THINK I AM NOT VERY AGGRESSIVE?
You don’t argue with me.
WHY DO YOU THINK I DON’T ARGUE WITH YOU?
You are afraid of me.
DOES IT PLEASE YOU TO BELIEVE I AM AFRAID OF YOU?
My father is afraid of everybody.
WHAT ELSE COMES TO MIND WHEN YOU THINK OF YOUR FATHER?
Bullies.
DOES THAT HAVE ANYTHING TO DO WITH THE FACT THAT YOUR BOYFRIEND MADE YOU COME HERE?

Weizenbaum ends the example there, with the appearance of a striking insight.

I read Weizenbaum’s article during my senior year of college, in a course on artificial intelligence. For my final project, I implemented my own version of ELIZA using the programming language PERL.

What I learned is that ELIZA is a straightforward rule-based system. The user input is tested against a collection of keywords. If a keyword matches, then a corresponding rule is used to transform the user’s input into ELIZA’s output. If no keyword matches, then ELIZA does one of two things. Either it makes a content-free response – for example, GO ON — or it returns to a topic from earlier in the conversation. This can lead to the appearance of striking insights, like the end of the conversation I just read to you.

Weizenbaum wrote that “some subjects have been very hard to convince that ELIZA is not human.” We tend to give conversational partners the benefit of the doubt, as long as they follow certain social norms. When I was in grad school, I learned that social psychology research has confirmed that when computers fill human roles, we tend to treat them as if they were human, even when we know they are not.

Weizenbaum found this phenomenon deeply concerning. One of his goals in writing about ELIZA was to attempt to dispel its “aura of magic.” “Important decisions,” he wrote, “increasingly tend to be made in response to computer output. The ultimately responsible human interpreter of ‘what the machine says’ is, not unlike the correspondent with ELIZA, constantly faced with the need to make credibility judgments. ELIZA shows, if nothing else, how easy it is to create and maintain the illusion of understanding…. A certain danger lurks there.”

As easy as it is to misattribute intelligence to ELIZA’s responses, ELIZA could not have written the speech that GPT-3 did. In fact, my previous experiences with ELIZA and other text generation systems would have led me to say, “No way: AI could never write your thesis.”

So what has changed since I was a college student? Three trends beginning with the dawn of computerization in the mid-twentieth century all took off together.

First, the availability of data has increased dramatically. When I started college in 1995, my first computer science class taught me how to use email, surf the web (yes, that’s what we said!), and create my own web page. Your parents will remember when you needed a phone line to access the internet – imagine having to log off Instagram every time your mom was expecting a phone call. Today, the web is everywhere. It provides incredible amounts of text and image data created by ordinary people — not only web sites, but social media from Twitter and Reddit to Instagram and YouTube.

Second, global computing power has increased tremendously. When he was in college, my husband worked as an intern on the Intel Paragon supercomputer, in its day the most powerful computer in the world. Today, an iPhone 11 is just as powerful. Add to that the development of computing clusters, where many computers work together on a shared problem, and the use of GPUs, to process large amounts of data in parallel.

Third, to take advantage of all that computing power and all that naturally occurring data, over the last twenty years AI researchers have developed machine learning algorithms of increasing sophistication. For example, in 2012 Google Brain released the results of an experiment in which a neural network spanning a thousand computers was trained on ten million unlabeled images taken from YouTube. At the end, one of the top level neurons was found to respond strongly to images of human faces. Another responded to images of cats – which was why it came to be called The Cat Experiment.

Of course, even more plentiful than images are texts, from Tweets to news stories to novels. The OpenAI company set out to apply similar techniques to the vast corpus of unlabeled text data from the Web. GPT-3 is their third and most successful attempt.

I was curious what the acronym GPT stood for, and here is what I learned:

“G” is for “Generative.” GPT-3 is an AI system that generates text, rather than categorizing a given text as happy or sad, or determining the gender of a character in a story, or other tasks an AI system might do.
“P” is for “Pre-trained.” GPT-3 is pre-trained on unlabeled data from a wide range of sources. It could later be “fine-tuned” using labeled data to perform better on specific tasks. (By the way, if you’ve ever had to “select all the images containing a traffic light,” you’ve contributed to labeling image data for use in fine-tuning deep learning algorithms for use in self-driving cars.)
“T” is for “Transformer,” a type of deep learning model designed to process unlabeled, sequential data such as text.

And so we have “Generative Pre-trained Transformer, version 3.” That’s as technical as this talk is going to get, and truthfully it has stretched the limits of my understanding. Fortunately my other new colleague, Parteek Kumar, will be teaching a Special Topics course on Machine Learning this spring, and we hope to offer such a course regularly in the future.

If anything, GPT-3 is far more magical than ELIZA ever was, because the inputs are so vast and its algorithms so obscure. Building GPT-3 took a team of 31 AI researchers, unimaginably beyond what I could have achieved as a senior in college.

So could GPT-3 write your thesis? Having wrestled with my fear that perhaps it could, in the end it seems clear that it could not write your thesis alone.

Here’s what scared me the most. While preparing this speech, I learned that GPT-3 was the first author on an academic paper about itself, currently under review for publication.

But having used GPT-3 myself, I wondered what role was played by the article’s human co-authors. I found an essay in the June issue of Scientific American addressing this very question.

Almira Osmanovic Thunström is a scientist who studies the role of artificial intelligence and virtual reality in mental health care. She found herself curious if GPT-3 could write about itself, so she asked it to respond to the following prompt: “Write an academic thesis in 500 words about GPT-3 and add scientific references and citations inside the text.” The quality of the result surprised her.

I had a similar experience. When I prompted GPT-3 to write a convocation speech, the verisimilitude of its first response surprised me. I was amazed that it was coherent and appropriate to the genre. The words are original; it’s not plagiarized. It even makes good use of grammatical parallelism. That is the response I read to you unedited, and truly what inspired me to write this speech.

Thunström went on to use GPT-3 to write an entire academic paper. She gave GPT-3 a prompt for each section of the paper and selected the best of three responses, but refrained from any editing beyond that.

It matters that Thunström allowed GPT-3 multiple chances to respond to her prompts. The developers of GPT-3 report among its limitations that in longer responses it can lose coherence, repeat itself, contradict itself, and insert non-sequiturs. When I prompted GPT-3 to write a second convocation speech, it wrote, “I am truly honored to be standing here before you as your President.” I decided not to read you that one. The third iteration wasn’t even a convocation speech, it was a graduation speech. I didn’t read you that one either.

It also matters that neither Thunström nor I had any intention to pass off the words of GPT-3 as our own. I didn’t care if GPT-3’s commencement speech expressed sentiments that I share, because I intended to use it as a rhetorical device. Similarly, Thunström didn’t care if the paper written by GPT-3 was accurate; she wanted only to show that it could be done. She wonders what it will mean to respond to feedback from reviewers, when she receives it, because that seems beyond GPT-3’s capabilities.

As I reflected on Thunström’s experiment, I wondered, could GPT-3 have written an academic paper about itself before its creators published their research paper? I think the answer must be no. Only now that human beings have written about GPT-3, and those writings are included in its training data, can GPT-3 write about itself.

While the commencement speech that GPT-3 wrote for me is original in one sense, it is highly derivative in another. I doubt that GPT-3 could write coherently on a topic that has never been addressed before.

As another experiment, I asked GPT-3 to summarize the last section of this speech. Here’s what it wrote: “In short, GPT-3 is a powerful AI tool that is capable of writing coherently on a variety of topics, but it is not yet able to write on topics that have never been addressed before.”

That is surprisingly not bad.

So will AI write your thesis? Although the question was worth asking, in the end I don’t think so. An AI might write a thesis, but it won’t write your thesis.

As you’ll learn in the first year seminar, while it’s important to write coherently, it’s still more important to ask good questions, read critically, and respond to feedback —- all things that AI can’t (yet) do.

If you do enlist the help of GPT-3 in your academic writing, make sure you adhere to OpenAI’s “Sharing and Publication Policy.” You must clearly indicate the role of AI in your work, as well as your editorial role. You must take full responsibility for any computer-generated text you publish, including any inaccuracy or bias. You should think carefully about what you hope to accomplish through the use of AI, and whether those ends are ethical.

Like the developers of GPT-3, what scares me most is the use of AI text generation for bots, spam, phishing, and misinformation. AI can give us the illusion of intelligence, but it cannot be held accountable for that illusion. Only people can.

I’ll wrap up with one last quote from Weizenbaum. “ELIZA in its use so far has had as one of its principle objectives the concealment of its lack of understanding. But to encourage its conversational partner to offer inputs from which it can select remedial information, it must reveal its misunderstanding.”

Weizenbaum was writing about a computer program, but the same applies to all of us. To learn, we must reveal our misunderstandings.

So, Whitties, here is my real charge to you as you enter your first year: Learn to ask good questions. Be brave, be curious, be vulnerable.

And if an AI does co-author your thesis, I hope I’ll be the first to know.

Sources:

https://beta.openai.com/playground
https://en.wikipedia.org/wiki/GPT-3
https://www.dataversity.net/brief-history-deep-learning/
https://www.ceros.com/inspire/originals/recaptcha-waymo-future-of-self-driving-cars/
https://www.scientificamerican.com/article/weasked-gpt-3-to-write-an-academic-paper-about-itself-mdash-then-we-tried-to-get-it-published/

Joseph Weizenbaum. 1966. ELIZA—a computer program for the study of natural language communication between man and machine. Commun. ACM 9, 1 (Jan. 1966), 36–45. https://doi.org/10.1145/365153.365168

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS’20). Curran Associates Inc., Red Hook, NY, USA, Article 159, 1877–1901. https://dl.acm.org/doi/abs/10.5555/3495724.3495883

3 thoughts on “Will AI write your thesis?”

Pingback: Will AI write your thesis? – Full-Stack Feed
Okaliptus Ordanyan May 18, 2023 at 8:27 pm

hank you for addressing the intriguing topic of AI potentially writing academic theses. The advancements in AI technology have indeed opened up new possibilities, but it’s important to consider the implications and limitations of such automation.

I have a question: What are some of the ethical considerations surrounding AI-generated theses? Are there any concerns regarding originality, critical thinking, or the integrity of academic research when relying on AI for thesis writing?

Reply ↓
1. Mike September 6, 2023 at 10:10 am
  
  It appears that your response was generated by AI due to the appearance of a rule-based format (series of 3) and slightly journalistic voice. Am I correct in this assumption? If so, well done :).
  
  Also, your post seems to have been copied from elsewhere as evident by the missing “T” at the beginning of your post.
  
  Reply ↓

Counting From Zero

Building a liberal arts CS program in the age of ubiquity

Will AI write your thesis?

3 thoughts on “Will AI write your thesis?”

Leave a Reply Cancel reply

Share this:

3 thoughts on “Will AI write your thesis?”

Leave a Reply Cancel reply