AI Wrote A Harvard Physicist’s Most Recent Paper. No One Knows What It Means for Science.

On a Thursday afternoon in mid-April, nearly 400 physicists and physics students pack into a lecture hall in the Harvard Science Center to learn about their replacement.

Harvard Professor of Physics Matthew D. Schwartz has long been a champion for agentic AI, artificial intelligence capable of autonomous action, reasoning, and conducting graduate-school-level research in theoretical physics.

“I would say 10 years from now, we’ll be out of the picture. I mean, it’ll really be doing things that we won’t understand. And part of what we’ll be doing is translating and writing popular science books,” Schwartz tells the audience.

The announcement doesn’t shock many in the room — Schwartz has prophesied an end to human-researched physics for years. But now, they’re all listening.

In January of this year, Schwartz released a paper titled “Resummation of the C-Parameter Sudakov Shoulder Using Effective Field Theory.” The last line of the abstract reads: “All calculations, numerical analysis, and manuscript preparation were performed by Claude, an AI assistant developed by Anthropic, working under physicist supervision.”

Schwartz described the process in a blog post on Anthropic’s website published in March. By guiding the LLM solely with text prompts, he completed in two weeks what he says would have, with a human graduate student, taken him two years.

Schwartz calls the content of the paper itself “not terribly exciting” and at the “level of a second-year grad student,” but to him, it demonstrates capabilities that have the potential to revolutionize a field constrained by the limits of the human mind.

“Humans are limited,” he says. “We focus on things that we can understand, and that’s not necessarily how the universe can be understood.”

But not everyone is as sold on AI’s scientific capabilities.

New York University astrophysicist David W. Hogg says that physics, which he calls the “human understanding of the physical world,” requires more creativity and ingenuity than AI can perform in its current form.

“Everyone’s so hyped up, like it’s the end of the world. And it’s like, okay, web search got slightly better. And everyone’s like, ‘Oh, we’ve created a conscious, intelligent being,’” Hogg says, before adding, “it hasn’t actually done anything exactly. It’s just, everyone thinks it’s going to do something.”

‘It Takes a Lot of Human Supervision’

AI use in theoretical sciences ranges from simple calculations and coding (which have already been adopted into the workflow) to relying on Large Language Models as something similar to a collaborator.

Harvard Physics graduate student Alexander Michel says that while he uses OpenAI’s Codex environment for some coding and other tasks where the output can be verified quickly and with certainty, most of his group’s calculations on theoretical physics problems are still performed by hand.

“If you do have the capacity to generate something quickly verifiable, then that is a pretty huge productivity benefit,” he says. “If you’re talking about using it to generate stuff that’s harder to verify, I’d say right now it has increased output, but the verification is still expensive for a lot of those tasks.”

Some, however, have expanded their use of LLMs beyond coding and computation. Oscar Barrera, a second-year physics graduate student who works with Schwartz’s high energy theory group, was gifted a Claude subscription by Schwartz. Now, he uses it to summarize background information in papers and test out different pathways to solving research problems. Pre-AI, Barrera says he often had to spend time learning things he would never use again in order to understand the procedures in a paper.

Now he uses Claude to connect his current level of understanding to the paper he is reading. Since AI will create annotations to fill in the gaps in his understanding, he can comprehend a complicated paper with a single read-through.

Mathematics professor Lauren K. Williams ’00 is also using AI for its scouting capabilities, such as checking whether a statement she wants to use in her proofs is already known.

In experienced hands, she says, it can speed things up considerably. But the gains are not always net positive. She recently asked an LLM to summarize a topic she was trying to learn, and the output looked good — until she started examining it more closely.

“One thing I wound up spending a lot of time on was checking whether it was really citing papers that existed and citing theorems that were there,” she says.

Williams calls the verification process uninteresting, explaining that it would’ve been more “fun” and “edifying” to have done the work herself from the beginning.

She’s also interested in the other capabilities of LLMs. Through the First Proof Project, she’s working with a team of mathematicians to measure the current research-level reasoning skills of AI. In February, they asked several commercial AI models to prove 10 statements drawn from their own unpublished research across a range of mathematical fields. Without additional prompting, the LLMs solved two out of the 10.

Physics Professor Xi Yin has found AI to be more than just an efficient tool — it allows him to do work he wouldn’t even consider taking on without AI’s computational ability.

“It takes a lot of human supervision,” Yin says. “But for me, it’s still a speed up by a factor of, I would say, at least 100, producing in weeks the code that would take me 10 years to write.”

But for Hogg, the New York University astrophysicist, truly meaningful research requires more creativity than what he sees LLMs as doing — recycling text from existing text.

“When we’re doing research, we’re always heading into some territory that nobody knows,” he says. “So it’s not just that we’re calculating something, we are doing something that’s never been done before. We don’t know what’s going to happen.”

Yin, however, doesn’t see this as a restriction. “It’s really a matter of: compute,” he says, pointing to the “sheer quantity” of data LLMs are trained on as something humans are incapable of intuitively grasping. “I personally do not believe there’s any single human intellectual capability that cannot be replicated with AI,” he says.

‘It’s Never Been a Better Time To Be a Crackpot’

The quick expansion of AI’s capability has raised questions about the future of science publication and norms that should be set to ensure the validity of published papers.

Today, completed physics research is often published as a paper that details the process on an archive site, where it stays until it can be peer-reviewed by largely volunteer experts, then published in a journal.

Williams, who serves as an editor for several journals, says that with AI hastening research production, the existing backlog of unreviewed and unpublished papers will continue to grow.

“All of a sudden, all of these people who are doing this volunteer job have three times or 10 times as much work. That’s really a problem,” she says.

Schwartz says that researchers must continue to take full responsibility for verifying their publications and be fully transparent about AI use, but that preventing AI slop will require the community to overcome a “pressure to publish.”

“You want to write a paper that someone will remember in five years. That’s much more important than writing 500 papers that no one will remember,” he says.

Rodrigo Córdova Rosado ’19, a postdoctoral fellow in astrophysics, thinks the difficulty also lies in how reliable LLMs can seem, even when they are wrong.

“The barrier to being less careful has been lowered substantially,” he says. “The tools make it seem like they solved the problem, even though they could very well have not, and they could be leading you down a distinctly wrong path. It takes a lot of wisdom and knowledge that you learn through trial and tribulation — it takes a lot of that experience to be able to understand how to use these tools responsibly.”

Alex Lupsasca ’11, a former physics professor at Vanderbilt who now works at OpenAI, doesn’t see this as a problem with AI per se, but with how researchers fulfill their personal responsibility when using it.

He places some responsibility on the companies supplying LLMs, but compares the situation to driving a car off a cliff — “unless the car was defective, it’s kind of on you; you just didn’t use the car properly,” he says.

Lupusca says AI has the potential to steepen the rise in misinformation, but that the existence of misinformation is not a new problem.

“For theoretical physics, there have always been a lot of crackpots throughout that have wack ideas. And it’s true that it’s never been a better time to be a crackpot, just like it’s never been a better time to be a serious researcher,” he says.

Some researchers are even suggesting that there will and should be a radical change to how science is published. Since the 1600s, scientists have published papers in scientific journals, now digitized with the rise of the internet. Some hope and expect that the process will change.

“I have already felt, even before AI, that publishing papers is not an efficient way to organize scientific knowledge,” Yin says. “When I look at the archive, I feel that this is like a street market, not a grand architecture. I think in the age of AI, maybe there’s some better architecture.”

Yin, however, did not suggest alternatives.

A new paper published by Stanford physics professor Xiaoliang Qi on the same day as the agentic AI talk outlined an idea for what could replace a paper format. Qi proposes AI agents as alternatives to traditional papers. He writes that these interactive agents would be “capable of explaining the background of the work, the methods used, the reasoning process, intermediate decisions, and relevant tool interfaces.”

‘Why Do We Do Astrophysics?’

Disagreements about the fundamental purpose of theory and how it is practiced in universities are fueling differing views on the way science should change in light of AI.

Hogg says that while he has no problem using AI to write code, he values the human interaction in learning and collaboration that LLMs don’t provide because they “are not learning from your interactions with them.”

“Why do we do astrophysics? I believe we do it mainly to help people, to train people, to teach people to learn things and spread knowledge among people,” he says. “I think a lot of people think that if some entity just produces papers faster than another entity, somehow it’s a better scientist. But I think that’s a mistake about what science is.”

To Lupsasca, the point of physics is to try to understand our place in the universe and how the world works, so “it doesn’t really matter how you figure it out,” he says.

“The point is just to figure it out,” he adds. “Every tool is fair game.”

Yin agrees, saying that he’d be “perfectly happy if the AI can find the answer and have a way to validate that.”

“Whether I came out with the solution myself is secondary.” And it’s not out of the question, given that he believes AI is now “at the level of a competent graduate student,” he adds.

But Hogg says that his colleagues’ comparing of AI’s capabilities to those of graduate students misunderstands the purpose of graduate education in the first place.

“One thing I’m certain is wrong is people will say things like, ‘Oh, interacting with Claude is more useful and more efficient than interacting with a graduate student,” he says. “I just disagree with that immensely, because more efficient for who and more efficient at what?”

Hoggs sees his graduate students not as underlings, but as the inheritors of scientific knowledge. “I want them to learn new things. I want them to grow. I want them to develop new skills. I want them to learn about the universe.”

Schwartz, on the other hand, sees AI as essential to the next big discoveries in physics and as a way to hasten learning.

“That’s the main thing it does, it helps everybody learn much, much faster,” Schwartz says. “We’re all going to be super smart.”

Rodrigo says that while the “fantastically optimistic” world where AI frees up researchers’ time to focus on creative thinking would be “amazing,” he’s worried about the way that AI makes it easy to surpass the struggle necessary for learning.

“I would challenge all of us to really think critically of what it means to do science with these tools, and how do we use these tools in a responsible way that enables better science, not detracts from our capacity to do science,” he says.

Beyond the challenges AI poses to education, Hogg says that in the imagined future where humans are just translating LLMs’ work instead of doing research themselves, “physics would be dead.”

“Why learn physics? You don’t need to learn physics,” Hoggs says. “You can just be told about it on an AI-generated podcast that you listen to when you drive home from work.”

— Magazine writer Maria Borrell Ferrero can be reached at [email protected].

— Staff writer Asher J. Montgomery can be reached at [email protected].

3 Risks You’re Facing by Investing In SpaceX

Small Antarctic Telescope Makes An Outsized Impact On Exoplanetary Science

Ethereal Artemis II heat shield photo provides NASA with vital data

Freehold’s new fast-casual restaurant serves ‘UFO Burgers’ and Oreo milkshakes

Sega Drops Fresh Alien: Isolation 2 Tease After Long Silence

A comet that could shine brighter than the brightest star in the night sky makes its closest approach to Earth this Sunday, and the best window to see it in Brazil is in early May.

Kailasa Temple

AI Wrote A Harvard Physicist’s Most Recent Paper. No One Knows What It Means for Science. | Magazine