18-Year-Old High School Student Discovers 1.5 Million Unknown Celestial Bodies Using AI

Matteo Paz, an 18-year-old high school student, utilized AI to identify 1.5 million previously unknown celestial bodies from vast astronomical data.

18-Year-Old High School Student Discovers 1.5 Million Unknown Celestial Bodies Using AI

Recently, OpenAI launched a platform called “ChatGPT Futures”. A total of 26 young individuals or teams were awarded $10,000 each, along with access to cutting-edge models.

Image 7 Among them, one standout is Matteo Paz. Just last year, he was an 18-year-old high school student who developed a machine learning algorithm to process nearly 200TB of data accumulated over a decade by the NEOWISE infrared survey. He identified and classified 1.9 million infrared variable sources, of which approximately 1.5 million were previously unrecorded potential new discoveries.

His paper was published in the Astronomical Journal. In March of this year, he also won the top prize at the Regeneron Science Talent Search.

According to Caltech, this represents “a local high school student achieving breakthroughs at Caltech”. Paz is just one of the 26 selected individuals.

Image 8 On March 11, 2025, 18-year-old Matteo Paz held the Regeneron Science Talent Search trophy, awarded for discovering 1.5 million unknown celestial bodies using AI.

Other notable names include:

  • 18-year-old Crystal Yang: Developed a learning game for 200,000 visually impaired students that uses auditory cues instead of visual ones.
  • 19-year-old Anshi Bhatt: Created an anti-fraud system that has helped 18,000 people avoid online scams.
  • 25-year-old Amrita Bhasin: Built a logistics system that redirected over 5 million pounds of unsold inventory from landfills.

These 26 projects range from astronomy to disaster relief, from healthcare to agriculture, and from education for visually impaired children to financial management for street vendors in South America. None of these projects involved merely “using ChatGPT to write papers”; they tackled complex issues that previously required credentials, institutions, or funding.

AI has empowered them to think big and take action, a feat that previous generations found hard to imagine.

The First Generation of ChatGPT Natives Graduates

The class of 2026 is the first cohort that has had access to ChatGPT throughout their entire university experience. While “always available” does not mean “fully reliant”, it has significantly reshaped how this generation learns and lives.

About three and a half years ago, in the fall of 2022, the class of 2026 entered college. Just over two months later, on November 30, ChatGPT was released. Their college experience has been intertwined with ChatGPT, marking the birth of the “first generation of ChatGPT natives”.

By the end of their first semester, they had an AI on their desks capable of writing code, finding literature, and discussing any topic.

Among these 26 individuals or teams are high school students and cross-school research groups; they are not all labeled as “recent graduates”, but they represent a sample of this generation.

OpenAI’s launch of “ChatGPT Futures” aims not only to award prizes but also to showcase “outstanding young people in the AI era”.

Using AI to See What Humans Cannot

What are the first generation of ChatGPT natives doing with AI? Let’s look at three representative projects.

The first is Matteo Paz’s project. He worked with data from NEOWISE, a retired NASA infrared survey telescope that has accumulated a decade’s worth of data.

As Paz’s mentor Davy Kirkpatrick stated, “This dataset has nearly 200 billion rows, recording every detection we’ve made over the past decade.” Processing 200 billion rows and nearly 200TB of data is a task that humans cannot manage alone, but AI can tackle this effectively.

Image 9 In 2023, Matteo Paz presented the initial results of his AI astronomy project at the Caltech Summer Research Connection seminar.

Paz developed a machine learning algorithm called VARnet that combed through the entire dataset, marking 1.9 million infrared variable sources, with 1.5 million being entirely new discoveries: supermassive black holes, newborn stars, supernovae, etc.

Kirkpatrick initially expected to find just a few variable stars and inform the astronomical community that there were treasures within the data. Instead, Paz provided a complete catalog of the dataset: 1.9 million variable sources, classified into ten categories, all archived.

The second project is called AION-Search, led by Nolan Koblischke. His goal is to make 140 million galaxy images searchable using natural language.

Traditional astronomical image retrieval relies on image similarity or predefined categories. Searching for “spiral galaxies with merger signs” or “suspected gravitational lenses”? Sorry, you would need to train a specialized classifier first.

Image 10 The AION-Search demo interface supports natural language searches, and the paper claims the system can scale to 140 million galaxy images.

Koblischke’s approach involved first having GPT-4.1-mini automatically generate textual descriptions for 275,000 galaxy images (costing $150); then training a contrastive learning model to create a shared retrieval space for images and text; finally, extending this to 140 million images.

How effective is this? Gravitational lenses are the rarest targets in galaxy data, accounting for only 0.1% of the entire database: equivalent to finding one image among 1,000.

Using traditional image similarity algorithms, nearly all of the top ten results are incorrect. In contrast, AION-Search yields a significant number of correct results among the top ten.

The industry measures the accuracy of the top ten results using a metric called nDCG@10. AION-Search achieved 0.180, while traditional methods only reached 0.015, marking an improvement of over ten times in retrieval effectiveness.

What used to require astronomers to manually sift through hundreds of thousands of images to find rare phenomena can now be accomplished using natural language.

The third project is WiFind, developed by Nayel Rehman, Arhan Menta, Rushil Kukreja, and Aayush Tendulkar. They use AI to process WiFi signals in an attempt to locate survivors through walls and rubble in disaster zones.

Image 11 WiFind project team members.

Currently, WiFind is an award-winning project at the Springer conference and the Conrad Challenge, still in the prototype stage and not yet deployed as a disaster relief system. However, its concept is innovative: WiFi routers are ubiquitous, and each one is a potential “life detector”.

Additionally, Zeyneb Kaya is using AI to protect endangered languages, and Amrita Bhasin’s project has redirected over 5 million pounds of unsold inventory from landfills to reuse.

The common thread among these 26 projects is not “using AI to write papers”, but rather “using AI to tackle challenges that humans struggle to address”.

26 Names, Not Just Celestial Bodies and Rescue

When you lay out this list, a more comprehensive picture emerges: the 26 selected individuals (or teams) come from over 20 universities and institutions, including MIT, Stanford, Harvard, Oxford, Berkeley, and Yale. The list essentially covers the top research institutions in North America and the UK.

OpenAI categorized them into three groups: Creators (who make products), Explorers (who conduct research), and Advocates (who promote and disseminate knowledge).

Celestial discoveries, galaxy searches, and disaster relief are just three concentrated areas of focus. Among the remaining projects, some are developing learning aids to reduce pressure on peers; others are translating mental health resources into minority languages to ensure psychological counseling is accessible beyond the English-speaking world; some are creating accessibility features for disabled students to ensure classrooms are inclusive; and others are using AI to identify scam information to prevent elderly individuals from being defrauded.

Kyle Scenna, a 24-year-old entrepreneur from Waterloo, remarked, “I never imagined that the distance from identifying a problem to solving it could be so short.”

Michelle Lawson, a 20-year-old student at Smith College, stated, “I have always believed that with the right support and resources, you can achieve everything you can imagine. AI has made this a reality for me and thousands of others.”

Nolan Windham, 23, who is already an AI lead at a well-known hedge fund, said, “What’s exciting is that this is just the beginning.”

Their commonality regarding AI is that it has expanded their capabilities.

This is the fundamental difference between this generation of “AI natives” and the previous one: they have come to view AI as a default infrastructure, an indispensable part of their learning and living, much like how the previous generation of internet natives view “Wi-Fi”.

The Barrier Has Not Disappeared, Just Shifted

The fact that high school students can make astronomical discoveries may lead some to a sense of optimistic delusion: that AI has truly lowered the barriers to scientific research.

However, such a judgment is premature. Let’s take a look at Paz’s complete background. In the summer of 2022, while still in high school, he entered Caltech’s Planet Finder Academy. In 2023, he participated in a six-week Summer Research Connection program at Caltech, mentored by senior astronomer Davy Kirkpatrick.

Paz completed the Pasadena school district’s “Math Academy” program in middle school: he finished AP Calculus BC in eighth grade, a course that typical high school students encounter only in their senior year, and he accomplished this before turning 14.

In other words, Paz is not just “an ordinary high school student with ChatGPT”; he is “a math prodigy at the university level, with top mentors from Caltech for two years, and direct access to IPAC computational resources”, plus AI.

Image 12 The paper on AION-Search, which makes 140 million galaxy images searchable using natural language, also mentions its limitations: VLM may overlook subtle astronomical structures and introduce biases from GPT-4.1-mini into the system. The entire method works in astronomy partly because datasets like Galaxy Zoo have already been used as training material for GPT.

What AI finds are primarily phenomena that astronomers already know how to label.

The WiFind project, which aims to use WiFi signals to locate survivors through rubble, is still in prototype form and not yet an operational disaster relief system.

AI has lowered the barrier for “repetitive tasks” but has not eliminated the need for “taste, judgment, and long-term training”.

The key point of Paz’s story is not that AI allows any high school student to make astronomical discoveries, but rather that a student who was already on track to make such discoveries has accelerated this process by ten years.

The barrier has not disappeared; it has merely shifted from “can it be done” to “can it be imagined”.

Was this helpful?

Likes and saves are stored in your browser on this device only (local storage) and are not uploaded to our servers.

Comments

Discussion is powered by Giscus (GitHub Discussions). Add repo, repoID, category, and categoryID under [params.comments.giscus] in hugo.toml using the values from the Giscus setup tool.