
With all the buzz about Dyson Sphere candidates recently, it reminded me a similar paper that was published a few years ago which didn’t get nearly as much fanfare, but is arguably just as significant – eight (8) separate, high-confidence SETI signals that had been missed by older, less capable detection algorithms.
What is most significant about this find is that 8 good candidates were identified from a very small data set (150 TB of data of 820 nearby stars), suggesting that there could be a much larger number of signals to be discovered from larger, more comprehensive data sets. This was accomplished by using new, more accurate detection algorithms that can catch signals older algorithms missed.
[Here is a link to the article describing the announcement](https://communities.springernature.com/posts/how-a-deep-learning-algorithm-discovered-8-new-seti-candidates), and a transcript of the article as well. I have bolded certain sections to add emphasis to important details. I have also included a link to the paper itself below.
> **How a Deep Learning Algorithm discovered 8 New SETI candidates**
>
> Springer Nature, 30 January 2023.
>
> By Peter Xiangyuan Ma, Researcher, University of Toronto
>
> On an uneventful evening of August 2021, I was on an arduous four day long cross country drive from Vancouver to Toronto with my family, when I decided to check some preliminary results on an algorithm that I’ve set to run while I was away. I hooked up to the spotty wifi of some motel in the middle of Manitoba and began scrolling.
>
> That summer I was working on a Deep Learning based search algorithm for radio technosignatures to help investigate the prevalence of extraterrestrial intelligence (ETI) from nearby stars. I was building a new addition to our classical search algorithm, algorithms that are now older than my parents. The goal for this shiny new algorithm is to run faster and to produce better candidates by leveraging AI and modern computer vision techniques. Nonetheless I was expecting to find radio frequency interference (RFI), junk that my algorithm had been returning for months prior to this. Instead, I had found something much more interesting.
>
> My algorithm started to find signals, most importantly ones that matched closely to simulated ETI signals. When I first saw this I dismissed it. I closed my laptop and headed to bed exhausted by the thought of two more days of driving awaiting my family.
>
> When I got back to Toronto I started compiling my results. With the help of my colleague Leo Rizk, my algorithm had returned 30,000 results, each requiring me to manually inspect. As the undergrad and the one who built this thing, I suppose this was a rite of passage. In total we had searched through 150 TB of data of 820 nearby stars, on a dataset that had previously been searched through in 2017 by classical techniques but had been labelled as devoid of interesting signals. I began reviewing all the results by eye and there it was again.
>
> That same signal. Weird. Then there it was again but this time it looked different. These came from a different star. Then again, and again. I began writing them down. Soon my list had grown to more than 10 rather suspicious looking signals. I thought this had to be interference, or it must’ve been picked up by previous searches. Looking them up in our database, I found no matches. I told my supervisor Cherry Ng about this and we were both confused. Were we the first to ever look at these signals?
>
> **Funnily enough, these looked almost perfect. Many of the signals had all the key characteristics we were looking for.**
>
> **1. The signals were narrow band, meaning they had narrow spectral width, on the order of just a few Hz. This is important because natural phenomena are much more broadband.**
>
> **2. The signals had non-zero drift rates, which means the signals had a slope. This could indicate a signal’s origin had some relative acceleration with our receivers, hence not local to the radio observatory.**
>
> **3. The signals appeared in ON-source observations and not in OFF-source observations. If a signal originates from a specific celestial source, it appears when we point our telescope toward the target and disappears when we look away. Human radio interference usually appears in ON and OFF observations due to the source being close by.**
>
> **We were able to rule a few signals out that didn’t pass our visual checks, but ultimately we were still left with eight signals of interest – the eight appearing in our manuscript.**
>
> When we showed our colleagues working in the Breakthrough Listen program, we were still scratching our heads. These were all different signals, of different drift rates originating from different stars and they weren’t picked up by our classical algorithms? This was news. Here we successfully demonstrated for the first time, a complete end-to-end search algorithm using deep learning that discovered signals that no classical algorithms were able to pick up. It finally worked!
>
> Originally this project began nearly two years ago. Back then, I was still in high school sitting in my senior computer science class. I was given a final software project, the goal of which was to come up with an idea and pair up with classmates to work on an app or program to solve a problem. I had previously taught myself machine learning in 11th grade and having an interest in SETI/astronomy I proposed this idea to fellow classmates. Unfortunately, I only received strange stares so I decided to do it alone.
>
> I worked tirelessly, and eventually I had built what became the basis for this paper’s work. At the end of 2019 and into 2020, I began cold emailing everyone at the UC Berkeley SETI group and with a few encouraging exchanges I had faith in my direction. You can still find my high school project on Github here.
>
> Fundamentally what I came up with is a way of leveraging unsupervised and supervised learning paired with a novel transfer learning method. I found that regular supervised models were too restrictive in searching for signals of interest. These methods found candidates that only matched simulated signals they were trained on, and couldn’t generalise to arbitrary anomalies. On the other hand the unsupervised methods were uncontrollable, and they basically identified anything with some slightly weird signal as anomalous, thus returning mostly junk. I found that by intermediately swapping the weights during the training phase of a supervised and an unsupervised model we could balance the best of both worlds. Eventually, in the algorithm ultimately implemented in this paper, this semi-supervised technique evolved into an autoencoder plus random forest technique. Although my high school experiments were unsuccessful, mostly because I was running code locally on my laptop, the groundwork had been set.
>
> I stuck with this project and when I graduated high school began working with the Breakthrough Listen team where I was supervised by Dr. Steve Croft and Dr. Cherry Ng. In 2021, I received funding for this project from the Laidlaw Foundation, and with the support of my supervisors I was off. I spent two months battling RFI, and after orchestrating an armada of 12 GPU’s running non-stop, full throttle, for two weeks, we came out of the trenches with the results in our paper: a successful search for technosignatures using deep learning. We found candidates that no other algorithm has previously found.
>
> **Looking forward, today we’re scaling this search effort to 1 million stars with the MeerKAT telescope and beyond. We believe that work like this will help accelerate the rate we’re able to make discoveries in our grand effort to answer the question “are we alone in the universe?”. Although I, like many others, have wondered if we’ll ever find that elusive technosignature needle in the vast haystack of anthropogenic interference, I hope that readers of our paper will agree that the new capabilities provided by deep learning provide grounds for new excitement and optimism in the search for extraterrestrial intelligence.**
Link to the paper:
Ma, P.X., Ng, C., Rizk, L. et al. [A deep-learning search for technosignatures from 820 nearby stars](https://www.nature.com/articles/s41550-022-01872-z). Nat Astron 7, 492–502 (2023). https://doi.org/10.1038/s41550-022-01872-z
by Captain_Hook_
