Rutgers logo
School of Engineering
Rutgers logo
School of Engineering

A newly published Science Advances article, co-authored by Professor Anand Sarwate of the School of Engineering's Department of Electrical and Computer Engineering, demonstrates that the differences between seemingly similar real data samples and those generated by generative AI can be measured using the right tools. 

Headshot of male with eyeglasses and green hair wearing a black button down shirt
Professor Anand Sarwate

According to Sarwate, the paper, “Understanding Generative AI Output with Embedding Models,” suggests that each generative AI model has an internal component that makes it unique.  

Sarwate, whose research interests focus on mathematical methods for learning from data, was the Rutgers principal investigator, or PI, and co-supervisor and co-supervisor with Dr. Tony Chiang from the Pacific Northwest National Laboratory (PNNL) of three additional PNNL staff members (Dr. Max Vargas, Reilly Cannon, and Andrew Engel) for the project, which received $70,000 in preliminary funding from PNNL’s Mathematics of Artificial Reasoning Systems (MARS) program. The project, he says, gave him a welcome opportunity to work on something that was “almost purely experimental.”  

Asking – and Answering – a Fundamental Question 

According to Sarwate, there is a preconception that generative AI systems process an input or prompt into an internal representation that captures all of the prompt’s meaning. So, it is assumed that if two different AI models – Models A and B – both have internal representations of the same capture, they are functionally identical.  

“What we are asking – how to compare two models – is very fundamental and very interesting,” Sarwate says. “One way is to compare internal representations, called embeddings, which is difficult. Another way is to compare the outputs by asking if two AI models will generate functionally equivalent text if given the same prompts.” 

For Sarwate’s team, these approaches weren’t good enough to make accurate quantitative comparisons. By using a third AI model – Model C –to make embeddings of the output of AI models A and B his team could facilitate an “apples-to-apples” comparison.  

“What we found is that in the embeddings created by Model C, the outputs from Models A and B looked very different, making it easy to separate them using simple statistical tools,” Sarwate says.  

It is an approach that could be extended in various directions. Model C could distinguish between real text and images and those that were AI-generated. “We could also use Model C to distinguish between different real data sources, showing that even real images which are more or less interchangeable might look very different.  

“The most surprising thing was that the statistical tools we used to understand these comparisons were very simple and classical,” Sarwate recalls. 

The Uniqueness of Each Generative AI Model 

“Each generative AI model has some internal component that makes it unique, which suggests a different way to approach the idea of detecting AI-generated content, such as phony academic papers or deepfakes,” Sarwate predicts. “It could also be used to better align two AI models to be more compatible. 

“Accomplishing these goals will require a lot more engineering effort, of course. What we wanted to show was that the differences between models are measurable if we use the right tool.”