AI+Science: Faculty Research Showcase

Celebrating faculty research in the College of Science

AI+Science: Faculty Research Showcase

Celebrating faculty research in the College of Science

Cell based artificial intelligence. Courtesy of Bo Sun, Department of Physics

The College of Science is organizing a faculty research showcase during the 2026 Artificial Intelligence Week, on April 13 and 15, 2026.

DESCRIPTION

From methodology and first principles to multiscale modeling of biological and physical systems with applications to health, energy and the understanding of our universe see how College of Science faculty are developing and using different aspects of AI in their work, including the development of machine learning, deep learning and large language models.

ORGANIZING COMMITTEE

Vrushali Bokil, Executive Associate Dean, College of Science
Nikki Rak, Administrative Assistant to the Associate Deans, College of Science

DATES & VENUE

Monday April 13: 3 pm to 4 pm | MU 13 – Multipurpose Room
Wednesday April 15: 11 am to Noon | MU 206 – Asian Pacific American Room
Website: https://ai.oregonstate.edu/ai-week-2026

AGENDA

PART 1 RESEARCH AT THE INTERFACE OF AI AND THE BIOLOGICAL AND BIOMEDICAL SCIENCES

Monday April 13th
3pm to 4pm | Memorial Union 13 – Multipurpose Room
Zoom Link

RESEARCH TALKS

Douglas Walker, Research Associate of Biochemistry & Biophysics

Abstract:  AI/ML techniques have brought great value to the process of scientific research, but an optimized output need not have any grounding in reality. It is the job of the scientist to interpret these outputs and push the technology back to a place of physical meaning. Drawing on my work with AlphaFold and other modeling approaches, I will share several cases in which I relied on my scientific expertise to understand what a model is doing and pushed back to reinforce physical meaning into ML applications. These examples highlight how AI can reveal hidden important features, requires direction to avoid non-physical solutions, and can be right for the wrong reasons. I will discuss strategies for interrogating AI models, thinking about their underlying constraints, and recovering physical meaning. In the end, AI shifts the source of uncertainty and reinforces the need for careful, critical thinking from the scientist.

Marilyn Mackiewicz, Associate Professor of Chemistry

Abstract: The integration of artificial intelligence (AI) with continuous flow synthesis offers a transformative approach to accelerating nanomaterials discovery, optimization, and scale-up. In this work, we present the development of an AI-enabled continuous flow platform (“Sunshine system”) designed to generate high-quality, reproducible datasets for the synthesis of hybrid lipid–coated nanomaterials. This system enables precise control over nanoparticle size, shape, and surface chemistry while allowing real-time feedback for adaptive optimization. We apply this platform to two key application spaces: (1) the development of nanotracers for tracking nanoplastics in environmental systems, and (2) the scalable production of X-ray–active nanomaterials for imaging triple-negative breast cancer (TNBC) cells. By coupling continuous flow synthesis with AI-guided parameter exploration, we establish a framework for rapid materials discovery, improved reproducibility, and translation-ready nanomaterials. This approach represents a shift toward autonomous, data-driven nanochemistry for complex human and environmental health challenges.

Mark Novak, Associate Professor of Integrative Biology

Abstract: Data limitations remain a central barrier to advancing ecological theory and ecosystem-based management, particularly for predator–prey interactions and key traits such as body mass. We present two complementary research efforts that leverage large language models (LLMs) to address these bottlenecks. First, we develop a customized LLM workflow to systematically identify and classify predator diet survey publications as “useful” or “not useful,” and to extract quantitative data on the fraction of feeding individuals and associated covariates from a vast but historically fragmented literature spanning more than a century. Second, we integrate machine learning and LLMs to predict species body mass from taxonomic information, training models on a large, curated dataset of known body masses. Together, these projects demonstrate how AI can accelerate data discovery, synthesis, and prediction in ecology, enabling new tests of theory and improving our capacity to understand and manage ecosystems under rapid environmental change.

Juan Vanegas, Associate Professor of Biochemistry & Biophysics

Abstract:  Proteins are essential components in biological systems and their functions are strongly dependent on their 3-dimensional structure. However, proteins are also highly dynamic and characterizing this dynamic range is an ongoing challenge for both simulations and experiments. I will show how the AlphaFold inference pipeline can be used to obtain a broad range of conformational structures by masking the input multiple sequence alignment (MSA). I will show some examples of this approach applied to membrane proteins such as mechanosensitive channels, which undergo large conformational changes in the presence of membrane tension. I will also describe the pitfalls and limitations of the MSA masking approach.

David Kikuchi, Assistant Professor of Integrative Biology

Abstract: The Kikuchi lab studies animal decision-making in predator-prey interactions. We focus on the behaviors that animals use to assess and respond to danger. Our goal is to illuminate the evolutionary and neurological basis of these behaviors to understand their generality across species and the consequences they have for multispecies communities. We use machine learning to construct precise ethograms of animals in experimental settings, letting us efficiently process large datasets of video footage. At the same time, we use gAI to write custom programs that extract further detail from videos, including thermal information that tells us about physiological responses animals have to different situations. A long-term goal of our lab is to integrate some of this technology into undergraduate CURE modules so that students can learn about AI while also massively scaling up our data processing capacity.

Myriam Cotten, Associate Professor of Biochemistry & Biophysics

Abstract: The Cotten lab studies lipid-interacting peptides and proteins, focusing on their roles in host–pathogen interactions and metabolic regulation, and on the development of peptide-based drug delivery systems. Traditionally, our work has relied on biophysical approaches to achieve atomic-level characterization of these molecules and decipher their mechanisms of action. We are now excited to expand this framework by incorporating machine learning methods for both in silico sequence design and the analysis of complex datasets. This integration presents a powerful opportunity to accelerate the discovery of novel peptide sequences with optimized properties, including enhanced antimicrobial activity and reduced toxicity toward mammalian cells.

PART 2 RESEARCH AT THE INTERFACE OF AI AND THE PHYSICAL AND COMPUTATIONAL SCIENCES

Wednesday April 15th
11am to Noon | Memorial Union Room 206 – Asian Pacific American Room
Zoom Link

RESEARCH TALKS

Tom Sharpton, Professor of Microbiology and Statistics

Abstract: Large language models are rapidly reshaping how scientists design studies, analyze data, and generate hypotheses, yet realizing their potential requires deliberate strategies that promote rigor and reproducibility. In this talk, I present three complementary efforts from my lab that illustrate how LLMs can be responsibly integrated into the research enterprise. First, I introduce a practical risk framework and prompt engineering guidelines for life scientists, currently under review and accompanied by an open-access prompt repository. Second, I describe the development of an AI-enabled co-pilot for microbiome data analysis that uses constrained LLM architectures to guide researchers through complex analytical workflows while maintaining methodological transparency. Third, I discuss our experience with Google's Co-Scientist platform for automated hypothesis generation and present an analytical tool we developed to organize and critically evaluate its outputs, substantially improving its utility for scientific ideation. Together, these projects highlight a common theme: LLMs are most powerful not as black-box oracles, but as structured collaborators whose outputs are made tractable, auditable, and scientifically grounded through thoughtful design.

Patti Hamerski, Assistant Professor of Physics

Abstract: Given the option, many STEM students choose to use generative AI as a tool when solving problems. This practice marks a fundamental change in the problem-solving processes. This change has been significant in a junior-level computational physics course at Oregon State University, where students often wrote code to model physics phenomena. I will give an overview of findings from research and curriculum development addressing generative AI usage in this setting. Takeaways from this work include the importance of drawing on student input, changes to computational problem-solving that were observed, and emergent issues for research and teaching practice.

Tim Zuehlsdorff, Assistant Professor of Chemistry

Abstract: Light-matter interactions are at the heart of a wide range of physical processes, from the generation of electricity from sunlight in photovoltaic devices, to photocatalysis and biomedical imaging. The development of robust computational approaches to predict optical properties of molecules and nanostructured materials has the potential to significantly accelerated the design of novel functional devices. Current methods show promise, but are computationally costly, relying on tens of thousands of individual calculations of excited state properties on different system configurations to capture the coupling of electronic excitations to nuclear motion, making materials discovery based on high-throughput screening prohibitive. In this talk, we show how state-of-the-art machine learning approaches can significantly reduce the computational cost associated with first principles modeling. This can potentially enable both high-throughput prediction of structure-property relationship, and significantly reduce the carbon footprint associated with computational modeling of these challenging processes.

Jeff Hazboun, Assistant Professor of Physics

Abstract: Gravitational-wave astronomy depends on the ability to identify faint astrophysical signals in large, complex, and noisy datasets. In this talk, we discuss how machine learning is being incorporated into our research to address key challenges in gravitational-wave data analysis. We focus in particular on the use of normalizing flows for efficient likelihood and posterior estimation, enabling rapid and flexible inference that complements traditional analysis pipelines. These approaches offer new ways to accelerate parameter estimation, improve scalability, and explore higher-dimensional signal models as detector sensitivity and data volume continue to grow. We conclude by outlining opportunities for further integration of AI methods into gravitational-wave science.

Kyriakos Stylianou, Associate Professor of Chemistry

Abstract: Sharing our synthetic efforts to synthesize materials after searching the large chemical space.

Nicholas Marshall, Assistant Professor of Mathematics

Abstract: In this talk, we describe a novel randomized algorithm for constructing binary neural networks with tunable accuracy. This approach is motivated by hyperdimensional computing (HDC), which is a brain-inspired paradigm that leverages high-dimensional vector representations, offering efficient hardware implementation and robustness to model corruptions. Unlike traditional low-precision methods that use quantization, we consider binary embeddings of data as points in the hypercube equipped with the Hamming distance. We propose a novel family of floating-point neural networks, G-Nets, which are general enough to mimic standard network layers. Each floating-point G-Net has a randomized binary embedding, an embedded hyperdimensional (EHD) G-Net, that retains the accuracy of its floating-point counterparts, with theoretical guarantees, due to the concentration of measure. This talk is based on joint work with Alireza Aghasi, Saeid Pourmand, and Wyatt Whiting.