Georgia Tech-Led Papers
Adversarial Attention Perturbations for Large Object Detection Transformers
Zachary Yahn, Selim Tekin, Fatih Ilhan, Sihao Hu, Tiansheng Huang, Yichang Xu, Margaret Loper, Ling Liu
ASCENT: Annotation-free Self-supervised Contrastive Embeddings for 3D Neuron Tracking in Fluorescence Microscopy
Haejun Han, Hang Lu
Clink! Chop! Thud! – Learning Object Sounds from Real-World Interactions
Mengyu Yang, Yiming Chen, Haozheng Pei, Siddhant Agarwal, Arun Vasudevan, James Hays
Contrastive Flow Matching
George Stoica, Vivek Ramanujan, Xiang Fan, Ali Farhadi, Ranjay Krishna, Judy Hoffman
Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment
Zhenbang Du, Yonggan Fu, Lifu Wang, Jiayi Qian, Xiao Luo, Yingyan Celine Lin
HyPiDecoder: Hybrid Pixel Decoder for Efficient Segmentation and Detection
Fengzhe Zhou, Humphrey Shi
Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation
Akshay Krishnan, Xinchen Yan, Vincent Casser, Abhijit Kundu
OuroMamba: A Data-Free Quantization Framework for Vision Mamba
Akshat Ramachandran, Mingyu Lee, Huan Xu, Souvik Kundu, Tushar Krishna
SplatTalk: 3D VQA with Gaussian Splatting
Anh Thai, Kyle Genova, Songyou Peng, Leonidas Guibas, Thomas Funkhouser
T2I-Copilot: A Training-Free Multi-Agent Text-to-Image System for Enhanced Prompt Interpretation and Interactive Generation
Chieh-Yun Chen, Min Shi, Gong Zhang, Humphrey Shi
Task-Specific Zero-shot Quantization-Aware Training for Object Detection
Changhao Li, Xinrui Chen, Ji Wang, Kang Zhao, Jianfei Chen
Partner-Led Papers
CLIP-GS: Unifying Vision-Language Representation with 3D Gaussian Splatting
Siyu Jiao, Haoye Dong, Yuyang Yin, ZEQUN JIE, Yinlong Qian, Yao Zhao, Humphrey Shi, Yunchao Wei
CompCap: Improving Multimodal Large Language Models with Composite Captions
Xiaohui Chen, Satya Narayan Shukla, Mahmoud Azab, Aashu Singh, Qifan Wang, David Yang, ShengYun Peng, Hanchao Yu, Shen Yan, Xuewen Zhang, Baosheng He
EmbodiedSplat: Personalized Real-to-Sim-to-Real Navigation with Gaussian Splats from a Mobile Device
Gunjan Chhablani, Xiaomeng Ye, Muhammad Zubair Irshad, Zsolt Kira
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance
Jiayi Guo, Chuanhao Yan, Xingqian Xu, Yulin Wang, Kai Wang, Gao Huang, Humphrey Shi
Modeling Saliency Dataset Bias
Matthias Kümmerer, Harneet Singh Khanuja, Matthias Bethge
One Last Attention for Your Vision-Language Model
Liang Chen, Ghazi Shazan Ahmad, Tianjun Yao, Lingqiao Liu, Zhiqiang Shen
SummDiff: Generative Modeling of Video Summarization with Diffusion
Kwanseok Kim, Jaehoon Hahm, Sumin Kim, Jinhwan Sul, Byung-Hak Kim, Joonseok Lee

ACM SIGCHI Conference on Computer-Supported Cooperative Work & Social Computing
Bergen, Norway | Oct 18–22, 2025

Papers
Advocacy Work
Charismatic Data and Material Traces: Monitoring Bird-Building Collisions through Citizen Science
Ashley Boone, Carl DiSalvo, Christopher Le Dantec
Bird collisions with man-made structures pose a significant threat to bird populations. In [Southern City], a small group of dedicated volunteers track these deaths with hopes of advocating for local policy requiring the use of bird-safe building materials. In addition to recording observations in a mobile application, volunteers log their efforts and collect the bodies of birds they find to add to university specimen collections. We offer a detailed empirical account of the work done by volunteers to produce (1) a digital record of local bird strikes (2) a log of volunteer monitoring efforts and (3) a collection of bird specimens. Unpacking the multiple forms of data produced by volunteer efforts, we examine how Project Safe Flight produced data oriented towards advocacy work. We find that Safe Flight data practices are deeply intertwined with the material qualities of these traces: mass, decay, feathers, and charisma. Finally, we discuss implications for data activism, discussing the link between materiality and charismatic data and next steps for action citizen science.
Metrics and Macchiatos: Challenges for Service-Industry Workers and the Need for Worker-Driven ICTs
Xander Koo, Lucy Scott, Amy Bruckman
Nearly 30 million people work in the foodservice and retail industries in the United States, representing approximately 18 percent of the total U.S. workforce. These service-industry workers contend with pressures from algorithmic management and other workplace technologies, yet they typically do not benefit from technologies that might help foster mutual support in the way that white-collar workers do. Recently, Starbucks, a major service-industry employer, has garnered media attention for issues with understaffing, labor law violations, and algorithm-based operations. We conducted interviews with sixteen Starbucks employees about their workplace issues, interactions with technology, and communication practices. These interviews illustrate how workplace technologies worsen existing issues for service-industry workers and how challenges to worker-to-worker communication reduce their capacity to rectify these issues, especially at the cross-store level. Our participants want better communication with other workers, such as through labor unions or new information and communication technologies (ICTs), to help improve their working conditions. We discuss how HCI scholars can use action research to help design localized, worker-driven ICTs to facilitate more connectivity and collaborative practices outside of the workplace. We conclude by outlining our ongoing work studying and designing ICTs for service-industry workers.
AI Applications for Safety and Support
“Poker with Play Money”: Exploring Psychotherapist Training with Virtual Patients
Cynthia Baseman, Masum Hasan, Nathaniel Swinger, Sheila Rauch, Sheila Rauch, Ehsan Hoque, Rosa Arriaga
Role-play exercises are widely utilized for training across a variety of domains; however, they have many shortcomings, including low availability, resource intensity, and lack of diversity. Large language model-driven virtual agents offer a potential avenue to mitigate these limitations and offer lower-risk role-play. The implications, however, of shifting this human-human collaboration to human-agent collaboration are still largely unexplored. In this work we focus on the context of psychotherapy, as psychotherapists-in-training extensively engage in role-play exercises with peers and/or supervisors to practice the interpersonal and therapeutic skills required for effective treatment. We provide a case study of a realistic virtual patient” system for mental health training, evaluated by trained psychotherapists in comparison to their previous experiences with both real role-play partners and real patients. Our qualitative, reflexive analysis generated three themes and thirteen subthemes regarding key interpersonal skills of psychotherapy, the utility of the system compared to traditional role-play techniques, and factors which impacted psychotherapist-perceivedhumanness” of the virtual patient. Although psychotherapists were optimistic about the system’s potential to bolster therapeutic skills, this utility was impacted by the extent to which the virtual patient was perceived as human-like. We leverage the Computers Are Social Actors framework to discuss human–virtual-patient collaboration for practicing rapport, and discuss challenges of prototyping novel human-AI systems for clinical contexts which require a high degree of unpredictability. We pull from the “SEEK” three-factor theory of anthropomorphism to stress the importance of adequately representing a variety of cultural communities within mental health AI systems, in alignment with decolonial computing.
The Practice of Online Peer Counseling and the Potential for AI-Powered Support Tools
Tony Wang, Amy Bruckman, Diyi Yang
What challenges do volunteers providing peer support in online mental health platforms (OMHPs) face in operating and growing their communities? How could the HCI community develop human-AI systems to help? Recent work on online peer counseling has led to the development of novel AI tools for conversational interaction, but it remains unknown how such technology can fit into existing practices. In this research, we conducted interviews and design exercises with seventeen peer counselors from 7 Cups of Tea, a large online therapy and counseling platform, to design tools — AI or not — that resolve challenges that arise from day-to-day community practices. Participant responses suggest three classes of tools that could improve online peer counseling: real-time decision support, productivity, and management and training. Investigation of design motivations surfaced four practice-based challenges including chat interface limitations, difficulties in support seeker management, fragmented contexts of practice, and lack of visibility due to privacy concerns. Based on counselors’ discussion of benefits and risks associated with AI features in the tools they designed, we offer suggestions for research on AI tools that build on peer counseling practices, and connect our findings with broader implications about online peer counseling as a form of volunteer-based mental health practice.
The Typing Cure: Experiences with Large Language Model Chatbots for Mental Health Support
Inhwa Song, Sachin Pendse, Neha Kumar, Munmun De Choudhury
People experiencing severe distress increasingly use Large Language Model (LLM) chatbots as mental health support tools. Discussions on social media have described how engagements were lifesaving for some, but evidence suggests that general-purpose LLM chatbots also have notable risks that could endanger the welfare of users if not designed responsibly. In this study, we investigate the lived experiences of people who have used LLM chatbots for mental health support. We build on interviews with 21 individuals from globally diverse backgrounds to analyze how users create unique support roles for their chatbots, fill in gaps in everyday care, and navigate associated cultural limitations when seeking support from chatbots. We ground our analysis in psychotherapy literature around effective support, and introduce the concept of therapeutic alignment, or aligning AI with therapeutic values for mental health contexts. Our study offers recommendations for how designers can approach the ethical and effective use of LLM chatbots and other AI mental health support tools in mental health care.
Beyond AI: Additional Considerations for Enhancing Healthcare
[HONORABLE MENTION] Bridging Ontologies of Neurological Conditions: Towards Patient-centered Data Practices in Digital Phenotyping Research and Design
Jianna So, Faye Yang, Krzysztof Gajos, Naveena Karusala, Anoopum Gupta
Amidst the increasing datafication of healthcare, deep digital phenotyping is being explored in clinical research to gather comprehensive data that can improve understanding of neurological conditions. However, participants currently do not have access to this data due to researchers’ apprehension around whether such data is interpretable or useful. This study focuses on patient perspectives on the potential of deep digital phenotyping data to benefit people with neurodegenerative diseases, such as ataxias, Parkinson’s disease, and multiple system atrophy. We present an interview study (n=12) to understand how people with these conditions currently track their symptoms and how they envision interacting with their deep digital phenotyping data. We describe how participants envision the utility of this deep digital phenotyping data in relation to multiple stages of disease and stakeholders, especially its potential to bridge different and sometimes conflicting understandings of their condition. Looking towards a future in which patients have increased agency over their data and can use it to inform their care, we contribute implications for shaping patient-driven clinical research practices and deep digital phenotyping tools that serve a multiplicity of patient needs.
Care Work
Jiaying “Lizzy” Liu, Shuer Zhuo, Xingyu Li, Andrew Dillon, Noura Howell, Angela D. R. Smith, Yan Zhang
“Enhancing emotional well-being has become an important focus in HCI and CSCW, with technologies increasingly designed to track, visualize, and manage emotions. However, these approaches have faced criticism for potentially suppressing certain emotional experiences. Through a scoping review of 53 empirical studies from ACM proceedings implementing Technology-Mediated Emotion Intervention (TMEI), we critically examine current practices through lenses drawn from HCI critical theories.
Our analysis reveals emotion intervention mechanisms that extend beyond traditional “”emotion regulation”” paradigms, identifying care-centered goals that prioritize non-judgmental emotional support and preserve users’ identities.
The findings demonstrate how researchers design technologies to generate artificial care, intervene in power dynamics, and nudge behavioral changes. We contribute the concept of “”emotion support”” as an alternative approach to “”emotion regulation,”” emphasizing human-centered approaches to emotional well-being. This work advances the understanding of diverse human emotional needs beyond individual and cognitive perspectives, offering design implications that critically reimagine how technologies can honor emotional complexity, preserve human agency, and transform power dynamics in care contexts.”
Caregiving & Caregivers
Kefan Xu, Cynthia Baseman, Nathaniel Swinger, Myeonghan Ryu, Rosa Arriaga
Informal caregivers perform an important role in taking care of family members with chronic disease. Informal caregivers’ mental health can be negatively impacted by life-changing events (e.g., patients’ diagnosis, care transitioning, etc.). This leads the caregiver to suffer from interpersonal and intrapersonal conflicts, causing a sense of disorientation and escalating malaise. In this study, we investigated informal caregivers’ experiences of facing conflicts and life-changing events by qualitatively analyzing the data from online health communities. We categorized conflicts using a psychodynamic framework. We further looked at the interplay of life-changing events and conflicts and how this leads to caregivers’ sense-making and decisions to mediate conflicts. We also found that online health communities provide support by helping caregivers interpret and navigate conflicts and raising awareness of the temporal resolution of life-changing events. We conclude this study by discussing designing online health communities to better support such practice.
Caring at a Distance
Lan Gao, Munmun De Choudhury, Jennifer Kim
In remote psychotherapy, challenges arising from remote client-therapist interactions can impact the therapeutic alliance and overall outcomes. HCI research has focused on leveraging sensing technology to bridge gaps in remote interactions. In this work, we investigate the values and risks of integrating sensing technology in remote psychotherapy, specifically to capture and interpret non-verbal cues, by conducting a speculative design study with both clients and therapists. Our findings reveal that sensing technology has the potential to facilitate self-reflection in therapy. The sharing of tracked non-verbal cues could also possibly foster mutual disclosure, supporting therapists’ judgments and balancing power dynamics between clients and therapists. However, clients and therapists were concerned about the accuracy of sensing systems, potential privacy threats, and additional cognition burden. Our insights into system values imply how sensing technology could potentially balance power dynamics in client-therapist relationships as well as general interpersonal relationships. We also emphasize the increased considerations in sensing-technology-empowered communication for remote psychotherapy than in non-vulnerable settings.
Helping the Helper: Supporting Peer Counselors via AI-Empowered Practice and Feedback
Shang-Ling Hsu, Raj Shah, Prathik Senthil, Zahra Ashktorab, Casey Dugan, Werner Geyer, Diyi Yang
Millions of users come to online peer counseling platforms to seek support. However, studies show that online peer support groups are not always as effective as expected largely due to users’ negative experiences with unhelpful counselors. Peer counselors are key to the success of online peer counseling platforms, but most often do not receive appropriate training. Hence, we introduce CARE: an AI-based tool to empower and train peer counselors through practice and feedback. Concretely, CARE helps diagnose which counseling strategies are needed in a given situation and suggests example responses to counselors during their practice sessions. Building upon the Motivational Interviewing framework, CARE utilizes large-scale counseling conversation data with text generation techniques to enable these functionalities. We demonstrate the efficacy of CARE by performing quantitative evaluations and qualitative user studies through simulated chats and semi-structured interviews, finding that CARE especially helps novice counselors in challenging situations. The code is available at https://app.box.com/s/z3a4dwgmeqfy8vbzi9cgmg0yhn6t4j53.
Core Concepts in Privacy Research
Measuring, Modeling, and Helping People Account for Privacy Risks in Online Self-Disclosures with AI
Isadora Krsek, Anubha Kabra, Yao Dou, Tarek Naous, Laura Dabbish, Alan Ritter, Wei Xu, Sauvik Das
In pseudonymous online fora like Reddit, the benefits of self-disclosure are often apparent to users (e.g., I can vent about my in-laws to understanding strangers), but the privacy risks are more abstract (e.g., will my partner be able to tell that this is me?). Prior work has sought to develop natural language processing (NLP) tools that help users identify potentially risky self-disclosures in their text, but none have been designed for or evaluated with the users they hope to protect. Absent this assessment, these tools will be limited by the social-technical gap: users need assistive tools that help them make informed decisions, not paternalistic tools that tell them to avoid self-disclosure altogether.To bridge this gap, we conducted a study with $N=21$ Reddit users; we had them use a state-of-the-art NLP disclosure detection model on two of their own posts, and asked them questions to understand if and how the model helped, where it fell short, and how it could be improved to help them make more informed decisions. Despite its imperfections, users responded positively to the model and highlighted its use as a tool that can help them catch mistakes, inform them of risks they were unaware of, and encourage self-reflection. However our work also shows how, to be useful and usable, AI for supporting privacy decision making must account for posting context, disclosure norms, users’ lived threat models, and provide explanations that help contextualize detected risks.
Data Visualization
Arpit Narechania, Alex Endert, Clio Andris
Choropleth maps are a common and effective way to visualize geographic thematic data. Although cartographers have established many principles about map design, data binning and color usage, less is known about how mapmakers make individual decisions in practice. We interview 16 cartographers and geographic information systems (GIS) experts from 13 government organizations, NGOs, and federal agencies about their choropleth mapmaking decisions and workflows. We categorize our findings and report on how mapmakers follow cartographic guidelines and personal rules of thumb, collaborate with other stakeholders within and outside their organization, and how organizational structures and norms are tied to decision-making during data preparation, data analysis, data binning, map styling, and map post-processing. We find several points of variation as well as regularity across mapmakers and organizations and present takeaways to inform cartographic education and practice, including broader implications and opportunities for CSCW, HCI, and information visualization researchers and practitioners.
Designing for Privacy
Design(ing) Fictions for Collective Civic Reporting of Privacy Harms
Yuxi Wu, William Agnew, W. Keith Edwards, Sauvik Das
Individually-experienced privacy harms are often difficult to demonstrate and quantify, which impedes efforts for their redress. Their effects often appear small and are inconsistently documented, and they only become more obvious when aggregated over time and across populations. Taking a design fiction approach, we explore the design requirements and cultural ideals of a government-run system that empowers people to collectively report on and make sense of experiences of privacy harm from online behavioral advertising. Through the use of fictional inquiry, story completion, and comicboarding methods, delivered in an online survey with 50 participants, we found that participants had detailed conceptions of the user experience of such a tool, but wanted assurance that their labor and personal data would not be exploited further by the government if they contributed evidence of harm. We extrapolate these design insights to government-supported complaint-reporting platforms in other domains, finding multiple common design gaps that might disincentivize people to report experiences of harm, be they privacy-related or otherwise.
Fighting Misinformation, Building Believability
Mohsin Yousufi, Charlotte Alexander, Nassim Parvin
Marginalized groups often face situations in which their knowledge and experiences are dismissed due to prejudice or bias—a phenomenon identified and theorized as epistemic injustice in feminist philosophy. These circumstances frequently compel individuals to produce additional evidence to support their claims, ranging from paper documentation to data generated by technologies such as location logs. This paper examines the case of Heat Seek, an internet-connected temperature sensor designed to provide tenants in New York City with “objective and reliable data” when filing heating complaints and appearing in housing court. We present findings from a qualitative study, supplemented by document review and artifact analysis, to illuminate the tool’s functions and uses. Drawing on this case, we introduce a class of civic technologies—credibility boosters. We find that these technologies aim to overcome credibility deficits by: (1) backing individual and collective claims with objective data, (2) materializing intangible experiences as tangible evidence with aesthetic reliability, and (3) shifting epistemic authority to perceived neutral third parties. We conclude by demonstrating the institutional and social impacts of such technologies and call for greater attention to epistemic injustices within CSCW research, advocating for the design of institutional, legal, and social systems that confront biased systems and empower marginalized communities.
Harassment & Micro-Aggressions
Lara Karki, Kayla Uleah, Carl DiSalvo, Sierra Traynail Ross, Jadin Butler, Selamawit Husein, Emanuel Bryant, Dana Priest, Justin Booker, Betsy DiSalvo
LinkedIn is central to salaried job search and professional networking. In a career development program for adults seeking upward socioeconomic mobility through middle-wage computing work, we aimed to use LinkedIn to find and develop new social ties. However, we could not use the platform for this purpose. Through a participatory research approach, we formed a research team with diverse positionalities to understand why LinkedIn was difficult to use and how it could be better for our program. We analyzed recorded walk-throughs and confirmed our findings with two years of ethnographic field notes and written reflections. Our findings demonstrate that LinkedIn’s embedded algorithms and interface design prioritize users with large networks who can afford a LinkedIn Premium subscription. We argue that such platform-embedded power differentials lead to platform-delivered microaggressions. Non-Premium users and users with small networks must endure microaggressions to participate in the salaried labor market. We argue the politics of LinkedIn as a platform are such that its embedded power differentials are beyond our control and unlikely to change. Therefore, we recommend sociotechnical coping and mitigation strategies for career development programs in lieu of design implications for LinkedIn or similar platforms. We contribute a detailed example of how a technology reinforces pre-existing privilege without users’ knowledge.
Hate Speech
[BEST PAPER] Harm in Layers: Compositions of Misinformative Hate in Anti-Asian Speech and Their Impacts on Perceived Harmfulness
Jiawei Zhou, Gaurav Verma, Lei Zhang, Nicholas Chang, Munmun De Choudhury
During times of crisis, heightened anxiety and fear make individuals more vulnerable, creating fertile ground for hate speech and misinformation, as people are more likely to fall for and be influenced by it. This paper looks into the interwoven relationship between anti-Asian hatred and COVID-19 misinformation amid the pandemic. By analyzing 785,798 Asian hate tweets and surveying 308 diverse participants, this empirical study explores how hateful content portrays the Asian community, whether it is based on truth, and what makes such portrayal harmful. We observed a high prevalence of misinformative hate speech that appeared to be lengthier, less emotional, and carried more pronounced motivational drives than general hate speech. Overall, we found that anti-Asian rhetoric was characterized by an antagonism and inferiority framing, with misinformative hate underscoring antagonism and general hate emphasizing calls for action. Among all entities being explicitly criticized, China and the Chinese were constantly named to assign blame with misinformative hate more likely to finger-point than general hate. Our survey results indicated that hateful messages with misinformation, demographic targeting, or divisive references were perceived as significantly more damaging. Individuals who placed less importance on free speech, had personal encounters with hate speech, or believed in the natural origin of COVID-19 were more likely to perceive higher severity. Taken together, this work highlights the distinct compositions of hate within misinformative hate speech that influences perceived harmfulness and adds to the complexity of defining and moderating harmful content. We discuss the implications for designing more contextualized and culturally sensitive counter-strategies, as well as building more adaptive, explainable moderation approaches.
Humanized AI: Avatars, Agents, and Voice Assistants
Virtual agent-based communication skills training to facilitate health persuasion among peers
Farnaz Nouraei, Keith Rebello, Mina Fallah, Prasanth Murali, Haley Matuszak, Valerie Jap, Andrea Parker, Michael Paasche-Orlow, Timothy Bickmore
Many laypeople are motivated to improve the health behavior of their family or friends but do not know where to start, especially if the health behavior is potentially stigmatizing or controversial. We present an approach that uses virtual agents to coach community-based volunteers in health counseling techniques, such as motivational interviewing, and allows them to practice these skills in role-playing scenarios. We use this approach in a virtual agent-based system to increase COVID-19 vaccination by empowering users to influence their social network. In a between-subjects comparative design study, we test the effects of agent system interactivity and role-playing functionality on counseling outcomes, with participants evaluated by standardized patients and objective judges. We find that all versions are effective at producing peer counselors who score adequately on a standardized measure of counseling competence, and that participants were significantly more satisfied with interactive virtual agents compared to passive viewing of the training material. We discuss design implications for interpersonal skills training systems based on our findings.
Identifying and Mitigating AI Risks
A Risk Taxonomy and Reflection Tool for LLM Adoption in Public Health
Jiawei Zhou, Amy Chen, Darshi Shah, Laura Schwab Reese, Munmun De Choudhury
Recent breakthroughs in large language models (LLMs) have generated both interest and concern about their potential adoption as information sources or communication tools across different domains. In public health, where stakes are high and impacts extend across diverse populations, adopting LLMs poses unique challenges that require thorough evaluation. However, structured approaches for assessing potential risks in public health remain under-explored. To address this gap, we conducted focus groups with public health professionals and individuals with lived experience to unpack their concerns, situated across three distinct and critical public health issues that demand high-quality information: infectious disease prevention (vaccines), chronic and well-being care (opioid use disorder), and community health and safety (intimate partner violence). We synthesize participants’ perspectives into a risk taxonomy, distinguishing and contextualizing the potential harms LLMs may introduce when positioned alongside traditional health communication. This taxonomy highlights four dimensions of risk to individuals, human-centered care, information ecosystem, and technology accountability. For each dimension, we discuss specific risks and offer example reflection questions to help practitioners adopt a risk-reflexive approach. We discuss the need to revisit pre-existing mental models of help-seeking and complement evaluations with external validity and domain expertise through lived experience and real-world practices. Together, this work contributes a shared vocabulary and reflection tool for people in both computing and public health to collaboratively anticipate, evaluate, and mitigate risks in deciding when to employ LLM capabilities (or not) and how to mitigate harm.
Partisan Discourse Online
Pooja Casula, Richmond Wong
Social media platforms have been widely perceived as centers of political discourse, and have been shown to facilitate political participation among young adults (18-26 years). However, as the effects of online political discourse and behaviors have become pervasive offline, significantly affecting global political processes such as deterring women from public political office and influencing election outcomes, it raises questions regarding how young adult users engage in these online political spaces of discourse. In this paper, we focus on the perceptions and forms of engagement of Gen Z social media users, specifically those of Gen Z young adult women. In this paper we broadly ask, how do voting-age Generation (Gen) Z young adult women perceive spaces of political discourse on social media, and do these perceptions affect how they choose to engage in them? To explore this question, we conducted 17 interviews with voting-age Gen Z women across the United States. We found that our participants were largely critical of social media as spaces of political discourse. They were skeptical of the credibility of the political information shared on social media, questioned the usefulness of sharing political information through social media, and felt that social media was not conducive to having productive political discussions. We find that participant perceptions of social media political discourse led to them limiting their online engagement or disengaging entirely from online public political spaces, but expanding their offline private political engagement through in-person discussion. Our findings indicate that our participants were not politically disinterested, but rather did not partake in public forms of social media political engagement, leading us to question and reconsider widespread interpretations of ‘political participation’ that center and emphasize public forms of action and expression. Drawing on our findings, we propose that the practice of ‘disengagement’ from public spaces of online political discourse should be considered a dimension of political engagement and not separate from it. In proposing this, we also broadly question the efficacy of social media as a forum to promote and facilitate political discourse.
The Role of Partisan Culture in Mental Health Language Online
Sachin Pendse, Ben Rochford, Neha Kumar, Munmun De Choudhury
The impact of culture on how people express distress in online support communities is increasingly a topic of interest within Computer Supported Cooperative Work (CSCW) and Human-Computer Interaction (HCI). In the United States, distinct cultures have emerged from each of the two dominant political parties, forming a primary lens by which people navigate online and offline worlds. We examine whether partisan culture may play a role in how U.S. Republican and Democrat users of online mental health support communities express distress. We present a large-scale observational study of 2,184,356 posts from 8,916 statistically matched Republican, Democrat, and unaffiliated online support community members. We utilize methods from causal inference to statistically match partisan users along covariates that correspond with demographic attributes and platform use, in order to create comparable cohorts for analysis. We then leverage methods from natural language processing to understand how partisan expressions of distress compare between these sets of closely matched opposing partisans, and between closely matched partisans and typical support community members. Our data spans January 2013 to December 2022, a period of both rising political polarization and mental health concerns. We find that partisan culture does play into expressions of distress, underscoring the importance of considering partisan cultural differences in the design of online support community platforms.
Reflecting on Methodology
Reflexive Data Walks: Cultivating Feminist Ethos through Place-Based Inquiry
Sylvia Janicki, Shubhangi Gupta, Nassim Parvin
Reflexivity, as conceived by feminist epistemologies, is essential to advancing social justice design practice. Reflexivity is thus critical for CSCW and HCI scholars and practitioners who seek to build equitable technological futures, as it allows for a critical examination of explicit and implicit values and politics in design and research processes. In this paper, we put forth a participatory walking method grounded in feminist ethos for cultivating reflexivity by engaging with the theme of boundaries in space. We outline this method through three integrated place-based strategies, including an activity in the home, a data walk in the city, and making and sharing visualizations for collaborative understandings of place. We argue that engaging with place is critical to foregrounding positionality and cultivating reflexivity in research. We share our findings from two workshops where we examined the efficacy of this method. We outline how the method deepens the understandings of the built environment, self, and others; welcomes vulnerability and fosters openness to change; scaffolds practices of critical self questioning. In doing so, it leads to a recognition of the entanglement of socio-political values in design and data creation, revealing uncertainties and ambiguities that can open up new areas for inquiry and design.
Social and Environmental Justice
[HONORABLE MENTION] Sustaining Workers Who Sustain the World: Assets-Based Design for Conservation Technologies in Madagascar
Eric Greenlee, David Klinges, Lalatiana Randriamiharisoa, Kim Valenta, Jhoanny Rasojivola, Justorien Rambeloniaina, Nicolas Naina Rasolonjatovo, Georges Razafindramavo, Joel Ratsirarson, Zovelosoa Raharinavalomanana, Edouard Ramahatratra, Abigail Ross, Thomas Kelly, Jean Claude Rakotoarivelo, Tafitasoa Mijoro, Eric Tsiriniaina Rajoelison, Efitiria Efitiria, Josiah Hester, Ellen Zegura, Alex Cabral
Local workers and their knowledge are essential for sustainable and effective conservation efforts. However, many technology-assisted conservation programs are guided by global benchmarks (e.g., forest cover) and industry metrics (e.g., cost per acre), which often devalue local knowledge and fail to consider the economic and conservation goals of local workers. Assets-based design is well-suited to center workers and their strengths, yet it may fail to fully address the complexities of long-term conservation programs by not explicitly emphasizing workers’ goals or bolstering their assets. We extend recent approaches in assets-based design literature that address these limitations through our case studies of reforestation, biodiversity monitoring, and carbon sequestration programs in three protected areas in Madagascar. We leverage a mixed-methods approach of direct reactive observations, unstructured interviews, and an informal design workshop, revealing emergent themes surrounding economic sustainability and the value of local ecological knowledge in conservation. Finally, we explore examples, tensions, and design considerations for worker-centered conservation technology to: (1) prioritize local knowledge, (2) foster love of nature, (3) center economic goals, and (4) embrace local autonomy. This work advances the dialogue on assets-based design, promoting the co-creation of equitable and sustainable conservation technologies with workers in Global South settings by centering local economic priorities and enhancing workers’ strengths.
Camille Harris, Clio Andris
In 2021, the City of Atlanta and Atlanta Police Foundation launched joint plans to build a large police training facility in the South River Forest in unincorporated DeKalb County, GA. At this time, residents of Atlanta and DeKalb County, environmental activists, police and prison abolitionists, and other activists and concerned individuals formed the movement in opposition to the facility, known as the Stop Cop City / Defend the Atlanta Forest movement. Social media and digital maps became common tools for communicating information about the facility and the movement. In this work, we examine online maps about the facility and the opposition movement, originating from grassroots organizations, the City of Atlanta, news media outlets, the Atlanta Police Foundation, and individuals. We gather and examine 32 publicly available maps collected through the Google Search API, Twitter (now X), Instagram and reddit. Then, using a framework of critical cartography, we conduct a content analysis of these maps to identify the mapping technologies and techniques (data, cartographic elements, styles) used by different stakeholders in the construction of the facility and roles that maps and mapping technologies can play in social movements. Finally, we examine the extent to which these maps provide data to confirm or dispute concerns raised by grassroots organizations and local residents about the facility. We argue that documenting the use of maps to communicate information about a contentious project can help enumerate positions and perspectives about community issues. We find that the different uses of (and varied access to) geo-spatial technologies is uneven across stakeholders and mapmakers and advocate for accessible mapmaking tools. We conclude by discussing the implications of accessibility of mapping technology and posting maps to social media, and share example map images that extend the geographic information systems (GIS) techniques seen in the retrieved maps.
Supporting Older Adults’ Care
[HONORABLE MENTION] Rethinking Technological Solutions for Community-Based Older Adult Care: Insights from `Older Partners’ in China
Yuling Sun, Sam Ankenbauer, Yuchen Chen, Xiaojuan Ma, Zhifan Guo, Liang He
Aging in place refers to the enabling of individuals to age comfortably and securely within their own homes and communities. Continued community living creates a number of potential areas for design and, accordingly, various information and communication technologies have been employed to support older adult care. At the same time, human-led care services have been designed to support aging in place. Through a long-term ethnographic study that includes semi-structured interviews with 24 stakeholders, we consider these technology- and human-driven care infrastructures for aging in place, examining their origins, deployment, interactions with older adults, and challenges. In doing so, we reconsider the value of these different forms of older adult care, highlighting the various issues associated with using, for instance, health monitoring technology or appointment scheduling systems to care for older adults aging in place. We suggest that technology should take a “supportive, not substitutive” role in older adult care infrastructure and that designing for aging in place should not be synonymous with designing for independence but should, instead, consider the larger community and its dynamics.
Team Work Makes the Dream Work
Nathaniel Swinger, Cynthia Baseman, Myeonghan Ryu, Saeed Abdullah, Christopher Wiese, Andrew Sherrill, Rosa Arriaga
The mental health crisis in the United States spotlights the need for more scalable training for mental health workers. While present-day AI systems have sparked hope for addressing this problem, we must not be too quick to incorporate or solely focus on technological advancements. We must ask empirical questions about how to ethically collaborate with and integrate autonomous AI into the clinical workplace. For these Human-Autonomy Teams (HATs), poised to make the leap into the mental health domain, special consideration around the construct of trust is in order. A reflexive look toward the multidisciplinary nature of such HAT projects illuminates the need for a deeper dive into varied stakeholder considerations of ethics and trust. In this paper, we investigate the impact of domain—and the ranges of expertise within domains—on ethics- and trust-related considerations for HATs in mental health. We outline our engagement of 23 participants in two speculative activities: design fiction and factorial survey vignettes. Grounded by a video storyboard prototype, AI- and Psychotherapy-domain experts and novices alike imagined TEAMMAIT, a prospective AI system for psychotherapy training. From our inductive analysis emerged 10 themes surrounding ethics, trust, and collaboration. Three can be seen as substantial barriers to trust and collaboration, where participants imagined they would not work with an AI teammate that didn’t meet these ethical standards. Another five of the themes can be seen as interrelated, context-dependent, and variable factors of trust that impact collaboration with an AI teammate. The final two themes represent more explicit engagement with the prospective role of an AI teammate in psychotherapy training practices. We conclude by evaluating our findings through the lens of Mayer et al.’s Integrative Model of Organizational Trust to discuss the risks of HATs and adapt models of ability-, benevolence-, and integrity-based trust. These updates motivate implications for the design and integration of HATs in mental health work.
Trauma & Abuse
Making Sense of Trauma Over Time: Interweaving Feminist Temporalities to Understand Histories
Catherine Wieczorek, Cindy Lin, Shaowen Bardzell
Trauma, an emotional response to events with lasting impacts, is a significant public health issue influencing technology interactions. This paper focuses on the sixth principle of trauma-informed care—Cultural, Historical, and Gender Issues—by exploring multiple timescales of trauma and generational impacts through two ethnographic vignettes: a trauma-informed healthcare design project in Chicago and environmental advocacy in Borneo, Indonesia. We integrate feminist temporality to understand temporal contingencies in cultural contexts to inform future trauma-informed design and computing work. Our contributions include detailed ethnographic accounts that shift the focus from trauma as an individual event to a historically and communally felt phenomenon, advancing CSCW scholarship by incorporating historicist sensibilities and feminist theorizations of temporality.
More Research
Doctoral Consortium
The Mechanisms of Muting: Deconstructing the Technology-Mediated Violence of Silence
Jasmine Foriest
This research addresses a critical gap in HCI: while the field engages with “harm,” it inadequately conceptualizes “violence.” One gap lies in how digital artifacts mediate structural violence through muting. Muting — the systemic silencing of marginalized groups — prevents vulnerable populations from accessing potentially life-saving resources and results in preventable morbidity and mortality. Drawing from Muted Group Theory, I demonstrate how technologies imbued with dominant values amplify muting in unprecedented ways through information suppression in suicide reporting, social-computing design that silences gender-based violence survivors, and epistemic inequity perpetuated by generative AI. My dissertation employs survivor-centered mixed methods — surveys, narrative interviews, and phenomenological analysis to understand how intimate partner violence survivors use digital artifacts in help-seeking. This work will produce the first empirical understanding of relationships between muting experiences and adverse outcomes, alongside design recommendations for remediating muting in help-seeking technologies. My goal is establishing cross-disciplinary approaches to violence prevention through ethical technology design.
Panels/ SIGs
PANEL: Computing and the Arts: Establishing Theoretical and Methodological Foundations for Cross-Disciplinary Collaboration
Angela Schöpke-Gonzalez, Kellie Dunn, Shaowen Bardzell, Federico Bomba, Barbara Carreras, Makayla Lewis, Maria Murray
The last five years have resulted in substantial changes to how computing affects work, how work affects computing, and how work and computing operate in tandem to affect society. From advances in automation, artificial intelligence, and virtual/extended reality, to the entrenchment of hybrid and remote work arrangements, and the documented harmful societal impacts that computing work has produced, these changes to computing-work relationships raise concern \textit{and} opportunities to reimagine these relationships in new ways. CSCW has an opportunity and a responsibility to ensure that the kinds of futures we imagine and enact benefit workers, communities, and future generations. Artistic research is well-positioned to help us not only understand, but imagine new pathways forward in response to pressing CSCW questions. By hosting a panel of experts in artistic methods well-equipped to help us imagine these futures, we expect to lay the groundwork for mutually respectful cross-disciplinary collaboration between arts and computing that makes more space in our field for different kinds of thinking, approaches to problems, and new imaginaries.
SIG: Alternative Technology Consumption Under Capitalism
Alternative Technology Consumption Under Capitalism
Yuxi Wu, Beatriz Palacios Abad, Vishal Sharma, Hanlin Li, Alexandra To
Even as large technology companies come under increasing legal and political scrutiny, their market dominance continues to grow. As Big Tech tends toward monopoly, however, people continue to seek out alternative technology systems and uses. What are the conditions that lead people to choose alternatives? What are the long term values associated with having viable alternatives? This SIG presents alternative technology, or AltTech, as a growing area of interest for the CSCW community to consider. We invite community members with interests in technology non-use, design for disruption, and post-growth design to join us for a sketch-based speculative discussion to better understand the landscape and future of AltTech.
SIG: Conducting Research in Oppressive Settings
Conducting Research in Oppressive Settings
Adrian Petterson, Benedetta Lusi, Cristina Bosco, Ashique Ali Thuppilikkat, Anupriya Tuli, Catherine Wieczorek, Robert Soden, Emily Tseng, Priyank Chandra
As justice-related research faces increasing transnational and domestic repression, researchers working on topics like reproductive justice, LGBTQ2SIA+ equity, decolonization, climate justice, and social movements encounter escalating constraints and risks. While the CSCW community has increasingly advocated for research in these domains, the current political climate exacerbates the precarity experienced by scholars engaged in this work. Institutional mechanisms such as ethics approvals frequently fail to address researchers’ safety concerns, particularly for those from marginalized communities themselves. Collaborators within the same project experience varying levels of risk based on location, career stage, and identity. This Special Interest Group (SIG) will facilitate dialogue on practical strategies for conducting research under oppressive contexts, drawing on expertise from researchers who have developed survival and safety tactics. Discussions will address data storage practices, visibility considerations, transnational collaboration strategies, and psychological safety mechanisms. Our goal is to establish a collaboratively curated resource collection supporting researchers as they navigate oppressions in their collaborations, recognizing these threats continue to grow in scale and intensity.
Posters
From Hashtag to Human-Centered Insights: Rethinking Disability Awareness Across Languages
Zainab AlMeraj, Fatemah Husain, Rosa Arriaga
As global discourse on disability expands, much of the digital awareness and inclusion efforts remain anchored in English-language narratives. This linguistic dominance limits our understanding of how disability is perceived, discussed, and mobilized across culturally diverse regions— particularly within underrepresented communities in the Global South. This study investigates cross-lingual and cross-cultural perspectives on disability awareness by analyzing three years of public posts from X (formerly Twitter), using the hashtag #peoplewithdisabilities. Through natural language processing (NLP), we examine (1) posting behaviors and engagement dynamics, (2) sentiment and empathy-oriented language, and (3) culturally embedded narrative framings in both Arabic and English content. Our interdisciplinary lens draws from computational linguistics and disability studies allows us to interpret trends beyond surface metrics. Findings reveal that Arabic posts often reflect familial, religious, and collectivist viewpoints rooted in local cultural values, while English posts emphasize rights-based advocacy and individual empowerment. Emotional expression and engagement patterns also diverge, highlighting that awareness itself is not universal but culturally constructed and contextually nuanced. We argue that designing inclusive technologies requires more than linguistic translation, it demands sensitivity to the cultural frameworks shaping disability discourse.
Workshops
Structuring Collaborative Reflection: Integrating Diary Study and Focus Group Discussion
Jixiang Fan, Jiacheng Zhao, Sunggyeol Oh, Michael Bolmer, Yoonje Lee, Nick Flammer, Yuhao Chen, D. Scott McCrickard
We present a structured reflection framework integrating diary study and focus group discussion to support collaborative meaning-making in HCI education. The framework follows a multi-phase design in which students progress from individual journaling to a two-stage group discussion sequence: first within shared application contexts, then across emergent experiential themes. To support this process, we extended DiaryQuest, a lightweight educational tool incorporating AI-assisted grouping, image-based prompts, and a Jigsaw-inspired workflow to scaffold participation. A preliminary classroom deployment with 11 undergraduate students suggests that the approach lowers the barrier to reflective dialogue, encourages cross-perspective engagement, and helps students surface design-relevant insights grounded in lived experience. These findings point to new opportunities for structuring reflection in sociotechnical learning environments.
CSCW Contributions to Critical Futures of Work
Alina Lushnikova, Michael Muller, Shaowen Bardzell, Toby Li, Saiph Savage, Saiph Savage
As the CSCW community evolves and participates in envisioning the impact of technologies on the work practices, we want to ensure that critical and alternative computing perspectives are well represented while we are co-constructing the future of work. In this hybrid workshop, we invite researchers, practitioners, civic actors, economists, and other interested parties to challenge dominant, powerful, status-quo narratives and imaginaries when considering the future of work, nurturing the CSCW commitments and methods. Co-constructing the workshop with participants, we aim to develop actionable insights and strengthen the community.
Exploring Resistance and Other Oppositional Responses to AI
Eric Baumer, Eric Baumer, Inha Cha, Vera Khovanskaya, Rosemary Steup, Janet Vertesi, Richmond Wong
This workshop will gather researchers and practitioners who study, and/or engage in, opposition to the proliferation of AI technologies. It will do so based on an inclusive conceptualization of what counts as AI, thereby assembling a diverse collection of participants and perspectives. The organizers will especially solicit submissions that respond to a variety of specific themes: resistance in organizational contexts; understandings of community-based collective resistance; research around non-voluntary adoption; considerations around distributions of power in the creation and use of AI; implications for designing technologies to support opposition, and the possibility of resistance indirectly reifying current conceptions of AI. Prospective participants will be invited to submit descriptions of their work either studying or engaging in oppositional practices, as well as a challenge they have faced in doing so. The workshop will involve a series of interactive, hands-on activities to enable participants to share both challenges and strategies. In addition to catalyzing connections among researchers, the workshop will also produce two concrete outputs: a living annotated bibliography of relevant citations across diverse domains, and a practical guide with context-sensitive tactics for challenging the perceived inevitability of AI.

ACM Conference on Computer and Communications Security
Taipei, Taiwan | Oct 13–17, 2025
Applied Cryptography
Distance-Aware OT with Application to Fuzzy PSI
Lucas Piske, Jaspal Singh, Ni Trieu, Vladimir Kolesnikov, Vassilis Zikas
May the Force Not be With You: Brute-Force Resistant Biometric Authentication and Key Reconstruction
Alexandra Boldyreva, Deep Inder Mohan, Tianxin Tang
Toss: Garbled PIR from Table-Only Stacking
Lucien K. L. Ng, Vladimir Kolesnikov
Blockchain and Distributed Systems
Lite-PoT: Practical Powers-of-Tau Setup Ceremony
Lucien K. L. Ng, Pedro Moreno-Sanchez, Mohsen Minaei, Panagiotis Chatzigiannis, Adithya Bhat, Duc Le
Hardware, Side Channels, and Cyber Physical Systems
MOLE: Breaking GPU TEE with GPU-Embedded MCU
Hongyi Lu, Yunjie Deng, Sukarno Mertoguno, Shuai Wang, Fengwei Zhang
One Video to Steal Them All: 3D-Printing IP Theft through Optical Side-Channels
Twisha Chattopadhyay, Fabricio Ceschin, Marco Garza, Dymytriy Zyunkin, Animesh Chhotaray, Aaron Stebner, Saman Zonouz, Raheem Beyah
WireTap: Breaking Server SGX via DRAM Bus Interposition
Alex Seto, Oytun Kuday Duran, Samy Amer, Jalen Chuang, Stephan van Schaik, Daniel Genkin, Christina Garman
Machine Learning and Security
VillainNet: Targeted Poisoning Attacks Against SuperNets Along the Accuracy-Latency Pareto Frontier
David Oygenblik, Abhinav Vemulapalli, Animesh Agrawal, Debopam Sanyal, Alexey Tumanov, Brendan Saltaformaggio
Privacy and Anonymity
Fingerprinting SDKs for Mobile Apps and Where to Find Them: Understanding the Market for Device Fingerprinting
Michael Specter, Abbie Farr, Bo Ma, Robin Lassonde, Mihai Christodorescu
Security Usability and Measurement
A Sea of Cyber Threats: Maritime Cybersecurity from the Perspective of Mariners
Anna Raymaker, Akshaya Kumar, Miuyin Yong Wong, Ryan Pickren, Animesh Chhotaray, Frank Li, Saman Zonouz, Raheem Beyah
The Challenges and Opportunities with Cybersecurity Regulations: A Case Study of the US Electric Power Sector
Sena Sahin, Burak Sahin, Robin Berthier, Kate Davis, Saman Zonouz, Frank Li
Web Security
Enhanced Web Application Security Through Proactive Dead Drop Resolver Remediation
Jonathan Fuller, Mingxuan Yao, Saumya Agarwal, Srimanta Barua, Taleb Hirani, Amit Kumar Sikder, Brendan Saltaformaggio
Head(er)s Up! Detecting Security Header Inconsistencies in Browsers
Jannis Rautenstrauch, Trung Tin Nguyen, Karthik Ramakrishnan, Ben Stock
Lock the Door But Keep the Window Open: Extracting App-Protected Accessibility Information from Browser-Rendered Websites
Haichuan Xu, Runze Zhang, Mingxuan Yao, David Oygenblik, Yizhi Huang, Jeman Park, Brendan Saltaformaggio

ACM Symposium on User Interface Software and Technology
Busan, Korea | Sep 28–Oct 1, 2025
Best Paper
DissolvPCB: Fully Recyclable 3D-Printed Electronics with Liquid Metal Conductors and PVA Substrates
Zeyu Yan, Su Hwan Hong, Josiah Hester, Tingyu Cheng, Huaishu Peng
We introduce DissolvPCB, an electronic prototyping technique for fabricating fully recyclable printed circuit board assemblies (PCBAs) using affordable FDM 3D printing, with polyvinyl alcohol (PVA) as a water-soluble substrate and eutectic gallium-indium (EGaIn) as the conductive material. When obsolete, the PCBA can be easily recycled by immersing it in water: the PVA dissolves, the EGaIn re-forms into a liquid metal bead, and the electronic components are recovered. These materials can then be reused to fabricate a new PCBA. We present the DissolvPCB workflow, characterize its design parameters, evaluate the performance of circuit produced with it, and quantify its environmental impact through a lifecycle assessment (LCA) comparing it to conventional CNC-milled FR-4 boards. We further develop a software plugin that automatically converts PCB design files into 3D-printable circuit substrate models. To demonstrate the capabilities of DissolvPCB, we fabricate and recycle three functional prototypes: a Bluetooth speaker featuring a double-sided PCB, a finger fidget toy with a 3D circuit topology, and a shape-changing gripper enabled by joule heat driven 4D printing. The paper concludes with a discussion of current technical limitations and opportunities for future directions.
Papers
Yue Lyu, Xizi Wang, Hanlu Ma, Yalong Yang, Jian Zhao
Effective communication between pilots and air traffic control (ATC) is essential for aviation safety, but verbal exchanges over radios are prone to miscommunication, especially under high workload conditions. While cockpit-embedded visual aids offer the potential to enhance ATC communication, little is known about how to design and integrate such aids. We present an exploratory, user-centered investigation into the design and integration of icon-based visual aids, named ATCion, to support in-cockpit ATC communication, through four phases involving 22 pilots and 1 ATC controller. This study contributes a validated set of design principles and visual icon components for ATC messages. In a comparative study of ATCion, text-based visual aids, and no visual aids, we found that our design improved readback accuracy and reduced memory workload, without negatively impacting flight operations; most participants preferred ATCion over text-based aids, citing their clarity, low cognitive cost, and fast interpretability. Further, we point to implications and opportunities for integrating icon-based aids into future multimodal ATC communication systems to improve both safety and efficiency.
BIOGEM: A Fully Biodegradable Gelatin-Based McKibben Actuator with Embedded Sensing
Gaolin Ge, Haoran Lu, Yingting Gao, Qifeng Yang, Josiah Hester, Tingyu Cheng, Yiyue Luo
We present BIOGEM, a fully biodegradable McKibben actuator with integrated sensing, made from gelatin-based composites. By tailoring the material compositions, we customize the mechanical and electrical properties of the biodegradable composites, creating an integrated biodegradable system that combines both actuation and sensing functionalities. BIOGEM integrates a McKibben actuating structure by using stiff gelatin as outer braiding and the stretchable gelatin as air chambers. It also integrates resistive strain sensing through ionic gelatin, allowing the actuator to monitor its own deformation without relying on conventional electronics. We characterize the actuator’s performance across key parameters including braid angle, wall thickness, and material stiffness, demonstrating reliable contraction and repeatable force output at low pressures. Biodegradation is validated through both enzyme-assisted and backyard soil studies, confirming the material’s sustainable end-of-life behavior under realistic conditions. We illustrate the potential of this platform through interactive, edible, and environmentally-degradable prototypes across human–computer interaction and soft robotics scenarios.
CoSight: Exploring Viewer Contributions to Online Video Accessibility Through Descriptive Commenting
Ruolin Wang, Xingyu Bruce Liu, Biao Wang, Wayne Zhang, Ziqian Liao, Ziwen Li, Amy Pavel, Xiang Chen
The rapid growth of online video content has outpaced efforts to make visual information accessible to blind and low vision (BLV) audiences. While professional Audio Description (AD) remains the gold standard, it is costly and difficult to scale across the vast volume of online media. In this work, we explore a complementary approach to broaden participation in video accessibility: engaging everyday video viewers at their watching and commenting time. We introduce CoSight, a Chrome extension that augments YouTube with lightweight, in-situ nudges to support descriptive commenting. Drawing from Fogg’s Behavior Model, CoSight provides visual indicators of accessibility gaps, pop-up hints for what to describe, reminders to clarify vague comments, and related captions and comments as references. In an exploratory study with 48 sighted users, CoSight helped integrate accessibility contribution into natural viewing and commenting practices, resulting in 89% of comments including grounded visual descriptions. Follow-up interviews with four BLV viewers and four professional AD writers suggest that while such comments do not match the rigor of professional AD, they can offer complementary value by conveying visual context and emotional nuance for understanding the videos.
DropPop: Designing Drop-to-Deploy Mechanisms with Bistable Scissors Structures
Yibo Fu, Emily Guan, Jianzhe Gu, Dinesh K Patel, Justin U Soza Soto, Yichi Luo, Carmel Majidi, Josiah Hester, Lining Yao
Deployable structures often rely on complex deployment mechanisms such as external pneumatic pumps, electric motors, or manual assembly. These conventional methods, which are intended for applications in shape morphing architectures, robotics, and product design, can be bulky and unwieldy for everyday interaction and daily use. We introduce a new class of deployable structures that harness the locomotion of a single bistable cap to drive the expansion of a scissor-like mechanism. Such structures can be rapidly deployed (0.2-0.7s) upon a small trigger, and stabilize themselves requiring no sustained energy input. We explore various input modalities for deployment such as hand dropping, and drone deployment, and showcase demo applications. Additionally, we provide a computational design tool for customizing shape primitives with physics simulation and offer design guidelines for fabrication.
ForcePinch: Force-Responsive Spatial Interaction for Tracking Speed Control in XR
Chenyang Zhang, Tiffany S Ma, John Andrews, Eric J Gonzalez, Mar Gonzalez-Franco, Yalong Yang
Spatial interaction in 3D environments requires balancing efficiency and precision, which requires dynamic tracking speed adjustments. However, existing techniques often couple tracking speed adjustments directly with hand movements, reducing interaction flexibility. Inspired by the natural friction control inherent in the physical world, we introduce ForcePinch, a novel force-responsive spatial interaction method that enables users to intuitively modulate pointer tracking speed and smoothly transition between rapid and precise movements by varying their pinching force. To implement this concept, we developed a hardware prototype integrating a pressure sensor with a customizable mapping function that translates pinching force into tracking speed adjustments. We conducted a user study with 20 participants performing well-established 1D, 2D, and 3D object manipulation tasks, comparing ForcePinch against the distance-responsive technique Go-Go and speed-responsive technique PRISM. Results highlight distinctive characteristics of the force-responsive approach across different interaction contexts. Drawing on these findings, we highlight the contextual meaning and versatility of force-responsive interactions through four illustrative examples, aiming to inform and inspire future spatial interaction design.
Adam J Coscia, Shunan Guo, Eunyee Koh, Alex Endert
As multi-turn dialogues with large language models (LLMs) grow longer and more complex, how can users better evaluate and review progress on their conversational goals? We present OnGoal, an LLM chat interface that helps users better manage goal progress. OnGoal provides real-time feedback on goal alignment through LLM-assisted evaluation, explanations for evaluation results with examples, and overviews of goal progression over time, enabling users to navigate complex dialogues more effectively. Through a study with 20 participants on a writing task, we evaluate OnGoal against a baseline chat interface without goal tracking. Using OnGoal, participants spent less time and effort to achieve their goals while exploring new prompting strategies to overcome miscommunication, suggesting tracking and visualizing goals can enhance engagement and resilience in LLM dialogues. Our findings inspired design implications for future LLM chat interfaces that improve goal communication, reduce cognitive load, enhance interactivity, and enable feedback to improve LLM performance.
Posters
MILO: An LLM Multi-Stage Conversational Agent for Fostering Teenagers’ Mental Resilience
Han Bao, Yongan Yu, Bohan Wang, Xiaowen Lu, Xin Tong
Adolescence is a significant period that shapes long-term development and well-being. Mental disorders contribute to 15% of the global disease burden among teenagers, according to the WHO. Adverse well-being during adolescence can not only compromise physical health but also lead to a wide range of negative social outcomes throughout life. Motivated by the potential of generative AI conversational agents to provide scalable and personalized support to cultivate mental resilience, we designed Milo, an LLM digital companion grounded in cognitive behavioral therapy (CBT), tailored specifically for teenagers. Milo promotes greater involvement of teenagers in the development of emotional awareness and resilience strategies through agent customization and offering an interactive interface.
Noetic Dream: A Personalized VR and Meditation System for Lucid Dream Training
Yichen Yu, Qiaoran Wang
Lucid dreaming relies on a high level of metacognition and requires significant time and effort to master induction techniques, presenting obstacles for those seeking such experiences. This study proposes a personalized lucid dreaming training system Noetic Dream that combines virtual reality (VR) with open-monitoring(OM) meditation, acting on the mechanism of “dream awareness” through both external and internal pathways. VR provides immersive dream-based games to help users practice identifying unrealistic states, while OM meditation stabilizes internal focus and implants lucid intent. The training cycle uses multimodal cues to help users establish dream recognition mechanisms, thereby increasing the likelihood of lucid dreaming. The contributions of this study include: applying generative language models (LLMs) to construct dream VR scenarios, designing dream anomaly detection game mechanisms to stimulate dream awareness, and integrating OM meditation to achieve a non-invasive lucid dreaming training pathway, thereby effectively increasing the probability of spontaneous lucid dreaming.

Telecommunications Policy Research Conference
Washington, D.C. | Sept. 18–20, 2025
Data Governance
This paper focuses on cross-border data flow regulations regarding Connected Vehicles (CVs) in the People’s Republic of China (PRC), the European Union (EU), and the United States of America (USA). The paper reviews the engineering-cybersecurity literature regarding CVs and derives from this a classification of data types generated by the CV ecosystem. It then analyzes the legal and policy texts regarding CVs from the three jurisdictions. By mapping the data types to each jurisdiction’s restrictions and regulations, the paper unpacks how they conceptualize the risks or threats from CV data and how they operationalize these concerns into CV data regulation. The paper’s objective is to provide a detailed examination of the similarities and differences among the three jurisdictions. We discover that governments’ attempt to regulate data flows pushes them into classification systems for information, and that governments attach different values or policy interests to these categories.
Platforms and Competition
Interconnection and Rivalry in Global Monetary Networks
Karim Farhat, Milton L. Mueller, Vagisha Srivastava
In this white paper, we apply concepts of network competition to analyze the contest for dominance between the US dollar, a BRICS alliance against the dollar, and a politically neutral money like Bitcoin.
Global money networks have network externalities; a currency becomes more valuable as more users in more countries accept it and use it. Users thus tend to converge on a single, dominant network for payments that maximizes their demand-side economies of scope. Drawing on empirical evidence from telecommunications competition and network externality theory, we show that when three systems with network externalities compete, an interconnection agreement between the dominant system and one of the two competitors can isolate and exclude the third system. We analyze the governance of dollar stablecoins as the monetary equivalent of an interconnection agreement between the fiat dollar and Bitcoin. We argue that the fiat dollar can strengthen its global dominance by fostering a stronger interconnection with Bitcoin via dollar stablecoins.
Dollar stablecoins are the optimal conversion asset between a liquid medium of exchange like the dollar and a less liquid store of value like Bitcoin. With a formal interconnection between dollar stablecoins and Bitcoin, demand-side economies of scope are shared, and strong complementarities become evident. Stablecoins serve as a medium of remittance and short-term savings while Bitcoin serves as a longer-term store of value or speculative asset, as with gold. At the same time, an interconnection agreement acts as an implicit check, imposing fiscal discipline on US dollar governance. If the dollar weakens excessively, a positive feedback loop ensues in Bitcoin where the more users diversify to Bitcoin the more its price appreciates and the more users drive value away from the fiat dollar, and so on.
As such, we argue policymakers should proactively foster an interconnection between dollar stablecoins and Bitcoin to strengthen the US dollar’s global dominance and forestall long-term threats to its hegemony. The interconnection agreement should center around:
• Designing a federal regulatory framework for stablecoins centered on open capital markets — without picking favorites.
• Incentivizing stablecoin operators to reduce short-term bonds in favor of longer-term securities and harder assets, enhancing stability and market confidence.
• Encouraging emerging markets and BRICS nations to freely access dollar stablecoins and Bitcoin as reliable stores of value depending on their needs; and
• Eliminating capital gains and tax reporting requirements for long-term Bitcoin saving and long-term Bitcoin to dollar stablecoin conversions to retain capital in the United States and simultaneously encourage more dollar exports for the foreseeable future. By pursuing these policies, the dollar’s network advantage can be reinforced, ensuring it remains the dominant currency in an increasingly contested global monetary landscape.
Routing Security Adoption
The Role of RIRs in RPKI Adoption
Josephine Wolff, Cecilia Testart
Recognizing the relevance of securing inter-domain routing to protect traffic flows in the Internet, the Internet Engineering Task Force (IETF) standardized the Resource Public Key Infrastructure (RPKI), a framework to provide networks with a system to cryptographically validate routing data. Despite many obstacles, RPKI has emerged as the consensus to improve routing security and currently about 50% of routed IP address blocks are part of the system. The Regional Internet Registries (RIRs) are in charge of allocating address space in five different geographical zones and play a crucial role in RPKI: they are the roots of trust of the crypto graphic system and provide the infrastructure to host RPKI certificates and keys for the Internet resources allocated in their region. Organizations and networks wanting to issue RPKI records for their address space need to follow the process from the RIR that delegated their address space. In this paper, we analyze the RIRs’ implementation of RPKI infrastructure from the perspective of network operators. Based on in-depth interviews with 13 network engineers who have been involved in their organizations’ efforts to adopt RPKI, we examine the RIR initiatives
that have or would have most supported RPKI adoption for different types of organizations. Given RIRs have independently developed and implemented the cryptographic infrastructure as well as the tooling to issue and manage certificates, we offer recommendations on strategies that have encouraged RPKI adoption.
Satellite and Space Networks
Are Leo Networks the Future of National Emergency Failover? – A Quantitative Study and Policy Blueprint
Vaibhav Bhosale, Zachary Bischof, Fabián E. Bustamante, Ying Zhang, Sameer Kapoor, Robin Kim, Miguel Schlicht, Muskaan Gupta, Ekaterina Tumanova, Alberto Dainotti, Ahmed Saeed
Low Earth Orbit (LEO) satellite networks are emerging as backups for national-scale outages. While they have demonstrated value in small-scale disasters such as supporting first responders during hurricanes, their effectiveness during large-scale infrastructure failures remains underexplored. This paper evaluates the capacity of LEO networks to act as national failover infrastructure using six real-world submarine cable failures. The failure capacity provided by a LEO network to a specific nation depends on a few key factors: the size of the country, the distribution of the user terminals, and the policies of the network operator for spectrum allocation and traffic engineering. We find that coordinated policies between governments and network operators, especially regarding terminal placement and spectrum use, can improve failover capacity by up to 1.8× without requiring additional infrastructure. However, even under optimistic conditions with 200,000 terminals and a dedicated failover network, LEO networks can only restore 0.9–14.7% of lost submarine cable capacity in most cases.
User-Generated Content
The Impact of Premium Licenses on Creator Behavior
Jae Sang Rhee
The creator economy relies on third-party platforms, free-sharing platforms, which enable creators to reach wide audiences, enhancing monetization opportunities. However, creators often remain uncompensated, their visibility declines due to content oversaturation, and the unauthorized use of their work poses significant risks as training datasets vital to artificial intelligence (AI) frequently draw from freely accessible creator content. These issues directly harm both creators and platforms. Some platforms introduced a premium license, offering subscription-based exclusive content, upfront creator payments, and enhanced copyright protection. This paper investigates the impact of premium licensing on creator behavior by leveraging a unique natural experiment. Using data from Unsplash and Pexels, we find that introducing premium licenses on free-sharing platforms reduces the volume of freely available content by 13.2\%. Particularly, this decline is observed even among creators who could not get into the premium license. We further identify two mechanisms driving this decline. First, reduced multi-homing occurs as existing creators deactivate accounts and move away from the platform offering premium license. Second, creators improve free content quality to stay competitive with premium offerings. Our findings highlight crucial trade-offs associated with premium licensing, demonstrating significant unintended consequences for content volume and quality. These issues directly impact both creators and platforms, underscoring the importance of strategic policy design in platform monetization.

Research Activities
A Common’s Approach to Cybersecurity Policy (Tutorial)
Vaibhav Garg, Comcast, Holly Peterson, Louisiana State University and Milton Mueller, Georgia Tech
There are two dominant paradigms in Tech Policy. The first one assumes technology outcomes to be a public good and ground policy interventions in regulatory responses. The second one asserts these outcomes to be private goods and targets policy solutions that address market incentives as well as associated dynamics. Yet the interconnected nature of Telecommunications technologies, such as Internet, as well as the correlated nature of associated risks, such as cybersecurity, means that there is a third option. This third way assumes technology outcomes to be common pool resources. Untrammeled extraction of these resources may lead to a Tragedy of the Commons. Numerous institutions across distinct domains have been able to avoid said Tragedy by investing in community-based governance. Research documenting the commonalities between such institutions led to Elinor Ostrom’s 2010 Nobel Prize winning work called the Institutional Analysis and Development framework (IAD).
Despite IAD’s successful application in many risk domains, its formal application to Telecommunications Policy, especially in cybersecurity, has been underexplored. Yet, telecommunications policy stakeholders – especially in emerging technologies – will often leverage community-based interventions. Applying IAD to such interventions may provide significant insights, making community-based governance both more effective and efficient. Furthermore, formal application of IAD to Telecommunications Policy may open opportunities for new policy solutions in cybersecurity. The goal of this workshop is to introduce TPRC attendees to the IAD framework and teach its application to cybersecurity.
The Regulatory Challenge of Artificial Intelligence (Panel)
The character of generative AI technologies present unique challenges to traditional regulatory paradigms. The panel participants have been conducting research in this field and will report briefly on their recent findings to provoke discussion among the panel members and audience.
Topics include: The intersection of intellectual property rights with AI; the framing of AI Ethics in terms of their social, economic and political contexts, the regulatory ramification of the potential existential risk of AI systems, current regulatory models in the U.S. and Europe and a view of AI as distributed computing.
Panelists:
Russ Neuman, New York University
Christopher Yoo, University of Pennsylvania
Christos Makridis, Stanford University
Chloé Bakalar, Meta
Milton L. Mueller, Georgia Institute of Technology

USENIX Security Symposium
Seattle | August 13 – 15, 2025
Hardware Security 1: Microarchitectures
FLOP: Breaking the Apple M3 CPU via False Load Output Predictions
Jason Kim, Jalen Chuang, Daniel Genkin, Yuval Yarom
To bridge the ever-increasing gap between the fast execution speed of modern processors and the long latency of memory accesses, CPU vendors continue to introduce newer and more advanced optimizations. While these optimizations improve performance, research has repeatedly demonstrated that they may also have an adverse impact on security. In this work, we identify that recent Apple M- and A-series processors implement a load value predictor (LVP), an optimization that predicts the contents of memory that the processor loads before the contents are actually available. This allows processors to alleviate slowdowns from Read-After-Write dependencies, as instructions can now be executed in parallel rather than sequentially. To evaluate the security impact of Apple’s LVP implementation, we first investigate the implementation, identifying the conditions for prediction. We then show that although the LVP cannot directly predict 64-bit values (e.g., pointers), prediction of smaller-size values can be leveraged to achieve arbitrary memory access. Finally, we demonstrate end-to-end attack exploit chains that build on the LVP to obtain a 64-bit read primitive within the Safari and Chrome browsers.
Hardware Security 3: Side-Channel and Fault Injection Attacks
ECC.fail: Mounting Rowhammer Attacks on DDR4 Servers with ECC Memory
Nureddin Kamadan, Walter Wang, Stephan van Schaik, Christina Garman, Daniel Genkin, Yuval Yarom
Rowhammer is a hardware vulnerability present in nearly all computer memory, allowing attackers to modify bits in memory without directly accessing them. While Rowhammer has been extensively studied on client and even mobile platforms, no successful Rowhammer attack has been demonstrated on server platforms using DDR4 ECC memory. Tackling this challenge, in this paper we demonstrate the first end-to-end Rowhammer technique effective against Intel servers using Hynix DDR4 ECC memory. To that aim, we first characterize the Hynix implementation of Target Row Refresh (TRR) on server parts, demonstrating effective hammering patterns on both FPGA and Intel-based testing platforms with ECC disabled. We then reverse engineer Intel’s ECC implementation on Skylake and Cascade Lake servers. We find that it has a coding distance of four, which often allows triggering incorrect ECC correction with just two bit flips. Combining the two observations, we present an end-to-end Rowhammer attack which can flip bits on Intel servers, without causing crashes. Finally, we demonstrate the effectiveness of our attack by hammering RSA public keys loaded into memory, causing the server to accept messages not signed by the original key.
Privacy 1: Differential Privacy and Audit
General-Purpose f-DP Estimation and Auditing in a Black-Box Setting
Önder Askin, Holger Dette, Martin Dunsche, Tim Kutta, Yun Lu, Yu Wei, Vassilis Zikas
In this paper we propose new methods to statistically assess f-Differential Privacy (f-DP), a recent refinement of differential privacy (DP) that remedies certain weaknesses of standard DP (including tightness under algorithmic composition). A challenge when deploying differentially private mechanisms is that DP is hard to validate, especially in the black-box setting. This has led to numerous empirical methods for auditing standard DP, while f-DP remains less explored. We introduce new black-box methods for f-DP that, unlike existing approaches for this privacy notion, do not require prior knowledge of the investigated algorithm. Our procedure yields a complete estimate of the f-DP trade-off curve, with theoretical guarantees of convergence. Additionally, we propose an efficient auditing method that empirically detects f-DP violations with statistical certainty, merging techniques from non-parametric estimation and optimal classification theory. Through experiments on a range of DP mechanisms, we demonstrate the effectiveness of our estimation and auditing procedures.
Privacy 2: Consent, Compliance, and Provable Privacy
Evaluating Privacy Policies under Modern Privacy Laws At Scale: An LLM-Based Automated Approach
Qinge Xie, Karthik Ramakrishnan, Frank Li
Website privacy policies detail an online service’s information practices, including how they handle user data and rights. For many sites, these disclosures are now necessitated by a growing set of privacy regulations, such as GDPR and multiple US state laws, offering visibility into privacy practices that are often not publicly observable. Motivated by this visibility, prior work has explored techniques for automated analysis of privacy policies and characterized specific aspects of real-world policies on a larger scale. However, existing approaches are constrained in the privacy practices they evaluate, as they rely upon rule-based methods or supervised classifiers, and many predate the prominent privacy laws now enacted that drastically shape privacy disclosures. Thus, we lack a comprehensive understanding of modern website privacy practices disclosed through privacy policies. In this work, we seek to close this gap by providing a systematic and comprehensive evaluation of website privacy policies at scale. We first systematize the privacy practices discussed by 10 notable privacy regulations currently in effect in the European Union and the US, identifying 34 distinct clauses on privacy practices across 4 overarching themes. We then develop and evaluate an LLM-based approach for assessing these clauses in privacy policies, providing a more accurate, comprehensive, and flexible analysis compared to prior techniques. Finally, we collect privacy policies from over 100K websites, and apply our LLM method to a subset of sites to investigate in-depth the privacy practices of websites today. Ultimately, our work supports broader investigations into web privacy practices moving forward.
Software Security 3: Fuzzing
Hybrid Language Processor Fuzzing via LLM-Based Constraint Solving
Yupeng Yang, Shenglong Yao, Jizhou Chen, Wenke Lee
Language processors, such as compilers and interpreters, play a crucial role in modern cyberspace. Faulty language processors can lead to severe consequences such as incorrect functionalities or malicious attacks. It is non-trivial to automatically test language processors to detect faulty behaviors, because language processors are multistaged and require various complex constraints to reach deep program states. Existing testing (fuzzing) approaches either fail to effectively generate inputs that satisfy the complex constraints or fail to generalize due to their heavy reliance on target-specific constraint modeling heuristics. In this paper, we explore the potential of using LLMs for constraint solving to address these limitations and identify two challenges regarding constraint prioritization and context construction. To effectively address these challenges, we propose two novel solutions, hybrid centrality prioritization and iterative context construction. We implement the solutions in a hybrid fuzzing framework, HLPFuzz, which leverages an LLM to overcome complex constraints and reach deep program states. In our evaluation, HLPFuzz successfully discovers 52 bugs in 9 popular language processors, of which 37 are confirmed and 14 are fixed. HLPFuzz also outperforms state-of-the-art solutions by up to 190% in code coverage and discovers 5x more bugs than the second-best fuzzer, with minimal reliance on target-specific heuristics.
Waltzz: WebAssembly Runtime Fuzzing with Stack-Invariant Transformation
Lingming Zhang, Binbin Zhao, Jiacheng Xu, Peiyu Liu, Qinge Xie, Yuan Tian, Jianhai Chen, Shouling Ji
WebAssembly (Wasm) is a binary instruction format proposed by major browser vendors to achieve near-native performance on the web and other platforms. By design, Wasm modules should be executed in a memory-safe runtime, which acts as a trusted computing base. Therefore, security vulnerabilities inside runtime implementation can have severe impacts and should be identified and mitigated promptly. Fuzzing is a practical and widely adopted technique for uncovering bugs in real-world programs. However, to apply fuzzing effectively to the domain of Wasm runtimes, it is vital to address two primary challenges: (1) Wasm is a stack-based language and runtimes should verify the correctness of stack semantics, which requires fuzzers to meticulously maintain desired stack semantics to reach deeper states. (2) Wasm acts as a compilation target and includes hundreds of instructions, making it hard for fuzzers to explore different combinations of instructions and cover the input space effectively. To address these challenges, we design and implement Waltzz, a practical greybox fuzzing framework tailored for Wasm runtimes. Specifically, Waltzz proposes the concept of stack-invariant code transformation to preserve appropriate stack semantics during fuzzing. Next, Waltzz introduces a versatile suite of mutators designed to systematically traverse diverse combinations of instructions in terms of both control and data flow. Moreover, Waltzz designs a skeleton-based generation algorithm to produce code snippets that are rarely seen in the seed corpus. To demonstrate the efficacy of Waltzz, we evaluate it on seven well-known Wasm runtimes. Compared to the state-of-the-art works, Waltzz can surpass the nearest competitor by finding 12.4% more code coverage even within the large code bases and uncovering 1.38x more unique bugs. Overall, Waltzz has discovered 20 new bugs which have all been confirmed and 17 CVE IDs have been assigned.

ACM Conference on International Computing Education Research
Charlottesville | August 3 – 6, 2025
Doctoral Consortium
Ethical Computing Education in the Age of Generative AI
Grace Barkhuff
Educating computing students in ethical practices is vitally important. This education is complicated by the rapid rise of generative AI (GenAI) and its use in higher education by students and instructors alike. My research aims to understand computing educators’ perceptions on ethically educating computing students, both about and with GenAI.
Lightning Talks and Posters
Benchmarking of Generative AI Tools in Software Engineering Education: Formative Insights for Curriculum Integration
Nimisha Roy, Oleksandr Horielko, Fisayo Omojokun
Exploring Community Perceptions and Experiences Towards Academic Dishonesty in Computing Education
Chandler C. Payne, Kai A. Hackney, Lucas Guarenti Zangari, Emmanuel Munoz, Sterling R. Kalogeras, Juan Sebastián Sánchez-Gómez, Fisayo Omojokun, Pedro Guillermo Feijóo-García
Should I Submit or Should I Not? Exploring the Effects of Mandatory vs. Voluntary Tasks on Student Engagement in Computing Education
Lucas Guarenti Zangari, Emilio Aponte-Archila, Pedro Guillermo Feijóo-García
What Computing Faculty Want: Designing AI Tools for High-Enrollment Courses Beyond CS1
Rodrigo Borela, Meryem Yilmaz Soylu, Jeonghyun Lee, Nimisha Roy

International Conference on Machine Learning
Vancouver | July 13 – 19, 2025
Algorithms
Learning to Stop: Deep Learning for Mean Field Optimal Stopping
Lorenzo Magnino, Yuchen Zhu, Mathieu Lauriere
Optimal stopping is a fundamental problem in optimization with applications in risk management, finance, robotics, and machine learning. We extend the standard framework to a multi-agent setting, named multi-agent optimal stopping (MAOS), where agents cooperate to make optimal stopping decisions in a finite-space, discrete-time environment. Since solving MAOS becomes computationally prohibitive as the number of agents is very large, we study the mean-field optimal stopping (MFOS) problem, obtained as the number of agents tends to infinity. We establish that MFOS provides a good approximation to MAOS and prove a dynamic programming principle (DPP) based on mean-field control theory. We then propose two deep learning approaches: one that learns optimal stopping decisions by simulating full trajectories and another that leverages the DPP to compute the value function and to learn the optimal stopping rule using backward induction. Both methods train neural networks to approximate optimal stopping policies. We demonstrate the effectiveness and the scalability of our work through numerical experiments on 6 different problems in spatial dimension up to 300. To the best of our knowledge, this is the first work to formalize and computationally solve MFOS in discrete time and finite space, opening new directions for scalable MAOS methods.
Mustafa Burak Gurbuz, Xingyu Zheng, Constantine Dovrolis
As deep learning continues to be driven by ever-larger datasets, understanding which examples are most important for generalization has become a critical question. While progress in data selection continues, emerging applications require studying this problem in dynamic contexts. To bridge this gap, we pose the Incremental Data Selection (IDS) problem, where examples arrive as a continuous stream, and need to be selected without access to the full data source. In this setting, the learner must incrementally build a training dataset of predefined size while simultaneously learning the underlying task. We find that in IDS, the impact of a new sample on the model state depends fundamentally on both its geometric relationship in the feature space and its prediction error. Leveraging this insight, we propose PEAKS (Prediction Error Anchored by Kernel Similarity), an efficient data selection method tailored for IDS. Our comprehensive evaluations demonstrate that PEAKS consistently outperforms existing selection strategies. Furthermore, PEAKS yields increasingly better performance returns than random selection as training data size grows on real-world datasets. The code is available at https://github.com/BurakGurbuz97/PEAKS.
Unpaired Point Cloud Completion via Unbalanced Optimal Transport
Taekyung Lee, Jaemoo Choi, Jaewoong Choi, Myungjoo Kang
Unpaired point cloud completion is crucial for real-world applications, where ground-truth data for complete point clouds are often unavailable. By learning a completion map from unpaired incomplete and complete point cloud data, this task avoids the reliance on paired datasets. In this paper, we propose the \textit{Unbalanced Optimal Transport Map for Unpaired Point Cloud Completion (\textbf{UOT-UPC})} model, which formulates the unpaired completion task as the (Unbalanced) Optimal Transport (OT) problem. Our method employs a Neural OT model learning the UOT map using neural networks. Our model is the first attempt to leverage UOT for unpaired point cloud completion, achieving competitive or superior performance on both single-category and multi-category benchmarks. In particular, our approach is especially robust under the class imbalance problem, which is frequently encountered in real-world unpaired point cloud completion scenarios.
Alignment
CollabLLM: From Passive Responders to Active Collaborators
Shirley Wu, Michel Galley, Baolin Peng, Hao Cheng, Gavin Li, Yao Dou, Weixin Cai, James Zou, Jure Leskovec, Jianfeng Gao
Large Language Models are typically trained with next-turn rewards, limiting their ability to optimize for long-term interaction. As a result, they often respond passively to ambiguous or open-ended user requests, failing to help users reach their ultimate intents and leading to inefficient conversations. To address these limitations, we introduce CollabLLM, a novel and general training framework that enhances multiturn human-LLM collaboration. Its key innovation is a collaborative simulation that estimates the long-term contribution of responsesusing Multiturn-aware Rewards. By reinforcement fine-tuning these rewards, CollabLLM goes beyond responding to user requests, and actively uncovers user intent and offers insightful suggestions—a key step towards more human-centered AI. We also devise a multiturn interaction benchmark with three challenging tasks such as document creation. CollabLLM significantly outperforms our baselines with averages of 18.5% higher task performance and 46.3% improved interactivity by LLM judges. Finally, we conduct a large user study with 201 judges, where CollabLLM increases user satisfaction by 17.6% and reduces user spent time by 10.4%.
Applications
Generalization Principles for Inference over Text-Attributed Graphs with Large Language Models
Haoyu Wang, Shikun Liu, Rongzhe Wei, Pan Li
Large language models (LLMs) have recently been introduced to graph learning, aiming to extend their zero-shot generalization success to tasks where labeled graph data is scarce. Among these applications, inference over text-attributed graphs (TAGs) presents unique challenges: existing methods struggle with LLMs’ limited context length for processing large node neighborhoods and the misalignment between node embeddings and the LLM token space. To address these issues, we establish two key principles for ensuring generalization and derive the framework LLM-BP accordingly: (1) **Unifying the attribute space with task-adaptive embeddings**, where we leverage LLM-based encoders and task-aware prompting to enhance generalization of the text attribute embeddings; (2) **Developing a generalizable graph information aggregation mechanism**, for which we adopt belief propagation with LLM-estimated parameters that adapt across graphs. Evaluations on 11 real-world TAG benchmarks demonstrate that LLM-BP significantly outperforms existing approaches, achieving 8.10\% improvement with task-conditional embeddings and an additional 1.71\% gain from adaptive aggregation. The code and task-adaptive embeddings are publicly available.
Chemistry, Physics, and Earth Sciences
LLM-Augmented Chemical Synthesis and Design Decision Programs
Haorui Wang, Jeff Guo, Lingkai Kong, Rampi Ramprasad, Philippe Schwaller, Yuanqi Du, Chao Zhang
Retrosynthesis, the process of breaking down a target molecule into simpler precursors through a series of valid reactions, stands at the core of organic chemistry and drug development. Although recent machine learning (ML) research has advanced single-step retrosynthetic modeling and subsequent route searches, these solutions remain restricted by the extensive combinatorial space of possible pathways. Concurrently, large language models (LLMs) have exhibited remarkable chemical knowledge, hinting at their potential to tackle complex decision-making tasks in chemistry. In this work, we explore whether LLMs can successfully navigate the highly constrained, multi-step retrosynthesis planning problem. We introduce an efficient scheme for encoding reaction pathways and present a new route-level search strategy, moving beyond the conventional step-by-step reactant prediction. Through comprehensive evaluations, we show that our LLM-augmented approach excels at retrosynthesis planning and extends naturally to the broader challenge of synthesizable molecular design.
Convex
Geometric Algebra Planes: Convex Implicit Neural Volumes
Irmak Sivgin, Sara Fridovich-Keil, Gordon Wetzstein, Mert Pilanci
Volume parameterizations abound in recent literature, encompassing methods from classic voxel grids to implicit neural representations. While implicit representations offer impressive capacity and improved memory efficiency compared to voxel grids, they traditionally require training through nonconvex optimization, which can be slow and sensitive to initialization and hyperparameters. We introduce GA-Planes, a novel family of implicit neural volume representations inspired by Geometric Algebra that can be trained using convex optimization, addressing the limitations of nonconvex methods. GA-Planes models generalize many existing representations including any combination of features stored in tensor basis elements followed by a neural feature decoder, and can be adapted to convex or nonconvex training as needed for various inverse problems. In the 2D setting, we prove GA-Planes models are equivalent to a low-rank plus low-resolution matrix factorization that outperforms the classic low-rank plus sparse decomposition for fitting a natural image. In 3D, GA-Planes models exhibit competitive expressiveness, model size, and optimizability across tasks such as radiance field reconstruction, 3D segmentation, and video segmentation.
Deep Learning
Can Transformers Reason Logically? A Study in SAT Solving
Leyan Pan, Vijay Ganesh, Jacob Abernethy, Chris Esposo, Wenke Lee
We formally study the logical reasoning capabilities of decoder-only Transformers in the context of the boolean satisfiability (SAT) problem. First, we prove by construction that decoder-only Transformers can decide 3-SAT, in a non-uniform model of computation, using backtracking and deduction via Chain-of-Thought (CoT).Second, we implement our construction as a PyTorch model with a tool (PARAT) that we designed to empirically demonstrate its correctness and investigate its properties.Third, rather than \textit{programming} a transformer to reason, we evaluate empirically whether it can be \textit{trained} to do so by learning directly from algorithmic traces (“reasoning paths”) from our theoretical construction. The trained models demonstrate strong out-of-distribution generalization on problem sizes seen during training but has limited length generalization, which is consistent with the implications of our theoretical result.
LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models
Dachuan Shi, Yonggan Fu, Xiangchi Yuan, Zhongzhi Yu, Haoran You, Sixu Li, Xin Dong, Jan Kautz, Pavlo Molchanov, Yingyan (Celine) Lin
Recent advancements in Large Language Models (LLMs) have spurred interest in numerous applications requiring robust long-range capabilities, essential for processing extensive input contexts and continuously generating extended outputs. As sequence lengths increase, the number of Key-Value (KV) pairs in LLMs escalates, creating a significant efficiency bottleneck.In this paper, we propose a new KV cache optimization paradigm called LaCache, a training-free method for efficient and accurate generative inference of LLMs. LaCache enables LLMs to simultaneously address both of the critical challenges in long-range modeling: robust long-range capabilities and continuous generation without running out-of-memory (OOM). Specifically, LaCache integrates two key innovations: (1) a ladder-shaped KV cache pattern that stores KV pairs not only sequentially (left-to-right within each layer) but also across layers (from shallow to deep), providing an extended span for capturing long-range dependencies under a fixed storage budget, thereby boosting long-range capabilities; and (2) an iterative compaction mechanism that progressively compresses older caches, freeing up space for new tokens within a fixed cache size. This token distance-based dynamic compression enables more effective continuous generation under constrained cache budgets.Experiments across various tasks, benchmarks, and LLM models consistently validate LaCache’s effectiveness in enhancing LLMs’ long-range capabilities. Our code is available at https://github.com/GATECH-EIC/LaCache.
Deep RL
Deep Reinforcement Learning from Hierarchical Preference Design
Alexander Bukharin, Yixiao Li, Pengcheng He, Tuo Zhao
Reward design is a fundamental, yet challenging aspect of reinforcement learning (RL). Researchers typically utilize feedback signals from the environment to handcraft a reward function, but this process is not always effective due to the varying scale and intricate dependencies of the feedback signals. This paper shows by exploiting certain structures, one can ease the reward design process. Specifically, we propose a hierarchical reward design framework — HERON for scenarios: (I) The feedback signals naturally present hierarchy; (II) The reward is sparse, but with less important surrogate feedback to help policy learning. Both scenarios allow us to design a hierarchical decision tree induced by the importance ranking of the feedback signals to compare RL trajectories. With such preference data, we can then train a reward model for policy learning. We apply HERON to several RL applications, and we find that our framework can not only train high performing agents on a variety of difficult tasks, but also provide additional benefits such as improved sample efficiency and robustness.
Efficient Online Reinforcement Learning for Diffusion Policy
Haitong Ma, Tianyi Chen, Kai Wang, Na Li, Bo Dai
Diffusion policies have achieved superior performance in imitation learning and offline reinforcement learning (RL) due to their rich expressiveness. However, the conventional diffusion training procedure requires samples from target distribution, which is impossible in online RL since we cannot sample from the optimal policy. Backpropagating policy gradient through the diffusion process incurs huge computational costs and instability, thus being expensive and not scalable. To enable efficient training of diffusion policies in online RL, we generalize the conventional denoising score matching by reweighting the loss function. The resulting Reweighted Score Matching (RSM) preserves the optimal solution and low computational cost of denoising score matching, while eliminating the need to sample from the target distribution and allowing learning to optimize value functions. We introduce two tractable reweighted loss functions to solve two commonly used policy optimization problems, policy mirror descent and max-entropy policy, resulting in two practical algorithms named Diffusion Policy Mirror Descent (DPMD) and Soft Diffusion Actor-Critic (SDAC). We conducted comprehensive comparisons on MuJoCo benchmarks. The empirical results show that the proposed algorithms outperform recent diffusion-policy online RLs on most tasks, and the DPMD improves more than 120% over soft actor-critic on Humanoid and Ant.
Foundation Models
Primitive Vision: Improving Diagram Understanding in MLLMs
Shan Zhang, Aotian Chen, Yanpeng Sun, Jindong Gu, Yi-Yu Zheng, Piotr Koniusz, Kai Zou, Anton Hengel, Yuan Xue
Mathematical diagrams have a distinctive structure. Standard feature transforms designed for natural images (e.g., CLIP) fail to process them effectively, limiting their utility in multimodal large language models (MLLMs). Current efforts to improve MLLMs have primarily focused on scaling mathematical visual instruction datasets and strengthening LLM backbones, yet fine?grained visual recognition errors remain unaddressed. Our systematic evaluation on the visual grounding capabilities of state?of?the?art MLLMs highlights that fine?grained visual understanding remains a crucial bottleneck in visual mathematical reasoning (GPT-4o exhibits a 70% grounding error rate, and correcting these errors improves reasoning accuracy by 12%). We thus propose a novel approach featuring a geometrically?grounded vision encoder and a feature router that dynamically selects between hierarchical visual feature maps. Our model accurately recognizes visual primitives and generates precise visual prompts aligned with the language model’s reasoning needs. In experiments, PRIMITIVE-Qwen2.5-7B outperforms other 7B models by 12% on MathVerse and is on par with GPT-4V on MathVista. Our findings highlight the need for better fine?grained visual integration in MLLMs. Code is available at github.com/AI4Math-ShanZhang/SVE-Math.
General Machine Learning
On the Power of Learning-Augmented Search Trees
Jingbang Chen, Xinyuan Cao, Alicia Stepin, Li Chen
We study learning-augmented binary search trees (BSTs) via Treaps with carefully designed priorities.The result is a simple search tree in which the depth of each item $x$ is determined by its predicted weight $w_x$.Specifically, each item $x$ is assigned a composite priority of $-\lfloor\log\log(1/w_x)\rfloor + U(0, 1)$ where $U(0, 1)$ is the uniform random variable. By choosing $w_x$ as the relative frequency of $x$, the resulting search trees achieve static optimality.This approach generalizes the recent learning-augmented BSTs [Lin-Luo-Woodruff ICML`22], which only work for Zipfian distributions, by extending them to arbitrary input distributions.Furthermore, we demonstrate that our method can be generalized to a B-Tree data structure using the B-Treap approach [Golovin ICALP’09]. Our search trees are also capable of leveraging localities in the access sequence through online self-reorganization, thereby achieving the working-set property. Additionally, they are robust to prediction errors and support dynamic operations, such as insertions, deletions, and prediction updates. We complement our analysis with an empirical study, demonstrating that our method outperforms prior work and classic data structures.
Generative Models and Autoencoders
ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features
Alec Helbling, Tuna Han Salih Meral, Benjamin Hoover, Pinar Yanardag, Polo Chau
Do the rich representations of multi-modal diffusion transformers (DiTs) exhibit unique properties that enhance their interpretability? We introduce ConceptAttention, a novel method that leverages the expressive power of DiT attention layers to generate high-quality saliency maps that precisely locate textual concepts within images. Without requiring additional training, ConceptAttention repurposes the parameters of DiT attention layers to produce highly contextualized *concept embeddings*, contributing the major discovery that performing linear projections in the output space of DiT attention layers yields significantly sharper saliency maps compared to commonly used cross-attention maps. ConceptAttention even achieves state-of-the-art performance on zero-shot image segmentation benchmarks, outperforming 15 other zero-shot interpretability methods on the ImageNet-Segmentation dataset. ConceptAttention works for popular image models and even seamlessly generalizes to video generation. Our work contributes the first evidence that the representations of multi-modal DiTs are highly transferable to vision tasks like segmentation.
Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces
Kevin Rojas, Yuchen Zhu, Sichen Zhu, Felix Ye, Molei Tao
Diffusion models have demonstrated remarkable performance in generating unimodal data across various tasks, including image, video, and text generation. On the contrary, the joint generation of multimodal data through diffusion models is still in the early stages of exploration. Existing approaches heavily rely on external preprocessing protocols, such as tokenizers and variational autoencoders, to harmonize varied data representations into a unified, unimodal format. This process heavily demands the high accuracy of encoders and decoders, which can be problematic for applications with limited data. To lift this restriction, we propose a novel framework for building multimodal diffusion models on arbitrary state spaces, enabling native generation of coupled data across different modalities. By introducing an innovative decoupled noise schedule for each modality, we enable both unconditional and modality-conditioned generation within a single model simultaneously. We empirically validate our approach for text-image generation and mixed-type tabular data synthesis, demonstrating that it achieves competitive performance.
Kaiwen Zheng, Yongxin Chen, Huayu Chen, Guande He, Ming-Yu Liu, Jun Zhu, Qinsheng Zhang
While likelihood-based generative models, particularly diffusion and autoregressive models, have achieved remarkable fidelity in visual generation, the maximum likelihood estimation (MLE) objective, which minimizes the forward KL divergence, inherently suffers from a mode-covering tendency that limits the generation quality under limited model capacity. In this work, we propose Direct Discriminative Optimization (DDO) as a unified framework that integrates likelihood-based generative training and GAN-type discrimination to bypass this fundamental constraint by exploiting reverse KL and self-generated negative signals. Our key insight is to parameterize a discriminator implicitly using the likelihood ratio between a learnable target model and a fixed reference model, drawing parallels with the philosophy of Direct Preference Optimization (DPO). Unlike GANs, this parameterization eliminates the need for joint training of generator and discriminator networks, allowing for direct, efficient, and effective finetuning of a well-trained model to its full potential beyond the limits of MLE. DDO can be performed iteratively in a self-play manner for progressive model refinement, with each round requiring less than 1\% of pretraining epochs. Our experiments demonstrate the effectiveness of DDO by significantly advancing the previous SOTA diffusion model EDM, reducing FID scores from 1.79/1.58/1.96 to new records of 1.30/0.97/1.26 on CIFAR-10/ImageNet-64/ImageNet 512$\times$512 datasets without any guidance mechanisms, and by consistently improving both guidance-free and CFG-enhanced FIDs of visual autoregressive models on ImageNet 256$\times$256.
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation
Philippe Hansen-Estruch, David Yan, Ching-Yao Chuang, Orr Zohar, Jialiang Wang, Tingbo Hou, Tao Xu, Sriram Vishwanath, Peter Vajda, Xinlei Chen
Visual tokenization via auto-encoding empowers state-of-the-art image and video generative models by compressing pixels into a latent space. However, questions remain about how auto-encoder design impacts reconstruction and downstream generative performance. This work explores scaling in auto-encoders for reconstruction and generation by replacing the convolutional backbone with an enhanced Vision Transformer for Tokenization (ViTok). We find scaling the auto-encoder bottleneck correlates with reconstruction but exhibits a nuanced relationship with generation. Separately, encoder scaling yields no gains, while decoder scaling improves reconstruction with minimal impact on generation. As a result, we determine that scaling the current paradigm of auto-encoders is not effective for improving generation performance. Coupled with Diffusion Transformers, ViTok achieves competitive image reconstruction and generation performance on 256p and 512p ImageNet-1K. In videos, ViTok achieves SOTA reconstruction and generation performance on 16-frame 128p UCF-101.
Jaemoo Choi, Jaewoong Choi, Dohyun Kwon
We address the convergence problem in learning the Optimal Transport (OT) map, where the OT Map refers to a map from one distribution to another while minimizing the transport cost. Semi-dual Neural OT, a widely used approach for learning OT Maps with neural networks, often generates spurious solutions that fail to transfer one distribution to another accurately. We identify a sufficient condition under which the max-min solution of Semi-dual Neural OT recovers the true OT Map. Moreover, to address cases when this sufficient condition is not satisfied, we propose a novel method, OTP, which learns both the OT Map and the Optimal Transport Plan, representing the optimal coupling between two distributions. Under sharp assumptions on the distributions, we prove that our model eliminates the spurious solution issue and correctly solves the OT problem. Our experiments show that the OTP model recovers the optimal transport map where existing methods fail and outperforms current OT-based models in image-to-image translation tasks. Notably, the OTP model can learn stochastic transport maps when deterministic OT Maps do not exist, such as one-to-many tasks like colorization.
RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression
Payman Behnam, Yaosheng Fu, Ritchie Zhao, Po-An Tsai, Zhiding Yu, Alexey Tumanov
Transformer-based Large Language Models rely critically on the KV cache to efficiently handle extended contexts during the decode phase. Yet, the size of the KV cache grows proportionally with the input length, burdening both memory bandwidth and capacity as decoding progresses. To address this challenge, we present RocketKV, a training-free KV cache compression strategy containing two consecutive stages. In the first stage, it performs coarse-grain permanent KV cache eviction on the input sequence tokens. In the second stage, it adopts a hybrid sparse attention method to conduct fine-grain top-k sparse attention, approximating the attention scores by leveraging both head and sequence dimensionality reductions. We show that RocketKV provides a compression ratio of up to 400×, end-to-end speedup of up to 3.7× as well as peak memory reduction of up to 32.6% in the decode phase on an NVIDIA A100 GPU compared to the full KV cache baseline, while achieving negligible accuracy loss on a variety of long-context tasks. We also propose a variant of RocketKV for multi-turn scenarios, which consistently outperforms other existing methods and achieves accuracy nearly on par with an oracle top-k attention scheme.
Graph Neural Networks
Biswadeep Chakraborty, Harshit Kumar, Saibal Mukhopadhyay
Graph Neural Networks (GNNs) face a critical limitation known as oversmoothing, where increasing network depth leads to homogenized node representations, severely compromising their expressiveness. We present a novel dynamical systems perspective on this challenge, revealing oversmoothing as an emergent property of GNNs’ convergence to low-dimensional attractor states. Based on this insight, we introduce **DYNAMO-GAT**, which combines noise-driven covariance analysis with Anti-Hebbian learning to dynamically prune attention weights, effectively preserving distinct attractor states. We provide theoretical guarantees for DYNAMO-GAT’s effectiveness and demonstrate its superior performance on benchmark datasets, consistently outperforming existing methods while requiring fewer computational resources. This work establishes a fundamental connection between dynamical systems theory and GNN behavior, providing both theoretical insights and practical solutions for deep graph learning.
Health / Medicine
EARTH: Epidemiology-Aware Neural ODE with Continuous Disease Transmission Graph
Guancheng Wan, Zewen Liu, Xiaojun Shan, Max Lau, B. Aditya Prakash, Wei Jin
Effective epidemic forecasting is critical for public health strategies and efficient medical resource allocation, especially in the face of rapidly spreading infectious diseases. However, existing deep-learning methods often overlook the dynamic nature of epidemics and fail to account for the specific mechanisms of disease transmission. In response to these challenges, we introduce an innovative end-to-end framework called Epidemiology-Aware Neural ODE with Continuous Disease Transmission Graph (EARTH) in this paper. To learn continuous and regional disease transmission patterns, we first propose EANO, which seamlessly integrates the neural ODE approach with the epidemic mechanism, considering the complex spatial spread process during epidemic evolution. Additionally, we introduce GLTG to model global infection trends and leverage these signals to guide local transmission dynamically. To accommodate both the global coherence of epidemic trends and the local nuances of epidemic transmission patterns, we build a cross-attention approach to fuse the most meaningful information for forecasting. Through the smooth synergy of both components, EARTH offers a more robust and flexible approach to understanding and predicting the spread of infectious diseases. Extensive experiments show EARTH superior performance in forecasting real-world epidemics compared to state-of-the-art methods. The code is available at https://github.com/GuanchengWan/EARTH.
Kernel methods
Statistical and Computational Guarantees of Kernel Max-Sliced Wasserstein Distances
Jie Wang, March Boedihardjo, Yao Xie
Optimal transport has been very successful for various machine learning tasks; however, it is known to suffer from the curse of dimensionality. Hence, dimensionality reduction is desirable when applied to high-dimensional data with low-dimensional structures. The kernel max-sliced (KMS) Wasserstein distance is developed for this purpose by finding an optimal nonlinear mapping that reduces data into $1$ dimension before computing the Wasserstein distance. However, its theoretical properties have not yet been fully developed. In this paper, we provide sharp finite-sample guarantees under milder technical assumptions compared with state-of-the-art for the KMS $p$-Wasserstein distance between two empirical distributions with $n$ samples for general $p\in[1,\infty)$. Algorithm-wise, we show that computing the KMS $2$-Wasserstein distance is NP-hard, and then we further propose a semidefinite relaxation (SDR) formulation (which can be solved efficiently in polynomial time) and provide a relaxation gap for the obtained solution. We provide numerical examples to demonstrate the good performance of our scheme for high-dimensional two-sample testing.
Large Language Models
CommVQ: Commutative Vector Quantization for KV Cache Compression
Junyan Li, Yang Zhang, Muhammad Yusuf Hassan, Talha Chafekar, Tianle Cai, Zhile Ren, Pengsheng Guo, Foroozan Karimzadeh, Colorado Reed, Chong Wang, Chuang Gan
Large Language Models (LLMs) are increasingly used in applications requiring long context lengths, but the key-value (KV) cache often becomes a memory bottleneck on GPUs as context grows. To address this, we propose Commutative Vector Quantization (CommVQ) to significantly reduce memory usage for long-context LLM inference. We first introduce additive quantization with a lightweight encoder and codebook to compress the KV cache, which can be decoded via simple matrix multiplication. To further reduce computational costs during decoding, we design the codebook to be commutative with Rotary Position Embedding (RoPE) and train it using an Expectation-Maximization (EM) algorithm. This enables efficient integration of decoding into the self-attention mechanism. Our approach achieves high accuracy with additive quantization and low overhead via the RoPE-commutative codebook. Experiments on long-context benchmarks and GSM8K show that our method reduces FP16 KV cache size by 87.5% with 2-bit quantization, while outperforming state-of-the-art KV cache quantization methods. Notably, it enables 1-bit KV cache quantization with minimal accuracy loss, allowing a LLaMA-3.1 8B model to run with a 128K context length on a single RTX 4090 GPU. The source code is available at: https://github.com/UMass-Embodied-AGI/CommVQ.
Siqi Guo, Ilgee Hong, Vicente Balmaseda, Changlong Yu, Liang Qiu, Xin Liu, Haoming Jiang, Tuo Zhao, Tianbao Yang
Supervised fine-tuning (SFT) has become a crucial step for aligning pretrained large language models (LLMs) using supervised datasets of input-output pairs. However, despite being supervised, SFT is inherently limited by its generative training objective. To address its limitations, the existing common strategy is to follow SFT with a separate phase of preference optimization (PO), which relies on either human-labeled preference data or a strong reward model to guide the learning process. In this paper, we address the limitations of SFT by exploring one of the most successful techniques in conventional supervised learning: discriminative learning. We introduce **Discriminative Fine-Tuning (DFT)**, an improved variant of SFT, which mitigates the burden of collecting human-labeled preference data or training strong reward models. Unlike SFT that employs a generative approach and overlooks negative data, DFT adopts a **discriminative paradigm** that increases the probability of positive answers while suppressing potentially negative ones, aiming for **data prediction** instead of token prediction. Our contributions include: (i) a discriminative probabilistic framework for fine-tuning LLMs by explicitly modeling the discriminative likelihood of an answer among all possible outputs given an input; (ii) efficient algorithms to optimize this discriminative likelihood; and (iii) extensive experiments demonstrating DFT’s effectiveness, achieving performance better than SFT and comparable to if not better than SFT?PO. The code can be found at https://github.com/Optimization-AI/DFT.
Diving into Self-Evolving Training for Multimodal Reasoning
Wei Liu, Junlong Li, Xiwen Zhang, Fan Zhou, Yu Cheng, Junxian He
Self-evolving training—where models iteratively learn from their own outputs—has emerged as a key approach for complex reasoning tasks, addressing the scarcity of high-quality chain-of-thought data. However, its effectiveness in multimodal reasoning, a domain more intricate than text-only reasoning, remains underexplored, and the understanding of critical factors in this training paradigm remains limited. Furthermore, a central challenge for this training method is performance saturation, which impedes further improvements and scalability. Inspired by reinforcement learning (RL), in this paper, we reframe self-evolving training for multimodal reasoning through the lens of RL, identifying three pivotal factors: $\textit{Training Method}$, $\textit{Reward Model}$, and $\textit{Prompt Variation}$. Through systematic analysis, we establish relatively optimal design principles that significantly enhance multimodal reasoning capabilities. Moreover, delving deeper into training dynamics, we uncover the roots of saturation and propose a new automatic balancing mechanism to mitigate this limitation. Building on these insights, we propose M-STaR (**M**ultimodal **S**elf-evolving **T**r**a**ining for **R**easoning), a framework that achieves consistent performance gains across models of varying sizes and diverse benchmarks. All resources will be made publicly available.
Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization
Phillip Guo, Aaquib Syed, Abhay Sheshadri, Aidan Ewart, Gintare Karolina Dziugaite
Methods for knowledge editing and unlearning in large language models seek to edit or remove undesirable knowledge or capabilities without compromising general language modeling performance. This work investigates how mechanistic interpretability—which, in part, aims to identify model components (circuits) associated to specific interpretable mechanisms that make up a model capability—can improve the precision and effectiveness of editing and unlearning. We find a stark difference in unlearning and edit robustness when training components localized by different methods. We highlight an important distinction between methods that localize components based primarily on preserving outputs, and those finding high level mechanisms with predictable intermediate states. In particular, localizing edits/unlearning to components associated with the *lookup-table mechanism* for factual recall 1) leads to more robust edits/unlearning across different input/output formats, and 2) resists attempts to relearn the unwanted information, while also reducing unintended side effects compared to baselines, on both a sports facts dataset and the CounterFact dataset across multiple models.We also find that certain localized edits disrupt the latent knowledge in the model more than any other baselines, making unlearning more robust to various attacks.
Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding
Jiajun Zhu, Peihao Wang, Ruisi Cai, Jason Lee, Pan Li, Zhangyang “Atlas” Wang
Transformers rely on both content-based and position-based addressing mechanisms to make predictions, but existing positional encoding techniques often diminish the effectiveness of position-based addressing. Many current methods enforce rigid patterns in attention maps, limiting the ability to model long-range dependencies and adapt to diverse tasks. Additionally, most positional encodings are learned as general biases, lacking the specialization required for different instances within a dataset. To address this, we propose con**T**extualized equivari**A**nt **P**osition **E**ncoding (**TAPE**), a novel framework that enhances positional embeddings by incorporating sequence content across layers. TAPE introduces dynamic, context-aware positional encodings, overcoming the constraints of traditional fixed patterns. By enforcing permutation and orthogonal equivariance, TAPE ensures the stability of positional encodings during updates, improving robustness and adaptability. Our method can be easily integrated into pre-trained transformers, offering parameter-efficient fine-tuning with minimal overhead. Extensive experiments show that TAPE achieves superior performance in language modeling, arithmetic reasoning, and long-context retrieval tasks compared to existing positional embedding techniques.
Scaling Sparse Feature Circuits For Studying In-Context Learning
Dmitrii Kharlapenko, Stepan Shabalin, Arthur Conmy, Neel Nanda
Sparse autoencoders (SAEs) are a popular tool for interpreting large language model activations, but their utility in addressing open questions in interpretability remains unclear. In this work, we demonstrate their effectiveness by using SAEsto deepen our understanding of the mechanism behind in-context learning (ICL). We identify abstract SAE features that (i) encode the model’s knowledge of which task to execute and (ii) whose latent vectors causally induce the task zero-shot.This aligns with prior work showing that ICL is mediated by task vectors. We further demonstrate that these task vectors are well approximated by a sparse sum of SAE latents, including these task-execution features. To explore the ICL mechanism, we scale the sparse feature circuits methodology of Marks et al. (2024) to the Gemma 1 2B model for the more complex task of ICL. Through circuit finding, we discover task-detecting features with corresponding SAE latents that activate earlier in the prompt, that detect when tasks have been performed. They are causally linked with task-execution features through the attention and MLP sublayers.
Monte Carlo and Sampling Methods
Annealing Flow Generative Models Towards Sampling High-Dimensional and Multi-Modal Distributions
Dongze Wu, Yao Xie
Sampling from high-dimensional, multi-modal distributions remains a fundamental challenge across domains such as statistical Bayesian inference and physics-based machine learning. In this paper, we propose Annealing Flow (AF), a method built on Continuous Normalizing Flows (CNFs) for sampling from high-dimensional and multi-modal distributions. AF is trained with a dynamic Optimal Transport (OT) objective incorporating Wasserstein regularization, and guided by annealing procedures, facilitating effective exploration of modes in high-dimensional spaces. Compared to recent NF methods, AF significantly improves training efficiency and stability, with minimal reliance on MC assistance. We demonstrate the superior performance of AF compared to state-of-the-art methods through extensive experiments on various challenging distributions and real-world datasets, particularly in high-dimensional and multi-modal settings. We also highlight AF’s potential for sampling the least favorable distributions.
Neuroscience, Cognitive Science
Jingyang Ke, Feiyang Wu, Jiyi Wang, Jeffrey Markowitz, Anqi Wu
Traditional approaches to studying decision-making in neuroscience focus on simplified behavioral tasks where animals perform repetitive, stereotyped actions to receive explicit rewards. While informative, these methods constrain our understanding of decision-making to short timescale behaviors driven by explicit goals. In natural environments, animals exhibit more complex, long-term behaviors driven by intrinsic motivations that are often unobservable. Recent works in time-varying inverse reinforcement learning (IRL) aim to capture shifting motivations in long-term, freely moving behaviors. However, a crucial challenge remains: animals make decisions based on their history, not just their current state. To address this, we introduce SWIRL (SWitching IRL), a novel framework that extends traditional IRL by incorporating time-varying, history-dependent reward functions. SWIRL models long behavioral sequences as transitions between short-term decision-making processes, each governed by a unique reward function. SWIRL incorporates biologically plausible history dependency to capture how past decisions and environmental contexts shape behavior, offering a more accurate description of animal decision-making. We apply SWIRL to simulated and real-world animal behavior datasets and show that it outperforms models lacking history dependency, both quantitatively and qualitatively. This work presents the first IRL model to incorporate history-dependent policies and rewards to advance our understanding of complex, naturalistic decision-making in animals.
Learning Time-Varying Multi-Region Brain Communications via Scalable Markovian Gaussian Processes
Weihan Li, Yule Wang, Chengrui Li, Anqi Wu
Understanding and constructing brain communications that capture dynamic communications across multiple regions is fundamental to modern system neuroscience, yet current methods struggle to find time-varying region-level communications or scale to large neural datasets with long recording durations. We present a novel framework using Markovian Gaussian Processes to learn brain communications with time-varying temporal delays from multi-region neural recordings, named Adaptive Delay Model (ADM). Our method combines Gaussian Processes with State Space Models and employs parallel scan inference algorithms, enabling efficient scaling to large datasets while identifying concurrent communication patterns that evolve over time. This time-varying approach captures how brain region interactions shift dynamically during cognitive processes. Validated on synthetic and multi-region neural recordings datasets, our approach discovers both the directionality and temporal dynamics of neural communication. This work advances our understanding of distributed neural computation and provides a scalable tool for analyzing dynamic brain networks. Code is available at https://github.com/BRAINML-GT/Adaptive-Delay-Model.
Neural Encoding and Decoding at Scale
Yizi Zhang, Yanchen Wang, Mehdi Azabou, Alexandre Andre, Zixuan Wang, Hanrui Lyu, International Brain Laboratory, Eva Dyer, Department of Statistics Liam Paninski, Cole Hurwitz
Recent work has demonstrated that large-scale, multi-animal models are powerful tools for characterizing the relationship between neural activity and behavior. Current large-scale approaches, however, focus exclusively on either predicting neural activity from behavior (encoding) or predicting behavior from neural activity (decoding), limiting their ability to capture the bidirectional relationship between neural activity and behavior. To bridge this gap, we introduce a multimodal, multi-task model that enables simultaneous Neural Encoding and Decoding at Scale (NEDS). Central to our approach is a novel multi-task-masking strategy, which alternates between neural, behavioral, within-modality, and cross-modality masking. We pretrain our method on the International Brain Laboratory (IBL) repeated site dataset, which includes recordings from 83 animals performing the visual decision-making task. In comparison to other large-scale modeling approaches, we demonstrate that NEDS achieves state-of-the-art performance for both encoding and decoding when pretrained on multi-animal data and then fine-tuned on new animals. Surprisingly, NEDS’s learned embeddings exhibit emergent properties: even without explicit training, they are highly predictive of the brain regions in each recording. Altogether, our approach is a step towards a foundation model of the brain that enables seamless translation between neural activity and behavior.
Online
Novelty Detection in Reinforcement Learning with World Models
Geigh Zollicoffer, Kenneth Eaton, Jonathan Balloch, Julia Kim, Wei Zhou, Robert Wright, Mark Riedl
Reinforcement learning (RL) using world models has found significant recent successes.However, when a sudden change to world mechanics or properties occurs then agent performance and reliability can dramatically decline.We refer to the sudden change in visual properties or state transitions as novelties.Implementing novelty detection within generated world model frameworks is a crucialtask for protecting the agent when deployed. In this paper, we propose straightforward bounding approaches to incorporate novelty detection into world model RL agents by utilizing the misalignment of the world model’s hallucinated states and the true observed states as a novelty score. We provideeffective approaches to detecting novelties in a distribution of transitions learned by an agent ina world model. Finally, we show the advantage ofour work in a novel environment compared to traditional machine learning novelty detection methods as well as currently accepted RL-focused novelty detection algorithms.
Online Learning and Bandits
On Mitigating Affinity Bias through Bandits with Evolving Biased Feedback
Matthew Faw, Constantine Caramanis, Jessica Hoffmann
Unconscious bias has been shown to influence how we assess our peers, with consequences for hiring, promotions and admissions. In this work, we focus on affinity bias, the component of unconscious bias which leads us to prefer people who are similar to us, despite no deliberate intention of favoritism. In a world where the people hired today become part of the hiring committee of tomorrow, we are particularly interested in understanding (and mitigating) how affinity bias affects this feedback loop. This problem has two distinctive features: 1) we only observe the _biased value_ of a candidate, but we want to optimize with respect to their _real value_ 2) the bias towards a candidate with a specific set of traits depends on the _fraction_ of people in the hiring committee with the same set of traits. We introduce a new bandits variant that exhibits those two features, which we call affinity bandits. Unsurprisingly, classical algorithms such as UCB often fail to identify the best arm in this setting. We prove a new instance-dependent regret lower bound, which is larger than that in the standard bandit setting by a multiplicative function of $K$. Since we treat rewards that are _time-varying_ and _dependent on the policy’s past actions_, deriving this lower bound requires developing proof techniques beyond the standard bandit techniques. Finally, we design an elimination-style algorithm which nearly matches this regret, despite never observing the real rewards.
Online Learning, Active Learning and Bandits
Improved and Oracle-Efficient Online $\ell_1$-Multicalibration
Rohan Ghuge, Vidya Muthukumar, Sahil Singla
We study *online multicalibration*, a framework for ensuring calibrated predictions across multiple groups in adversarial settings, across $T$ rounds. Although online calibration is typically studied in the $\ell_1$ norm, prior approaches to online multicalibration have taken the indirect approach of obtaining rates in other norms (such as $\ell_2$ and $\ell_{\infty}$) and then transferred these guarantees to $\ell_1$ at additional loss. In contrast, we propose a direct method that achieves improved and oracle-efficient rates of $\widetilde{\mathcal{O}}(T^{-1/3})$ and $\widetilde{\mathcal{O}}(T^{-1/4})$ respectively, for online $\ell_1$-multicalibration. Our key insight is a novel reduction of online $\ell_1$-multicalibration to an online learning problem with product-based rewards, which we refer to as *online linear-product optimization* ($\mathtt{OLPO}$). To obtain the improved rate of $\widetilde{\mathcal{O}}(T^{-1/3})$, we introduce a linearization of $\mathtt{OLPO}$ and design a no-regret algorithm for this linearized problem. Although this method guarantees the desired sublinear rate (nearly matching the best rate for online calibration), it is computationally expensive when the group family $\mathcal{H}$ is large or infinite, since it enumerates all possible groups. To address scalability, we propose a second approach to $\mathtt{OLPO}$ that makes only a polynomial number of calls to an offline optimization (*multicalibration evaluation*) oracle, resulting in *oracle-efficient* online $\ell_1$-multicalibration with a corresponding rate of $\widetilde{\mathcal{O}}(T^{-1/4})$. Our framework also extends to certain infinite families of groups (e.g., all linear functions on the context space) by exploiting a $1$-Lipschitz property of the $\ell_1$-multicalibration error with respect to $\mathcal{H}$.
Optimization
Fast Tensor Completion via Approximate Richardson Iteration
Mehrdad Ghadiri, Matthew Fahrbach, Yunbum Kook, Ali Jadbabaie
We study tensor completion (TC) through the lens of low-rank tensor decomposition (TD). Many TD algorithms use fast alternating minimization methods to solve _highly structured_ linear regression problems at each step (e.g., for CP, Tucker, and tensor-train decompositions). However, such algebraic structure is often lost in TC regression problems, making direct extensions unclear. This work proposes a novel _lifting_ method for approximately solving TC regression problems using structured TD regression algorithms as blackbox subroutines, enabling sublinear-time methods. We analyze the convergence rate of our approximate Richardson iteration-based algorithm, and our empirical study shows that it can be 100x faster than direct methods for CP completion on real-world tensors.
Planning
Navigating the Social Welfare Frontier: Portfolios for Multi-objective Reinforcement Learning
Cheol Kim, Jai Moondra, Shresth Verma, Madeleine Pollack, Lingkai Kong, Milind Tambe, Swati Gupta
In many real-world applications of Reinforcement Learning (RL), deployed policies have varied impacts on different stakeholders, creating challenges in reaching consensus on how to effectively aggregate their preferences. Generalized $p$-means form a widely used class of social welfare functions for this purpose, with broad applications in fair resource allocation, AI alignment, and decision-making. This class includes well-known welfare functions such as Egalitarian, Nash, and Utilitarian welfare. However, selecting the appropriate social welfare function is challenging for decision-makers, as the structure and outcomes of optimal policies can be highly sensitive to the choice of $p$. To address this challenge, we study the concept of an $\alpha$-approximate portfolio in RL, a set of policies that are approximately optimal across the family of generalized $p$-means for all $p \in [-\infty, 1]$. We propose algorithms to compute such portfolios and provide theoretical guarantees on the trade-offs among approximation factor, portfolio size, and computational efficiency. Experimental results on synthetic and real-world datasets demonstrate the effectiveness of our approach in summarizing the policy space induced by varying $p$ values, empowering decision-makers to navigate this landscape more effectively.
Privacy
Underestimated Privacy Risks for Minority Populations in Large Language Model Unlearning
Rongzhe Wei, Mufei Li, Mohsen Ghassemi, Eleonora Kreacic, Yifan Li, Xiang Yue, Bo Li, Vamsi Potluru, Pan Li, Eli Chien
Large Language Models (LLMs) embed sensitive, human-generated data, prompting the need for unlearning methods. Although certified unlearning offers strong privacy guarantees, its restrictive assumptions make it unsuitable for LLMs, giving rise to various heuristic approaches typically assessed through empirical evaluations. These standard evaluations randomly select data for removal, apply unlearning techniques, and use membership inference attacks (MIAs) to compare unlearned models against models retrained without the removed data. However, to ensure robust privacy protections for every data point, it is essential to account for scenarios in which certain data subsets face elevated risks. Prior research suggests that outliers, particularly including data tied to minority groups, often exhibit higher memorization propensity which indicates they may be more difficult to unlearn. Building on these insights, we introduce a complementary, minority-aware evaluation framework to highlight blind spots in existing frameworks. We substantiate our findings with carefully designed experiments, using canaries with personally identifiable information (PII) to represent these minority subsets and demonstrate that they suffer at least 20\% higher privacy leakage across various unlearning methods, MIAs, datasets, and LLM scales. Our proposed minority-aware evaluation framework marks an essential step toward more equitable and comprehensive assessments of LLM unlearning efficacy.
XAttnMark: Learning Robust Audio Watermarking with Cross-Attention
Yixin Liu, Lie Lu, Jihui Jin, Lichao Sun, Andrea Fanelli
The rapid proliferation of generative audio synthesis and editing technologies has raised significant concerns about copyright infringement, data provenance, and the spread of misinformation through deepfake audio. Watermarking offers a proactive solution by embedding imperceptible, identifiable, and traceable marks into audio content. While recent neural network-based watermarking methods like WavMark and AudioSeal have improved robustness and quality, they struggle to achieve both robust detection and accurate attribution simultaneously. This paper introduces the Cross-Attention Robust Audio Watermark (XAttnMark), which bridges this gap by leveraging partial parameter sharing between the generator and the detector, a cross-attention mechanism for efficient message retrieval, and a temporal conditioning module for improved message distribution. Additionally, we propose a psychoacoustic-aligned temporal-frequency masking loss that captures fine-grained auditory masking effects, enhancing watermark imperceptibility. Our approach achieves state-of-the-art performance in both detection and attribution, demonstrating superior robustness against a wide range of audio transformations, including challenging generative editing with strong editing strength. This work represents a significant step forward in protecting intellectual property and ensuring the authenticity of audio content in the era of generative AI.
Reinforcement Learning and Planning
Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games
Tong Yang, Bo Dai, Lin Xiao, Yuejie Chi
Multi-agent reinforcement learning (MARL) lies at the heart of a plethora of applications involving the interaction of a group of agents in a shared unknown environment. A prominent framework for studying MARL is Markov games, with the goal of finding various notions of equilibria in a sample-efficient manner, such as the Nash equilibrium (NE) and the coarse correlated equilibrium (CCE). However, existing sample-efficient approaches either require tailored uncertainty estimation under function approximation, or careful coordination of the players. In this paper, we propose a novel model-based algorithm, called VMG, that incentivizes exploration via biasing the empiricalestimate of the model parameters towards those with a higher collective best-response values of all the players when fixing the other players’ policies, thus encouraging the policy to deviate from its current equilibrium for more exploration. VMG is oblivious to different forms of function approximation, and permits simultaneous and uncoupled policy updates of all players. Theoretically, we also establish that VMG achieves a near-optimal regret for finding both the NEs of two-player zero-sum Markov games and CCEs of multi-player general-sum Markov games under linear function approximation in an online environment, which nearly match their counterparts with sophisticated uncertainty quantification.
Robotics
Letian Chen, Nina Moorman, Matthew Gombolay
Reinforcement learning (RL) has demonstrated compelling performance in robotic tasks, but its success often hinges on the design of complex, ad hoc reward functions. Researchers have explored how Large Language Models (LLMs) could enable non-expert users to specify reward functions more easily. However, LLMs struggle to balance the importance of different features, generalize poorly to out-of-distribution robotic tasks, and cannot represent the problem properly with only text-based descriptions. To address these challenges, we propose ELEMENTAL (intEractive LEarning froM dEmoNstraTion And Language), a novel framework that combines natural language guidance with visual user demonstrations to align robot behavior with user intentions better. By incorporating visual inputs, ELEMENTAL overcomes the limitations of text-only task specifications, while leveraging inverse reinforcement learning (IRL) to balance feature weights and match the demonstrated behaviors optimally. ELEMENTAL also introduces an iterative feedback-loop through self-reflection to improve feature, reward, and policy learning. Our experiment results demonstrate that ELEMENTAL outperforms prior work by 42.3% on task success, and achieves 41.3% better generalization in out-of-distribution tasks, highlighting its robustness in LfD.
Robustness
SGD Jittering: A Training Strategy for Robust and Accurate Model-Based Architectures
Peimeng Guan, Mark Davenport
Inverse problems aim to reconstruct unseen data from corrupted or perturbed measurements. While most work focuses on improving reconstruction quality, generalization accuracy and robustness are equally important, especially for safety-critical applications. Model-based architectures (MBAs), such as loop unrolling methods, are considered more interpretable and achieve better reconstructions. Empirical evidence suggests that MBAs are more robust to perturbations than black-box solvers, but the accuracy-robustness tradeoff in MBAs remains underexplored. In this work, we propose a simple yet effective training scheme for MBAs, called SGD jittering, which injects noise iteration-wise during reconstruction. We theoretically demonstrate that SGD jittering not only generalizes better than the standard mean squared error training but is also more robust to average-case attacks. We validate SGD jittering using denoising toy examples, seismic deconvolution, and single-coil MRI reconstruction. Both SGD jittering and its SPGD extension yield cleaner reconstructions for out-of-distribution data and demonstrates enhanced robustness against adversarial attacks.
Safety
Tiansheng Huang, Gautam Bhattacharya, Pratik Joshi, Joshua Kimball, Ling Liu
Safety aligned Large Language Models (LLMs) are vulnerable to harmful fine-tuning attacks — a few harmful data mixed in the fine-tuning dataset can break the LLMs’s safety alignment. While several defenses have been proposed, our evaluation shows that existing defenses fail \textit{when some specific training hyper-parameters are chosen} — a large learning rate or a large number of training epochs in the fine-tuning stage can easily invalidate the defense. To this end, we propose Antidote, a post-fine-tuning stage solution, which remains \textbf{\textit{agnostic to the training hyper-parameters in the fine-tuning stage}}. Antidote relies on the philosophy that by removing the harmful parameters, the harmful model can be recovered from the harmful behaviors, regardless of how those harmful parameters are formed in the fine-tuning stage. With this philosophy, we introduce a one-shot pruning stage after harmful fine-tuning to remove the harmful weights that are responsible for the generation of harmful content. Despite its embarrassing simplicity, empirical results show that Antidote can reduce harmful score while maintaining accuracy on downstream tasks.
Security
Topological Signatures of Adversaries in Multimodal Alignments
Minh Vu, Geigh Zollicoffer, Huy Mai, Ben Nebgen, Boian S Alexandrov, Manish Bhattarai
Multimodal Machine Learning systems, particularly those aligning text and image data like CLIP/BLIP models, have become increasingly prevalent, yet remain susceptible to adversarial attacks. While substantial research has addressed adversarial robustness in unimodal contexts, defense strategies for multimodal systems are underexplored. This work investigates the topological signatures that arise between image and text embeddings and shows how adversarial attacks disrupt their alignment, introducing distinctive signatures. We specifically leverage persistent homology and introduce two novel Topological-Contrastive losses based on Total Persistence and Multi-scale kernel methods to analyze the topological signatures introduced by adversarial perturbations. We observe a pattern of monotonic changes in the proposed topological losses emerging in a wide range of attacks on image-text alignments, as more adversarial samples are introduced in the data. By designing an algorithm to back-propagate these signatures to input samples, we are able to integrate these signatures into Maximum Mean Discrepancy tests, creating a novel class of tests that leverage topological signatures for better adversarial detection.
Sequential Models, Time series
In-Context Fine-Tuning for Time-Series Foundation Models
Matthew Faw, Rajat Sen, Yichen Zhou, Abhimanyu Das
Motivated by the recent success of time-series foundation models for zero-shot forecasting, we present a methodology for _in-context fine-tuning_ of a time-series foundation model. In particular, we design a pretrained foundation model that can be prompted (at inference time) with multiple time-series examples, in order to forecast a target time-series into the future. Our foundation model is specifically trained to utilize examples from multiple related time-series in its context window (in addition to the history of the target time-series) to help it adapt to the specific distribution of the target domain at inference time. We show that such a foundation model that uses in-context examples at inference time can obtain much better performance on popular forecasting benchmarks compared to supervised deep learning methods, statistical models, and other time series foundation models. Interestingly, our in-context fine-tuning approach even matches the performance of a foundation model that is explicitly fine-tuned on the target domain.
Jiecheng Lu, Shihao Yang
Autoregressive attention-based time series forecasting (TSF) has drawn increasing interest, with mechanisms like linear attention often outperforming vanilla attention. However, deeper Transformer architectures frequently misalign with autoregressive objectives, obscuring the underlying VAR structure embedded within linear attention and hindering their ability to capture the data generative processes in TSF. In this work, we first show that a single linear attention layer can be interpreted as a dynamic vector autoregressive (VAR) structure. We then explain that existing multi-layer Transformers have structural mismatches with the autoregressive forecasting objective, which impair interpretability and generalization ability. To address this, we show that by rearranging the MLP, attention, and input-output flow, multi-layer linear attention can also be aligned as a VAR model. Then, we propose Structural Aligned Mixture of VAR (SAMoVAR), a linear Transformer variant that integrates interpretable dynamic VAR weights for multivariate TSF. By aligning the Transformer architecture with autoregressive objectives, SAMoVAR delivers improved performance, interpretability, and computational efficiency, comparing to SOTA TSF models.
WAVE: Weighted Autoregressive Varying Gate for Time Series Forecasting
Jiecheng Lu, Xu Han, Yan Sun, Shihao Yang
We propose a Weighted Autoregressive Varying gatE (WAVE) attention mechanism equipped with both Autoregressive (AR) and Moving-average (MA) components. It can adapt to various attention mechanisms, enhancing and decoupling their ability to capture long-range and local temporal patterns in time series data. In this paper, we first demonstrate that, for the time series forecasting (TSF) task, the previously overlooked decoder-only autoregressive Transformer model can achieve results comparable to the best baselines when appropriate tokenization and training methods are applied. Moreover, inspired by the ARMA model from statistics and recent advances in linear attention, we introduce the full ARMA structure into existing autoregressive attention mechanisms. By using an indirect MA weight generation method, we incorporate the MA term while maintaining the time complexity and parameter size of the underlying efficient attention models. We further explore how indirect parameter generation can produce implicit MA weights that align with the modeling requirements for local temporal impacts. Experimental results show that WAVE attention that incorporates the ARMA structure consistently improves the performance of various AR attentions on TSF tasks, achieving state-of-the-art results.









