New England Computer Vision (NECV) Workshop 2025

University of Massachusetts, Amherst, MA

Friday, November 21, 2025



The New England Computer Vision Workshop (NECV) brings together researchers in computer vision and related areas for an informal exchange of ideas through a full day of presentations and posters. Held conveniently after the CVPR deadline and before the NeurIPS conference, NECV offers opportunities to network and showcase research. NECV attracts researchers from universities and industry research labs in New England. As in previous years, the workshop will focus on graduate student presentations. Welcome to UMass Amherst!

- Grant & Subhransu


Registration and Submission

Academic researchers: Participation is free for all researchers at academic institutions. Please register here and submit your abstract here.

Industry participants: For our industry friends, a limited number of registrations are available for a fee. Please register here.

Deadlines: Early-bird registration (lunch and parking provided) by November 11. Please submit your abstract by November 16. Oral decisions will be released by November 18.

Submission guidelines: Please submit a one-page PDF abstract using the CVPR 2026 rebuttal template. Please include the title of your work and the list of authors in the abstract. You may present work that has already been published or work that is in progress. All relevant submissions will be granted a poster presentation, and selected submissions from each institution will be granted 12-minute oral presentations. Post-docs and faculty may submit for poster presentations, but oral presentations are reserved for graduate students. There will be no publications resulting from the workshop, so presentations will not be considered "prior peer-reviewed work" according to any definition we are aware of. Thus, work presented at NECV can be subsequently submitted to other venues without citation. The workshop is after the CVPR submission deadline, so come and show off your new work in a friendly environment. It's also just before the NeurIPS conference, so feel free to come and practice your presentation.


Presentation

Oral presentation: Each presentation is allocated a 12-minute slot, with an additional 3 minutes dedicated to questions and the change of speaker. We request all presenters to bring their own laptop for their presentation.

Poster presentation: The poster session will be held in the same room as the oral session (Marriott Room, 11th Floor, Campus Center). Your assigned poster ID can be found on this website. Please locate the correct poster board to display your poster. These boards can accommodate up to 57.75" x 46" posters (width x height). You are welcome to use any format within that size limit.


Logistics

Schedule

Time Topic
9:00-9:50 Registration, Poster Setup & Breakfast
9:50-10:00 Opening Remarks
10:00-11:30 Oral Session I
[10:00] Not All Birds Look The Same: Identity-Preserving Generation For Birds, Aaron Sun, UMass Amherst
[10:15] Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos, Xavier Thomas, Boston University
[10:30] CObL: Toward Zero-Shot Ordinal Layering without User Prompting, Aneel Damaraju, Harvard University
[10:45] Consensus-Driven Active Model Selection, Justin Kay, MIT
[11:00] Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs, Hanhui Wang, Northeastern University
[11:15] Structured Light with a Million Light Planes per Second, Dhawal Sirikonda, Dartmouth College
11:30-12:30 Coffee & Poster Session I
[1]Tell the Story, Not the Frames: Narrative-Aware Retrieval for Audio Description, Seung Hyun Hahm, Dartmouth College
[2]Relational Representation Learning, Ian Hajra, Brown University
[3]Progressive Stereo Edge Correspondences and Refinement, Chiang-Heng Chien, Brown University
[4]Learning and Stabilizing Isometries for Robust Vision, Javid Lakha, Harvard University
[5]stable-worldmodel: An Ecosystem For World Model Research, Lucas Maes, Mila
[6]Augmented Reality Active Area Labels for Dynamic Scenes, Lana Yang-Maccini, Brown University
[7]Exploring Texture Guidance in Diffusion Models, Eric Yee, MIT
[8]PackUV: Packed Gaussian UV Maps for 4D Volumetric Video, Aashish Rai, Brown University
[9]Compositional Targeted Multi-Label Universal Perturbations, Hassan Mahmood, Northeastern University
[10]Blind to Shape, Bound to Semantics: A VLM’s Dilemma, Zachary Meurer, Boston University
[11]Curvature Tuning: Provable Training-free Model Steering From a Single Parameter, Leyang Hu, Brown University
[12]FLIGHT: Fibonacci Lattice-based Inference for Geometric Heading in real-Time, Dave Dirnfeld, UMass Amherst
[13]ID-Sim: An Identity-Focused Perceptual Similarity Metric, Nayoung Chae, MIT
[14]Enhancing Autonomous Navigation by Imaging Hidden Objects using Single-Photon LiDAR, Nevindu Batagoda, Dartmouth College
[15]Audio Geolocation: An Investigation with Natural Sounds, Wuao Liu, UMass Amherst
[16]Some Modalities are More Equal Than Others: Decoding and Architecting Multimodal Integration in MLLMs, Tianle Chen, Boston University
[17]Do VLMs see texture like humans and CNNs? Evidence from slant-from-texture, Qian Zhang, Brown University
[18]A Monte Carlo Rendering Framework for Simulating Optical Heterodyne Detection, Juhyeon Kim, Dartmouth College
[19]LVT: Large-Scale Scene Reconstruction via Local View Transformers, Tooba Imtiaz, Northeastern University
[20]LayerCraft-Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration, Yuyao Zhang, Dartmouth College
[21]Iris: Integrating Language into Diffusion-based Monocular Depth Estimation, Ziyao Zeng, Yale University
[22]Underwater Optical Backscatter Communication using Acousto-Optic Beam Steering, Dhawal Sirikonda, Dartmouth College
[23]BabyVLM-V2: Toward Developmentally Grounded Pretraining and Benchmarking of Vision Foundation Models, Shengao Wang, Boston University
[24]DevCV Toolbox: Toward Developmentally Grounded Benchmarking of Vision Foundation Models, Max Whitton, Boston University
[25]PRISM: Controllable Diffusion for Compound Image Restoration with Scientific Fidelity, Rupa Kurinchi-Vendhan, MIT
[26]Words That Make Language Models Perceive, Sophie Wang, MIT
[27]Active Measurement: Efficient Estimation at Scale, Max Hamilton, UMass Amherst
[28]Spatially-Varying Autofocus, Yingsi Qin, Carnegie Mellon University
[29]VisReason: A Large-Scale Dataset for Visual Chain-of-Thought Reasoning, Lingxiao Li, Boston University
[30]Residual Primitive Fitting of 3D Shapes with SuperFrusta, Aditya Ganeshan, Brown University
[31]CHAIR : An interpretable pipeline for AI-expert collaboration on elephant Re-identification, Antoine Salaun, MIT
[32]The LLM Bottleneck: Why Open-Source Vision LLMs Struggle with Hierarchical Visual Recognition, Yuwen Tan, Boston University
12:30-2:00 Lunch
2:00-3:00 Coffee & Poster Session II
[1]Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos, Xavier Thomas, Boston University
[2]Looking at the Sky, Shrenik Borad, George Washington University
[3]Not All Birds Look The Same: Identity-Preserving Generation For Birds, Aaron Sun, UMass Amherst
[4]Super-Resolution with Structured Motion, Gabby Litterio, Brown University
[5]PLLM: Pseudo-Labeling Large Language Models for CAD Program Synthesis, Yuanbo Li, Brown University
[6]Unsafe2Safe: Controllable Image Anonymization for Downstream Utility, Minh Dinh, Dartmouth College
[7]Exploring Efficient and Practical Unified Unified Multimodal Model, Xu Ma, Northeastern University
[8]Scale-DiT: Ultra-High-Resolution Image Generation with Hierarchical Local Attention, Yuyao Zhang, Dartmouth College
[9]HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Models, Yiwen Chen, Northeastern University
[10]Outlier-Aware Post-Training Quantization for Image Super-Resolution, Hailing Wang, Northeastern University
[11]Does learning about time improve out-of-distribution generalization in object detection?, Kai Van Brunt, MIT
[12]Vision Masked Image Modeling Transfers Across Domains, Pranav Sankar, Brown University
[13]Can LVLMs Harness Visual Contexts to Untangle Ambiguity In Language?, Heejeong Nam, Brown University
[14]SNAP: Towards Segmenting Anything in Any Point Cloud, Aniket Gupta, Northeastern University
[15]Trace Anything: Representing Any Video in 4D via Trajectory Fields, Xinhang Liu, Dartmouth College
[16]LASER: Layer-wise Scale Alignment for Training-Free Streaming 4D Reconstruction, Tianye Ding, Northeastern University
[17]RealBirdID: Benchmarking Bird Species Identification in the Era of MLLMs, Logan Lawrence, UMass Amherst
[18]Combining Translation with Magnification to Resolve Ambiguity in Super-Resolution, Daniel Fu, Brown University
[19]DIET-CP: Lightweight and Data Efficient Self Supervised Continued Pretraining, Jakob Ambsdorf, Brown University
[20]Potion Brewing Laboratory: An Environment for Continual Learning in World Models, Taj Gillin, Brown University
[21]CObL: Toward Zero-Shot Ordinal Layering without User Prompting, Aneel Damaraju, Harvard University
[22]Consensus-Driven Active Model Selection, Justin Kay, MIT
[23]Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs, Hanhui Wang, Northeastern University
[24]Attribution Robustness via Implicit Curvature Regularization, Matteo Gamba, Brown University
[25]Discontinuous 2D Neural Fields without Meshing, Javid Lakha, Harvard University
[26]S3: Learnable Spline-Wavelets for State Space Models, Daniel Cai, Brown University
[27]SimpleCall: A Lightweight Image Restoration Agent in Label-Free Environments with MLLM Perceptual Feedback, Jianglin Lu, Northeastern University
[28]SuperRivolution: Fine-Scale Rivers from Coarse Temporal Satellite Imagery, Rangel Daroya, UMass Amherst
[29]Coffee: Controllable Diffusion Fine-tuning, Ziyao Zeng, Yale University
[30]Structured Light with a Million Light Planes per Second, Dhawal Sirikonda, Dartmouth College
[31]Arbitrary-Scale 3D Gaussian Super-Resolution, Huimin Zeng, Northeastern University
[32]3D Curvix: From Multiview 2D Edges to 3D Curve Segments, Chiang-Heng Chien, Brown University
3:00-4:30 Oral Session II
[3:00] Tell the Story, Not the Frames: Narrative-Aware Retrieval for Audio Description, Seung Hyun Hahm, Dartmouth College
[3:15] Curvature Tuning: Provable Training-free Model Steering From a Single Parameter, Leyang Hu, Brown University
[3:30] Iris: Integrating Language into Diffusion-based Monocular Depth Estimation, Ziyao Zeng, Yale University
[3:45] BabyVLM-V2: Toward Developmentally Grounded Pretraining and Benchmarking of Vision Foundation Models, Shengao Wang, Boston University
[4:00] Words That Make Language Models Perceive, Sophie Wang, MIT
[4:15] Residual Primitive Fitting of 3D Shapes with SuperFrusta, Aditya Ganeshan, Brown University
4:30-4:45 Closing Remarks

Venue

The workshop will be held in the Marriott Room, 11th Floor, Campus Center, University of Massachusetts, Amherst.

Parking

If you signed up by the early registration deadline we will pay for your parking. Please follow these instructions:

If you did not sign up by the early registration deadline, the Campus Center Parking Garage is still the most convenient parking location and you can use the parking kiosks to pay for parking. The rate is $1.85/hr.


Sponsorship


Organizers

Host: Grant Van Horn and Subhransu Maji.

Program committee: Deep Chakraborty, Mustafa Chasmai, Rangel Daroya, Max Hamilton, Logan Lawrence, Wuao Liu, Aaron Sun, and Vikas Thamizharasan.

Logistics committee: Rangel Daroya and Logan Lawrence.

Corporate relations chair: Samson Timoner.

Website chair: Fabien Delattre.

Steering committee: Erik Learned-Miller (UMass Amherst), Kate Saenko (Boston University), Yun (Raymond) Fu (Northeastern University), Octavia Camps (Northeastern University), Todd Zickler (Harvard), James Tompkin (Brown), Benjamin Kimia (Brown), Phillip Isola (MIT), Pulkit Agrawal (MIT), SouYoung Jin (Dartmouth), Adithya Pediredla (Dartmouth), Yu-Wing Tai (Dartmouth), and Alex Wong (Yale).


Past Years