NECV 2025

Logistics

Schedule

Time

Topic

9:00-9:50

Registration, Poster Setup & Breakfast

9:50-10:00

Opening Remarks

10:00-11:30

Oral Session I

[10:00]	Not All Birds Look The Same: Identity-Preserving Generation For Birds, Aaron Sun, UMass Amherst
[10:15]	Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos, Xavier Thomas, Boston University
[10:30]	CObL: Toward Zero-Shot Ordinal Layering without User Prompting, Aneel Damaraju, Harvard University
[10:45]	Consensus-Driven Active Model Selection, Justin Kay, MIT
[11:00]	Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs, Hanhui Wang, Northeastern University
[11:15]	Structured Light with a Million Light Planes per Second, Dhawal Sirikonda, Dartmouth College

11:30-12:30

Coffee & Poster Session I

[1]	Tell the Story, Not the Frames: Narrative-Aware Retrieval for Audio Description, Seung Hyun Hahm, Dartmouth College
[2]	Relational Representation Learning, Ian Hajra, Brown University
[3]	Progressive Stereo Edge Correspondences and Refinement, Chiang-Heng Chien, Brown University
[4]	Learning and Stabilizing Isometries for Robust Vision, Javid Lakha, Harvard University
[5]	stable-worldmodel: An Ecosystem For World Model Research, Lucas Maes, Mila
[6]	Augmented Reality Active Area Labels for Dynamic Scenes, Lana Yang-Maccini, Brown University
[7]	Exploring Texture Guidance in Diffusion Models, Eric Yee, MIT
[8]	PackUV: Packed Gaussian UV Maps for 4D Volumetric Video, Aashish Rai, Brown University
[9]	Compositional Targeted Multi-Label Universal Perturbations, Hassan Mahmood, Northeastern University
[10]	Blind to Shape, Bound to Semantics: A VLM’s Dilemma, Zachary Meurer, Boston University
[11]	Curvature Tuning: Provable Training-free Model Steering From a Single Parameter, Leyang Hu, Brown University
[12]	FLIGHT: Fibonacci Lattice-based Inference for Geometric Heading in real-Time, Dave Dirnfeld, UMass Amherst
[13]	ID-Sim: An Identity-Focused Perceptual Similarity Metric, Nayoung Chae, MIT
[14]	Enhancing Autonomous Navigation by Imaging Hidden Objects using Single-Photon LiDAR, Nevindu Batagoda, Dartmouth College
[15]	Audio Geolocation: An Investigation with Natural Sounds, Wuao Liu, UMass Amherst
[16]	Some Modalities are More Equal Than Others: Decoding and Architecting Multimodal Integration in MLLMs, Tianle Chen, Boston University
[17]	Do VLMs see texture like humans and CNNs? Evidence from slant-from-texture, Qian Zhang, Brown University
[18]	A Monte Carlo Rendering Framework for Simulating Optical Heterodyne Detection, Juhyeon Kim, Dartmouth College
[19]	LVT: Large-Scale Scene Reconstruction via Local View Transformers, Tooba Imtiaz, Northeastern University
[20]	LayerCraft-Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration, Yuyao Zhang, Dartmouth College
[21]	Iris: Integrating Language into Diffusion-based Monocular Depth Estimation, Ziyao Zeng, Yale University
[22]	Underwater Optical Backscatter Communication using Acousto-Optic Beam Steering, Dhawal Sirikonda, Dartmouth College
[23]	BabyVLM-V2: Toward Developmentally Grounded Pretraining and Benchmarking of Vision Foundation Models, Shengao Wang, Boston University
[24]	DevCV Toolbox: Toward Developmentally Grounded Benchmarking of Vision Foundation Models, Max Whitton, Boston University
[25]	PRISM: Controllable Diffusion for Compound Image Restoration with Scientific Fidelity, Rupa Kurinchi-Vendhan, MIT
[26]	Words That Make Language Models Perceive, Sophie Wang, MIT
[27]	Active Measurement: Efficient Estimation at Scale, Max Hamilton, UMass Amherst
[28]	Spatially-Varying Autofocus, Yingsi Qin, Carnegie Mellon University
[29]	VisReason: A Large-Scale Dataset for Visual Chain-of-Thought Reasoning, Lingxiao Li, Boston University
[30]	Residual Primitive Fitting of 3D Shapes with SuperFrusta, Aditya Ganeshan, Brown University
[31]	CHAIR : An interpretable pipeline for AI-expert collaboration on elephant Re-identification, Antoine Salaun, MIT
[32]	The LLM Bottleneck: Why Open-Source Vision LLMs Struggle with Hierarchical Visual Recognition, Yuwen Tan, Boston University

12:30-2:00

Lunch

2:00-3:00

Coffee & Poster Session II

[1]	Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos, Xavier Thomas, Boston University
[2]	Looking at the Sky, Shrenik Borad, George Washington University
[3]	Not All Birds Look The Same: Identity-Preserving Generation For Birds, Aaron Sun, UMass Amherst
[4]	Super-Resolution with Structured Motion, Gabby Litterio, Brown University
[5]	PLLM: Pseudo-Labeling Large Language Models for CAD Program Synthesis, Yuanbo Li, Brown University
[6]	Unsafe2Safe: Controllable Image Anonymization for Downstream Utility, Minh Dinh, Dartmouth College
[7]	Exploring Efficient and Practical Unified Unified Multimodal Model, Xu Ma, Northeastern University
[8]	Scale-DiT: Ultra-High-Resolution Image Generation with Hierarchical Local Attention, Yuyao Zhang, Dartmouth College
[9]	HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Models, Yiwen Chen, Northeastern University
[10]	Outlier-Aware Post-Training Quantization for Image Super-Resolution, Hailing Wang, Northeastern University
[11]	Does learning about time improve out-of-distribution generalization in object detection?, Kai Van Brunt, MIT
[12]	Vision Masked Image Modeling Transfers Across Domains, Pranav Sankar, Brown University
[13]	Can LVLMs Harness Visual Contexts to Untangle Ambiguity In Language?, Heejeong Nam, Brown University
[14]	SNAP: Towards Segmenting Anything in Any Point Cloud, Aniket Gupta, Northeastern University
[15]	Trace Anything: Representing Any Video in 4D via Trajectory Fields, Xinhang Liu, Dartmouth College
[16]	LASER: Layer-wise Scale Alignment for Training-Free Streaming 4D Reconstruction, Tianye Ding, Northeastern University
[17]	RealBirdID: Benchmarking Bird Species Identification in the Era of MLLMs, Logan Lawrence, UMass Amherst
[18]	Combining Translation with Magnification to Resolve Ambiguity in Super-Resolution, Daniel Fu, Brown University
[19]	DIET-CP: Lightweight and Data Efficient Self Supervised Continued Pretraining, Jakob Ambsdorf, Brown University
[20]	Potion Brewing Laboratory: An Environment for Continual Learning in World Models, Taj Gillin, Brown University
[21]	CObL: Toward Zero-Shot Ordinal Layering without User Prompting, Aneel Damaraju, Harvard University
[22]	Consensus-Driven Active Model Selection, Justin Kay, MIT
[23]	Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs, Hanhui Wang, Northeastern University
[24]	Attribution Robustness via Implicit Curvature Regularization, Matteo Gamba, Brown University
[25]	Discontinuous 2D Neural Fields without Meshing, Javid Lakha, Harvard University
[26]	S3: Learnable Spline-Wavelets for State Space Models, Daniel Cai, Brown University
[27]	SimpleCall: A Lightweight Image Restoration Agent in Label-Free Environments with MLLM Perceptual Feedback, Jianglin Lu, Northeastern University
[28]	SuperRivolution: Fine-Scale Rivers from Coarse Temporal Satellite Imagery, Rangel Daroya, UMass Amherst
[29]	Coffee: Controllable Diffusion Fine-tuning, Ziyao Zeng, Yale University
[30]	Structured Light with a Million Light Planes per Second, Dhawal Sirikonda, Dartmouth College
[31]	Arbitrary-Scale 3D Gaussian Super-Resolution, Huimin Zeng, Northeastern University
[32]	3D Curvix: From Multiview 2D Edges to 3D Curve Segments, Chiang-Heng Chien, Brown University

3:00-4:30

Oral Session II

[3:00]	Tell the Story, Not the Frames: Narrative-Aware Retrieval for Audio Description, Seung Hyun Hahm, Dartmouth College
[3:15]	Curvature Tuning: Provable Training-free Model Steering From a Single Parameter, Leyang Hu, Brown University
[3:30]	Iris: Integrating Language into Diffusion-based Monocular Depth Estimation, Ziyao Zeng, Yale University
[3:45]	BabyVLM-V2: Toward Developmentally Grounded Pretraining and Benchmarking of Vision Foundation Models, Shengao Wang, Boston University
[4:00]	Words That Make Language Models Perceive, Sophie Wang, MIT
[4:15]	Residual Primitive Fitting of 3D Shapes with SuperFrusta, Aditya Ganeshan, Brown University

4:30-4:45

Closing Remarks

Venue

The workshop will be held in the Marriott Room, 11th Floor, Campus Center, University of Massachusetts, Amherst.

Parking

If you signed up by the early registration deadline we will pay for your parking. Please follow these instructions:

Park at the Campus Center Parking Garage at Levels 2-6.
Do not pay for parking.
Keep track of your license plate number as it will be asked at the registration desk.
Proceed to the registration desk at the Marriott Room, 11th Floor, Campus Center.
You will receive a link at the registration desk that you can use within 20 minutes of your arrival to register your car for parking.

If you did not sign up by the early registration deadline, the Campus Center Parking Garage is still the most convenient parking location and you can use the parking kiosks to pay for parking. The rate is $1.85/hr.

New England Computer Vision (NECV) Workshop 2025

University of Massachusetts, Amherst, MA

Friday, November 21, 2025

Registration and Submission

Presentation

Logistics

Schedule

Venue

Parking

Sponsorship

Organizers

Past Years