Vision

Coupled with New York University’s mission to become a fully connected global network university, faculty from NYU New York and Abu Dhabi created NYU Multimedia and Visual Computing Lab as an intellectual hub for faculty, researchers and students from both New York and Abu Dhabi campuses, who come together to study and address the key challenges in multimedia and visual data processing. With the advancement in data acquisition techniques, we have observed an exponential increase of visual data that present in different domains and modalities, such as, 2D images, 2D videos, 2D sketches, 2.5D depth images, 3D point cloud, 3D meshed surface and so on. We are therefore faced with an ever-increasing demand for approaches towards automatic visual data processing, understanding and analysis. Visual data are often featured with high complexity, subject to large structural variations, intrinsic imprecision and ambiguity, and exhibit heavy noise and incompleteness. For instance, a car built by different manufactures is likely to be significantly distinct in 3D shape representation; a building viewed from different view angles is likely to be distinct in 2D view representation; and a horse sketched by different individuals with experiential and cognitive difference is likely to be distinct in sketch representation. Our research lab aims to develop a unified framework based on the state-of-the-art techniques in big-data and deep learning to address the aforementioned challenges, specifically, multitude ongoing research threads in the lab are as follows:

  • 3D computer vision: the development of novel techniques that handles challenging research problems in 3D object detection, classification and registration.
  •  Large-scale visual computing: the development of novel techniques for addressing challenging research problems caused by the exponential growth of visual data in the era of “Big Data”.
  • Deep visual computing: the development of novel techniques for deeply learning and discovering the hidden visual pattern for visual object recognition using the state-of-the-art deep learning techniques.
  • Deep cross-domain model: The development of novel techniques for deeply mining the intrinsic relationship among loosely related data across different domains such 2D images and 3D shape.
  • Deep cross-modality model: The development of novel techniques for deeply exploring the affinity from diverged relations among data present in different modalities such 3D shapes and semantic text-based description.
  • 3D scene understanding: The development of novel techniques for scene segmentation, detection, tracking and semantic scene labeling for RGB-D data.
  • 3D Computational Structural Biology: The development of new methods to address challenging issues caused by the significant conformational structural flexibility of biological molecules.