Faculty

On the Horizon

More powerful artificial intelligence (AI) is entering the mainstream on a screen near you. It’s now possible to test drive a handful of services and applications that show this new computing power and automation in high gear.

As this tech spreads across our digital landscape, we asked Georgia Tech experts publishing new research in computer vision — a subfield of AI involving the processing of visual data (e.g. photos and video) — to share their take on the state of AI and what this new horizon of technology looks like.

Sunrise” multimedia created using Dall•E and Photoshop. (Credit: Josh Preston)

Faculty Experts on the State of Artificial Intelligence

Hover on images to flip cards

Image

Dhruv Batra

Assoc. Professor, Interactive Computing

Contemporary discussion — and unfortunately, hype — about LLMs and AGI seems oblivious of Moravec’s paradox. We’ve hypothesized since the ’80s that the hardest problems in AI involve sensorimotor control, not abstract thought or reasoning. It explains why AI is mastering games, dialogue, and scene generation. But robots are conspicuously missing from this revolution.

At Georgia Tech, we are pursuing the embodied road to general intelligence – the idea that intelligence emerges in the interaction of an agent with an environment and as a result of sensorimotor activity. We study problems of embodied navigation, mobile manipulation, world modeling, future prediction, social interaction, etc in rich 3D simulators at scale and sim2real transfer to real robots.

Image

Yongxin Chen

Asst. Professor, Aerospace Engineering

The emergence of generative AI such as text2image and ChatGPT heralds a new era of AI. It not only greatly expands the applications of AI but also enables the public to have access to frontier AI technologies. Its commercial potential will further boost the development of AI.

Our lab focus is on the diffusion model, the backbone of text2image. Our CVPR paper presents a method that can effectively utilize incomplete or fragmented data for training generative models with the aim to relax the conventional dependency on complete datasets for better scalability with regard to training data availability.

Image

Irfan Essa

Professor, Interactive Computing

This is the back of the card

Image

James Hays

Assoc. Professor, Interactive Computing

The long standing computer vision “grand challenge” — to give machines a human-like understanding of images — is nearly solved due to advances in learning, computation, and training data (in increasing order of importance).

Similar advances in other artificial intelligence fields have led to worries that AI is an “existential threat”. There is no reason to worry in the near and medium-term. AI is not embodied, and even if it were it would still need widespread human cooperation to be a threat.

We have passed many AI “singularities” (e.g. machines have been faster at math for a century), and it is exciting to pass a few more this decade, but many more remain!

Image

Judy Hoffman

Asst. Professor, Interactive Computing

It’s truly an amazing time to be working in AI! The world is paying attention and eager to find out how AI will impact their lives. Late breaking research developments are rapidly being integrated into products and propelling new commercial frontiers.

As the reach of this technology further expands in the coming years, our challenge in research will be to not only expand capabilities and make them more accessible, but to advance the reliability and trustworthiness so that AI continues to benefit society.

Image

Zsolt Kira

Asst. Professor, Interactive Computing

One of the exciting aspects of current progress in AI is the unification of models across all modalities including language, vision, and audio. As language models become more powerful, I’m excited about the interaction between perception and language, allowing us to ground language models to the real world — preventing these models from hallucinating — and chat with computers about images and videos.

The ability to process all of these types of information jointly and reason about them brings us closer to more general intelligence.

Image

Yingyan (Celine) Lin

Assoc. Professor, Computer Science

AI is transforming numerous sectors of our technology and society, thanks to its amazing breakthroughs, especially in computer vision and natural language processing. While its potential is immense, promising a future of efficient, eco-conscious technology driving societal change, this brings computational and environmental challenges.

Training powerful AI models demands significant resources and emits enormous carbon emissions. Looking forward, balancing computational demand, model performance, and sustainability is crucial for AI advancement and its widespread applications.

Image

Ling Liu

Professor, Computer Science

This is the back of the card

Image

James Rehg

Professor, Interactive Computing

This is the back of the card

Image

Dhruv Batra

Assoc. Professor, Interactive Computing

Contemporary discussion — and unfortunately, hype — about large language models (LLMs) and artificial general intelligence (AGI) seems oblivious of Moravec’s paradox. We’ve hypothesized since the ’80s that the hardest problems in AI involve sensorimotor control, not abstract thought or reasoning. It explains why AI is mastering games, dialogue, and scene generation. But robots are conspicuously missing from this revolution.

At Georgia Tech, we are pursuing the embodied road to general intelligence – the idea that intelligence emerges in the interaction of an agent with an environment and as a result of sensorimotor activity. We study problems of embodied navigation, mobile manipulation, world modeling, future prediction, social interaction, etc in rich 3D simulators at scale and sim2real transfer to real robots.

Image

Yongxin Chen

Asst. Professor, Aerospace Engineering

The emergence of generative AI such as text2image and ChatGPT heralds a new era of AI. It not only greatly expands the applications of AI but also enables the public to have access to frontier AI technologies. Its commercial potential will further boost the development of AI.

Our lab focus is on the diffusion model, the backbone of text2image. Our CVPR paper presents a method that can effectively utilize incomplete or fragmented data for training generative models with the aim to relax the conventional dependency on complete datasets for better scalability with regard to training data availability.

Image

Irfan Essa

Professor, Interactive Computing

Ti nulluptam essende leniam volo totam nes dolorer undusdae con rereser sperore mpelique necatus aeptiata digenectae cus mos ut ut fuga. Itatquatem. Ratiis inctiur apereium quas vento volorem oditatemodi offic totae dolenimus etum expedi incta et odipsapidus coriorae dolorae. Uda sequis dolori odis apid magnatium harchil latquid quos rernatium ipsam que cus mincient aruptas mollorese delluptat. Si con eosandunt.

nulluptam essende leniam volo totam nes dolorer undusdae con rereser sperore mpelique necatus aeptiata digenectae cus mos ut ut fuga. Itatquatem. Ratiis inctiur apereium quas vento volorem oditatemodi offic totae dolenimus etum.

Image

James Hays

Assoc. Professor, Interactive Computing

The long standing computer vision “grand challenge” — to give machines a human-like understanding of images — is nearly solved due to advances in learning, computation, and training data (in increasing order of importance).

Similar advances in other artificial intelligence fields have led to worries that AI is an “existential threat”. There is no reason to worry in the near and medium-term. AI is not embodied, and even if it were it would still need widespread human cooperation to be a threat.

We have passed many AI “singularities” (e.g. machines have been faster at math for a century), and it is exciting to pass a few more this decade, but many more remain!

Image

Judy Hoffman

Asst. Professor, Interactive Computing

It’s truly an amazing time to be working in AI! The world is paying attention and eager to find out how AI will impact their lives. Late breaking research developments are rapidly being integrated into products and propelling new commercial frontiers.

As the reach of this technology further expands in the coming years, our challenge in research will be to not only expand capabilities and make them more accessible, but to advance the reliability and trustworthiness so that AI continues to benefit society.

Image

Zsolt Kira

Asst. Professor, Interactive Computing

One of the exciting aspects of current progress in AI is the unification of models across all modalities including language, vision, and audio. As language models become more powerful, I’m excited about the interaction between perception and language, allowing us to ground language models to the real world — preventing these models from hallucinating — and chat with computers about images and videos.

The ability to process all of these types of information jointly and reason about them brings us closer to more general intelligence.

Image

Yingyan (Celine) Lin

Assoc. Professor, Computer Science

AI is transforming numerous sectors of our technology and society, thanks to its amazing breakthroughs, especially in computer vision and natural language processing. While its potential is immense, promising a future of efficient, eco-conscious technology driving societal change, this brings computational and environmental challenges.

Training powerful AI models demands significant resources and emits enormous carbon emissions. Looking forward, balancing computational demand, model performance, and sustainability is crucial for AI advancement and its widespread applications.

Image

Ling Liu

Professor, Computer Science

Ti nulluptam essende leniam volo totam nes dolorer undusdae con rereser sperore mpelique necatus aeptiata digenectae cus mos ut ut fuga. Itatquatem. Ratiis inctiur apereium quas vento volorem oditatemodi offic totae dolenimus etum expedi incta et odipsapidus coriorae dolorae. Uda sequis dolori odis apid magnatium harchil latquid quos rernatium ipsam que cus mincient aruptas mollorese delluptat. Si con eosandunt.

nulluptam essende leniam volo totam nes dolorer undusdae con rereser sperore mpelique necatus aeptiata digenectae cus mos ut ut fuga. Itatquatem. Ratiis inctiur apereium quas vento volorem oditatemodi offic totae dolenimus etum.

Image

James Rehg

Professor, Interactive Computing

Ti nulluptam essende leniam volo totam nes dolorer undusdae con rereser sperore mpelique necatus aeptiata digenectae cus mos ut ut fuga. Itatquatem. Ratiis inctiur apereium quas vento volorem oditatemodi offic totae dolenimus etum expedi incta et odipsapidus coriorae dolorae. Uda sequis dolori odis apid magnatium harchil latquid quos rernatium ipsam que cus mincient aruptas mollorese delluptat. Si con eosandunt.

nulluptam essende leniam volo totam nes dolorer undusdae con rereser sperore mpelique necatus aeptiata digenectae cus mos ut ut fuga. Itatquatem. Ratiis inctiur apereium quas vento volorem oditatemodi offic totae dolenimus etum.