On the Horizon
More powerful artificial intelligence (AI) is entering the mainstream on a screen near you. It’s now possible to test drive a handful of services and applications that show this new computing power and automation in high gear.
As this tech spreads across our digital landscape, we asked Georgia Tech experts publishing new research in computer vision — a subfield of AI involving the processing of visual data (e.g. photos and video) — to share their take on the state of AI and what this new horizon of technology looks like.
Faculty Experts on the State of Artificial Intelligence
Hover on images to flip cards
Dhruv Batra
Assoc. Professor, Interactive Computing
Contemporary discussion — and unfortunately, hype — about LLMs and AGI seems oblivious of Moravec’s paradox. We’ve hypothesized since the ’80s that the hardest problems in AI involve sensorimotor control, not abstract thought or reasoning. It explains why AI is mastering games, dialogue, and scene generation. But robots are conspicuously missing from this revolution.
At Georgia Tech, we are pursuing the embodied road to general intelligence – the idea that intelligence emerges in the interaction of an agent with an environment and as a result of sensorimotor activity. We study problems of embodied navigation, mobile manipulation, world modeling, future prediction, social interaction, etc in rich 3D simulators at scale and sim2real transfer to real robots.
Yongxin Chen
Asst. Professor, Aerospace Engineering
The emergence of generative AI such as text2image and ChatGPT heralds a new era of AI. It not only greatly expands the applications of AI but also enables the public to have access to frontier AI technologies. Its commercial potential will further boost the development of AI.
Our lab focus is on the diffusion model, the backbone of text2image. Our CVPR paper presents a method that can effectively utilize incomplete or fragmented data for training generative models with the aim to relax the conventional dependency on complete datasets for better scalability with regard to training data availability.
Irfan Essa
Professor, Interactive Computing
This is the back of the card
James Hays
Assoc. Professor, Interactive Computing
The long standing computer vision “grand challenge” — to give machines a human-like understanding of images — is nearly solved due to advances in learning, computation, and training data (in increasing order of importance).
Similar advances in other artificial intelligence fields have led to worries that AI is an “existential threat”. There is no reason to worry in the near and medium-term. AI is not embodied, and even if it were it would still need widespread human cooperation to be a threat.
We have passed many AI “singularities” (e.g. machines have been faster at math for a century), and it is exciting to pass a few more this decade, but many more remain!
Judy Hoffman
Asst. Professor, Interactive Computing
It’s truly an amazing time to be working in AI! The world is paying attention and eager to find out how AI will impact their lives. Late breaking research developments are rapidly being integrated into products and propelling new commercial frontiers.
As the reach of this technology further expands in the coming years, our challenge in research will be to not only expand capabilities and make them more accessible, but to advance the reliability and trustworthiness so that AI continues to benefit society.
Zsolt Kira
Asst. Professor, Interactive Computing
One of the exciting aspects of current progress in AI is the unification of models across all modalities including language, vision, and audio. As language models become more powerful, I’m excited about the interaction between perception and language, allowing us to ground language models to the real world — preventing these models from hallucinating — and chat with computers about images and videos.
The ability to process all of these types of information jointly and reason about them brings us closer to more general intelligence.
Yingyan (Celine) Lin
Assoc. Professor, Computer Science
AI is transforming numerous sectors of our technology and society, thanks to its amazing breakthroughs, especially in computer vision and natural language processing. While its potential is immense, promising a future of efficient, eco-conscious technology driving societal change, this brings computational and environmental challenges.
Training powerful AI models demands significant resources and emits enormous carbon emissions. Looking forward, balancing computational demand, model performance, and sustainability is crucial for AI advancement and its widespread applications.
Ling Liu
Professor, Computer Science
This is the back of the card
James Rehg
Professor, Interactive Computing
This is the back of the card
Dhruv Batra
Assoc. Professor, Interactive Computing
Contemporary discussion — and unfortunately, hype — about large language models (LLMs) and artificial general intelligence (AGI) seems oblivious of Moravec’s paradox. We’ve hypothesized since the ’80s that the hardest problems in AI involve sensorimotor control, not abstract thought or reasoning. It explains why AI is mastering games, dialogue, and scene generation. But robots are conspicuously missing from this revolution.
At Georgia Tech, we are pursuing the embodied road to general intelligence – the idea that intelligence emerges in the interaction of an agent with an environment and as a result of sensorimotor activity. We study problems of embodied navigation, mobile manipulation, world modeling, future prediction, social interaction, etc in rich 3D simulators at scale and sim2real transfer to real robots.
Yongxin Chen
Asst. Professor, Aerospace Engineering
The emergence of generative AI such as text2image and ChatGPT heralds a new era of AI. It not only greatly expands the applications of AI but also enables the public to have access to frontier AI technologies. Its commercial potential will further boost the development of AI.
Our lab focus is on the diffusion model, the backbone of text2image. Our CVPR paper presents a method that can effectively utilize incomplete or fragmented data for training generative models with the aim to relax the conventional dependency on complete datasets for better scalability with regard to training data availability.
Irfan Essa
Professor, Interactive Computing
Ti nulluptam essende leniam volo totam nes dolorer undusdae con rereser sperore mpelique necatus aeptiata digenectae cus mos ut ut fuga. Itatquatem. Ratiis inctiur apereium quas vento volorem oditatemodi offic totae dolenimus etum expedi incta et odipsapidus coriorae dolorae. Uda sequis dolori odis apid magnatium harchil latquid quos rernatium ipsam que cus mincient aruptas mollorese delluptat. Si con eosandunt.
nulluptam essende leniam volo totam nes dolorer undusdae con rereser sperore mpelique necatus aeptiata digenectae cus mos ut ut fuga. Itatquatem. Ratiis inctiur apereium quas vento volorem oditatemodi offic totae dolenimus etum.
James Hays
Assoc. Professor, Interactive Computing
The long standing computer vision “grand challenge” — to give machines a human-like understanding of images — is nearly solved due to advances in learning, computation, and training data (in increasing order of importance).
Similar advances in other artificial intelligence fields have led to worries that AI is an “existential threat”. There is no reason to worry in the near and medium-term. AI is not embodied, and even if it were it would still need widespread human cooperation to be a threat.
We have passed many AI “singularities” (e.g. machines have been faster at math for a century), and it is exciting to pass a few more this decade, but many more remain!
Judy Hoffman
Asst. Professor, Interactive Computing
It’s truly an amazing time to be working in AI! The world is paying attention and eager to find out how AI will impact their lives. Late breaking research developments are rapidly being integrated into products and propelling new commercial frontiers.
As the reach of this technology further expands in the coming years, our challenge in research will be to not only expand capabilities and make them more accessible, but to advance the reliability and trustworthiness so that AI continues to benefit society.
Zsolt Kira
Asst. Professor, Interactive Computing
One of the exciting aspects of current progress in AI is the unification of models across all modalities including language, vision, and audio. As language models become more powerful, I’m excited about the interaction between perception and language, allowing us to ground language models to the real world — preventing these models from hallucinating — and chat with computers about images and videos.
The ability to process all of these types of information jointly and reason about them brings us closer to more general intelligence.
Yingyan (Celine) Lin
Assoc. Professor, Computer Science
AI is transforming numerous sectors of our technology and society, thanks to its amazing breakthroughs, especially in computer vision and natural language processing. While its potential is immense, promising a future of efficient, eco-conscious technology driving societal change, this brings computational and environmental challenges.
Training powerful AI models demands significant resources and emits enormous carbon emissions. Looking forward, balancing computational demand, model performance, and sustainability is crucial for AI advancement and its widespread applications.
Ling Liu
Professor, Computer Science
Ti nulluptam essende leniam volo totam nes dolorer undusdae con rereser sperore mpelique necatus aeptiata digenectae cus mos ut ut fuga. Itatquatem. Ratiis inctiur apereium quas vento volorem oditatemodi offic totae dolenimus etum expedi incta et odipsapidus coriorae dolorae. Uda sequis dolori odis apid magnatium harchil latquid quos rernatium ipsam que cus mincient aruptas mollorese delluptat. Si con eosandunt.
nulluptam essende leniam volo totam nes dolorer undusdae con rereser sperore mpelique necatus aeptiata digenectae cus mos ut ut fuga. Itatquatem. Ratiis inctiur apereium quas vento volorem oditatemodi offic totae dolenimus etum.
James Rehg
Professor, Interactive Computing
Ti nulluptam essende leniam volo totam nes dolorer undusdae con rereser sperore mpelique necatus aeptiata digenectae cus mos ut ut fuga. Itatquatem. Ratiis inctiur apereium quas vento volorem oditatemodi offic totae dolenimus etum expedi incta et odipsapidus coriorae dolorae. Uda sequis dolori odis apid magnatium harchil latquid quos rernatium ipsam que cus mincient aruptas mollorese delluptat. Si con eosandunt.
nulluptam essende leniam volo totam nes dolorer undusdae con rereser sperore mpelique necatus aeptiata digenectae cus mos ut ut fuga. Itatquatem. Ratiis inctiur apereium quas vento volorem oditatemodi offic totae dolenimus etum.