Assistant Professor Judy Hoffman is at the forefront of training computing systems to ‘see’ the world as people do and adapt in real-time as the situation demands
By Joshua Preston | Photos by Kevin Beasley
Judy Hoffman is planning a large-scale research event that in some ways could be compared to a Broadway musical — it’s a limited production with the promise of high spectacle and will draw a global audience eager to see a show.
The event is this summer’s Computer Vision and Pattern Recognition Conference (CVPR), an international gathering of researchers in computer vision, a subfield of artificial intelligence that, at its simplest, is about training computers to process image and video pixels so they “see” the world as people do.
Over the past 12 months, Hoffman, an assistant professor in the School of Interactive Computing at Georgia Tech, has helped engineer a global production as program co-chair for CVPR. A sampling of her duties includes: coordinating with experts across industries, universities, and national labs; guiding 400+ research area chairs in charge of the peer-review process; and defining the scope and shape of the technical program for the conference. Starting June 18, Hoffman will see the results of the team’s year-long effort when 8,000+ researchers descend on Vancouver.
“It’s been a lot, but well worth it,” says Hoffman. “The research community is producing work that is pretty striking, especially when you start to see the basic research making its way into mainstream applications.”
As a byproduct of her role, Hoffman is helping shape computer vision at a time when the broader field of artificial intelligence is getting more attention. AI captured the public’s imagination late last year with the release of “generative AI” bots for web browsers. A person types a request into the web browser and these bots, such as ChatGPT, can within seconds produce high-fidelity text content (think essays or emails) or photo-realistic art. Seeing the results firsthand is often enough to convince skeptics.
Predictions abound that AI will become a major disruptor, and critical applications, such as self-driving vehicles, will now be closer on the horizon. But for Hoffman and her peers in the field, they understand the level of complexity at play and what must be considered before humanity steps into this science fiction future.
Academia’s Advantage as a Career Path
Hoffman has a fluid understanding of computer vision and its diverse mechanics, a skill that can be traced back to her time at UC Berkeley, where she studied for her undergraduate and Ph.D. degrees, and at Stanford, where she did post-doctoral work. At Berkeley she received a dual degree in computer science and electrical engineering, a pathway that allowed her to study things from signal processing to AI and more high-level computing.
“It allowed me to kind of find my niche of the things that I found interesting,” says Hoffman.
It also gave her a clear sense that academia was her future professional home.
“It’s kind of a unique time for the AI field in that there is so much access to research outside of academia,” notes Hoffman, a former research scientist at Facebook AI Research. “But for me personally, academia and teaching students is what I like to do, because it’s so open in terms of what you can work on, who you can talk to, and you can quickly pivot to investigating new topics with new people.
“Not only do you get to choose who you work with, but also who you mentor. The relationship that you have with Ph.D. students in mentoring them for five or more years is very unique. So it’s not only about the research for me, it’s kind of the entire job description.”
Now after four years at Georgia Tech, Hoffman has an established lab with a mixture of students at various levels of seniority, ranging from undergraduates to Ph.D. students.
“I love having that pipeline. Obviously, what people accomplish at different levels of seniority is not equivalent, but they can learn from each other,” says Hoffman. “I think it makes everybody more productive, and I tend to push my senior students to take on mentorship roles.”
Hoffman says that a metric of her success will be if she can design a lab that continues to have that type of pipeline hierarchy where everybody is helping each other and improving.
“What’s been really awesome to see is that the most senior Ph.D. students in the lab have now started to work on lots of projects at once and also kind of advise on a few different projects at a low level. They are helping students brainstorm ideas and helping them with things like all their code bugs or characterizing what is the right way to prioritize experimentation and think about research.”
The Challenge of Computer Vision
When people see new things, they can usually adapt to what the situation demands. People use their vision to perceive their environment and don’t think twice about it. For machines, it’s not so intuitive, and sometimes the lessons they’ve already learned don’t stick.
According to Hoffman, computers need to be able to implicitly use a 3D model of the spaces they see (like humans do) in order to navigate or understand that space. They also need to understand the semantics of the world, or, put more simply, consistently recognize common objects. For example, an apple is not sometimes an apple and sometimes a rock, it’s always an apple.
Hoffman says both aspects — spatial modeling and semantic knowledge — are just two requirements of many that are on the roadmap to making computer vision more robust.
The California native’s overall goal with her work is to be able to build technologies that can solve these computer vision tasks. She investigates how to get vision tasks done in a reliable way and, ideally, with some measure of interpretability, or steps a human can understand.
With the commercial traction of a lot of AI technologies, Hoffman notes that one risk is the deployment of systems before they’re ready. The implications could be significant — accidents involving self-driving cars are a commonly cited scenario.
Researchers use a “closed-world assumption” to test benchmarks for their algorithms or products. Progress, even very good progress, Hoffman says, might be misleading if there is a heavy focus on benchmarking in one particular setting.
“It’s very natural for non-experts to see that progress and to think that system X should immediately generalize to new settings in system Y.”
Hoffman’s projects emphasize finding ways to automatically characterize deficiencies in existing systems so that users know what might or might not work if integrating that computer vision system into their own applications.
“There’s a lot of different ways in which you could accidentally miss out on very real aspects of the world,” Hoffman says. “If they are missing, the system really can only generalize to the type of data that it’s seen examples of.”
Her teams specialize in trying to create learning systems that can adapt when confronted with new situations and then building systems that can be as robust and reliable as possible.
Generalization versus and Adaptation
If computer vision systems are to operate reliably in the real world — with all its messiness and unpredictability — adaptation to the unknown is essential. This a clear theme that Hoffman talks about and is central to her work.
A new National Science Foundation CAREER project she is leading advocates “for resilient vision systems through a new integrated approach which iterates between generalizing across available visual domains and rapidly adapting given new domain data.”
In other words, general vision systems that may have a large menu of pre-trained scenarios need to pair well with adaptive systems designed to handle the unknown. Making the two work together is no trivial task.
“There’s kind of a fundamental mismatch with generalization and adaptation approaches that needs to be resolved in order to really get the best of what both fields have to offer,” Hoffman says.
Hoffman goes on to explain that techniques people use to improve generalization sometimes hurt the system’s calibration, which is when a system may overall be doing better at solving a task, but it doesn’t have any notion of confidence that the answer is reliable.
“It’s just as likely to say that something is a table with 99% confidence even if it’s actually a laptop. It may be doing better overall at recognizing tables and laptops but at the cost of not having the nuance of understanding uncertainty.
“If you are trying to adapt a computer vision system, then that system, and the people it affects, need to know when there’s this uncertainty. Otherwise the human doesn’t really know when to intervene.”
Creating The Next
The new breed of AI chatbots and text-to-image algorithms are likely to be followed in quick succession by others. When they come, computer vision will play a role more often than not, and it will take experts such as Hoffman and her students to advocate for the safest and smartest capabilities for these and other more critical systems.
One student recently joined Hoffman’s lab while working with another group on a quadrupedal robot that uses a vision system to navigate a real-world environment. Hoffman made sure the student had time to finish the work while also starting his Ph.D. program. It’s an example of her mentoring approach and insight into how she prioritizes the advancement of the field by making sure students are positioned for success.
One key to success that Hoffman has learned: “Students understandably get very ‘heads down’ focusing on their particular projects. That goes with the territory, but I think that the best research and the best ideas come from breaking out of what you are thinking about on a day-to-day basis.
“Something that people don’t necessarily think about with research is that it’s a really social discipline and the best things come from casual collaboration or in a social setting.”
Hoffman speaks from experience. At a junior faculty mixer, she connected with an aerospace engineer and is now part of a large collaborative grant involving NASA and other partners. They are working on systems for urban air mobility.
“Instead of self-driving cars, it’s self-driving cars…that fly.”
Maybe that science fiction future isn’t so far off after all.