Director
Breanna (Bree) Shi
Bree Shi is a fourth year PhD student studying Bioinformatics with Minors in Machine Learning and Higher education. She holds many prestigious awards such as the following: The GEM Ph.D Engineering and Science Fellowship Issued by The National GEM Consortium, STEM Diversity PhD Fellowship Issued by Graduate Fellowships for STEM Diversity, GAANN Biology Fellowship Issued by U.S. Department of Education. Before becoming a PhD student at Georgia Tech, Bree earned her master’s in Mathematics from University of Minnesota where she earned Diversity of Views and Experiences Fellowship and her Mathematics BS degree at Stetson University.
Dr. Nicholas Lytle
Dr. Nick Lytle is the Director of Research for Georgia Tech’s OMSCS program. Lytle has a Ph.D. in computer science and has worked as a researcher, consultant, and data scientist. His research expertise is in Computing Education, Educational Technology, and AI for Education. His goal is to make computing education more accessible and effective for everyone.
Charles (Charlie) Clark
Charlie Clark is a current M.S. in Computer Science student specializing in Machine Learning. He has a B.S. with dual majors in Applied Mathematics and Economics from Stony Brook University (2022, summa cum laude). He has been working on a novel vision transformer architecture for end-to-end object re-identification (reID) as part of the Cichlid CV team. In his spare time, he enjoys hiking, playing the violin, and exploring the latest advances in AI.
Thomas Deatherage
I’m a software engineer in Georgia. I’ve worked as a developer at a wide range of companies — from very big to very small. Additionally, I’ve been a developer/researcher with HAAG since its inception, summer 2024. I expect to complete my master’s at GaTech sometime next year.
Cichlid Computer Vision Project
Researchers: Charles Clark, Kailey Quesada, Thuan Nguyen, Adam Thomas, Jeanette Schofield
Collaborator: Dr. Patrick McGrath
The project will focus on analyzing the newly available data from multiple cichlid species using the Cichlid Bower Tracking Repository. This will involve processing the raw video footage through the existing pipeline to extract annotated behavioral data. Concurrently, a multi-species animal tracking dataset will be curated by combining data from the various species. Building upon the initial analysis, the project will explore data distillation techniques to improve the efficiency and scalability of the tracking process. This may involve techniques such as data subsampling, compression, or feature extraction to reduce the computational overhead without significantly compromising accuracy. Novel challenges such as occlusion, where animals partially or fully obscure each other, will be tackled through the development of specialized algorithms and model architectures.
Publications:
- Johnson, Z.V., Arrojwala, M.T.S., Aljapur, V. et al. Automated measurement of long-term bower behaviors in Lake Malawi cichlids using depth sensing and action recognition. Sci Rep 10, 20573 (2020). https://doi.org/10.1038/s41598-020-77549-2
- Long, Lijiang & Johnson, Zachary & Li, Junyu & Lancaster, Tucker & Aljapur, Vineeth & Streelman, Jeffrey & Mcgrath, Patrick. (2020). Automatic Classification of Cichlid Behaviors Using 3D Convolutional Residual Networks. iScience. 23. 10.1016/j.isci.2020.101591.
Natural Florida History Museum Project
Researchers: Thomas Deatherage, Vy Nguyen, Romouald Dombrovski
Collaborator: Dr. Arthur Porto
Natural history collections are invaluable resources for scientific research, education, and public engagement. However, the sheer volume and diversity of specimens often make it challenging to efficiently search and retrieve specific items. The goal of this project is to develop a sophisticated search interface that leverages advanced machine learning techniques to embed images from natural history collections, enabling users to search the database using images or natural language queries.
Researchers: Thomas Orth, Michael Bock
Collaborator: Charlotte Alexander, J.D.
One of this project’s goals is to summarize legal documents from the clearinghouse.net to automate Clearinghouse’s work with law students. Another target that the team is working on is a document classification project on a dno case dataset from UPenn. The team attempts to automatically populate certain fields such as the allegation, the companies involved, etc.
Researchers: Karol Gutierrez Suarez, Alejandro Gomez, Víctor Fernandez
Collaborator: Charlotte Alexander, J.D.
This project is to analyze the use of NLP on court case documents from the Dominican Republic to estimate processing duration in order to optimize case triaging. The narrowest goal is to extract from each sentencia the procedural history of the case, focusing on dates. From that, a timeline of each case can be constructed. Other variables can also be built, e.g. the type of the case, the court, the judge, etc. and eventually a model can be built that helps identifying or even predicting the types of cases that take the longest to progress from step A to the final outcome. The secondary goal is to help the public policy people in the judiciary identify possible interventions that would help with delays. The next-level goal is to demonstrate to the judiciary the value of structured data. If the judges realize that structuring their sentencias more consistently will make this kind of analysis easier, and therefore make their courts more efficient, then they might be motivated to adopt and follow some rules about standardizing the information that they include in the sentencias. The goal would ultimately be to increase standardization to make information extraction easier. An additional next-level goal is to use this work as proof of concept for other countries’ courts/ legal systems, as a way to demonstrate the value of a data science approach to courts and court operations.
Lizard Computer Vision Project
Researchers: Mercedes Quintana, Philip Woolley, Jacob Dallaire, Ruiqing Wang, Ayush Parikh
Collaborator: Dr. James Stroud
The project is in collaboration with the Stroud lab in the Department of Biological Sciences, within the domain of Ecology and Evolution. The research question is to investigate the morphology to fitness connection in adaptive evolution, and specifically, the performance cost to missing a leg in lizards. Given videos of experimental trials involving lizards jumping and sprinting, the team uses DeepLabCut (DLC) to track the positions of key body parts, from which biophysical information of interest to biologists can be extracted.
Researchers: Yoon Kim, Michael Falter, Vikas Agarwal, Eve Dang
The goal of this project is to create and maintain a structure for large research groups in higher education. Code base solutions, contribution tracking, resource management, researcher support, and program development are included in this project. Additionally, members of this team often participate in additional projects individually, as directed.