At the crossroads of computer vision and natural language processingWednesday, December 16, 2020
Samira Ebrahimi-Kahou, who recently joined the Department of Software Engineering as a faculty researcher, is interested in the learning of multimodal representations that allow reasoning in supervised and reinforcement learning tasks. Her research activities focus on problem solving at the intersection of computer vision and natural language processing.
Take autonomous vehicles, for example. A vehicle’s interaction with its environment requires a deep understanding of everything that may occur while driving: possible collisions, interactions with humans, objects moving through space, etc. "To understand the real world, a car must integrate information coming from many sources, including vision, text and sound. I’m trying to develop deep learning methods that address the problems related to integrating all this information," she explained.
The researcher believes firmly that her work will enable the design of models that contribute to humanity’s well-being. As examples, she cites applications to improve accessibility for the visually impaired and enhance the performance of personal digital assistants. Other applications could enable visual content creation from linguistic exchanges, which could result in vastly improved design tools for many purposes.
Particularly interested in designing applications that can respond to humanitarian disasters, the professor is currently collaborating on efforts to deal with the historic infestation of locusts in Africa, the Middle East and Southwest Asia, brought about by a combination of factors including unusual weather conditions. This transnational plague is the worst seen in at least a generation and a major threat to food security.
Convinced that machine learning and artificial intelligence will play a growing role in our daily lives, Professor Ebrahimi-Kahou believes that researchers in this field have a social responsibility, and that training the next generation is of prime importance.
Her academic background
Samira Ebrahimi-Kahou is crazy about math, and holds a bachelor’s degree in applied mathematics and a master’s in computer systems and components. She obtained her PHD in computer engineering at Polytechnique Montreal/Mila in 2016, after which she joined Microsoft Research Montreal. She was also a postdoctoral fellow at McGill University.
In the course of her doctoral studies, she focused on developing new deep learning methods for video analysis. Her work on recognizing emotions in video content has served as a basis for many other projects.
Since obtaining her doctorate, she has focused on various multimodal problems, including iterative image generation from dialogues and improving visual navigation. She has contributed to the creation of several large-scale datasets including :
- FigureQA - visual reasoning on mathematical plots;
- Something-Something (fine-grained video captioning;
- ReDial (conversational movie recommendations.
On the application side, she works on machine learning for disaster response with a focus on modeling extreme weather events.
Professor Ebrahimi-Kahou was a teaching assistant and lecturer for both undergraduate and graduate level courses. She mentored many students and junior researchers, and gave several talks on topics related to her research.
Her research interests
- Learning from multimodal information such as video and text
- Learning via interaction with an environment
- Machine learning methods for disaster response
Communications serviceAll the news