Video Action Recognition Under Label Uncertainty
Subproject 1: Multiview Video Understanding
– Zahid Hasan
- Overview –
Multiview video understanding aims to recognize actions from any camera angle by developing view-invariant systems. These systems extract features based solely on the underlying actions, regardless of angular viewpoint or camera sensor differences.
- Significance –
This approach ensures robust decision-making despite variations in camera viewpoints. In applications such as robotics and autonomous vehicles, systems must reliably understand scenarios from different perspectives while focusing only on the actions taking place.
- Obstacles-
The range of possible viewpoints is vast, making gathering and labeling data from all angles difficult. Additionally, distribution shifts across viewpoints complicate the task of maintaining consistent performance across diverse visual perspectives.
Subproject 2: Novel Categories Discovery
– Zahid Hasan
- Overview –
This project focuses on identifying and clustering novel categories from datasets where only a subset of classes is labeled. The aim is to group unlabeled instances into meaningful clusters based on their semantic features, making the model capable of recognizing new classes beyond those it has seen during training. This type of approach is especially useful when dealing with vast datasets where not all classes are predefined or labeled.
- Significance –
Novel category discovery (NCD) is critical for building generalizable machine learning models that can handle new, unseen data. It extends the flexibility of AI systems by allowing them to adapt and understand novel classes autonomously, making it useful in dynamic environments like autonomous vehicles, medical diagnosis, and e-commerce. These models go beyond mere classification—they learn to detect patterns in new, unlabeled categories, enabling applications such as automatic sorting or detecting anomalies.
- Obstacles-
One of the key challenges in NCD is handling the partially labeled data. Models must transfer knowledge from labeled data to unlabeled data in a way that clusters the novel categories effectively. Additionally, accurately estimating the number of novel classes and dealing with imbalanced or sparse datasets is difficult. The gap between visual features and semantic understanding, as well as the scalability of clustering large-scale datasets, adds complexity to the problem. Privacy and trustworthiness are also important considerations when dealing with sensitive or personal data
AI-Driven Adaptive Agro-Wildlife Overlap Prediction and Mitigation
– Bipendra Basnyat
Research Objectives:
The primary objective of this research is to develop an advanced AI-driven system capable of predicting and mitigating the long-term impacts of agricultural development and infrastructure construction on wildlife habitats and movement patterns. This system will integrate multi-modal data sources, including satellite imagery, ecological surveys, climate data, and historical wildlife movement patterns, to create dynamic, high-resolution models of agro-wildlife interactions. The AI will be designed to adapt to changing environmental conditions and human activities, providing continuous, updated predictions of how infrastructure projects such as road construction, fencing, and agricultural expansion affect wildlife over extended periods, potentially spanning decades.
A key focus of the research will be on developing AI algorithms that can simulate and predict complex, cascading effects of human interventions on ecosystems. This includes modeling changes in migration routes, breeding patterns, and population dynamics of various species in response to landscape alterations. The system will also incorporate socio-economic factors to balance human development needs with wildlife conservation, proposing optimal strategies for land use that minimize negative impacts on biodiversity while supporting sustainable agricultural growth. By providing data-driven insights and actionable recommendations, this research aims to inform policy-making, guide sustainable development practices, and foster harmonious coexistence between agricultural communities and wildlife in shared landscapes.
Research Contributions:
- The research will produce a predictive AI model for agro-wildlife interactions, developed using machine learning on historical data and satellite imagery.
- A long-term impact assessment tool for infrastructure projects will be created through the integration of ecological models with AI prediction algorithms.
- An optimal land-use strategy generator will be designed using multi-objective optimization techniques and scenario simulations.
- The project will build a wildlife migration pattern simulator with agent-based modeling, validated against GPS tracking data.
- Finally, a policy recommendation framework for sustainable development will be formulated by combining AI predictions with expert knowledge and stakeholder input.
Wearable-based Human Actions Recognition and Assessment
– Indrajeet Ghosh
- Research Objectives:he growing interest in combining machine learning algorithms with wearable sensors to address real-world challenges necessitates a deeper understanding of this evolving research direction. This work provides a comprehensive overview of state-of-the-art machine learning approaches, methodologies, and hypotheses in the domain of sports performance assessment. Wearable sensors—such as galvanic skin response, photoplethysmography, accelerometers, gyroscopes, and magnetometers—allow for continuous, real-time monitoring of physical activities like walking, running, and complex sports actions. These technologies offer a scalable, non-intrusive, and personalized method for fitness tracking, tele-rehabilitation, and health monitoring. By integrating machine learning, wearable-based human action recognition provides intelligent motion feedback, skill assessment, and interpretability, advancing the future of human-machine teaming systems.Despite advancements, a notable gap exists in sensor-based performance assessment, particularly in fine-grained limb movement analysis. To address this, we focus on developing quantifiable athlete performance measures by leveraging intelligent motion analysis. This approach is key to designing next-generation IoT and AR/VR systems for automated human performance assessment. In addition to enhancing performance assessment, our second objective is to develop a robust action recognition system with limited supervision. This involves utilizing minimal labeled data to train SOTA machine learning models capable of accurately recognizing a wide range of human activities.Research Contributions:
- Can we develop action recognition addressing real-world challenges, such as the availability of annotated data and tackling noisy samples (domain-induced variations).
- Can we develop data-driven algorithms that simultaneously learn subtle dissimilarities and distinctive traits from each limb to assess individual performance and specific skill sets?
- Can AI-driven feedback and visual cues boost overall performance, enhance skill development, and improve individual shortcomings?
- Can we learn the correlation between activity with associated features i.e. motivated to tackle intra/inter class variations to reduce the conditional distribution between source and target domains?