Projects

Deb8 | Clash of the Models:
Deb8 | Clash of the Models was developed for the Meta Llama 3 Hackathon, where it won 2nd place out of over 500 participants. This innovative project utilized a multi-agent system to automate debate script generation, featuring debaters, moderators, and scorers. The project integrated VAPI text-to-voice APIs with Unity 3D models to create a game show aesthetic and evaluated the reasoning skills of AI models like Llama, GPT, Gemma, and Mistral through AI-powered scoring. Additionally, it connected with Hugging Face transformers to score over 1 million open-source models. Co-developed in under 24 hours, this project showcased rapid development and innovative use of AI technologies.Through this project, I gained hands-on experience with multi-agent systems, integrating text-to-voice APIs, and working with Unity 3D models. I also enhanced my skills in rapid prototyping and collaborative development under tight deadlines, which are crucial in hackathon settings. The experience of evaluating various AI models' reasoning abilities deepened my understanding of AI performance metrics and practical applications of AI in entertainment and educational settings.
Implementation video link.

BCCNet for Breast Cancer Classification:
The BCCNet for Breast Cancer Classification project was my Capstone Project which focused on developing an advanced deep learning framework for accurately classifying various types of breast cancer using histopathological images. By combining Squeeze and Excitation networks with a multi-headed attention mechanism, BCCNet aims to enhance the precision of distinguishing subtle imaging differences between multiple breast cancer types. The project utilized the BreakHis dataset, comprising over 7900 images across different magnification factors, to train and validate the model. Through innovative data augmentation methods and the integration of pre-trained CNNs like VGG16, SEResNet152, and EfficientNet_v2_Small, the model achieved high accuracy and robustness. BCCNet demonstrated exceptional performance, particularly at 40X magnification, with the custom model achieving an accuracy of 98.70% and an F1-score of 98.80%. This project not only showcases the potential of AI in improving diagnostic capabilities but also highlights the importance of advanced attention mechanisms in medical image analysis. The experience gained from this project includes handling large datasets, implementing complex model architectures, and fine-tuning hyperparameters to optimize performance, contributing to the broader goal of integrating AI into clinical settings for better diagnostic outcomes.

Integrated Image and Location Analysis for Wound Classification:
I have published a paper titled "Integrated Image and Location Analysis for Wound Classification" in Nature's Scientific Reports. This project aimed to enhance wound diagnosis by developing a multi-modal deep learning network that combines wound images with their corresponding body locations to improve classification accuracy for diabetic, pressure, surgical, and venous ulcers. Utilizing models like VGG16, ResNet152, and EfficientNet, along with Squeeze-and-Excitation modules and multi-headed attention mechanisms, the model achieved superior accuracy. This project provided valuable insights into integrating multi-modal data, advanced neural network architectures, and the potential of AI in clinical diagnostics.

3D Image Slicing and Segmentation:
The 3D Image Slicing and Segmentation project focuses on developing a powerful tool for analyzing and segmenting 3D medical images. This project utilizes advanced image processing techniques and machine learning algorithms to slice and segment 3D images, facilitating detailed analysis of complex structures within the images. The primary objective is to enable precise segmentation, which is crucial for various medical applications, such as identifying and isolating tumors, bones, and other anatomical features. By implementing state-of-the-art models and methods, the project demonstrates how 3D image data can be effectively processed and analyzed to assist in medical diagnostics and treatment planning. Through this project, I gained valuable experience in handling 3D image data, applying segmentation algorithms, and fine-tuning models for optimal performance, contributing to the development of advanced tools for medical imaging and analysis.

Cloud Based Attendance-Log API:
The Cloud Based Attendance Log API project focuses on developing a robust API for maintaining and managing employee attendance logs using cloud computing technologies. The project leverages the power of cloud platforms to ensure scalability, reliability, and efficiency in handling attendance data. The API is designed to record, store, and retrieve attendance records, offering seamless integration with various front-end applications and systems. By utilizing cloud services, the project ensures that the attendance log system can handle large volumes of data with high availability and minimal downtime. This project explores the use of cloud computing principles and technologies to create a scalable solution for attendance management, demonstrating practical applications of cloud-based API development in real-world scenarios. Through this project, I gained hands-on experience in designing, implementing, and deploying cloud-based APIs on both AWS and Azure, enhancing my skills in cloud computing and API development.

Pneumonia Detection:
My Professor, Dr Sussan Mc Roy, advised me to create the Pneumonia Detection project as my Course Project for Artificial Intelligence as it focused on utilizing convolutional neural networks (CNN) and comparing pre-trained models to create an effective tool for identifying pneumonia. By leveraging advancements in artificial intelligence, this study employs lung image data as the primary input to accurately detect pneumonia cases. The project demonstrates the potential to enhance the speed and efficacy of medical diagnoses, thereby enabling prompt treatment and potentially saving lives as AI technology becomes more accessible. The model's therapeutic relevance could be expanded to accurately identify other conditions such as cancer, tumors, and bone fractures. This project involves implementing well-known models like VGG19 and RESNET 152/200, as well as exploring the construction of CNN models and hyperparameter tuning to achieve higher accuracy. It provides an in-depth exploration of current and past AI and ML efforts aimed at detecting pneumonia, contributing valuable insights into the development of medical diagnostic tools.

Image Caption Generator:
During my Master’s, I created an Image Caption Generator as a Course Project for Natural Language Processing, which is a sophisticated application designed to generate descriptive captions for images. Leveraging advanced techniques in predictive modeling, deep learning, natural language processing, and image processing, this project analyzes objects within an image to produce meaningful captions. The core functionality includes inspecting objects in the image, generating appropriate labels based on identified categories, and suggesting relevant captions as users upload images into the application.
Technologies utilized in this project encompass predictive modeling to forecast and generate possible captions, deep learning for training models to interpret image data, natural language processing for creating coherent and contextually accurate captions, image processing for analyzing visual information, multi-class text classification for labeling different objects, and web scraping for gathering and preprocessing training data.

Face Recognition:
I created a Face Recognition project which would keep track of the attendance of the Employees i.e., in-time and out-time of the employees. This opportunity was given by 3i Infotech .ltd during my internship period. I worked remotely under the guidance of Ushasri Oddiraju. I designed and deployed an independent project using Face-Recognition and Databases using the primary camera installed on the laptop.

Traffic Sign Recognition:
The College provided me and my classmates a chance to design and deploy a mini-project during the 6th Semester. In which me and my team-mates worked closely to create a front-end and a model which can help the drivers recognise traffic signs using a trained model, which was trained using GTSRB. This led us to use Keras, CNN, Image-Augmentation, Deep Learning and libraries like OpenCV, MatplotLib etc. The GUI can give a textual output as well as audio output.
Implementation video link.

Zoom Automation:
During the pandemic, because of the work from home scenario, logging in into the meetings one after the other might feel hectic or tedious, so I created a small program which used to open the zoom links on the time specified in the list. The user just had to copy-paste the links at the beginning of the day and the rest is all handled by the code, the links open one after the other time to time. Simple to use and much helpful to all the lazy college students who don’t like to open the groups and keep checking for the links or the meeting id.

Stock Price Predictor:
I created a stock predictor to predict the stock price based on the previous close price. I used a previous month data collected from the NSE, cleaned the data and used it to train the model in which there were three ways to determine the best way to predict the price, I used Linear Regression, Polynomial Regression and Radial Basis Function (RBF) SVM to determine the next closing price of a particular stock. This small project was an inspirational idea given by my dad who’s a Stock Broker.

OYO- Web scrap:
While learning web-scrapping, I tried to implement it on a website which would have a lot of data, So OYO website was a great place to do it, from which I collected data of hotels in a particular area or city, the details include name, address, price, ratings and amenities provided by the hotels and saved in a .csv file.

CLAB:
CLAB saves the SSH work and data which was there just before the issue occurs like disconnection, maximum limit of 24 hrs is reached etc. This stores the data in the drive and helps you continue your model training and helps you deploy the model training from where you stopped. This project was created during Shellhacks Hackathon2020 in which I and my club mates took part during the 5th Semester.
Implementation video link.

Basic School Administration Tool:
It’s a tool which collects basic details of students like name, age, contact number, Email ID etc and stores it into a .csv file. It helps to enter student details into excel without the need of opening excel or even knowing excel.