I am currently a final year B.Tech (ECE) student at Visvesvaraya National Institute of Technology (VNIT), Nagpur. I have been working on Deep Learning for quite a while now. Primarily, my focus has been on the mathematics behind Computer Vision & Deep Learning algorithms along with its real-time implementations. I mainly work with images and sequential data and wish to contribute more towards the progress of these fields. With sheer persistence, I have been able to complete a diverse span of projects in a short time. Furthermore, I have mentored a few projects which have procured international research papers and patents. I plan to join an academia for productive research! I have had exposure in other fields also like Natural Language Processing, Electronics, Communication systems, CAD, Graphic Designing, Video Editing, etc.
[Aug. 2020] Started working at CAIR Lab, DRDO on drone image tracking, detection and pose estimation.
[Apr. 2020] Started my internship at NUS - National University of Singapore remotely.
[Feb. 2020] Bagged two first position in two different competitions named 'Techno.Docx' and 'Electroblitz' at AXIS’20, VNIT | Central India’s Largest Techfest.
Aaron Bobbick | Udacity
Helper robots are widely used in various situations, for ex-ample at airports and railway stations. This paper presents a pipelineto multiplex the tracking and detection of a person in dynamic envi-ronments using a stereo camera in real-time. Recent developments inobject detection using ConvNets have led to robust person detection.These deep convolutional neural networks generally fail to run with highframes rates on devices with less computing power. Trackers are alsoused to retain the identity of the target person as well as imposefewerconstraints on hardware. A concept of multiplexed detection and track-ing is used which makes the pipeline faster by many folds. TurtleBot-2is used for prototyping the robot and tuning of the motion controller.Robot Operating System (ROS) is used to set up communication be-tween various nodes of the pipeline. The results found were comparableto current state-of-the-art person followers and can be readily usedinday to day life.
Our current solution (implemented) provides a robust registration plate detection, and extracts other features like car model, speed, face (if visible), date and time of entry/exit and upload the extracted data to a centralized IoT integrated database. Beneficiaries include malls, colleges, parking lots, etc. with multiple gates. Whenever the gate camera detects a departing car, the corresponding owner gets notified. Further, the owner can use the Alert feature to warn the guard. The web application has two levels of access, the first providing general information about a specific car to the corresponding owner, and the latter one for the Authority, which stores all the data of a campus. This can be used to monitor the traffic on the campus and for surveillance applications.
Control system is designed to stabilise the camera gimbal system used in different airborne systems for applications such as target tracking, surveillance, aerial photography, autonomous navigation and so on. The technique
is applied in everything from self-stabilising cameras to helicopters and noise reducing equipment. This camera gimbal system replaces many traditional tracking systems such as radar which are heavy and large to mount
on air vehicles. So, the stabilisation of camera gimbal is very important to eliminate shakes and vibrations in photography, provides accuracy.
NOTE: This project was selected for SIH-20 from our internal hackathon conducted by college. Further details will be shared after results of SIH-20
There are four designated locations in the grid world indicated by R(ed), G(reen), Y(ellow), and B(lue). When the episode starts, the taxi starts off at a random square and the passenger is at a random location. The taxi drives to the passenger's location, picks up the passenger, drives to the passenger's destination (another one of the four specified locations), and then drops off the passenger. Once the passenger is dropped off, the episode ends. Observations: There are 500 discrete states since there are 25 taxi positions, 5 possible locations of the passenger (including the case when the passenger is in the taxi), and 4 destination locations.This problem was solved using Q-Learning Approach. The model trained is consistently among top-5 in OpenAI Gym Leaderboard
Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers. Conceptually it involves a mathematical embedding from a space with many dimensions per word to a continuous vector space with a much lower dimension. Word2vec is a group of related models that are used to produce word embeddings. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words. Word2vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located close to one another in the space
From the above image we can see that the distance between 'Obama' and 'President' word vector in the projected 2D space is least. This is because those two words are highly correlated. Similar justification goes with words pairs like 'media' and 'press', 'speaks' and 'greets', etc. The accuracy will increase with increase in voclabulary and training time. For my results visit the github repository..
This Summer Project was mentored by me at IvLabs, VNIT
The MNIST handwritten digit classification problem is a standard dataset used in computer vision and deep learning. Although the dataset is effectively solved, it can be used as the basis for learning and practicing how to develop, evaluate, and use convolutional deep learning neural networks for image classification from scratch. This includes how to develop a robust test harness for estimating the performance of the model, how to explore improvements to the model, and how to save the model and later load it to make predictions on new data. This was coded in scratch using Numpy. For more results and details about the algorithm visit the GitHub page. For full demo click the Video button
Simple harmonic motion can serve as a mathematical model for a variety of motions, such as the oscillation of a spring. With the aim of learning computer vision and MATLAB, I worked on analyzing the motion of a target-object undergoing a damped harmonic motion. The target-object was separated from the background using color thresholding and estimated as a point object. Coordinates of this point were recorded and used to estimate the parameters associated with the mathematical model of the system like maximum displacement, mean position, the velocity at different time instants. A mathematical model was estimated by fitting a curve to the recorded data using MATLAB Curve Fitting Toolbox.
The hand gesture controlled bot is a bot which receives it commands by giving pitch and roll to hand. This is helpful for people on wheelchair who can't even move their fingers or hands.These bots are very useful in many applications like remote surveillance, military etc. Hand gesture controlled robot can be used by physically challenged people for wheelchair control .Hand gesture controlled industrial grade robotic arms can be developed.
The code is written from scratch using pytorch for dataloading, matrix calculations and GPU acceleration. This was my first introduction to DL where I wrote the code myself along with learning various mathematics and techniques required to optimize a netowrk (PS. This also included learning ways to tune hyperparameters). Deep Learning Models Implemented are enlisted below: