Mission Quizify is an advanced AI-driven platform designed to enhance educational experiences through personalized quizzes. Developed at Radical AI, New York, NY, this project leverages cutting-edge AI technologies to generate and deliver customized educational content that caters to the unique learning needs of each user.
Technologies Used
Python: Primary programming language for backend development.
Streamlit: Built a user-friendly web interface allowing seamless user interactions.
Google Cloud Vertex AI: Powers advanced question generation
Chroma DB: Utilized for robust data handling and ensuring reliable quiz delivery
Features
Personalized Quiz Generation: Utilizes Vertex AI to create quizzes
User Interface: Built with Streamlit, interface provides a smooth user interaction model.
This research introduces a transformative approach to
empower visually impaired individuals in navigating their
surroundings independently. We present a real-time scene
description system employing the innovative ExpansionNet
v2 model and a user-friendly app. This groundbreaking
technology achieves an impressive 85% accuracy in providing dynamic audio descriptions of scenes, significantly enhancing the mobility of visually impaired individuals both
indoors and outdoors.
3. Proposed CNN Model
ExpansionNet v2 utilizes a modified version of the Vision Transformer (ViT) architecture as its backbone. ViT’s
inherent ability to model long-range dependencies and relationships within data makes it well-suited for handling
the expanded blocks and heterogeneous sequences generated by BSE. The training strategy employs a two-stage approach:
Stage 1: Cross-entropy pre-training: The model is trained
on both the original image and its expanded blocks, along
with captions generated from them. This stage utilizes
cross-entropy loss to minimize the difference between the
model’s generated captions and the actual captions, fostering basic image understanding and caption generation
skills.
• Stage 2: Reinforcement learning fine-tuning: In the final
stage, BLEU score guides the model’s learning through
reinforcement learning. This incentivizes the model to
generate captions that are not only factually accurate but
also fluent, grammatically correct, and engaging. This
stage polishes the model’s skills, ensuring its captions effectively capture the essence of the image
This project focuses on developing a deep learningbased approach for human pose estimation using the popular
COCO dataset. Leveraging the advanced capabilities of
convolutional neural networks (CNNs), the project implements
a pose estimation model that accurately identifies and
tracks human body keypoints in images. The core of the
project is built around a tailored architecture based on the
PoseEstimationWithMobileNet model, which is known for its
efficiency and accuracy in processing images for keypoint
detection.
The project’s outcome demonstrates the model’s proficiency in
detecting human poses with high accuracy. The results are quantified using standard metrics and visually represented to show
the model’s effectiveness. This report encapsulates the journey
from conceptualization to implementation, offering insights into
the challenges faced and the innovative solutions adopted. The
project sets a foundation for future enhancements in real-time
applications and more complex pose estimation challenges
This project involves the development of an autonomous robot system capable of independently locating its charging station, planning a collision-free path, and accurately aligning itself with the docking station for charging. The entire system is built using ROS2 (Robot Operating System Version 2), which provides a robust framework for robot software development, enhancing real-time performance and supporting more complex and distributed systems.
Pipeline
The goal of this project is to drive the KUKA youBot to pick up a block at the start location, carry it to the desired location, and put it down in the simulation software V-REP. The project covers the following topics: 1. Plan a trajectory for the end-effector of the youBot mobile manipulator. 2. Generate the kinematics model of the youBot, consisting of the mobile base with 4 mecanum wheels and the robot arm with 5 joints 3. Apply feedback control to drive the robot to implement the desired task 4. Conduct the simulations in V-REP
Sampling-based path planning algorithms play an important role in autonomous robotics. However, a common
problem among the RRT-based algorithms is that the initial path generated is not optimal and the convergence is
too slow to be used in real-world applications. In this paper, we propose a novel image-based learning algorithm
(CBAGAN-RRT) using a Convolutional Block Attention
Generative Adversarial Network with a combination of
spatial and channel attention and a novel loss function
to design the heuristics, find a better optimal path, and
improve the convergence of the algorithm both concerning time and speed. The probability distribution of the
paths generated from our GAN model is used to guide the
sampling process for the RRT algorithm. We train and
test our network on the dataset generated by (Zhang et al.,
2021) and demonstrate that our algorithm outperforms
the previous state-of-the-art algorithms using both the
image quality generation metrics like IOU Score, Dice
Score, FID score, and path planning metrics like time
cost and the number of nodes.
Dataset and Augmentation
We used the dataset generated by (Zhang et al., 2021)
to validate our results. The dataset was generated by
randomly placing different obstacles on the map and
randomly sampling the start and goal nodes which are
denoted by red and blue dots on the map respectively.
The RRT algorithm was run to generate the feasible path
which is shown in green color or the ground truth. The
dimensions of all the images are (3x256x256) where the
height and the width of the images are 256 and the number of channels is 3. We use 8000 images for training and
2000 images for testing respectively using the dataset by
(Zhang et al., 2021).
The parameters used in the data augmentation like height
shift of the map, width shift of the map, shift step of the
map, rotation probability of the map, and the number of
maps generated of the map are shown below
The goal of this project is to Create an image panorama by stitching a set of images together
Image Registration
I used SURF to do the feature point extraction and matching, then used random sample consensus(RANSAC) for transform matrix estimation
Image Warping
Use the derived transform matrix nad project that warped image on a plain surface
Image Blending
Using Center-Weighting algorithm (compute the the distance from each pixel to 4 boundaries of the image and take the the smallest ratio
between two distances and the dimension of image as the corresponding pixel
value on mask matrix). The mask we derived is shown in the following image:
For each image, I derive a mask and then warp the mask just as warp the image
Cropping
After doing image stitching and image blending, I get the panorama look as following
Use pythong to find the largest rectangle that don’t include the black region in the
panorama image, I get the final panorama look as following