Computer Vision Dataset Toolkit Python | SAM2 | Flask

Computer Vision Dataset Toolkit

Role: Technology Intern

Tools used: PyQt5, Vimba Python API, AWS SDK(Boto3) with Amazon S3, Meta SAM2 model, Flask, React.js

PyQt5 GUI to capture training images with Allied Vision cameras

At Smart Design, I contributed to the development of a custom imaging rig used for collecting training data for computer vision models. The setup utilized Allied Vision industrial cameras, controlled via the Vimba Python API. I developed a user-friendly graphical interface using PyQt5, enabling team members to easily capture and manage high-quality training images. The tool supported live camera previews, adjustable settings, and batch image capture, streamlining the data collection process for machine learning applications.

Photo of the imaging rig

Screen shot from the Capture GUI

Web-based Segmentation Tool Using Meta SAM2

To complement the camera capture workflow, I developed a web application that leverages Meta’s Segment Anything Model v2 (SAM2) to automatically segment objects from the captured training images. The tool allows users to upload images, visualize segmentation masks, and refine or export selected regions for use in annotation or dataset generation. This streamlined the dataset preparation process, reducing manual effort and improving segmentation consistency across training images.