This summer, I was fortunate enough to get the opportunity to work as an intern at Dr. Do’s Lab of Medical Imaging and Computation at Massachusetts General Hospital. The lab worked to use machine learning in order to create artificial intelligence that would recognize and categorize medical images. The lab functioned in hybrid mode: 2 days in person and 3 days remote, with lab members staying connected during remote days through communication in Microsoft Teams and progress meetings 3 times a week
For my first project, I was assigned to work with Brady, another intern from Rivers. We were directed to create a program that would recognize different body parts from X-Rays. In order to accomplish this, we created a convolutional neural network that we fed with correctly labeled images from the training dataset. The neural network was able to take the input image, assign importance to various aspects/objects in the image and differentiate one class of labeled image from the others (i.e. differentiate ankle x-rays from elbow x-rays). Then, we fed our program images from the testing dataset in order to deduce its accuracy. After much work and many tweaks, the accuracy of our program reached 99.3%.
After the completion of our first project, I was assigned to analyze a dataset of chest radiographs and create a program that would recognize if a patient has Covid-19. Currently, Covid-19 can be diagnosed via polymerase chain reaction to detect genetic material from the virus. Unfortunately, it can take a few hours or even days before the test results are back. By contrast, chest radiographs can be obtained in minutes, but as Covid-19 looks very similar to other viral and bacterial pneumonias on chest radiographs, it is extremely difficult to diagnose. So, with a computer program that can identify Covid-19 from chest radiographs, patients will receive a quicker diagnosis and receive better care for their condition. The chest radiographs in my dataset were labeled in four different ways: Negative for Pneumonia, Typical Appearance of Covid-19, Indeterminate Appearance of Covid-19, and Atypical Appearance of Covid-19. I was assigned to read the dicom files and display 9 images per each label, as well as organize the dicom files in 4 separate directories based on their labels.
Negative for Pneumonia
Typical for Covid-19
Indeterminate for Covid-19
Atypical for Covid-19
After I completed my work with the dataset of chest radiographs, Dr. Do taught me how to use MarkIt, a newly created annotation tool that is implemented with pre-trained artificial intelligence to create preliminary annotations and constructs a high quality dataset by exchanging datasets between researchers and using blockchain technology to evaluate the value of the data. Machine learning methods require well-annotated datasets - if images are incorrectly labeled or there aren’t enough images for each label, critical consequences can occur with the use of said machine learning algorithms in clinical practice. MarkIt looks to solve this issue by allowing radiologists to label (annotate) images in an efficient manner and using blockchain technology to track user activity, creating a high quality dataset by measuring the annotation accuracy and time cost of each user’s annotations.
I am extremely fortunate to have had this opportunity! I learned a lot about working in a lab environment, particularly how to work effectively with those around me. I learned so much about machine learning and coding from the lab members and gained important insight into the medical world, as well as the importance of computer science in the advancement of healthcare. Thank you so much to Dr. Do and Mr. Schlenker for providing me with this fantastic experience, and to all the lab members who made my day-by-day experience so great!