Joel Manasseh: Mass General Hospital
This summer, I had the opportunity to work at the Lab of Medical Imaging and Computation at Massachusetts General Hospital. The LMIC specializes in using machine learning to advance medical imaging. My first task when I started my time here was to learn as much as I could with the programming language, Python, and the machine learning library, keras. In order to familiarize myself with the machine learning process, I had to learn and write a program to teach the computer how to recognize handwritten numbers 0-9 by looking for different patterns in the training set of 70,000 numbers. Thankfully, I didn’t have to write 70,000 numbers myself as this dataset is widely available in a dataset called MNIST. After the computer learned from the 70,000 images 100 times through, I was able to achieve an accuracy of 99% when I tested the model on a test set of 10,000 numbers. The next step for this short project was to test my model on my own handwritten digits from 0-9. Unfortunately, after writing numbers on lined paper and running it through the model, the computer was unable to predict any number but 8.
Apparently, it is very difficult to create your own dataset of numbers. Nevertheless, I was able to learn immensely from this little project.
After completing that project, I jumped into the main project that I would be completing during my time here. I, along with two other interns, Apsi and Hannah Daniel, were asked to participate in a competition hosted by the SIIM (Society for imaging informatics in medicine) and the ACR (American College of Radiology) on a site called Kaggle. Kaggle is an online data science company that allows machine learning engineers and data scientists to build models and learn from each other through many different monetary competitions.
In this specific competition, we have to build a model that can detect a lung disease called pneumothorax from a chest x-ray. This disease is defined as a collapsed lung when air leaks into the space between the chest wall and the lung.
Because pneumothorax is extremely vital to detect due to its fatal potential, it has become necessary to transfer the detection job over to computers to handle as humans can easily make mistakes recognizing it with the naked eye. In order to build our model, I forked (legally copied) a public kernel (batch of code) from another competitor on Kaggle. Once I had the code down in my own Python workspace, I had to figure out how to get my model running on the gpu located in another room. If I didn’t, the model would run on the local computer underneath the desk and would take over a month straight to train. However, this task required a lot of work on the terminal, so I asked Dr. Kim, a research fellow who was willing to assist me. Once we got this working and submitted our model to Kaggle, our next steps are to improve our model drastically to raise our accuracy and hopefully move up the leaderboard. This goal requires a lot of time googling, learning, and asking the other research fellows in the office. Thankfully, I have found that the people here are very eager to help me and lead me through this learning experience.
Because most of this internship has been a self-guided experience, I find that asking questions is the only way I can get the boost I need to work and not get distracted. Right now, I am in the middle of my six week time here, and I’ve already learned as much about real-world applications to coding than I ever have before. Now, I can see how vital machine learning can be in the medical field and its potential to save lives. It’s amazing to see how the steps necessary to teach a computer to recognize numbers are the same steps necessary to detect diseases in humans. This internship has also given me insight to normal office life in the city and the steps necessary to succeed in a certain industry. I am very grateful for the opportunity to work here as it has offered me many answers about what I want to do hereafter. I will definitely use the skills I learned here in my academic future and my overall future.