Advice to an Undergraduate Researcher
Short advice for young machine learning researchers (and my former-self).
Hello everyone 👋
I hope you are enjoying the last remnant of summer (if you are in the northern hemisphere that is).
In this edition I wanted to share a bunch of advice I would give to my former young undergraduate self or any undergraduate interested in research.
This thought was prompted by this message I’ve received:
Hi Yacine!
My name's [BLANKED OUT], and I'm a 19 year old 2nd year Undergraduate at University of [BLANKED OUT] studying computer engineering. I recently joined one of my campus' research labs as an intern, it focuses on ML applications within healthcare and medicine.
I'm still unfamiliar with many of the concepts we utilize, but I luckily have some coding experience.
I was wondering if you have any advice for a beginner in this field?
How did you become interested and so familiar with the machine learning concepts you teach in your videos?
I'm eager to make meaningful additions to my colleagues work but I currently feel like I'm taking up more of their time having them explain certain things to me-- rather than helping them with their research.
By the way, he’s talking about these tutorials I post on machine-learning topics that interest me.
My answers seemed useful to this bright student so I thought about sharing them here more broadly!
The Answers
Hey man,
First of all big props for reaching out, this is the right mindset to have especially if you are doing something as difficult as contributing to research while doing a bachelor's degree.
I’ll separate this answer in two, just so that you have the right context in which I’m answering it.
I’ll first tell you how I got involved in research as an undergraduate because I was in the same exact situation as you.
Then I’ll tell you what I would do if I could do it again.
A1: What did I do to contribute early on?
I got into a lab doing lots of physiological signal analysis and I was responsible for a portion of the coding.
The first thing I did was try to learn as much as possible about how everything worked.
I was mainly trying to build my knowledge of the lab bottom up and try to help in whatever way I could in whatever project came up.
I didn’t always have the full context of a project I was helping out, I was mostly glad to have a specific project I could help with.
The very first project I was the full owner of was a Java project which involved lots of signal acquisition and processing with the Android SDK.
These types of signals.👆
I got kind of into a rabbit hole in some of the technical aspects I had to help with which was very cool.
It allowed me to get a very practical grasp on how everything was analyzed, collected, or processed in some of the lab projects.
As I gained more knowledge about the technical, I started proposing ways into which we could get interesting result by modifying some analysis.
I also made sure to optimize the code I was building so that the throughput of the lab was better. Like for instance by making heavy use of parallelization later on (we had access to a supercluster for free that was awesome).
Anyway, at the end of it, I was able to publish quite a few papers which were mostly “assist” papers where I was a co-author. I was the main contributor to the general codebase of the lab and I became the go-to researcher for all the tricky technical problems.
A2: What I would do to contribute in retrospect?
Okay so that’s what I did, in retrospect I would do it a bit differently.
I would absorb the context the lab is operating in wayyyyyy earlier.
Looking back there was a lot of wasted technical effort in the process I followed. I learned a lot, but I could have been more impactful faster.
To get the context of the lab faster I would do the following:
Take one of the papers the lab already published (like the best recent one) and then piece it down into the technical component.
I would ask very targeted questions about why some stuff was done in a certain and how I could replicate the results.
Once I have a good idea about that blueprint, I would have a very good idea about how the lab is structured to generate science-worthy results:
Who did they collaborate with?
Who made this code?
Why is it set up like this (sometimes it’s silly like “the intern only knew how to do it in R”)?
Why did they choose this analysis over another one?
What have they tried that failed and what changed?
All of that meta-information is absorbable in under a month and you will get up to speed with how the lab functions fast.
THEN with that context, I would help whoever is most collaborative in the lab with their experiment and focus on two areas:
the mundane repetitive tasks that are annoying to do: in our case, it was cleaning up the EEG data we collected.
the coding tasks to push analysis forward: given your background, the coding part is where you can bring some assistance. Since you already understand the setup you should be able to contribute quickly.
After that, I would check what I’m most interested in and double down on that area. This way, it will provide me with growth and allow the lab to benefit from my learning experience.
btw have fun during the process too, that’s the most important part.
A3: How I Became Interested in Machine Learning?
To be frank, I’m not directly interested in machine learning.
I’m interested in learning in general, as a meta-concept. I’ve started my bachelor's in cell biology and I was fascinated by neurons.
So fascinated that I interned in a neurobiology lab during my summer studying memory in Aplysia (it’s a sea slug).
beautiful!
I was doing my best to cut out a protein with multiple ubiquitin-binding domain to check how this would affect it. This was all part of a greater study about mice, genes, and memory.
That’s my protein! 👆
I was fascinated, and I went on to learn about memory in humans in a psychiatric lab. I had a project where I was trying to teach the memory palace technique to patients with schizophrenia.
Then I took lots and lots of computer science classes which included signal processing and machine learning.
I’ve also took online classes on the subject, but where I learned the most was by directly trying to apply the concept in the research project I was working on.
These projects directly benefited my research lab and I had great motivation to dig deeper into both the data that my lab was generating and how to use machine learning properly.
Don’t be afraid to self-learn machine learning, this topic is so vast and multidisciplinary that you will never find one resource that will show you everything you need to know.
It’s the kind of field that you learn in various way and that you try to apply ml to your problems in creative fashion.
Let me know if you want to voice-chat later this week, might be easier to answer your questions!
END OF THE MESSAGE.
Hope that was useful everyone 🌹
Have a great rest of the week!
Best
Yacine Mahdid
PS: feel free to shoot me questions by replying to this email, I love them.