I graduated this year with a masters of statistics. In this article, I will explain the process that ultimately led to my offer for a Senior Data Scientist - Team Lead position for a company in the SF Bay Area. The components of the process that led to my success, in no particular order, were: crafting my resume and LinkedIn, building skills and projects, staying motivated (during the pandemic), decoding the data science interview process, and determining my professional goals.
As with any statistical inference, a singleton dataset won’t yield robustness. I was an unusual applicant to my grad program, and am an unorthodox candidate for DS roles, which is why it took me six months to find a job while my peers all had several offers immediately following graduation (and some months in advance!). I worked for 6 years between my undergrad and masters in the nonprofit world and had many different job titles, as noted in Edit #2 below. Coming back to school was a huge pivot and career shift, and so I am extremely fortunate to have found a firm who recognized the unique strengths I bring to the table; I was also extremely fortunate to interact with this firm at the right time where my unique strength combination was part of their strategic plan.
Takeaway: My experience is not a modal experience, but the tools I used and the lessons I learned may be useful for others. I would have appreciated reading it two years ago, so I’m putting it here in case others relate. Also a friendly reminders to aspiring or current data scientists not to conflate prior and posterior probabilities.
I completely botched my first DS resume. I borrowed a classmate’s resume and used it as a template, and tried to copy what they had done. But they had internships, relevant projects, and a better GPA than me, so my version looked… weird, since I didn’t have any of those things. Also, I was still expecting people to “read between the lines” on my resume instead of being as clear as possible. I started applying and connecting with folks, and what I am shocked by is that not one person I asked about my resume gave substantial or useful feedback. The one useful piece of feedback that I received was from my parents, who remarked “this doesn’t seem to really sell you; you’re much better in person than on this paper.” While initially, I was resistant to rehauling my resume, I decided to spend a full week almost full time rehauling my resume. This paid off, because I saw a significant uptick in responses and was able to get several first round interviews. The main changes I made:
Takeaway: your image matters a lot. Make sure to craft it carefully, and tailor it for roles that you are really interested in.
My strategy for learning something is spend at least a week or two finding the best resource, then pay whatever it costs (in your budget) and use it 100%. Don’t find 16 free cheat sheets and “shortcuts”. I researched every resource I could find (many thanks to r/datascience, r/machinelearning, and r/cscareersquestions) and I tried out a few, but saw that many only give free temporary access to some subsection of the entire platform, so you can’t really explore past the first few questions or modules. However, I saw a reddit post talking about some site called DataCamp where they gave you 7 days for free, but it was full access. I looked through the catalogue and found a lot of what I wanted to learn. I took a week and devoted 8 hours per day to going through the modules. There are some things I would change, but for the most part, it is very well designed, and extremely helpful. I earned somewhere around 20K “experience” on the platform, which means I finished ~100-200 exercises from data engineering, modeling, or reviewing OOP in Python. Then at the end of the free trial, they emailed me a 62% coupon for a year’s subscription, which brought it down to an insanely reasonable number, like between 100-150 bucks? Easy decision, since I had already mapped my curriculum through the rest of their materials, and they have new courses coming out every 1-2 weeks.
For textbooks, anything from O’Reily with an animal on the front is probably going to be a good resource. I burned through about a half dozen of those books, taking notes and building the example projects, then moving to DataCamp to do similar projects, then once I felt confident, I would find a dataset from Kaggle or the UCI ML repo and try to carry out the steps, then benchmark my findings with some medium article where someone did the same thing. Try to keep projects at the center of your learning, then find materials that will add to the project. This is much more transferrable to a job, and learning to think in this way will help you in interviews.
I saw an instagram account I follow put out a survey and was getting a lot of responses, but the way they were reporting the data was not able to do full justice to the story they were trying to tell. So I reached out and asked if I could take a look, and they were super excited to have someone with experience weigh in. So I ended up getting a few different spreadsheets, some with categorical and quantitative data and some categorical, while one of the responses was meant for a massively long response (Some users inputted over 1000 words). Do you see where this is going? It’s basically a playground where my boss has 0 expectations and all I have to do is improve on autogenerated excel charts. I began cleaning the data in a notebook, then built a set of scripts, then loaded a database, then made a dashboard for the team (using a python flask app), and scheduled cron jobs to extract the data and report results to the ceo/founder of this nonprofit. Every new DataCamp module I completed was one more secret to the puzzle of how to present and improve the data visuals, process, and my code. I got invited to meetings with the other leaders, asked about business decisions, and got to be part of the real life cycle of their mission.
Now that I had a taste of what that looked like, I reached out to my gym; they keep all of their members data on lifting progress and workout goals in an app, and I was able to give them a fun graphic and report for their members, and they shared on social media and saw an uptick in new memberships! I considered packaging this “product” and emailing other gyms, but I got overwhelmed by the pandemic/election and decided to put extra stuff on the back burner and wait for later when I have more skills.
Takeaway: make your learning project driven, and document your entire project, including packaging in several different formats, making a clear write-up, and versions of a verbal explanation that take 1 minute, 5 minutes, and 20 minutes. Then, explain it for a PhD, a CEO, a peer, and a non-technical client (or whatever audiences you want, provided they vary by technical understanding and business investment). Try to carry every project through the finish line. As an example, this post/article is my way of compiling a high-level overview of the job search process–the “finish line” of this 6 month project.
Have you ever been invited to church by your friend, but they didn’t explain anything before you got there? You don’t know when to raise your hands, or to stand up or sit down, or why the man up front is yelling? That’s how I felt for the last 6 months. From when you’re supposed to negotiate salary, wtf a “first year cliff” is, or what you’re allowed to ask and to whom, nobody teaches you this stuff. Why does everything have to be so goddamned awkward and needlessly confusing? I have teaching experience so all of this infuriated me as a very eager learner.
There are two kinds of people you will encounter:
Nobody will tell you the truth to your face, or give you meaningful feedback of any kind, and I asked for it constantly. They will send you a form email, ghost you, or dodge your questions and judge you for breaking etiquette you have no idea about.
I decided to submit some applications on Linkedin every other day as a benchmark, and took advantage of the “Easy Apply” feature to get more applications out. There is a tradeoff between quality and quantity in the applications you send out. Aside from more applications going out, I needed more information, so I decided to use my network to do some decoding.
I went on Facebook, IG, and my LinkedIn and filtered by software, data, CS, analyst etc until I had a list of people to ask questions to. I contacted each of them and asked for a brief phone call to get their advice and to hear about their experience in role R at company C. Here are examples of the questions I asked:
The final question I always ask is:
I got some first round interviews or conversations with recruiters through this method, but none of the connections panned out, and I only got one technical interview, which was a coding challenge that I answered 5/6 correct, so was not invited to the next round.
Now that I had exhausted my first round connections, it was time to go to strangers. I went to company pages on LinkedIn and clicked “people” and filtered by Data Scientist / Analyst / Data Engineer, then reached out with the following message:
Subject Line: [Fellow University Alum]* wondering about [Company]
Hey [name],
My name is [name] and I just finished up at [school] with an [degree] in [major]! I have a background in [sub-filed] and love what I have seen in the job descriptions at [company], and I was wondering if you wouldn’t mind connecting and answering some questions I have about the data scientist role and how your experience has been. Thanks so much for your time!
Best, [name]
* replace “Fellow University Alum” with whatever way you can connect with the person based on their profile. Otherwise just say “Aspiring Data Scientist” or something humble and eager.
I got several interviews and referrals from strangers this way.
Takeaway: use your network and reach out to make as many connections as possible in order to learn more about what you want or don’t want. They may also be happy to refer you to a position.
I interviewed for the following positions: Intern, Research Associate, Data Engineer, Machine Learning Engineer, Data Analyst, Product Analyst, Analyst, Consultant, Product Manager, and others.
I talked to a lot of people and wanted to understand what motivates them, what they are experiencing in their role, and what they hope for in the future. What skills do they have, and are those skills transferrable? It seems to me that coding practices and statistical intuition are very transferrable, and so I wanted a role that would allow me to improve those two things. I want to be able to transfer what I learn in my next role to future roles, and I’m not attached to any particular industry. So it was important for me to distinguish myself from those who love coding, or those who want a 9-5 without much challenge, or those who want to do analyst work but don’t want to become leaders. Benchmarking and measuring your goals and feelings against others similar to you but in different roles and spaces is an excellent way to figure out what you want to do, and even what size of company you prefer.
My set of values pre-job offer:
Takeaway: find out what positions interest you, and try to craft your profile, projects, and skills to fit that role. Don’t be afraid to say no to positions if they don’t meet your criteria.
The 2020 turbulence shook everything that wasn’t securely tied down. I’ve spent much of my free time on calls with friends and family about navigating the challenges they are facing this year. I had weekends and whole weeks where I didn’t do anything except scroll on reddit, tiktok, IG, etc. and felt like shit. I had other weeks where I felt like a superhero, learning things and gaining confidence, getting a website to work, debugging part of a data pipeline, etc. Here are the things that helped me stay on track:
I was SO LONELY on this journey, and resources on Reddit have helped me massively. As a way to give back to the community, I want to offer the following things for free:
In the Reddit thread linked below, and in many online forums in general, data scientists are characterized and measured by the number of lines of code they have written, or how many tools, technologies, and algorithms they know or can reproduce on the spot. However, because data scientists often stand between engineering and business teams in companies, communication, documentation, and cross-functionality are often skills that are massively underrated. There are many posts complaining about being underpaid or underappreciated for one’s skill level, but not mentioning anything about soft skills, which I’m certain has a substantial effect on the poster’s experience. It certainly impacted my own interactions with elite CS, Math, and Stats degree pursuants in the past.
I was also accused of lying or misleading others, and received several hateful messages. I received these messages as affirmations of the extraordinary nature of my journey, and an indication of high levels of stress and comparison or toxicity in the greater data science ecosystem.
In my 7 years in the nonprofit world, I did not see a model for transitioning successfully into a different industry. Almost 100% of my coworkers did not invest in transferrable skills outside of what was required in their day to day jobs, and when the systems or bureacracy left them without room to grow or thrive, they were unwilling or unable to iterate into new environments which better suited them. My wife and I have both transitioned fully into tech companies in the SF Bay area, and it is not due to training or opportunities afforded us by the natural career path of a nonprofit world employee, but rather the relentless documentation, coaching practice, outside materials, luck, and careful attention to our personal development which led us here. We took risk after risk and fought almost biannually to ensure we could move teams, positions, projects, and opportunities in order to guarantee development. In essence, there is no injective function which matches job experiences and titles to the employees in those roles, but only people who are limited by circumstances, organizations, mentorship, or lack of imagination. Our story is proof that a job title and context is only one ingredient in your development and preparation, and it’s possible to forge a path to your goal no matter how tangential your current opportunity seems.
If you’re reading this on my personal website, see the original post here.