This is what I wish I read before starting.
This article is supposed to be different from the one I posted a few days ago: A quick guide to get started on Machine Learning and Computer Vision. That one is more of a collection of resources that focus mostly on getting up to speed on Machine Learning and Computer Vision, but it lacks the story part. You know, when you don't know where to start, sometimes you need somebody that guides you step by step from the beginning and doesn't throw you in the middle of a shitton of resources.
That article was about throwing you in the dumpster. This one tries to be helpful for those who want to start doing this for real.
This "guide" (or however you want to call it) is more or less what worked for me (and exactly how I'm still doing it, because this is a lifelong learning experience, buddy).
First, let's get the bad news out of the way
If you want to dedicate your life to Machine Learning, there will be math involved. Calculus, linear algebra, statistics, and probabilities. Do you have to be an expert? Of course not, but you'll have to make peace with the idea of moonlighting while reading math concepts that you forgot ten years ago. If you like math, then this won't be a problem, and if you don't, well, it is not the end of the world, but it will have to be part of your life.
A lot of people recommend starting with a math refresher before anything else, but that would never work for me. There's a lot of math out there, and I wanted to make sure I wasn't overcomplicating my life with things that were not relevant. I began with Machine Learning theory and only looked at the specific math concepts as they got in front of me; so that's what I'd recommend you do as well.
Here you have a list of free online resources from the MIT Open Courseware that cover everything you need (and if you haven't seen the MIT Open Courseware site, consider this your End-Of-2018 gift): Mathematics for Computer Science Linear Algebra Introduction to Probability and Statistics Single Variable Calculus * Multivariable Calculus
If you are like me, then you'd like to grab a book or two to put on a shelf and never read. But you know, at least you have it just in case you are bored some day. Here are the books that I'd recommend (some of these I own, some are recommendations from other people, and most of them I haven't read):
- Linear Algebra Done Right
- No bullshit guide to linear algebra
- Introduction to Linear Algebra
- Matrix Computations
- A First Course in Probability
- Introduction to Probability Theory
- Machine Learning: A Probabilistic Perspective
- Introduction to Stochastic Processes
- Introduction to Statistical Theory
- Applied Multivariate Statistical Analysis
You need the theory
Alright, so putting the math aside, like with anything else in life, you'll need some Machine Learning theory to know what you are doing. Of course, you can start messing with things directly and skip the lectures, but this would be like trying to fly a plane without going to pilot school first (except the crashing part).
A lot of people will try to send you to graduate school, and although this is a way, it's not necessary. Today, there's a lot of online coursework that you can take. For a couple of examples, check Udacity and Coursera. For a little bit more specific advice, check A quick guide to get started on Machine Learning and Computer Vision; I added some specifics there.
I had the blessing to go to graduate school. I wasn't thinking about doing Machine Learning, but once inside, I decided to specialize in it. This works, of course, but I know smarter people that didn't pay a ton of money or spent a ton of time slaving in college. Yesterday, companies cared a lot about your credentials, but right now, they are taking anyone with the necessary skills: companies need people that care, like the field, and know how to get stuff done. And you can do all of this without going to school.
So start taking courses and reading papers. Free or paid, doesn't matter. Just start learning. You can also read books, of course, but the papers will keep you on top of what's new, and hot, and buzzwordy. Papers will be harder to read, but a great exercise to get smarter (yes, you'll get smarter) and get a headache from time to time.
Pick a language
Controversial topic, for sure, but I honestly don't care: I picked Python because, by the time Machine Learning became a thing for me, I already had experience with Python. And also, because I believe Python is the best.
There, I said it.
But you can also do R and be perfectly fine with it. I know somebody (yes, a single person) that does R. But he also does Python whenever TensorFlow is involved, so, as I said, do Python.
And if you are a software developer, expect a whole lot fewer lines of code to get cool things to happen here. You are probably used to count your lines of code by the thousand —or heck, even by the million—, but here it will be different. The focus is on the data and not in the code, and you'll learn that the hard way like everyone else. Just a heads up that here you'll find average programmers achieving amazing things. Emulate them.
Then, make things happen
This is the fun part, and unfortunately, the part that some people forget: start doing things. Whenever you apply to a job, this is what companies want to see from you: shit that works. Who cares how many papers you've read and how many fancy words you can spell? Show me things that run and work, and you are getting to the next level!
Your coursework will give you a ton of exercises, so make sure you create your GitHub account and upload all your solutions there (you can set them private if the school doesn't like you to share them). Make sure you write a report or some explanation together with the code. If you have nothing else to show, this will be your Plan B.
But if you have time, and you don't have kids that ruin your nights, you can download a dataset from anywhere and do something with it. I don't care what it is, but make it interesting. Just play with the data, and select some algorithms to do something with it. Document your findings, your struggles, and push it to your GitHub account. Repeat three times, and you'll get more from it than from reading all the theory books ever written.
If you are lucky to participate in some real-world project as part of your school or company, then great! You are all set, and headed down the right path. If you are not, another option is to participate in an open-source project (everyone says this, so I'm repeating it, but open-source projects are more intimidating than what people tend to make others believe). I'd recommend asking around (StackOverflow or Quora may help) and finding a good place that allows you to follow along —and who knows— you may be able to contribute here and there.
Doing is the way you'll learn the most, so make it happen.
Expect this to be your job
First, you'll spend a lot of your time dealing with data: getting it, cleaning it, making sense out of it, cleaning it some more, and playing with it in every imaginable way.
The data will be your everything.
Then, you'll pick an algorithm —or two—, and you'll run your data through it. Sounds easier than it is, but this is pretty much a good summary. It is unlikely that you'll be inventing new algorithms, but instead, you'll be using everything that others have already put together for you. And don't let people minimize this step: as I said, this is tricky stuff. Knowing what to pick and how to apply it is an art that you'll need to master.
Then, you'll communicate your results. This is very important: unless you can get people to understand what you did, your work has no value. Here you'll learn how to talk, write, and visualize your data. You'll need to master the art of making people that know nothing about math, statistics, and probabilities understand what you need them to know.
Finally, you'll look at your work, decide that is not that good after all, and go back to the first step to iterate and make it better. Rinse and repeat, my friend. You'll do this until it's good enough —or you run out of budget, whatever comes first.
Be ready to suck at this for a long time. I've been doing it for some time, and I'm still horrible at it. I have hope, and I'm certainly better than what I was last week, but this is a forever thing, and only practice and time will help you get there.
And of course, don't let this to discourage you. Nobody can expect to show up in training camp and beat those who have been playing for five straight years. But if you get your sorry ass out of the comfort zone and start dedicating some time to this, you'll make it just like everyone else.
Have a Merry Christmas and Happy New 2019!