Road-map — Data Scientist

Hidevs Community
9 min readOct 7, 2020

--

Necessity is the mother of invention

Over the past 10 years, there are great requirements for Skilled engineers with ample new technologies or inventions. In that list of news jobs, Data Scientists or Machine Learning Engineer become the sexiest job of the 21st century with highly paid and future safe jobs. Nowadays everyone wants to become a Data Scientist or Machine Learning Engineer. If I consider myself when I heard about it in the 3rd year of my college. I also decided to become a Data Scientist, start exploring on google about what steps should I follow, courses, tools, framework, or any roadmap but unable to find any proper format or steps, read from different websites, courses, codes, tools, frameworks, etc. As hard word work always pays off. Look at the results, I successfully got a job as a Data Scientist. Linkedin Profile

After successfully working in the field of Data scientist, I decided to write the proper roadmap.

Let’s start our journey.

If you an absolute beginner or new in the programming world then you should start with step 1 else you know about the basic programming, VCS, or you are experienced one then you can start directly from step 8. But here I personally recommended start from step 1 and give more time on topics which you don’t know and less time on others, but try to start with step 1.

Roadmap to becoming a Data Scientist

1. How to use Google efficiently — 2 Days

Before jumping directly into data science first we have to learn how to use the world’s best search engine,Google” because nowadays it is placing or involving in all our daily tasks. The more efficiently you learn to use google the best results to get from google.

2. 5w1h on Data Science — 2 Days

There is no use of learning anything without knowing it’s basics, reasons, applications, etc, like here also we should know what is data science, why data science, how it works, its application. I use this rule whenever I start learning and exploring a new topic, it gives me proper information and simple notes. I am running a campaign on it on LinkedIn.

Want to know more about know 5w1h click below.

3. Basic Programming Languages like C, C++, Python, Java, and Data structures — 15 days

Photo by Chris Ried on Unsplash

Programming is the basic foundation of the technology world and without coding experience, we can not start our career in the Data Science field, so we should learn programming.

4. Linux commands and shell scripting7 days

Now you are thinking about why we need this step. Wait I will explain to you see when you are working in any company they mostly preferred to work on open source technologies, libraries, OS. and most of the time we encountered the basic commands of Linux or we have to write some automated scripts not only in the data science field is required in every designation of the programming world. I recommended that every programmer should follow and learn the things to steps 7.

Photo by hannah joshua on Unsplash

5. Basic Networking concepts, commands — 7 days

After completing these four steps, we will proceed to step 5 where you can learn the basic networking concepts and its basic commands, now I think again you are confusing where is the need for networking in the Data Science career path. Hold on all points will be cleared, when we are working on machine learning, we have to deploy models on the cloud, have to do parallel processing training of the model, configuration different things in the cluster, have to work on master and slave architecture, which needed networking knowledge.

Photo by Ildefonso Polo on Unsplash

6. Version Control System (VCS) —7 days

When we are working on any project. We need a system that can manage our code, where VCS comes in the role. With the help of VCS, you can track your daily coding activities if you commit and push your daily code. You can use any VCS but most trending and open source and ease of use VCS are Github and I personally use Github. If you maintain your GitHub account with proper coding style and repositories, then it helps you in applying for a job also.

7. Need some experience in any domain like Web/Mobile — 15 days

Ok one more, Now why this? This is because machine learning is a concept or in simple terms, we can say that machine learning is all about the algorithms. So we have to integrate it somewhere otherwise, there is no use of doing machine learning if you don’t know the integration of the machine learning algorithms for the user.

Photo by Halacious on Unsplash

8. Python and its Data Structures — 10 days

As we all know that Python is one of the most used programming languages in Machine Learning or Data Science world. So you need to have good command in it to do as much as practice on it, with different data structures and even try to learn advanced Python also like test cases, debugging, optimization techniques of python code, etc.

Photo by Chris Ried on Unsplash

9. Linear algebra, Probability, and Statistics — 15 days

Learn the foundation or building blocks of the Machine Learning algorithms. It gives the matrix calculations, vectors, inverse, and transpose. Probability gives some machine learning models like Naive Bayes, probability theorem, etc. Statistics give us statistical models like regression, logistics which helps us in the coming steps a lot.

Photo by M. B. M. on Unsplash

10. Machine Learning Libraries and Tools — 15 days

In this step you need to learn all important machine learning libraries which we are going to use incoming steps like Pandas for data cleaning, preprocessing, manipulation, etc, Numpy for n-dimensional array calculation and basic arithmetic operations on it, Matplotlib for graphs and Seaborn advanced graphs for in-depth analysis of the data with the help of advance graphs. You should have to practice on Data Science IDE’s like Jupyter notebook, Spider etc. In this step, you also have to learn and step up the Machine Learning Development environment. Here I created an automated script which downloads and installed all required libraries and packages in one shot.

11. Machine Learning Introduction, Algorithms, and Building blocks — 20 days

In this step, you have to learn the machine learning introduction and their basic algorithms mathematics like Regression, Classification, SVM, SVR, KNN, Decision Tree, etc and its building blocks like Gradient Descent, Cost Function, Loss Function, Stochastic Gradient Descent, Mini Batch Gradient Descent, Stochastic gradient descent convergence, Online learning, Data Parallelism, etc. You should have to implement all algorithms in core python without using any libraries. If you code all algorithms from scratch in core python then you will get to know how the algorithms work, what are key points and other important and this step is one of the most important, crucial, and time taking steps. When I was learning, I gave more than 3 months to learn and code every algorithm in core python. Here is my GitHub repository.

12. Optimization, Debugging and Validation Techniques — 10 days

Learn how you can optimize your algorithm, how to test and debug machine learning algorithms or solutions. This is another important step in the roadmap of the data scientist because knowing how to debug or optimize any algorithms or solution is much more important than the implementation of algorithms or solution like Evaluating a hypothesis, Model selection and training validation test sets, K-Fold Cross Validation, XGBoost, Diagnosis, Regularization and Bias/Variance, Learning curves, etc.

Source

13. Project on ML and deploy using flask and AWS — 7 days

After completing all these 12 steps it’s time to deploy the machine learning model on any website or app. Here you can use flask or Django or TensorFlow.js for a website and react for mobile apps.

Source

Closing Notes

Good luck with your Data Scientist journey! It’s certainly not going to be easy, but by following this roadmap and guide, you are one step closer to becoming the Data Scientist, you always wanted to be.

Thanks for reading this article so far. Hope it will help you all and one humble request follow all steps without any ignorance. If you cheat or left any point then you will cheat with yourself. You might be thinking that there is so much stuff to learn, so many courses to join, so many websites or blogs to read, so many libraries to practice on it, but you don’t need to worry, I am with you and if like this roadmap and you want me to share a full explanation with resources materials like articles, tools, courses, etc. as I used in each step and you also want other tech stack roadmap like becoming a Back-End Developer, MEAN, MERN developer, NLP, Computer Vision, Devops, MLOps, then give me your request on Twitter, Linkedin or Website. or comment in the section below.

If you like this article then please consider following me on medium (deepakchawla). if you’d like to be notified for every new post and don’t forget to follow me on Twitter Linkedin Website.

All the best for your Data Scientist Journey.

Other articles on Medium you may like:

--

--

No responses yet