# DataGenX

My e-Notes about DataScience, Machine Learning, Python, Data Analytics, DataStage, DWH and ETL Concepts

## Monday, 15 June 2020

This question is very common in novice ML practitioners who aspire to become a ML engineer who can resolve any problem with their knowledge.

First of all, I want to say, Not all problems must be resolved by ML practice. We have to focus on the simplest and easier solution and it doesn't matter which practice helped to solve that problem. Suppose, if there is a problem which can be solved with the help of algebra equations should not be a best scenario to use ML practices so think before jumping on the How can I use ML here?
In this post, I am going to share few basic steps, which is not new at all, but I use these for all the problems (focusing on ML problems here) which I came across in my day to day work.

Let's go through with the below flow first, then we discuss things in details.

I personally follow this flow to resolve a ML problem and I swear, this flow had never ditch me. So let's see into the details -

a. Conceptualize the Problem:
What do I mean by that? Actually, this is a human tendency to jump on the gun if they know they wore the bullet proof jacket without analyzing which kind of gun they are facing. So always analyze the problem first before jumping on the solution. Read the problem, repeat it 2-3 times at least to get a hold on it.

b. Decide ML Algo & Merits:
Every problem can be solved with some algorithm (not talking about the accuracy) so start from there. If dont have any idea, start from Random forest. Decide merits on which you are going to judge your model.

c. Pseudo-code the Problem:
Just design a raw code structure how you are going to execute all the steps to resolve this problem.

d. Data Collection and Cleaning:
From this steps, we are going to actually work on the problem, till now we are only watching the highlights of match and you must know that Data Cleaning & Feature Engineering take most of the time of total time to resolve the problem.

e. Feature Engineering:
This step is very crucial in your journey till now, No of minimum columns required to resolve the problem in optimal way, or which kind of pre-processing is required. You might have to revisit this step back n forth as you work on your model. Once features have been processed, split the data into 3 parts which are train, test & validation data. Remember to randomize the data in split to distribute all kind of data into every parts.

f. Model Training & Tuning:
Now, proceed with your model training and tune it based on model merits you decided in 2nd step. Here, you can change your mind to use another ML algo or merits for which you may have to revisit feature engineering step.

g. Model Validation:
Once, model is ready, Validate it against the validation data, if the data size if small you can choose cross validation technique as well. Based on validation result, you will get the answer if you have to revisit Model algo/merits etc.

h. Model Deployment:
If you are satisfied with result of all above steps, you can proceed with the Model deployment. It can be on local server of cloud, remember to convert the model into appropriate cloud format.

I hope, this post will give you an idea how you have to proceed with any ML problem. Let me know in comment box.
till then...Happy Learning !!!

Like the below page to get the update
Facebook Page      Facebook Group      Twitter Feed      Telegram Group

## Disclaimer

The postings on this site are my own and don't necessarily represent IBM's or other companies positions, strategies or opinions. All content provided on this blog is for informational purposes and knowledge sharing only.
The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of his information.