What is Machine Learning, Data Science?
Today we are going to discuss what Machine learning or Analytics or Data Science techniques which are Data Scientist use. It will help you to understand the usage of all statistics technique and work towards updating your skills accordingly. After going through this blog you will gain knowledge about data science and you will not be having question like what is Machine learning.
What types of techniques are available in Machine Learning?
Before discussing the types of techniques available, we must know what kind of problems we have. All business problems are like problems in life only therefore we need to be lost in our life and understand the requirements of our life to understand the business problems.
Let’s take our life example. Suppose you are searching for job and you attended interview then you will be having below questions.
- Whether you will get job or not?
- What salary you will get?
- What salaries you will be drawing coming next 5 or 10 years?
- How you interview went? What did interviewer ask? What was discussion?
- How you felt after talking interviewer. Was interviewer Positive or Negative or Neutral about you?
Whether you will get job or not?
Here after giving interview we want to know whether we will get job or not. Here we are having two classes/categories of getting job or not getting job (Yes or No). We can also say it is categorization of getting job for you. This kind of problems are called classification problem whether it is in our life or project in any organization. This kind problems are solved by classification problem techniques like Logistic Regression, Decision Tree, Random Forest, Clustering, SVM – Support Vector Machine, Naïve Bayes and many more which are part of Machine Learning Technique. For Machine Learning we have options avaiable such as R Mechine Learning, Python Machine Learning and SAS Machine Learning. It is also like our life. If we want to enjoy TV shows then we will not use AC remote. We will be using TV remote only to operate TV to walk through channels. Same way we use above mentioned technique if we have classification problem. When ever we are working in Data Science domain we need to learn techniques then we will be able to understand easily what technique we are going to use any in project requirement.
In organizations, classification problems are use to predict whether employee or customer will leave company or not, whether customer will do fraud or not and more.
What salary you will get? (Data Science jobs / Machine Learning jobs / Data Scientist Salary)
Now let’s discuss problem what salary you will get in case company is giving you job offer. Here you will get suppose 80K USD per year. If you notice 80K amount it is some value or we can say numeric value. This kind of problems are like continuous value prediction problem or we can say regression problem. Same way we have few Machine Learning techniques to predict continuous value as well. We can use Machine Learning techniques such as Regression analysis, SVM, Decision Tree and Random forest to predict regression problem.
Now you may think that we are using SVM, Decision Tree, Random Forest in both classification problem as well and in continuous value prediction problem. Then what is difference? I have not written Logistic Regression to predict continuous value because logistic regression can be used to predict only classification problem same why Regression analysis is used to predict continuous value. But other techniques such as SVM, Decision Tree and Random forecast can be used to predict both types of values. These techniques are like universal remote of TV and Dish TV (Dish TV is channel provider. It can be different country to country)
In organizations, Regression problems are use like what amount of loan we should give to customer in banks. What amount of credit card limit should be given to any customer and more other problems are there.
What salaries you will be drawing coming next 5 or 10 years?
Now let’s discuss this problem where we become happy of thinking what salary we will be getting in future. We feel happy after imagining our future salaries as it gives us picture about our future. How we think of this kind of problem. We think like in first year of my job I was getting 50K per year, in second year of job, I was getting 60K, same way 70K in 3rd year, 80K in 4th Year and 90K in 5th year. Now we want to assume what salary in we will be getting if 6th, 7th, 8th, 9th and 10th year. This kind of problems are solved by Time Series Forecasting techniques as we are predicting future values based on previous or historical data which is salary in our case. We have univariate, multivariate, seasonal time series forecasting models and these models are used as per business requirement.
In organization, Time Series forecasting problems are use to predict sales, profit, employee head count, performance of employee etc.
How you interview went? What did interviewer ask? What was discussion?
It is time to discuss how interview went. What we have done in interview? We have talked with interviewer about job profile. He asked lots of questions and you answered his/her question. So, what we are going to do here. We are going to analysis the communication between you and interviewer which is text data. Suppose you went to give interview for Data Scientist profile and interviewer was asking questions about web development then after analyzing your interview conversation, any body can tell that you went in wrong interview and you will be rejected because that profile was not matching with your profile.
Analyzing text data is done by NLP – Natural Language Processing techniques or we can say text mining. We can use Topic Modelling, Text Clustering or more to solve these kinds of problems.
Organizations use NLP to analysis employee survey, customer survey, or any other kind of text data available with them.
How you felt after talking interviewer. Was interviewer Positive or Negative or Neutral about you?
Let’s discuss final problem type remaining. Suppose you gave interview for data scientist only and interviewer was asking same questions. Now we want to judge how interviewer was behaving with you. He was happy with his words like he was saying good, great kind of worlds, or he was behaving rude and was saying you should know this thing if you are working same data scientist.
This kind of problems helps use to find the sentiment of any text which is your interview conversation in your case. If you find that sentiment is Positive then you will get job, if you find sentiment is Negative then you will not get job and if you find sentiment is Neutral then there is 50-50% chance of getting and not getting job.
We find the sentiments of text with the help of sentiment analysis techniques. Organizations use sentiment of employee or customer about them. Suppose if any customer is having negative sentiment then there is probability he will go to other company and we will lose our customer. Companies take proactive actions to attract these kinds of customers.
Note: We have discussed scenarios based statistics techniques here whether those belongs to Machine Learning or not. Time Series Forecasting, NLP and Sentiment Analysis are not part of Machine Learning techniques but whether we talk about Machine Learning, Time Series Forecasting or NLP, all techniques are part of data science domain and all Data Scientists use these techniques. You will also find Data Scientist salary always high compared to other job profiles in Analytics Industry. After learning all these techniques you will become master in Data Science.
We have prepared end to end project implementation courses which will help you to understand technique easily and crack any business problem.
You can also watch video to understand Machine Learning
Have look on SAS, R & Python course contents and demo videos
Comparison of Data Analyst vs Data Scientist vs Data Visualization
Comparison of SAS, R and Python
Tag: Machine Learning, Classification Problem, Regression Problem, Time Series Forecasting, Natural Language Processing, Data Science, What is Machine Learning,data mining, master in data science, python machine learning, r machine learning, data science jobs
Description: Classification Problem, Regression Problem, Time Series Forecasting, Natural Language Processing, Logistic Regression, Data Scientist Salary, R Machine Learning, SAS Machine Learning, Python Machine Learning, Data Mining, Master in data science.
Topic: Machine Learning
Category: Machine Learning