Machine Learning

 

Machine learning

        Machine learning is a field of computer science that uses statistical techniques to give computer systems the ability to "learn” with data, without being explicitly programmed.

        Machine learning is technology which enables computers to learn automatically from past data and building model and predict output for future value.

        In simple word machine learning is all about learning from data

        Now, in the definition, "explicitly programmed" means that we write a program for every scenario to handle that scenario.

        But in machine learning, we don’t do that. Instead, we have some data and an algorithm, and we instruct the algorithm to analyze or move over the data and identify patterns. Between the input and output, once we recognize the patterns, we provide new input data to the algorithm, and it generates the output.

        In a conventional or traditional program, you write a program based on the logic you have created for a given problem or scenario. Then, you provide data to the program, and the computer generates the output.

But in machine learning, things are different. Here, you provide data that includes both inputs and outputs. However, you don’t write a program or create logic manually. Instead, the logic is automatically generated by the machine learning algorithm.

So here good part is that you can’t write program abnd logic for every condition or scenario

So, the good part is that you don’t have to write a program and logic for every condition or scenario, as machine learning algorithms will handle it automatically.



        Example,

        If you write a program to add two numbers, whenever you provide two numbers to it, the program will return their sum.

        But in machine learning, you can provide data in an Excel sheet where each row contains numbers and their sum. The machine learning algorithm trains on this data, learns patterns, and in the future, when you provide two or more numbers, it will know how to perform the addition where as program you written for add two number However, the program you wrote to add two numbers cannot handle more than two numbers as input because it is explicitly coded to add only two numbers. This is main difference.

        So, from this example, you can understand why ML is so popular nowadays

        Now, we will discuss in which scenarios ML is more useful than normal software or traditional programming.

        1) In some scenarios, you can't write a program or define all possible cases, and that's where Machine Learning helps.

Example – You are trying to build a spam classifier that identifies whether a given email is spam or not. As a software developer, what can you do? First, you can analyze a large number of spam emails and try to identify some patterns. For example, if the word "discount" appears more than three times, or the word "sale" is used frequently, or if the email contains too many images, these could be indicators of spam. Based on these patterns, you could write a long list of if-else conditions to develop spam detection software.

But if an advertising company finds out that your code marks an email as spam when the word "discount" appears more than three times, they can bypass this by using synonyms like "offer." As a result, your program will no longer detect this condition, and the email won’t be identified as spam.

As a software developer, you would need to continuously update the code and refine the logic to account for such variations. This means you must regularly update your program or software to ensure it continues to work properly.

In Machine Learning, this does not happen because the system learns from data. If there are changes in the data, the algorithm automatically adapts its logic. That is the beauty of Machine Learning.

In Machine Learning, you only need to write one program or algorithm for a given scenario, and it can handle various cases dynamically without requiring constant manual updates.

2) Machine Learning is used for complex tasks where there are countless possible cases that are difficult to anticipate. In such scenarios, traditional programming may not be effective.

Example – Image classification, you can classify dog that in image dog is present or not

 

Types of machine learning

Machine learning types depend on three different factors, but today we will focus on the amount of supervision needed for a machine learning algorithm to be trained.

 

Figure: Types of Machine Learning

 

As shown in the figure, based on the amount of supervision, Machine Learning is divided into four categories. The first category is Supervised Machine Learning, the second is unsupervised Machine Learning, the third is Semi-Supervised Machine Learning, and the last is reinforcement learning.

This is a famous categorization of machine learning types. If you read any book or watch a YouTube video, you will see these categories: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning.

So, we will discuss each one in detail, one by one, to understand the logic behind it.

Machine learning is all about learning from data and training on it. If both input and output (labeled data) are present in the dataset, and the  task is to find the relationship between them so that, given a new input, the output can be predicted, then this type of learning is called supervised machine learning.

Let me explain with an example. Suppose we have data on 5,000 students, which includes two pieces of information: first, the student's IQ, and second, their CGPA. Additionally, we have one more piece of information—whether the student was placed or not.

 

Sr. No

IQ

CGPA

Placement (Y/N)

1

120

8.5

Y

2

110

7.8

Y

3

130

9.1

Y

4

105

7.0

N

...

...

...

...

4997

102

6.9

N

4998

138

9.3

Y

4999

99

6.2

N

5000

125

8.7

Y

Now, if someone asks which of these three columns of information is the input and which is the output, you can easily say that IQ and CGPA are the input columns, while the column indicating whether the student was placed or not is the output column. So, here, the output column (student was placed or not) depends on the other two input columns.

Now, in this data, both Input and Output are present. When you apply a Machine Learning algorithm, it identifies the mathematical relationship between Input and Output. This allows the model to predict whether a student will be placed or not based on their IQ and CGPA in the future. This is known as Supervised Machine Learning.

Supervised Machine Learning has two types: Regression and Classification. To understand these, we first need to know the types of data. Generally, data is categorized into two types: Numerical data, such as age, weight, IQ, and CGPA, and Categorical data, such as gender and nationality.

Supervised Machine Learning has two types: Regression and Classification. To understand these, we first need to know the types of data. Generally, data is categorized into two types:

  1. Numerical data – Examples: age, weight, IQ, CGPA, etc.
  2. Categorical data – Examples: gender, nationality, blood group, and education level (e.g., Bachelor's, Master's, PhD)."

Now, let's understand what Regression is. If you are working on a Supervised Machine Learning problem, it means you have a dataset where both input and output columns are present. If the output column contains numerical values, then the Supervised Machine Learning problem is called Regression.

Example :

Student Placement Data

Sr. No.

IQ

CGPA

Package (in LPA)

1

120

8.5

10.5

2

110

7.8

6.8

3

130

9.1

15.2

4

105

7.0

5.5

...

...

...

...

4997

102

6.9

4.8

4998

138

9.3

18.0

4999

99

6.2

3.9

5000

125

8.7

12.5

         

=This table contains 5000 records of students' data with four columns:

  1. Sr. No. – Serial number of the record.
  2. IQ – The intelligence quotient of the student.
  3. CGPA – The cumulative grade point average.
  4. Package (in LPA) – The salary package offered in Lakhs Per Annum (LPA).

The data can be used for Regression-based Supervised Machine Learning, where IQ and CGPA are inputs, and the package is the numerical output. This helps in predicting salary packages based on academic performance and intelligence.

If you understand Regression, then Classification is easy to understand. In Supervised Machine Learning, if the output column is categorical instead of numerical, the problem is classified as Classification.

Now, let's understand what Regression is. If you are working on a Supervised Machine Learning problem, it means you have a dataset where both input and output columns are present. If the output column contains numerical values, then the Supervised Machine Learning problem is called Regression.

Example –

Sr. No

IQ

CGPA

Placement (Y/N)

1

120

8.5

Y

2

110

7.8

Y

3

130

9.1

Y

4

105

7.0

N

...

...

...

...

4997

102

6.9

N

4998

138

9.3

Y

4999

99

6.2

N

5000

125

8.7

Y

 

Student Placement Data

This table contains 5000 student records with three input columns and one output column. It is used for Classification-based Supervised Machine Learning, where the goal is to predict whether a student will be placed or not based on their IQ and CGPA.

Columns Explanation:

  1. Sr. No. – A unique serial number for each student in the dataset.
  2. IQ – The Intelligence Quotient of the student, which measures cognitive ability.
  3. CGPA – The Cumulative Grade Point Average, representing academic performance.
  4. Placement (Y/N) – The output label (target variable):
    • Y (Yes) → The student was placed in a job.
    • N (No) → The student was not placed.

We will discuss some examples to determine whether they are Regression or Classification problems:

1)   Given house data, you need to predict the price of a houseRegression Problem

2)   Given email data, you need to predict whether an email is spam or notClassification Problem

3)   Given weather data, you need to predict whether it will rain today or notClassification Problem

4)   Given an image, you need to determine whether a dog is present or notClassification Problem

5)   Given an image, you need to predict how many dogs are presentRegression Problem

6)     6) Predicting a person's age based on their height and weight → Regression Problem

7)      7) Predicting the number of tickets sold for a concert based on past data → Regression Problem

8)      8) Determining whether a loan applicant will default or not → Classification Problem

9)          9) Estimating the fuel efficiency (miles per gallon) of a car based on engine size → Regression Problem

 

 

 

Post a Comment

0 Comments