Day 3

June 04, 2024

Lets start with Random Forest

1. It combines the output of multiple decision tree to reach the single result.

2. It handles both regression and classification problems so we wont be having problems we encountered on Ordinary Square Method.

3. it is made of many decision tree but I am yet to learn decision tree.

Lets move back and learn decision tree first.

1. Similar to Random forest as it can handle both regression and classification.

Lets drive into some math before we start:

1. Entropy (Information Gain):

Measure's the impurity or disorder of set of data. High entropy means the data is more mixed up (e.g., equal numbers of different classes), while low entropy means it's more pure (mostly one class).

2. Information Gain

it is a decrease in entropy achieved by splitting the data on particular attribute. One of the main attribute of decision tree is that it gives highest information gain, as this leads to most information splits.

Formula for Entropy:

Entropy(S) = - Σ (p_i * log2(p_i))

where:
S is the set of examples
p_i is the proportion of examples in S that belong to class i
Formula for Information Gain:
InformationGain(S, A) = Entropy(S) - Σ ((|S_v| / |S|) * Entropy(S_v)) 
where:
S is the set of examples
A is the attribute we're considering splitting on
S_v is the subset of examples in S that have value v for attribute A
2. Gini Impurity
lower the Gini means more pure node

Formula for Gini Impurity:
Gini(S) = 1 - Σ (p_i)^2
3. Attribute Selection
For every node, decision tree information gain or gini impurity to choose the best attribute
to split on at each node. The goal is the maximize the purity of the resulting child node
4. Recursive Partitioning
Decision trees build their structure by recursively splitting the data based on the chosen
attributes. This process continues until a stopping criterion is met (e.g., maximum depth, 
minimum number of samples per leaf).

Search This Blog

SaileshLearnCode

Day 3

1. Entropy (Information Gain):

2. Gini Impurity

`3. Attribute Selection`

4. Recursive Partitioning

Comments

Post a Comment

Popular posts from this blog

Linear Models Battle