Spaces:
Running
Running
Sébastien De Greef
commited on
Commit
•
89d0322
1
Parent(s):
7cade5b
feat: Add sklearn to requirements.txt and include transfer learning and early stopping in theory section
Browse files- requirements.txt +2 -1
- src/_quarto.yml +4 -0
- src/theory/early_stopping.qmd +63 -0
- src/theory/transfer_learning.qmd +69 -0
requirements.txt
CHANGED
@@ -2,4 +2,5 @@ pandas
|
|
2 |
seaborn
|
3 |
jupyter
|
4 |
torch
|
5 |
-
matplotlib
|
|
|
|
2 |
seaborn
|
3 |
jupyter
|
4 |
torch
|
5 |
+
matplotlib
|
6 |
+
sklearn
|
src/_quarto.yml
CHANGED
@@ -76,6 +76,10 @@ website:
|
|
76 |
text: "Underfitting"
|
77 |
- href: theory/hyperparameter_tuning.qmd
|
78 |
text: "Hyperparameter Tuning"
|
|
|
|
|
|
|
|
|
79 |
|
80 |
- href: theory/perplexity_in_ai.qmd
|
81 |
text: "Perplexity and Quantization"
|
|
|
76 |
text: "Underfitting"
|
77 |
- href: theory/hyperparameter_tuning.qmd
|
78 |
text: "Hyperparameter Tuning"
|
79 |
+
- href: theory/transfer_learning.qmd
|
80 |
+
text: "Transfer Learning"
|
81 |
+
- href: theory/early_stopping.qmd
|
82 |
+
text: "Early Stopping"
|
83 |
|
84 |
- href: theory/perplexity_in_ai.qmd
|
85 |
text: "Perplexity and Quantization"
|
src/theory/early_stopping.qmd
ADDED
@@ -0,0 +1,63 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Early Stopping in AI Training: Maximizing Efficiency for Better Results
|
2 |
+
|
3 |
+
Early stopping is a powerful technique used during the training of artificial intelligence (AI) models that helps prevent overfitting and enhances model performance. This method involves monitoring the model's performance on a validation set and terminating the training process when it ceases to improve, or starts deteriorating. In this article, we will explore the concept of early stopping in AI training, its benefits, how to implement it effectively using Python, and demonstrate its impact with visualizations.
|
4 |
+
|
5 |
+
## Understanding Overfitting and Early Stopping
|
6 |
+
|
7 |
+
Overfitting occurs when a model learns the training data too well, capturing noise or random fluctuations instead of generalizing patterns in the data. This can lead to poor performance on new, unseen data. By implementing early stopping, we can mitigate overfitting and improve our AI models' ability to generalize.
|
8 |
+
|
9 |
+
```python
|
10 |
+
import numpy as np
|
11 |
+
from sklearn.metrics import mean_squared_error
|
12 |
+
|
13 |
+
# Generating a simple dataset with noise
|
14 |
+
np.random.seed(0)
|
15 |
+
X = np.linspace(-1, 1, 20).reshape(-1, 1)
|
16 |
+
y = np.sin(X).ravel() + np.random.normal(scale=0.3, size=len(X))
|
17 |
+
```
|
18 |
+
|
19 |
+
## Implementing Early Stopping in Python
|
20 |
+
|
21 |
+
To implement early stopping during AI training using popular libraries like TensorFlow and Keras, we can use callbacks provided by these frameworks. Here's an example of how to set up an early stopping mechanism:
|
22 |
+
|
23 |
+
```python
|
24 |
+
from tensorflow.keras import Sequential
|
25 |
+
from tensorflow.keras.layers import Dense
|
26 |
+
from tensorflow.keras.callbacks import EarlyStopping
|
27 |
+
|
28 |
+
# Define a simple model architecture
|
29 |
+
model = Sequential([Dense(1, input_shape=(1,))])
|
30 |
+
model.compile(optimizer='adam', loss='mse')
|
31 |
+
|
32 |
+
# Set up early stopping callback
|
33 |
+
early_stopper = EarlyStopping(monitor='val_loss', patience=5)
|
34 |
+
|
35 |
+
# Train the model with early stopping
|
36 |
+
history = model.fit(X, y, epochs=100, validation_split=0.2, callbacks=[early_stopper])
|
37 |
+
```
|
38 |
+
|
39 |
+
## Visualizing Early Stopping Effectiveness
|
40 |
+
|
41 |
+
To demonstrate how effective early stopping can be in preventing overfitting and improving AI model performance, let's plot the training and validation loss during the training process using Matplotlib:
|
42 |
+
|
43 |
+
```python
|
44 |
+
import matplotlib.pyplot as plt
|
45 |
+
|
46 |
+
# Plotting training and validation losses
|
47 |
+
plt.figure(figsize=(12, 6))
|
48 |
+
pltenas = history.history['val_loss']
|
49 |
+
train_losses = history.history['loss']
|
50 |
+
epochs = range(len(train_losses))
|
51 |
+
|
52 |
+
plt.plot(epochs, train_losses, 'b', label='Training loss')
|
53 |
+
plt.plot(epochs, val_losses, 'r', label='Validation loss')
|
54 |
+
plt.title('Training and validation losses over time')
|
55 |
+
plt.legend()
|
56 |
+
plt.show()
|
57 |
+
```
|
58 |
+
|
59 |
+
## Conclusion
|
60 |
+
|
61 |
+
Early stopping is a valuable technique in AI training that can help prevent overfitting and enhance model performance by terminating the training process when it ceases to improve or starts deteriorating. By using callbacks like EarlyStopping, we can implement this technique effectively with popular libraries such as TensorFlow and Keras. The visualization of training and validation losses demonstrated above further emphasizes how early stopping contributes to more efficient AI model development.
|
62 |
+
|
63 |
+
By incorporating early stopping into your machine learning workflows, you'll be taking a significant step towards creating robust and highly-performing models that are better suited for real-world applications.
|
src/theory/transfer_learning.qmd
ADDED
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
# Transfer Learning: Techniques and Applications
|
3 |
+
|
4 |
+
## Introduction
|
5 |
+
|
6 |
+
Transfer learning is a powerful technique in machine learning that allows us to leverage knowledge from one problem domain to improve learning efficiency or performance on another, but related, domain. This approach has revolutionized the field of deep learning by enabling models to achieve state-of-the-art results with less data and computational resources.
|
7 |
+
|
8 |
+
## What is Transfer Learning?
|
9 |
+
|
10 |
+
At its core, transfer learning involves two main components: a source task (domain A) where abundant labeled data exists, and a target task (domain B) that has limited or noisy data available. The goal of transfer learning is to utilize the knowledge gained from solving domain A's problem to benefit performance on domain B's problem.
|
11 |
+
|
12 |
+
## Techniques for Transfer Learning
|
13 |
+
|
14 |
+
There are several techniques used in transfer learning, which can be broadly classified into two categories: **fine-tuning** and **feature extraction**.
|
15 |
+
|
16 |
+
### Fine-Tuning
|
17 |
+
|
18 |
+
Fine-tuning involves taking a pre-trained model (usually trained on a large dataset) and continuing the training process to adapt it for the target task. The most common approach is to replace the final layer(s) of the neural network with new layers tailored to the target problem, while keeping the earlier layers fixed.
|
19 |
+
|
20 |
+
Here's an example using Keras/TensorFlow:
|
21 |
+
|
22 |
+
```python
|
23 |
+
from tensorflow import keras
|
24 |
+
|
25 |
+
# Load pre-trained model (assuming ResNet50 trained on ImageNet)
|
26 |
+
pretrained_model = keras.applications.ResNet50(weights="imagenet", include_top=False, input_shape=(224, 224, 3))
|
27 |
+
|
28 |
+
# Freeze layers up to the desired depth (e.g., 17th layer)
|
29 |
+
pretrained_model.layers[:18].trainable = False
|
30 |
+
|
31 |
+
# Add new classification head for target task
|
32 |
+
x = keras.layers.GlobalAveragePooling2D()(pretrained_model.output)
|
33 |
+
x = keras.layers.Dense(units=num_classes, activation="softmax")(x)
|
34 |
+
final_model = keras.models.Model(inputs=pretrained_model.input, outputs=x)
|
35 |
+
```
|
36 |
+
|
37 |
+
### Feature Extraction
|
38 |
+
|
39 |
+
In feature extraction-based transfer learning, the pre-trained model is used as a fixed feature extractor, and its output serves as input to another classifier trained from scratch for the target task. This approach does not modify the original network architecture but instead utilizes learned features directly.
|
40 |
+
|
41 |
+
Here's an example using Keras/TensorFlow:
|
42 |
+
|
43 |
+
```python
|
44 |
+
from tensorflow import keras
|
45 |
+
|
46 |
+
# Load pre-trained model (ResNet50) as a feature extractor
|
47 |
+
pretrained_model = keras.applications.ResNet50(weights="imagenet", include_top=False, input_shape=(224, 224, 3))
|
48 |
+
|
49 |
+
# Extract features from the pre-trained model
|
50 |
+
features = pretrained_model.output
|
51 |
+
|
52 |
+
# Flatten and add new classification head for target task
|
53 |
+
x = keras.layers.GlobalAveragePooling2D()(features)
|
54 |
+
x = kerasinas.layers.Dense(units=num_classes, activation="softmax")(x)
|
55 |
+
final_model = keras.models.Model(inputs=pretrained_model.input, outputs=x)
|
56 |
+
```
|
57 |
+
|
58 |
+
## Benefits of Transfer Learning
|
59 |
+
|
60 |
+
Transfer learning offers several advantages:
|
61 |
+
|
62 |
+
1. **Reduced Data Requirement**: By leveraging pre-existing models and knowledge from large datasets (e.g., ImageNet), transfer learning allows us to achieve high performance even with limited labeled data in the target domain.
|
63 |
+
2. **Faster Convergence**: Since a portion of the model is already learned, training times are significantly reduced compared to building an entirely new network.
|
64 |
+
3. **Improved Performance**: Transfer learning can lead to better generalization and accuracy by utilizing knowledge from related tasks or domains.
|
65 |
+
|
66 |
+
## Conclusion
|
67 |
+
|
68 |
+
Transfer learning has transformed machine learning applications across various fields such as computer vision, natural language processing, and speech recognition. By understanding the techniques of fine-tuning and feature extraction, developers can effectively apply transfer learning to their problems, saving time and resources while achieving impressive results.
|
69 |
+
|