Spaces:
Running
Running
Sébastien De Greef
commited on
Commit
·
0bc83a1
1
Parent(s):
a430533
feat: Update website format and remove unused files
Browse files- src/_quarto.yml +0 -2
- src/theory/hyperparameter_tuning.qmd +87 -0
src/_quarto.yml
CHANGED
@@ -108,8 +108,6 @@ website:
|
|
108 |
contents:
|
109 |
- href: vision/tasks.qmd
|
110 |
text: "Tasks"
|
111 |
-
- href: math.qmd
|
112 |
-
text: "Math"
|
113 |
|
114 |
- section: "Tools and Frameworks"
|
115 |
contents:
|
|
|
108 |
contents:
|
109 |
- href: vision/tasks.qmd
|
110 |
text: "Tasks"
|
|
|
|
|
111 |
|
112 |
- section: "Tools and Frameworks"
|
113 |
contents:
|
src/theory/hyperparameter_tuning.qmd
CHANGED
@@ -74,6 +74,93 @@ model = SomeModel()
|
|
74 |
bayes_search = BayesSearchCV(estimator=model, search_spaces=param_space)
|
75 |
```
|
76 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
77 |
### Conclusion
|
78 |
|
79 |
Hyperparameter tuning is an essential step in building effective machine learning models. By using techniques like grid search, random search, or Bayesian optimization, we can find the best hyperparameters for our model and improve its performance on unseen data.
|
|
|
74 |
bayes_search = BayesSearchCV(estimator=model, search_spaces=param_space)
|
75 |
```
|
76 |
|
77 |
+
### 4. Gradient-based Optimization
|
78 |
+
Gradient-based optimization techniques leverage the gradients of the loss function with respect to hyperparameters to find the optimal values. These methods are often more efficient than exhaustive search methods as they follow the direction of steepest descent, making them suitable for continuous hyperparameters.
|
79 |
+
|
80 |
+
One popular approach within this category is the Gradient Descent algorithm, which iteratively adjusts hyperparameters in the direction that reduces the loss function. Another method is Hypergradient Descent, which extends this idea to also adjust learning rates during the optimization process.
|
81 |
+
|
82 |
+
```python
|
83 |
+
import tensorflow as tf
|
84 |
+
from tensorflow.keras.optimizers import Adam
|
85 |
+
|
86 |
+
# Example of gradient-based optimization using TensorFlow
|
87 |
+
model = SomeModel()
|
88 |
+
optimizer = Adam(learning_rate=0.01)
|
89 |
+
|
90 |
+
# Define a training step
|
91 |
+
@tf.function
|
92 |
+
def train_step(data, labels):
|
93 |
+
with tf.GradientTape() as tape:
|
94 |
+
predictions = model(data)
|
95 |
+
loss = compute_loss(labels, predictions)
|
96 |
+
gradients = tape.gradient(loss, model.trainable_variables)
|
97 |
+
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
|
98 |
+
```
|
99 |
+
|
100 |
+
### 5. Evolutionary Algorithms
|
101 |
+
Evolutionary algorithms (EAs) draw inspiration from natural evolution to optimize hyperparameters. These algorithms use mechanisms such as selection, mutation, and crossover to evolve a population of candidate solutions over several generations. Techniques like Genetic Algorithms (GA) and Covariance Matrix Adaptation Evolution Strategy (CMA-ES) are prominent examples.
|
102 |
+
|
103 |
+
EAs can efficiently explore complex hyperparameter spaces and are particularly useful when the objective function is noisy or when there are multiple local optima.
|
104 |
+
|
105 |
+
```python
|
106 |
+
from deap import base, creator, tools, algorithms
|
107 |
+
|
108 |
+
# Define fitness function and individual
|
109 |
+
creator.create("FitnessMax", base.Fitness, weights=(1.0,))
|
110 |
+
creator.create("Individual", list, fitness=creator.FitnessMax)
|
111 |
+
|
112 |
+
# Register evolutionary operations
|
113 |
+
toolbox = base.Toolbox()
|
114 |
+
toolbox.register("attr_float", random.random)
|
115 |
+
toolbox.register("individual", tools.initRepeat, creator.Individual, toolbox.attr_float, n=10)
|
116 |
+
toolbox.register("population", tools.initRepeat, list, toolbox.individual)
|
117 |
+
toolbox.register("evaluate", evaluate_model)
|
118 |
+
toolbox.register("mate", tools.cxTwoPoint)
|
119 |
+
toolbox.register("mutate", tools.mutGaussian, mu=0, sigma=1, indpb=0.2)
|
120 |
+
toolbox.register("select", tools.selTournament, tournsize=3)
|
121 |
+
|
122 |
+
# Perform the evolutionary algorithm
|
123 |
+
population = toolbox.population(n=50)
|
124 |
+
algorithms.eaSimple(population, toolbox, cxpb=0.5, mutpb=0.2, ngen=40, stats=None, halloffame=None, verbose=True)
|
125 |
+
```
|
126 |
+
|
127 |
+
### 6. Population Based Training (PBT)
|
128 |
+
Population Based Training (PBT) is a hybrid optimization technique that combines elements of evolutionary algorithms and hyperparameter tuning. PBT maintains a population of models with different hyperparameters, periodically replacing poor-performing models with better-performing ones and mutating hyperparameters to explore new configurations.
|
129 |
+
|
130 |
+
PBT is particularly effective for large-scale models and tasks that require significant computational resources, as it allows for parallel evaluation and optimization of multiple models.
|
131 |
+
|
132 |
+
```python
|
133 |
+
import tensorflow as tf
|
134 |
+
from tensorflow.keras.optimizers import Adam
|
135 |
+
|
136 |
+
# Define a simple PBT loop
|
137 |
+
population_size = 10
|
138 |
+
generations = 5
|
139 |
+
population = [SomeModel() for _ in range(population_size)]
|
140 |
+
optimizers = [Adam(learning_rate=0.01) for _ in range(population_size)]
|
141 |
+
|
142 |
+
for generation in range(generations):
|
143 |
+
# Train each model in the population
|
144 |
+
for i in range(population_size):
|
145 |
+
for data, labels in dataset:
|
146 |
+
with tf.GradientTape() as tape:
|
147 |
+
predictions = population[i](data)
|
148 |
+
loss = compute_loss(labels, predictions)
|
149 |
+
gradients = tape.gradient(loss, population[i].trainable_variables)
|
150 |
+
optimizers[i].apply_gradients(zip(gradients, population[i].trainable_variables))
|
151 |
+
|
152 |
+
# Evaluate performance and apply selection/mutation
|
153 |
+
scores = [evaluate_model(model) for model in population]
|
154 |
+
best_indices = np.argsort(scores)[-population_size//2:]
|
155 |
+
worst_indices = np.argsort(scores)[:population_size//2]
|
156 |
+
|
157 |
+
for i in range(len(worst_indices)):
|
158 |
+
population[worst_indices[i]] = population[best_indices[i % len(best_indices)]]
|
159 |
+
optimizers[worst_indices[i]].learning_rate = optimizers[best_indices[i % len(best_indices)]].learning_rate * (0.8 + 0.4 * np.random.rand())
|
160 |
+
```
|
161 |
+
|
162 |
+
### Conclusion
|
163 |
+
Hyperparameter tuning is an essential step in building effective machine learning models. By using techniques like grid search, random search, Bayesian optimization, gradient-based optimization, evolutionary algorithms, and Population Based Training (PBT), we can find the best hyperparameters for our models and improve their performance on unseen data. Each method has its strengths and is suitable for different types of problems and computational constraints, making it crucial to choose the appropriate technique for a given task.
|
164 |
### Conclusion
|
165 |
|
166 |
Hyperparameter tuning is an essential step in building effective machine learning models. By using techniques like grid search, random search, or Bayesian optimization, we can find the best hyperparameters for our model and improve its performance on unseen data.
|