lorraine2 (Jonathan Lorraine)

posted an update 3 months ago

Post

560

🚨 Code now available for "Using Large Language Models for Hyperparameter Optimization" at https://github.com/michaelrzhang/LLM-HyperOpt 🚨

TLDR: You can just ask LLMs which hyperparameters to use, and it works pretty well! You can even directly optimize your model’s code as a hyperparameter with this.

Check out the paper at https://arxiv.org/abs/2312.04528 - with Michael Zhang, Nishkrit Desai, Juhan Bae, and Jimmy Ba

Reacted to their post with 👍🔥 4 months ago

Post

2694

⚡ My PhD thesis, “Scalable Nested Optimization for Deep Learning,” is now on arXiv! ⚡

tl;dr: We develop various optimization tools with highlights, including:
· Making the momentum coefficient complex for adversarial games like GANs.
· Optimizing millions of hyperparameters using implicit differentiation.
· Tuning hyperparameters using hypernetworks.
· Differentiably finding bifurcations in optimization for diverse solutions.

https://arxiv.org/abs/2407.01526

3 replies

·

Reacted to their post with 👍 4 months ago

Post

1681

New NVIDIA GTC24 paper 🎊

We generate high-quality 3D assets in only 400ms from text by combining (a) amortized optimization for speed, (b) surface rendering for quality, and (c) 3D data for robustness.

☕ LATTE3D project details: https://research.nvidia.com/labs/toronto-ai/LATTE3D/

2 replies

·

Reacted to their post with 🚀🤗🔥 4 months ago

Post

2391

New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed Model Weights

Hyperparameter optimization often dominates the cost of model design. So, we want cheap surrogate functions that approximate model performance to guide our search. Existing methods can train on optimization metadata – like a trajectory of losses – to build these surrogates.

In our work, we add the ability to train our hyperparameter optimization surrogates on checkpointed model weights with a graph metanetwork. This allows us to leverage a large, pre-existing source of information that can featurize the architecture, dataset, losses, and optimization procedure.

🔍Project page: https://research.nvidia.com/labs/toronto-ai/FMS/
👨‍💻 Code for reproduction: https://github.com/NVlabs/forecasting-model-search
📄 Full Paper: https://arxiv.org/abs/2406.18630

Our project was a collaboration between NVIDIA’s Toronto AI Lab and the TAO team.

Check out more work from Toronto AI Lab here: https://research.nvidia.com/labs/toronto-ai/

You can view the TAO toolkit here: https://developer.nvidia.com/tao-toolkit

Reacted to their post with ❤️🚀 4 months ago

Post

1681

New NVIDIA GTC24 paper 🎊

We generate high-quality 3D assets in only 400ms from text by combining (a) amortized optimization for speed, (b) surface rendering for quality, and (c) 3D data for robustness.

☕ LATTE3D project details: https://research.nvidia.com/labs/toronto-ai/LATTE3D/

2 replies

·

posted an update 4 months ago

Post

2694

⚡ My PhD thesis, “Scalable Nested Optimization for Deep Learning,” is now on arXiv! ⚡

tl;dr: We develop various optimization tools with highlights, including:
· Making the momentum coefficient complex for adversarial games like GANs.
· Optimizing millions of hyperparameters using implicit differentiation.
· Tuning hyperparameters using hypernetworks.
· Differentiably finding bifurcations in optimization for diverse solutions.

https://arxiv.org/abs/2407.01526

3 replies

·

posted an update 5 months ago

Post

2391

New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed Model Weights

Hyperparameter optimization often dominates the cost of model design. So, we want cheap surrogate functions that approximate model performance to guide our search. Existing methods can train on optimization metadata – like a trajectory of losses – to build these surrogates.

In our work, we add the ability to train our hyperparameter optimization surrogates on checkpointed model weights with a graph metanetwork. This allows us to leverage a large, pre-existing source of information that can featurize the architecture, dataset, losses, and optimization procedure.

🔍Project page: https://research.nvidia.com/labs/toronto-ai/FMS/
👨‍💻 Code for reproduction: https://github.com/NVlabs/forecasting-model-search
📄 Full Paper: https://arxiv.org/abs/2406.18630

Our project was a collaboration between NVIDIA’s Toronto AI Lab and the TAO team.

Check out more work from Toronto AI Lab here: https://research.nvidia.com/labs/toronto-ai/

You can view the TAO toolkit here: https://developer.nvidia.com/tao-toolkit

replied to their post 8 months ago

We include a narrated 30s summary video here and, additionally, on our project webpage, a video demonstrating our model's usage and a 3-minute video overview explaining our method.

posted an update 8 months ago

Post

1681

New NVIDIA GTC24 paper 🎊

We generate high-quality 3D assets in only 400ms from text by combining (a) amortized optimization for speed, (b) surface rendering for quality, and (c) 3D data for robustness.

☕ LATTE3D project details: https://research.nvidia.com/labs/toronto-ai/LATTE3D/

2 replies

·

Jonathan Lorraine

AI & ML interests

Recent Activity

Organizations

lorraine2's activity