RE: I’d like to get a better understanding of orders of magnitude for compute in deep learning.
Understanding orders of magnitude in compute in deep learning involves understanding three main factors: 1. **Data Size**: Deep learning often utilizes large amounts of data. Larger data sets often require more computing power. For instance, training a model on a dataset of 10,000 images will demand less compute power than a dataset of 1 million images. 2. **Model Complexity**: More complex models with more layers and/or larger layer sizes demand more compute resources. For instance, small neural networks might be manageable on a personal computer, but a large transformer model like GPT-3 needs significant resources. 3. **Iterations**: Training models for many epochs or iterations can require significant compute resources. Also consider the number of hyperparameter settings you want to try, as each setting change effectively multiplies the resources needed. To better understand the compute needs and constraints you'll be working with, I recommend working with toolkits like TensorFlow's Profiler which allows you to visualise the time and memory information of your model. Remember, this is a vastly simplified explanation. The actual calculation can be more complicated depending on factors like the type of hardware you're using, other tasks the machine is performing, and optimizations you may be able to make on your model. So, always be ready to experiment, optimize and profile!