32 subscribers
התחל במצב לא מקוון עם האפליקציה Player FM !
פודקאסטים ששווה להאזין
בחסות


1 LIVE: Before the Chorus & Open Folk Present: In These Lines feat. Gaby Moreno, Lily Kershaw & James Spaite 33:58
Half precision
Manage episode 301973966 series 2921809
In this episode I talk about reduced precision floating point formats float16 (aka half precision) and bfloat16. I'll discuss what floating point numbers are, how these two formats vary, and some of the practical considerations that arise when you are working with numeric code in PyTorch that also needs to work in reduced precision. Did you know that we do all CUDA computations in float32, even if the source tensors are stored as float16? Now you know!
Further reading.
- The Wikipedia article on IEEE floating point is pretty great https://en.wikipedia.org/wiki/IEEE_754
- How bfloat16 works out when doing training https://arxiv.org/abs/1905.12322
- Definition of acc_type in PyTorch https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/AccumulateType.h
83 פרקים
Manage episode 301973966 series 2921809
In this episode I talk about reduced precision floating point formats float16 (aka half precision) and bfloat16. I'll discuss what floating point numbers are, how these two formats vary, and some of the practical considerations that arise when you are working with numeric code in PyTorch that also needs to work in reduced precision. Did you know that we do all CUDA computations in float32, even if the source tensors are stored as float16? Now you know!
Further reading.
- The Wikipedia article on IEEE floating point is pretty great https://en.wikipedia.org/wiki/IEEE_754
- How bfloat16 works out when doing training https://arxiv.org/abs/1905.12322
- Definition of acc_type in PyTorch https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/AccumulateType.h
83 פרקים
כל הפרקים
×








1 Dispatcher questions with Sherlock 18:36





1 Tensor subclasses and Liskov substitution principle 19:13


1 DataLoader with multiple workers leaks memory 16:38


1 Multiple dispatch in __torch_function__ 14:20


1 Asynchronous versus synchronous execution 15:03


1 torch.use_deterministic_algorithms 10:50




1 API design via lexical and dynamic scoping 21:44




ברוכים הבאים אל Player FM!
Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.