-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Docs: fp16 page #404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs: fp16 page #404
Conversation
The documentation is not available anymore as the PR was closed or merged. |
LGTM. float16 made all the difference for me in terms of being able to run it on my hardware; it's good to have examples like this of its use. A few other details to consider (but could also follow in a future update):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me - feel free to merge. Left some suggestions as comments :-)
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Explained by @patrickvonplaten after a suggestion by @keturn. Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
…to docs-optim-fp16
Added a couple of tweaks on top of your suggestions. Thanks a lot @patrickvonplaten and @keturn, very useful observations! |
Very nice! |
* Initial version of `fp16` page. * Fix typo in README. * Change titles of fp16 section in toctree. * PR suggestion Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * PR suggestion Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Clarify attention slicing is useful even for batches of 1 Explained by @patrickvonplaten after a suggestion by @keturn. Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Do not talk about `batches` in `enable_attention_slicing`. * Use Tip (just for fun), add link to method. * Comment about fp16 results looking the same as float32 in practice. * Style: docstring line wrapping. Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Part of #293.