TorchServe IPEX Blog #2#2079
TorchServe IPEX Blog #2#2079msaroufim merged 45 commits intopytorch:masterfrom min-jean-cho:minjean/torchserve_with_ipex_2
Conversation
✅ Deploy Preview for pytorch-tutorials-preview ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
|
@msaroufim, PR for our second joint TorchServe IPEX blog. Could you please help review, thanks! |
msaroufim
left a comment
There was a problem hiding this comment.
Fantastic blog (as usual) - thank you @min-jean-cho
svekars
left a comment
There was a problem hiding this comment.
A few editorial suggestions. Also, need to fix these links:
/var/lib/jenkins/workspace/intermediate/torchserve_with_ipex.rst:2: WARNING: Duplicate explicit target name: "intel® vtune™ profiler".
/var/lib/jenkins/workspace/intermediate/torchserve_with_ipex_2.rst:3: WARNING: Duplicate explicit target name: "config.properties"
Probably, just need to add double underscores in the link (my link <url>__)
| Grokking PyTorch Intel CPU performance from first principles (Part 2) | ||
| ===================================================================== | ||
|
|
||
| Authors: Min Jean Cho, Jing Xu, Mark Saroufim |
There was a problem hiding this comment.
let's make this links to Github profiles.
There was a problem hiding this comment.
Thanks @svekars for the review. Have done so.
|
|
||
| Throughout this blog, we'll use `Top-down Microarchitecture Analysis (TMA) <https://www.intel.com/content/www/us/en/develop/documentation/vtune-cookbook/top/methodologies/top-down-microarchitecture-analysis-method.html>`_ to profile and show that the Back End Bound (Memory Bound, Core Bound) is often the primary bottleneck for under-optimized or under-tuned deep learning workloads, and we'll demonstrate optimization techniques via Intel® Extension for PyTorch* for improving Back End Bound. We'll also use `Intel® VTune™ Profiler's Instrumentation and Tracing Technology (ITT) <https://github.com/pytorch/pytorch/issues/41001>`_ to profile at finer granularity. | ||
|
|
||
| ***************** |
There was a problem hiding this comment.
We typically don't add a table of contents like this because it's autogenerated on the right hand side under Shortcuts. It is difficult to keep a manual TOC like this up-to-date with any future changes in the content.
There was a problem hiding this comment.
I'm fine with either options, but I thought TOC would help when readers read the intro since this tutorial has lots of content. Let me know what you'd prefer !
There was a problem hiding this comment.
I suggest to remove it as we don't have it in any other tutorials.
| - Intel® Extension for PyTorch* Optimizations | ||
| - Intel® Extension for PyTorch* with TorchServe | ||
| - Exercise | ||
|
|
There was a problem hiding this comment.
Where is prerequisites? In the TOC above they are listed as first section.
There was a problem hiding this comment.
Do we need to have an explicit heading with "Prerequisites" ? The TOC is more of intended for a glance of content at one shot, rather than a link to the sections.
There was a problem hiding this comment.
It would be good to have a prerequisites section and list what the user need to know / have installed before they can complete this tutorial.
There was a problem hiding this comment.
OK. Let me try moving around some paragraphs, to heave content under each heading.
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Co-authored-by: Svetlana Karslioglu <svekars@fb.com>
Thank you @svekars for checking on this. I've fixed the img sizes, could you check if they look fine now?
Could I have a look at this expanded (maybe a screenshot)? I want to see if they show up as I've intended. |
May I know the source of this warning if it's from an internal CI? I've tried fixing to my link __ where applicable, but not sure if this will resolve that warning. |
| Additionally, let's profile with PyTorch Profiler. | ||
|
|
||
| .. figure:: /_static/img/torchserve-ipex-images-2/13.png | ||
| :width: 150% |
There was a problem hiding this comment.
can we please set this image to not more than 100% width since it causes a formatting issue? They should be clickable so the user should be able to enlarge them.
There was a problem hiding this comment.
All images are already changed to less than 100% width. The comment is on outdated file, prior to this change.
| Additionally, let's profile with PyTorch Profiler. | ||
|
|
||
| .. figure:: /_static/img/torchserve-ipex-images-2/15.png | ||
| :width: 150% |
There was a problem hiding this comment.
Can we please set the width to not more than 100%
| When tuning CPU for optimal performance, it's useful to know where the bottleneck is. Most CPU cores have on-chip Performance Monitoring Units (PMUs). PMUs are dedicated pieces of logic within a CPU core that count specific hardware events as they occur on the system. Examples of these events may be Cache Misses or Branch Mispredictions. PMUs are used for Top-down Microarchitecture Analysis (TMA) to identify the bottlenecks. TMA consists of hierarchical levels as shown: | ||
|
|
||
| .. figure:: /_static/img/torchserve-ipex-images-2/2.png | ||
| :width: 130% |
There was a problem hiding this comment.
Please set the width to 100%
|
|
||
| Throughout this blog, we'll use `Top-down Microarchitecture Analysis (TMA) <https://www.intel.com/content/www/us/en/develop/documentation/vtune-cookbook/top/methodologies/top-down-microarchitecture-analysis-method.html>`_ to profile and show that the Back End Bound (Memory Bound, Core Bound) is often the primary bottleneck for under-optimized or under-tuned deep learning workloads, and we'll demonstrate optimization techniques via Intel® Extension for PyTorch* for improving Back End Bound. We'll also use `Intel® VTune™ Profiler's Instrumentation and Tracing Technology (ITT) <https://github.com/pytorch/pytorch/issues/41001>`_ to profile at finer granularity. | ||
|
|
||
| ***************** |
There was a problem hiding this comment.
I suggest to remove it as we don't have it in any other tutorials.
|
|
||
| 1. The feature has to be explicitly enabled by with *torch.autograd.profiler.emit_itt()*. | ||
|
|
||
| ********************************************* |
There was a problem hiding this comment.
Right, it looks like you have three headings on lines 66 - 71, meaning that TorchServe with Intel® Extension for PyTorch* and Leveraging Advanced Launcher Configuration: Memory Allocator are empty. Can we add a short overview paragraph under each of these headings?
|
Thanks @svekars for the review, I have updated according to your comments. The updates are:
|
svekars
left a comment
There was a problem hiding this comment.
LGTM! Thank you for addressing the comments.
|
Thanks @svekars for the thorough review - grammatical errors, also I realize it's easier to read now with content under each heading. Thanks :) Will you be merging this? |
| :card_description: A case study on the TorchServe inference framework optimized with Intel® Extension for PyTorch (Part 2). | ||
| :image: _static/img/thumbnails/cropped/generic-pytorch-logo.png | ||
| :link: intermediate/torchserve_with_ipex_2 | ||
| :tags: Model-Optimization,Production |




No description provided.