Tools and Frameworks for Deep Learning CPU Benchmarks

Parmanand Sahu Last Updated : 13 Jan, 2025

12 min read

Deep learning GPU benchmarks has revolutionized the way we solve complex problems, from image recognition to natural language processing. However, while training these models often relies on high-performance GPUs, deploying them effectively in resource-constrained environments such as edge devices or systems with limited hardware presents unique challenges. CPUs, being widely available and cost-efficient, often serve as the backbone for inference in such scenarios. But how do we ensure that models deployed on CPUs deliver optimal performance without compromising accuracy?

This article dives into the benchmarking of deep learning model inference on CPUs, focusing on three critical metrics: latency, CPU utilization and Memory Utilization. Using a spam classification example, We explore how popular frameworks like PyTorch, TensorFlow, JAX , and ONNX Runtime handle inference workloads. By the end, you’ll have a clear understanding of how to measure performance, optimize deployments, and select the right tools and frameworks for CPU-based inference in resource-constrained environments.

Impact: Optimal inference execution can save a significant amount of money and free up resources for other workloads.

Learning Objectives

Understand the role of Deep Learning CPU benchmarks in assessing hardware performance for AI model training and inference.
Evaluate PyTorch, TensorFlow, JAX, ONNX Runtime, and OpenVINO Runtime to choose the best for your needs.
Master tools like psutil and time to collect accurate performance data and optimize inference.
Prepare models, run inference, and measure performance, applying techniques to diverse tasks like image classification and NLP.
Identify bottlenecks, optimize models, and enhance performance while managing resources efficiently.

This article was published as a part of the Data Science Blogathon.

Optimizing Inference with Runtime Acceleration
Model Inference Performance Metrics
Assumptions and Limitations
Tools and Frameworks
Install Dependencies
Problem Statement and Input Specification
Models Architecture and Formats
Examples of Additional Networks for Benchmarking
Benchmarking Workflow
Benchmarking Function Definiton
Model Inference and Perform Benchmarking for Each Framework
Results and Discussion
Conclusion
Frequently Asked Questions

Optimizing Inference with Runtime Acceleration

Inference speed is essential for user experience and operational efficiency in machine learning applications. Runtime optimization plays a key role in enhancing this by streamlining execution. Using hardware-accelerated libraries like ONNX Runtime takes advantage of optimizations tailored to specific architectures, reducing latency (time per inference).

Additionally, lightweight model formats such as ONNX minimize overhead, enabling faster loading and execution. Optimized runtimes leverage parallel processing to distribute computation across available CPU cores and improve memory management, ensuring better performance especially on systems with limited resources. This approach makes models faster and more efficient while maintaining accuracy.

Model Inference Performance Metrics

To evaluate the performance of our models, we focus on three key metric:

Latency

Definition : Latency refers to the time it takes for the model to make a prediction after receiving input. This is often measured as the time taken from sending the input data to receiving the output (prediction)
Importance : In real-time or near-real-time applications, high latency leads to delays, which can result in slower responses.
Measurement : Latency is typically measure in milliseconds (ms) or seconds (s). Shorter latency means the system is more responsive and efficient, crucial for applications requiring immediate decision-making or actions.

CPU Utilization

Definition: CPU Utilization is the percentage of the CPU’s processing power that is consumed while performing inference tasks. It tells you how much of the system’s computational resources are being used during model inference.
Importance : High CPU usage means that the machine might struggle to handle other tasks concurrently, leading to bottlenecks. Efficient use of CPU resources ensures that the model inference does not monopolize the system resources.
Measurement : It is typically measured as a percentage (%) of the total available CPU resources. Lower utilization for the same workload generally indicates a more optimized model, utilizing CPU resources more effectively.

Memory Utilization

Definition: Memory utilization refers to the amount of RAM used by the model during the inference process. It tracks the memory consumption by the model’s parameters, intermediate computations, and the input data.
Importance : Optimizing memory usage is especially critical when deploying models to edge devices or systems whith limited memory. High memory consumption could lead to memory overfloe, slower processing, or system crashes.
Measurement: Memory utilization is measure in megabytes (MB) or gigabytes (GB). Tracking the memory consumption at different stages of inference can help identify memory inefficiencies or memory leaks.

Assumptions and Limitations

To keep this benchmarking study focused and practical, we made the following assumptions and set a few boundaries:

Hardware Constraints: The tests are designed to run on a single machine with limited CPU cores. While modern hardware is capable of handling parallel workloads, this setup mirrors the constraints often seen in edge devices or smaller-scale deployments.
No Multi-System Parallelization: We didn’t incorporate distributed computing setups or cluster-based solutions. The benchmarks reflect performance standalone conditions, suitable for single-node environments with limited CPU cores and Memory.
Scope:The primary focus is only on CPU inference performance. While GPU-based inference is an excellent option for resource-intensive tasks, this benchmarking aims to provide insights into CPU-only setups, which are more common in cost-sensitive or portable applications.

These assumptions ensure the benchmarks remain relevant for developers and teams working with resource-constrained hardware or who need predictable performance without the added complexity of distributed systems.

Tools and Frameworks

We’ll explore the essential tools and frameworks used to benchmark and optimize deep learning model inference on CPUs, providing insights into their capabilities for efficient execution in resource-constrained environments.

Profiling Tools

Python Time (time library) : The time library in Python is a lightweight tool for measuring the execution time of code blocks. By recording the start and end time stamps, it helps calculate the time taken for operations like model inference or data processing.
psutil (CPU, Memory Profiling) : psutil is a Python library for sustem monitoring and profiling. It provides real-time data on CPU usage, memory consumption, disk I/O and more, making it ideal for analyzing usage during model training or inference.

Frameworks for Inference

TensorFlow : A robust framework for deep learning that is widely used for both training and inference tasks. It offers strong support for various models and deployment strategies.
PyTorch: Known for its ease of use and dynamic computation graphs, PyTorch is a popular choice for research and production deployment.
ONNX Runtime: An open-source , cross-platform engine for running ONXX(Open Neural Network Exchange) models, providing efficient inference across various hardware and frameworks.
JAX : A functional framework focused on high-performance numerical computing and machine learning, offering automatic differentiation and GPU/TPU acceleration.
OpenVINO: Optimized for Intel hardware, OpenVINO provides tools for model optimization and deployment on Intel CPUs, GPUs and VPUs.

Hardware Specification and Environment

We are utilizing github codespace (virtual machine) with below configuration:

Specification of Virtual Machine: 2 cores, 8 GB RAM, and 32 GB storage
Python Version: 3.12.1

Install Dependencies

The versions of the packages used are as follows and this primary include five deep learning inference libraries: Tensorflow, Pytorch, ONNX Runtime, JAX, and OpenVINO:

!pip install numpy==1.26.4
!pip install torch==2.2.2
!pip install tensorflow==2.16.2
!pip install onnx==1.17.0
!pip install onnxruntime==1.17.0!pip install jax==0.4.30
!pip install jaxlib==0.4.30
!pip install openvino==2024.6.0
!pip install matplotlib==3.9.3
!pip install Matplotlib: 3.4.3
!pip install Pillow: 8.3.2
!pip install psutil: 5.8.0

Problem Statement and Input Specification

Since model inference consists of performing a few matrix operations between network weights and input data, it doesn’t require model training or datasets. For our example the benchmarking process, we simulated a standard classification use case. This simulates common binary classification tasks like spam detection and loan application decisions(approval or denial). The binary nature of these problems makes them ideal for comparing model performance across different frameworks. This setup reflects real-world systems but allows us to focus on inference performance across frameworks without needing large datasets or pre-trained models.

Problem Statement

The sample task involves predicting whether a given sample is spam or not (loan approval or denial), based on a set of input features. This binary classification problem is computationally efficient, allowing for a focused analysis of inference performance without the complexity of multi-class classification tasks.

Input Specification

To simulate real-world email data, we generated randomly input. These embeddings mimic the type of data that might be processed by spam filters but avoid the need for external datasets. This simulated input data allows for benchmarking without relying on any specific external datasets, making it ideal for testing model inference times, memory usage, and CPU performance. Alternatively, you can use image classification, NLP task or any other deep learning tasks to perform this benchmarking process.

Models Architecture and Formats

Model selection is a critical step in benchmarking as it directly influences the inference performance and insights gained from the profiling process. As mentioned in the previous section, for this benchmarking study, we chose a standard Classification use case, which involves identifying whether a given email is spam or not. This task is a straightforward two-class classification problem that is computationally efficient yet provides meaningful results for comparison across frameworks.

Models Architecture for Benchmarking

The model for the Classification task is a Feedforward Neural Network (FNN) designed for binary classification (Spam vs. Not Spam). It consists of the following layers:

Input Layer : Accepts a vector of size 200(embedding features). We have provided example of PyTorch, other frameworks follow the exact same network configuration

self.fc1 =  torch.nn.Linear(200,128)

Hidden Layers : The network has 5 hidden layers, with each successive layer containing fewer units than the previous one.

self.fc2 = torch.nn.Linear(128, 64)
self.fc3 = torch.nn.Linear(64, 32)
self.fc4 = torch.nn.Linear(32, 16)
self.fc5 = torch.nn.Linear(16, 8)
self.fc6 = torch.nn.Linear(8, 1)

Output Layers : A single neuron with a Sigmoid activation function to output a probability (0 for Not Spam, 1 for Spam). We have utilized sigmoid layer as final output for binary classification.

self.sigmoid = torch.nn.Sigmoid()

The model is simple yet effective for classification task.

The model architecture diagram used for benchmarking in our use case is shown below:

Neural_Network_Architecture: Deep Learning GPU Benchmarks

Examples of Additional Networks for Benchmarking

Image Classification : Models like ResNet-50 (medium complexity) and MobileNet (lightweight) can be added to the benchmark suite for tasks involving image recognition. ResNet-50 offers a balance between computational complexity and accuracy, while MobileNet is optimized for low-resource environments.
NLP Tasks : DistilBERT: A smaller, faster variant of the BERT model, suited for natural language understanding tasks.

Model Formats

Native Formats: Each framework supports its native model formats, such as .pt for PyTorch and .h5 for TensorFlow.
Unified Format (ONNX): To ensure compatibility across frameworks, We exported the PyTorch model to the ONNX format (model.onnx). ONNX (Open Neural Network Exchange) acts as a bridge, enabling models to be used in other frameworks like PyTorch, TensorFlow, JAX, or OpenVINO without significant modifications. This is especially useful for multi-framework testing and real-world deployment scenarios, where interoperability is critical.
These formats are optimized for their respective frameworks, making them easy to save, load, and deploy within those ecosystems.

Benchmarking Workflow

This workflow aims to compare the inference performance of multiple deep learning frameworks (TensorFlow, PyTorch, ONNX, JAX, and OpenVINO) using the classification task. The task involves using randomly generated input data and benchmarking each framework to measure the average time taken for a prediction.

Import python packages
Disable GPU usage and suppress Tensorflow Logging
Input data preparation
Model Implementations for each framework
Benchmarking function definition
Model Inference and Benchmarking execution for each framework
Visualization and export of Benchmarking Results

Import Necessary Python Packages

To get started with benchmarking deep learning models, we first need to import the essential Python packages that enable seamless integration and performance evaluation.

import time
import os
import numpy as np
import torch
import tensorflow as tf
from tensorflow.keras import Input
import onnxruntime as ort
import matplotlib.pyplot as plt
from PIL import Image
import psutil
import jax
import jax.numpy as jnp
from openvino.runtime import Core
import csv

Disable GPU Usage and Suppress TensorFlow Logging

os.environ["CUDA_VISIBLE_DEVICES"] = "-1" # Disable GPU
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3" #Suppress Tensorflow Log

Input Data Preparation

In this step, we randomly generate input data for spam classification:

Dimensionality of a sample (200-dimesnional features)
The number of classes (2: Spam or Not Spam)

We generate randome data using NumPy to serve as input features for the models.

#Generate dummy data
input_data = np.random.rand(1000, 200).astype(np.float32)

Model Definition

In this step, we define the netwrok architecture or setup the model from each deep learning framework( Tensorflow, PyTorch, ONNX, JAX and OpenVINO). Each framework requires a specific methods for loading models and setting them up for inference.

PyTorch Model: In PyTorch, we define a simple neural neural network architecture with five fully connected layers.
Tensorflow Model : The TensorFlow model is defined using the Keras API and consists of a simple feedforward neural network for the classification task.
JAX Model: The model is initialized with parameters, and the prediction function is compiled using JAX’s Just-in-Time (JIT) compilation for efficient execution.
ONNX Model: For ONNX, we export a model from PyTorch. After exporting to the ONNX format, we load the model using the onnxruntime. InferenceSession API. This allows us to run inference on the model across different hardware specification.
OpenVINO Model: OpenVINO is used for running optimized and deploying models, particularly those trained using other frameworks (like PyTorch or TensorFlow). We load the ONNX model and compile it with OpenVINO’s runtime.

Pytorch

class PyTorchModel(torch.nn.Module):
    def __init__(self):
        super(PyTorchModel, self).__init__()
        self.fc1 = torch.nn.Linear(200, 128)
        self.fc2 = torch.nn.Linear(128, 64)
        self.fc3 = torch.nn.Linear(64, 32)
        self.fc4 = torch.nn.Linear(32, 16)
        self.fc5 = torch.nn.Linear(16, 8)
        self.fc6 = torch.nn.Linear(8, 1)
        self.sigmoid = torch.nn.Sigmoid()

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = torch.relu(self.fc3(x))
        x = torch.relu(self.fc4(x))
        x = torch.relu(self.fc5(x))
        x = self.sigmoid(self.fc6(x))
        return x
        
     # Create PyTorch model
    pytorch_model = PyTorchModel()

TensorFlow

tensorflow_model = tf.keras.Sequential([
    Input(shape=(200,)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(16, activation='relu'),
    tf.keras.layers.Dense(8, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])
tensorflow_model.compile()

Jax

def jax_model(x):
    x = jax.nn.relu(jnp.dot(x, jnp.ones((200, 128))))
    x = jax.nn.relu(jnp.dot(x, jnp.ones((128, 64))))
    x = jax.nn.relu(jnp.dot(x, jnp.ones((64, 32))))
    x = jax.nn.relu(jnp.dot(x, jnp.ones((32, 16))))
    x = jax.nn.relu(jnp.dot(x, jnp.ones((16, 8))))
    x = jax.nn.sigmoid(jnp.dot(x, jnp.ones((8, 1))))
    return x

ONNX

# Convert PyTorch model to ONNX
dummy_input = torch.randn(1, 200)
onnx_model_path = "model.onnx"
torch.onnx.export(
    pytorch_model, 
    dummy_input, 
    onnx_model_path, 
    export_params=True, 
    opset_version=11, 
    input_names=['input'], 
    output_names=['output'], 
    dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}}
)

onnx_session = ort.InferenceSession(onnx_model_path)

OpenVINO

# OpenVINO Model Definition
core = Core()
openvino_model = core.read_model(model="model.onnx")
compiled_model = core.compile_model(openvino_model, device_name="CPU")

Benchmarking Function Definiton

This function executes benchmarking tests across different frameworks by taking three arguments: predict_function, input_data, and num_runs. By default, it executes 1,000 times but It can be increased as per requirements.

def benchmark_model(predict_function, input_data, num_runs=1000):
    start_time = time.time()
    process = psutil.Process(os.getpid())
    cpu_usage = []
    memory_usage = []
    for _ in range(num_runs):
        predict_function(input_data)
        cpu_usage.append(process.cpu_percent())
        memory_usage.append(process.memory_info().rss)
    end_time = time.time()
    avg_latency = (end_time - start_time) / num_runs
    avg_cpu = np.mean(cpu_usage)
    avg_memory = np.mean(memory_usage) / (1024 * 1024)  # Convert to MB
    return avg_latency, avg_cpu, avg_memory

Model Inference and Perform Benchmarking for Each Framework

Now that we have loaded the models, it’s time to benchmark the performance of each framework. The benchmarking process perform inference on the generated input data.

PyTorch

# Benchmark PyTorch model
def pytorch_predict(input_data):
    pytorch_model(torch.tensor(input_data))

pytorch_latency, pytorch_cpu, pytorch_memory = benchmark_model(lambda x: pytorch_predict(x), input_data)

TensorFlow

# Benchmark TensorFlow model
def tensorflow_predict(input_data):
    tensorflow_model(input_data)

tensorflow_latency, tensorflow_cpu, tensorflow_memory = benchmark_model(lambda x: tensorflow_predict(x), input_data)

JAX

# Benchmark JAX model
def jax_predict(input_data):
    jax_model(jnp.array(input_data))

jax_latency, jax_cpu, jax_memory = benchmark_model(lambda x: jax_predict(x), input_data)

ONNX

# Benchmark ONNX model
def onnx_predict(input_data):
    # Process inputs in batches
    for i in range(input_data.shape[0]):
        single_input = input_data[i:i+1]  # Extract single input
        onnx_session.run(None, {onnx_session.get_inputs()[0].name: single_input})

onnx_latency, onnx_cpu, onnx_memory = benchmark_model(lambda x: onnx_predict(x), input_data)

OpenVINO

# Benchmark OpenVINO model
def openvino_predict(input_data):
    # Process inputs in batches
    for i in range(input_data.shape[0]):
        single_input = input_data[i:i+1]  # Extract single input
        compiled_model.infer_new_request({0: single_input})

openvino_latency, openvino_cpu, openvino_memory = benchmark_model(lambda x: openvino_predict(x), input_data)

Results and Discussion

Here we discuss the results of performance benchmarking of previously mentioned deep learning frameworks. We compare them on – latency, CPU usage, and memory usage. We have included tabular data and plot for quick comparison.

Latency Comparison

Framework	Latency (ms)	Relative Latency (vs. PyTorch)
PyTorch	1.26	1.0 (baseline)
TensorFlow	6.61	~5.25×
JAX	3.15	~2.50×
ONNX	14.75	~11.72×
OpenVINO	144.84	~115×

Insights:

PyTorch leads as the fastest framework with ~1.26 ms latency.
TensorFlow has ~6.61 ms latency, about 5.25× PyTorch’s time.
JAX sits between PyTorch and TensorFlow in absolute latency.
ONNX is relatively slow as well, at ~14.75 ms.
OpenVINO is the slowest in this experiment, at ~145 ms (115× slower than PyTorch).

CPU Usage

Framework	CPU Usage (%)	Relative CPU Usage<sup>1</sup>
PyTorch	99.79	~1.00
TensorFlow	112.26	~1.13
JAX	130.03	~1.31
ONNX	99.58	~1.00
OpenVINO	99.32	1.00 (baseline)

Insights:

JAX uses the most CPU (~130 %), ~31% higher than OpenVINO.
TensorFlow is at ~112 %, more than PyTorch/ONNX/OpenVINO but still lower than JAX.
PyTorch, ONNX, and OpenVINO, all have similar, ~99-100% CPU usage.

Memory Usage

Framework	Memory (MB)	Relative Memory Usage (vs. PyTorch)
PyTorch	~959.69	1.0 (baseline)
TensorFlow	~969.72	~1.01×
JAX	~1033.63	~1.08×
ONNX	~1033.82	~1.08×
OpenVINO	~1040.80	~1.08–1.09×

Insights:

PyTorch and TensorFlow have similar memory usage around ~960-970 MB
JAX, ONNX, and OpenVINO use around ~1,030–1,040 MB of memory, approximately 8–9% more than PyTorch.

Here is the plot comparing the Performance of Deep Learning Frameworks:

Comparision_of_Deep_Learning_Inference_Framework: Deep Learning GPU Benchmarks

Conclusion

In this article, we presented a comprehensive benchmarking workflow to evaluate the inference performance of prominent deep learning frameworks—TensorFlow, PyTorch, ONNX, JAX, and OpenVINO—using a spam classification task as a reference. By analyzing key metrics such as latency, CPU usage and memory consumption, the results highlighted the trade-offs between frameworks and their suitability for different deployment scenarios.

PyTorch demonstrated the most balanced performance, excelling in low latency and efficient memory usage, making it ideal for latency-sensitive applications like real-time predictions and recommendation systems. TensorFlow provided a middle-ground solution with moderately higher resource consumption. JAX showcased high computational throughput but at the cost of increased CPU utilization, which might be a limiting factor for resource-constrained environments. Meanwhile, ONNX and OpenVINO lagged in latency, with OpenVINO’s performance particularly hindered by the absence of hardware acceleration.

These findings underline the importance of aligning framework selection with deployment needs. Whether optimizing for speed, resource efficiency, or specific hardware, understanding the trade-offs is essential for effective model deployment in real-world environments.

Key Takeaways

Deep Learning CPU Benchmarks provide critical insights into CPU performance, aiding in selecting optimal hardware for AI tasks.
Leveraging Deep Learning CPU Benchmarks ensures efficient model training and inference by identifying high-performing CPUs.
Achieved the best latency (1.26 ms) and maintained efficient memory usage, ideal for real-time and resource-limited applications.
Balanced latency (6.61 ms) with slightly higher CPU usage, suitable for tasks requiring moderate performance compromises.
Delivered competitive latency (3.15 ms) but at the cost of excessive CPU utilization (130%), limiting its utility in constrained setups.
Showed higher latency (14.75 ms), but its cross-platform support makes it flexible for multi-framework deployments.

Frequently Asked Questions

Q1. Why is PyTorch preferred for real-time applications?

A. PyTorch’s dynamic computation graph and efficient execution pipeline allow for low-latency inference (1.26 ms), making it well-suited for applications like recommendation systems and real-time predictions.

Q2. What affected OpenVINO’s performance in this study?

A. OpenVINO’s optimizations are designed for Intel hardware. Without this acceleration, its latency (144.84 ms) and memory usage (1040.8 MB) were less competitive compared to other frameworks.

Q3. How do I choose a framework for resource-constrained environments?

A. For CPU-only setups, PyTorch is the most efficient. TensorFlow is a strong alternative for moderate workloads. Avoid frameworks like JAX unless higher CPU utilization is acceptable.

Q4. What role does hardware play in framework performance?

A. Framework performance depends heavily on hardware compatibility. For instance, OpenVINO excels on Intel CPUs with hardware-specific optimizations, while PyTorch and TensorFlow perform consistently across varied setups.

Q5. Can benchmarking results differ with complex models or tasks?

A. Yes, these results reflect a simple binary classification task. Performance could vary with complex architectures like ResNet or tasks like NLP or others, where these frameworks might leverage specialized optimizations.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Parmanand Sahu

As a seasoned data scientist, I specialize in developing full-stack data science solutions that deliver measurable impact. With expertise in building and deploying deep learning models, scalable data pipelines, and explainability tools, I’ve successfully driven cost savings, enhanced decision-making, and streamlined workflows across industries like banking, fintech, healthcare, communication, and human resources in startups as well as large enterprises.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Deep Learning

Feed Forward Networks

Gradient Descent

Loss Function

Activation Functions

Introduction to Neural networks

Forward and Backward Propagation

Optimizers

Learning Rate Schedulers

NN on Structured Data

Improving the Deep Learning Model

Deep Learning Model Optimization

Unsupervised Deep Learning

AutoDL

Model Deployment

Introduction to PyTorch

Tools and Frameworks for Deep Learning CPU Benchmarks

Learning Objectives

Table of contents

Optimizing Inference with Runtime Acceleration

Model Inference Performance Metrics

Latency

CPU Utilization

Memory Utilization

Assumptions and Limitations

Tools and Frameworks

Profiling Tools

Frameworks for Inference

Hardware Specification and Environment

Install Dependencies

Problem Statement and Input Specification

Problem Statement

Input Specification

Models Architecture and Formats

Models Architecture for Benchmarking

Examples of Additional Networks for Benchmarking

Model Formats

Benchmarking Workflow

Import Necessary Python Packages

Disable GPU Usage and Suppress TensorFlow Logging

Input Data Preparation

Model Definition

Pytorch

TensorFlow

Jax

ONNX

OpenVINO

Benchmarking Function Definiton

Model Inference and Perform Benchmarking for Each Framework

PyTorch

TensorFlow

JAX

ONNX

OpenVINO

Results and Discussion

Latency Comparison

CPU Usage

Memory Usage

Conclusion

Key Takeaways

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)