Inference Flocabulary

aws-neuron/neuronx-distributed-inference

This package includes an inference demo console script that you can use to run inference. This script includes benchmarking and accuracy checking features that are useful for developers to verify that ...

Microsoft

VIDUR: A Large-Scale Simulation Framework for LLM Inference

Optimizing the deployment of Large Language Models (LLMs) is expensive today since it requires experimentally running an application workload against an LLM implementation while exploring large ...

IEEE

Efficient Inference for Pruned CNN Models on Mobile Devices With Holistic Sparsity Alignment

Abstract: Many artificial intelligence applications based on convolutional neural networks are directly deployed on mobile devices to avoid network unavailability and user privacy leakage. However, ...

IEEE

Task-Oriented Over-the-Air Computation for Edge-Device Co-Inference With Balanced Classification Accuracy

Abstract: Edge-device co-inference, which concerns the cooperation between edge devices and an edge server for completing inference tasks over wireless networks, has been a promising technique for ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results