Just drop in your AI code and play your algorithms. Any deep learning model and any deep learning framework end-to-end with ease. Harness the power of DEEPX's NPU today and elevate your deep learning experience to the next level.
DXNN® - DEEPX NPU Software (SDK)
DEEPX's DXNN software framework is an all-in-one solution that streamlines the deployment of deep learning
models into DEEPX's AI SoCs. DXNN comprises two essential components: the NPU compiler, DX-COM,
and the NPU runtime system software, DX-RT. DX-COM delivers the tools necessary for high-performance
quantization, model optimization, and NPU inference compilation. Meanwhile, DX-RT includes the NPU
device driver, runtime with APIs, and NPU firmware. With DXNN, you can easily and efficiently deploy
deep learning models, unlocking the full potential of DEEPX's AI SoCs.
DXNN® Key features
NPU COMPILER FOR DNN MODELS
DEEPX compiler compiles the trained inference DNN models to generate
binaries for DEEPX NPU. The result is an optimized execution code in terms
of accuracy, latency, throughput, and efficiency. The execution binary
efficiently utilizes every element of NPU compute resources for optimizing
power consumption, processing performance, memory bandwidth, and memory
footprints. The SW tool explores numerous different schedules of
NPU operations and picks the best approach to generate an optimized runtime.
WORLD'S TOP PERFORMING QUANTIZER
DXNN supports automatic quantization of DNN models trained in floating-point format. It receives a model description and representative inputs and automatically quantizes the model to fixed-point data types, thus greatly reducing execution time and increasing power efficiency. The SDK’s quantizer converts trained models from FP32 bit to INT8 or less bit integer representation. The DXNN quantizer provides extremely high AI accuracy in NPU solutions. The AI accuracy of DXNN quantization is almost similar to the level of DNN models in the FP32 bit representation of the GPU or even higher!
WORLD'S TOP OPTIMIZER STREAMLINES THE MODEL INFERENCE PROCESS
The optimizer of DXNN is in charge of optimizing user DNN models.The optimizer exploits both the traditional optimization technique and the emerging graph-level optimization technique.The optimizer of DXNN can highly reduce an amount of computation without AI accuracy loss.
The optimizer of DXNN aggressively substitutes sub-graph with optimized version of the sub-graph, such as fusing operators or exchanging the order of operators.
USER FRIENDLY HOST COMMUNICATION AND RUNTIME
The SDK includes Linux (x86/Arm) and Windows (x86) drivers that support
communication between the host and DEEPX NPU. DEEPX’s runtime API
supports commands for model loading, inference execution, passing model
inputs, receiving inference data, and a set of functions to manage the devices.