More

    NVIDIA expands its Ampere GPU line up with the A2 Tensor Core GPU accelerator

    NVIDIA, the world’s largest graphics cards producer and a leading player in data centre GPUs, is now further expanding the professional data centre line up of Ampere GPUs with the A2 Tensor Core GPU accelerator.

    According to sources, this new accelerator from NVIDIA is the most entry-level design we have ever seen from NVIDIA and this GPU will boast some decent specifications based on its entry-level market designation.

    Coming to the specifications of the device, the NVIDIA A2 Tensor Core GPU is designed specifically for inferencing and will replace the Turing-powered T4 Tensor Core GPU. This new GPU will be featuring a variant of Ampere GA107 GPU SKU which offers 1280 CUDA cores and 40 Tensor cores.

    The cores of the GPU run’s at a clock speed of 1.77 GHz and are based on the Samsung 8nm process node. Only the higher-end GA100 GPU SKUs are based on the TSMC 7nm process node. Memory wise, the design of this new GPU comes with a 16 GB GDDR6 capacity that runs across a 128-bit bus-wide interface, clocking in at 12.5 Gbps effectively for a total bandwidth of 200 GB/s.

    NVIDIA’s new A2 Tensor Core GPU operates at a TDP between 40 and 60 Watts, and it also comes in a small form factor design with a Half-Height and Half-Length form factor which is passively cooled. The GPU also features a PCIe Gen 4.0 x8 interface instead of the standard x16 link.

    NVIDIA Ampere Professional GPU Lineup:

    GPU NameA100A40A30A16A10A2
    Process NodeTSMC 7nmSamsung 8nmTSMC 7nmSamsung 8nmSamsung 8nmSamsung 8nm
    GPU SKUGA100-884GA102-895GA100-8904x GA107GA102-890GA107
    GPU Transistors54.2B28.3B54.2BTBA28.3BTBA
    CUDA Cores69121075235842560 x492161280
    Tensor Cores43233622480 x428840
    Boost Clock1.41 GHz1.74 GHz1.44 GHz1.69 GHz1.69 GHz1.77 GHz
    FP32 Compute19.49 TFLOPs37.42 TFLOPs10.32 TFLOPs8.678 TFLOPs x431.24 TFLOPs4.5 TFLOPs
    FP64 Compute9.74 TFLOPs1.16 TFLOPs5.16 TFLOPs0.27 TFLOPs x40.97 TFLOPs0.14 TFLOPs
    FP16 Compte77.97 TFLOPs37.42 TFLOPs10.32 TFLOPs8.67 TFLOPs x431.24 TFLOPs4.5 TFLOPs
    INT8 Tensor Compute624 TOPS598.6 TOPs330 TOPSTBA500 TOPS36 TOPS
    TF32 Tensor Compute156 TFLOPS149.6 TOPs82 TFLOPSTBA125 TF9 TFLOPS
    PCIe InterconnectsNVLink 3
    12 Links
    PCIe 4.0 x16PCIe 4.0 x16 +
    NVLink 3 (4 Links)
    PCIe 4.0 x16PCIe 4.0 x16PCIe 4.0 x8
    Memory Capacity40 GB HBM2e48 GB GDDR624 GB HBM2e16 GB x4 GDDR624 GB GDDR616 GB GDDR6
    Memory Bus5120 bit384 bit3072 bit128 bit x4384 bit128-bit
    Memory Clock1215 MHz1812 MHz1215 MHz1812 MHz1563 MHz1563 MHz
    Bandwidth1.55 TB/s695.8 GB/s933.1 GB/s231.9 GB/s x4600.2 GB/s200 GB/s
    TDP400W300W165W250W150W60W
    Form FactorSXM4PCIe Dual Slot, Full LengthPCIe Dual Slot, Full LengthPCIe Dual Slot, Full LengthPCIe Single Slot, FLHHPCIe Single Slot, HLHF

    Source

    Get in Touch

    LEAVE A REPLY

    Please enter your comment!
    Please enter your name here

    Latest Posts