ARM Neoverse N1 Processor Architecture : Features, Working & Its Applications

The ARM Neoverse processors are a group of 64-bit ARM processor cores. The ARM Neoverse platform was introduced to extend ARM’s low-power architecture to infrastructure computing. These are designed for use in high-performance computing, edge computing, and data center applications. This group includes different series of processors like V-Series, N-Series, and E-Series. Thus, these are optimized for different levels of power efficiency and performance. These processor cores are well known for delivering scalability, high performance, and a base for developing custom silicon for current infrastructure workloads. This article provides an in-depth examination of the ARM Neoverse N1 processor, its operation, and its applications.


What is ARM Neoverse N1?

The first-generation Arm CPU core, like the ARM Neoverse N1 processor, is designed for use in infrastructure and cloud applications like 5G and cloud computing. Thus, it provides power efficiency and world-class performance for networking, edge computing, and cloud-native workloads. This processor combines thread performance and server-class features with advanced low-power design methods. Thus, it delivers innovative performance for each watt to design the next-generation cloud-to-edge infrastructure.

This processor platform comprises the N1 CPU & CMN-600 coherent mesh interconnects to connect several cores & custom accelerators. So it provides scalability, server-class features for high-core counts, and incorporation with Arm’s CMN-600 mesh interconnect & Ethos AI processor to generate high-performance and power-efficient platforms.

ARM Neoverse N1 Processor
ARM Neoverse N1 Processor

How does ARM Neoverse N1 Work?

The Arm Neoverse N1 core works by performing instructions in a multi-level cache hierarchy with higher data prefetchers to enhance memory access efficiency. Therefore, N1 uses a high-frequency out-of-order execution pipeline derived from Cortex-A76 for server-grade workloads. It features a D-cache, a coherent I-cache, an optional shared SLC (System Level Cache), and a unified L2 cache within the system interconnect. In addition, the ARM Neoverse N1 architecture supports the ARMv8.2-A instruction set, which includes features like efficient virtualization, power management, and server-class RAS for edge and cloud infrastructure.

The ARM Neoverse N1 is designed for server-class workloads & cloud-to-edge infrastructure. Thus, it incorporates the CMN-600 interconnect for severe scalability by allowing systems with 8 to above 128 cores. This architecture provides innovative performance per watt for storage, security, AI/ML, and networking applications.

ARM Neoverse N1 Architecture

The Arm Neoverse N1 uses a CPU microarchitecture designed for high-performance infrastructure applications like data centers, edge computing, and networking. It is derived from the Cortex-A76 core. This is a first-generation Neoverse product that implements the ARMv8.2-A instruction set, including the features of a highly scalable design through a mesh interconnect by connecting up to 128+ cores for each socket. This processor provides performance gains and significant power efficiency over earlier architectures. Thus, it allows different compute solutions at the network edge from the cloud.

ARM Neoverse N1 Architecture
ARM Neoverse N1 Architecture

Components

The Arm Neoverse N1 includes the Armv8.2-A core, including features like physical address – a 48-bit, a superscalar usable pipeline, elective cryptographic extensions, and TrustZone. Its system components mainly include the CoreSight SoC-400, the CoreLink MMU-600 & a CoreLink NIC-450. So all these components are connected through a high-performance interconnect, designed for infrastructure markets to focus on performance per watt for edge and cloud applications.

Memory Management Unit or MMU

The memory management unit or MMU of Arm Neoverse N1 changes addresses from software-generated virtual to physical by allowing some features like address relocation, controlled access permissions, and for memory regions. It uses a two-level TLB (Translation Lookaside Buffer) to cache modern address translations, which accelerates the process. This memory management unit is essential for current operating systems, delivering memory isolation between various tasks. Thus, it allows applications to run without requiring the physical memory layout.

Superscalar Out-of-Order Pipeline

This processor has a superscalar, variable-length, out-of-order execution pipeline. It is designed for efficient instruction processing, data centers & cloud computing. Thus, it uses a wide, powerful, in-order front-end and out-of-order back-end to exploit instruction throughput.

Advanced SIMD & Floating-Point Support

This can be implemented by an integrated execution unit with features like hardware support for optional double-precision, fused multiply-accumulate, and single-precision instructions. Thus, it includes particular execution pipelines for SIMD through various floating-point registers accessible and a dedicated FPCR (Floating-Point Control Register) for controlling performance.

Armv8.1-A & Armv8.2-A Instruction Sets

This processor uses the Armv8.2-A architecture, including the T32, A64, and A32 instruction sets, where A64 is the main 64-bit instruction set. This architecture can also have some extensions, like SHA3, by including support for a number of features in the A64 instruction set, like the higher SIMD SHA3 extensions.

RAS Extensions

The Arm Neoverse N1 processor supports RAS (Reliability, Availability, and Serviceability) extensions to enhance system fault handling & data integrity, including Error Data Record registers used for recovery actions, as well as Fault Handling Interrupts & Error Recovery Interrupts, which are utilized for software error management & error injection for testing purposes. RAS features are critical for hyperscale cloud reliability (five nines uptimes). Additionally, it can implement cache protection using parity and SECDED (Single Error Correct, Double Error Detect) ECC to safeguard data, L1 and L2 cache tags, and MMU RAMs.

Arm TrustZone Technology

It is a hardware-level safety technology incorporated into the processor to create two isolated environments, like the Secure World & the Non-secure World. So the Secure World manages critical operations like authentication and encryption that run the general-purpose operating system & applications. Thus, this hardware-enforced isolation guarantees that sensitive data & code stay protected from potential software malware by providing a safe foundation for devices.

Coherent Mesh Network

This CMN is extremely scalable, with mesh interconnects similar to CMN-700 or CMN-600 that connect the cores & other IP. In addition, it provides a scalable, cache-coherent fabric and high-performance for connecting accelerators, I/O, and compute cores in a SoC (System-on-Chip). This coherent backplane allows low-latency and high-bandwidth data transmission by supporting customizable system architectures by increasing power efficiency, and compute density for Armv8-A processors.

CMN600 Interconnect with Core Clusters
CMN600 Interconnect with Core Clusters

Generic Interrupt Controller

The ARM Neoverse N1 includes an Arm CoreLink GIC-600, a standard Generic Interrupt Controller (GIC) that manages and routes interrupts from peripherals to the appropriate CPU core within the system. This GIC, part of Arm’s CoreLink family, is designed for multicore systems and handles priorities, state, and routing for shared and private interrupts to ensure efficient processor operation.

Memory Controllers

The DMC-620 is a memory controller that is used to interface with DDR4 memory. This architecture supports dual-channel DDR4 memory to provide a 2x 72-bit interface mainly for the system. So these memory controllers with the CMN-600 interconnect & other components are mainly designed to handle the fast and large memory access required for infrastructure-focused applications.

CoreSight

It is a debug & trace framework, used to monitor and debug the system. This framework provides visibility to the processor & system, which includes on-chip trace buffers similar to the 32KB TMC_3 & TMC_4, the ETR buffer for routing to the TPIU, and DDR memory for off-chip trace. Thus, it allows system-level debugging, cross-triggering among subsystems, and also high-bandwidth data collection with consistent programmer’s models, mostly for tool support.

Generic Timers

The generic timers in this processor provide a consistent timer framework, including a system-wide System Counter & per-core timers that utilize comparators from this common count to activate interrupts or events. So this framework allows for event scheduling, timestamps generation, and incorporation with the WFE (Wait for Event) mechanism by ensuring a reliable time view across the cores of the system.

Software System

The ARM Neoverse N1 core software comprises an open-source software stack for the Neoverse N1 System Development Platform with SCP-firmware, Trusted Firmware-A, EDK II UEFI, Linux (like Ubuntu 18.04), user-space components, and GRUB. Ubuntu 18.04 support demonstrates ARM’s open-source readiness for cloud native workloads. So this software stack allows hardware prototyping, validation of the system, and profiling performance for the platform of a processor, which is designed for the infrastructure of cloud-to-edge.

This software includes key software components and resources like Open-Source Stack, Firmware & Bootloader, Trusted Firmware, SCP-firmware, EDK II UEFI, GRUB, Operating System, User-Space Components, and Development Tools. The purpose of this software is for hardware prototyping, software development, system validation, and performance profiling.

Difference between ARM Neoverse N1 and ARM Neoverse N2

The difference between ARM Neoverse N1 and ARM Neoverse N2 includes the following. Overall, N2 offers higher IPC and Armv9-A enhancements like SVE and MPAM for better workload partitioning.

ARM Neoverse N1

ARM Neoverse N2
It is a group of 64-bit ARM processor cores. It is a power-efficient and high-performance Armv9-A CPU core.
This processor is based on the ARMv8-A architecture. This processor is based on the Armv9-A Architecture.
It has a strong design for infrastructure workloads with higher features like large-system core scalability and cache stashing. This processor has some features like MPAM (Memory Partitioning and Monitoring). Thus, it helps in managing shared system resources within large-scale deployments & improves SLAs (Service Level Agreements).
It provides a significant performance boost over its predecessors with ~60% superior performance as compared to the Cortex-A72 core at the same frequency. It delivers an impressive 40% IPC performance uplift by enhancing performance for cloud-to-edge infrastructure.
It is optimized for scalability, power efficiency, and compute density in cloud & edge environments. It uses SVE (Scalable Vector Extensions), enhanced memory partitioning to handle shared resources & better cryptography features.
It allows a new creation of cloud-native workloads, like Graviton2 processors. It is designed for 5G infrastructure, edge & even more demanding scale-out cloud applications, which require higher efficiency and performance.

Advantages

The advantages of ARM Neoverse N1 include the following.

  • The Arm Neoverse N1 processor provides high power efficiency for superior performance-per-watt, outstanding scalability, and strong compute performance. Thus, it allows more work to be finished with low power in edge infrastructure and data centers.
  • It provides significant power savings & better performance-per-watt. This allows for lower power consumption and more compute density for edge devices and hyperscale data centers.
  • It provides a highly scalable SoC design that allows a flexible approach to designing systems through high core counts in demanding workloads.
  • It merges thread performance and server-class features with superior low-power methods. Thus, delivers exciting compute efficiency, with at least 30% performance gains over earlier Arm infrastructure cores.
  • The N1 processor supports higher core counts & superior core densities, leading to more efficient and powerful systems. Thus, it is particularly helpful for hyperscale cloud atmospheres.

Disadvantages

The disadvantages of ARM Neoverse N1 include the following.

  • It provides poorer performance in video encoding tasks as compared to x86 alternatives.
  • It has limited support for certain performance event counters for complete analysis & reliability concerns. This is common to high-performance computing designs & addressed through system-level ECC.
  • The Neoverse N1 processor performs weakly in video encoding tasks, whereas AMD’s Zen 2 architectures provide significantly better performance.
  • It does not support events for counting retired instructions, which delays detailed architectural instruction mix analysis.
  • It includes mechanisms to notice and change errors; however, certain 2-bit errors within protected RAM may not be changeable. Thus, it can lead to problems that depend on the particular RAM technology.
  • The system may encounter silent data corruptions with uncorrected errors throughout the system. So a risk is diminished by memory error detection & correction.

Applications

The applications of ARM Neoverse N1 include the following.

  • This processor is designed for internet infrastructure, which allows efficient and scalable cloud-to-edge transformations.
  • These are used in security, networking, storage, data centers, edge computing devices, etc.
  • These are used in general-purpose computing within servers, cloud-native workloads, and infrastructure deployments for AI/ML inference.
  • This processor provides power-efficient and high-performance CPU cores to design cloud infrastructure and scalable datacenters.
  • This platform can be used in networking nodes & equipment to support network and high-speed data traffic functions.
  • It is used in AWS Gravition2 and Fujitsu A64FX-based deployments.
  • In addition, the Neoverse N1 processor can also be used in storage solutions to provide efficient processing.
  • These are used in heterogeneous systems to supply accelerated functions mainly for 5G deployments.
  • It is suitable for the demands of infrastructure and edge devices by supporting the change of cloud-to-edge abilities.
  • Neoverse N1 processor performs as a self-hosted accelerator for ML and AI inference applications by integrating through Arm’s Ethos processor.

Conclusion:

The ARM Neoverse N1 processor effectively accelerated the cloud and edge infrastructure transformation. This provides a power-efficient, scalable, and high-performance Arm architecture. Thus, it achieves significant improvements over earlier Arm designs for a variety of workloads. It allows lower TCO (Total Cost of Ownership) and better design diversity, mainly for data center operators & edge deployments. This Neoverse N1 platform became a key building block for next-generation storage, compute & network processing. It unlocks modernism across the cloud-to-edge ecosystem. Here is a question for you: What is ARM Neoverse N2?