THE ULTRA SPARC ARCHITECTURE
Here I will be listing out a few of the important architectural features of UltraSPARC.
Hardware Trap Stack: A hardware trap stack is provided to allow nested traps. It contains all of the machine state necessary to return to the previous trap level.
Hyperprivileged Mode: This mode simplifies porting of operating systems and provides robust handing of error conditions.
Extended Instruction Set: It includes many instruction set extensions.
Chip Level Multithreading: It provides a control architecture for highly threaded processor implementations.
Register Window: The SPARC “register window” architecture allows for straightforward, high-performance compilers and a reduction in memory load/store instructions. Some of the windows overlap thereby reducing context switching.
Binary Compatibility: It should be compatible across different architecture for application software down to the binary level.
PROCESSOR ARCHITECTURE
UltraSPARC architecture processor contains two units:
1. IU (Integer Unit)
2. FPU(Floating Point Unit)
Virtual processor or a strand is the hardware containing the state for execution of a thread.
Physical Core is the hardware required to execute instructions.
1. IU
a) IU contains the general purpose registers and controls the overall operation of the virtual processor.
b) It performs integer arithmetic and calculates memory addressing for load/store operations.
c) Maintains program counters and controls instruction execution for FPU.
2. FPU
a) An OpenSPARC FPU has thirty-two 32-bit (single-precision) floating-point registers, thirty-two 64-bit (double-precision) floating-point registers, and sixteen 128-bit (quad-precision) floating-point registers, some of which overlap.
b) If FPU is disabled and if a call for FPU is made then either of the two things might happen:
(i) Enable the FPU and reexecute the trapping instruction
(ii) Emulate the trapping instruction in software
INSTRUCTIONS
Instructions fall into the following basic categories:
- Memory Access
Integer load and store instructions support byte, halfword (16 bit), word (32 bit) and extended-word (64bit) accesses
1. Memory Alignment Restrictions
A memory access on an OpenSPARC virtual processor must typically be aligned on an address boundary greater than or equal to the size of the datum being accessed, else an exception or a trap may be generated.
2. Addressing Convention
An unmodified OpenSPARC processor uses big-endian byte-order by default
3. Addressing Range
An OpenSPARC implementation supports 64 bit virtual address space. The supported range of virtual addresses is limited to two equal sized ranges at the extreme upper and lower ends of 64 bit addresses; i.e. for n bit virtual addresses, the valid address ranges are from 0 to 2^(n-1)- 1 and (2^64) - (2^(n-1)) to (2^64)-1
4. Load/Store Alternate
Versions of load-store instructions can specify an arbitrary 8-bit ASI for the load-store data access.
5. Separate Instruction and Data Memories
The interpretation of addresses in an unmodified OpenSPARC process is split; instruction references use on caching and translation mechanism and data references use another. The same underlying main memory is used.
In such split memory system, the coherency mechanism may be split so that a write into data memory is not immediately reflected in the instruction memory. Therefore self-modifying code must use FLUSH instructions to bring the instruction and data caches in a consistent state.
6. I/O Registers
The UltraSPARC architecture assumes that I/O registers are accessed through load/store alternate instructions, normal load/store instruction or read/write Ancillary State register instructions (RDasr, WRasr)
7. Memory Synchronization
Two instructions are required for synchronization of memory operations: FLUSH and MEMBAR (Memory Barrier).
- Integer Arithmetic/Logical/Shift Instructions
These instructions compute a result that is a function of two source operands; the result is either written into a destination register or is discarded.
Shift instruction shifts the contents of an R register by a given number of bits, which is specified by the constant in the instruction or by the contents in the R register.
- Control Transfer
Control Transfer Instructions (CTIs) include PC-relative branches and calls, register-indirect jumps and conditional traps. Most of the control transfer instructions are delayed i.e. the instruction immediately following a CTI in logical sequence, is dispatched before the control transfer to the target address is completed.
The instruction following a delayed CTI is called a delay instruction. Setting the annul bit in a conditional delayed CTI causes the delay instruction to be annulled if and only if the branch is not taken. Setting the annul bit in an Unconditional delayed CTI causes the delay instruction to be always annulled.
- State Register Access
1. The "read" and "write" ancillary state registers read and write the contents of ancillary state registers visible to non-privileged software.
2. These registers are visible to privileged and hyperprivileged software
3. These are visible to only hyper-privileged software
- Floating Point Operate
- Condition Move
- Register Window Management
- SIMD
TRAPS
A trap is a vectored transfer of control to a privileged or a hyperprivileged software through a trap table which contains the first 8 instructions of each trap handler. The base address of the trap table is contained in the state register. Part of the trap table is reserved for hardware traps and a part for the software traps.
A trap can be caused by an asynchronous request, an exception or a interrupt not directly related to the instruction.
Source: OpenSPARC_Internals_Book OpenSPARCT2_Core_Micro_Arch