Tweaking Barcelona via WPCREDIT.
This is an English version of the article. Russian version is available here.
In this article I would like to tell about one of the most important applications of WPCREDIT - tweaking integrated A64 controllers and Barcelona in particular. Due that Intel desktop chipsets are using PCI-E registers, we can ask about whether the usage of WPCREDIT is actual in nowadays, what has been done in the discussion of my previous article about WPCREDIT. But there are AMD processors that have the north bridge, integrated into the CPU. One of the most important components of the north bridge is the memory controller (DCT). And all of his settings are just inside the PCI registers of the north bridge (0,0,0).
So what interesting innovations are represented in the new AMD microarchitecture? First of all, it’s the two memory controllers, each with a 64-bit bus working with its own memory channel. This means, we can set different timings for different memory channels, but the memory frequency must be the same. The processor may be configured to behave as a single dual-channel DCT; this is called ganged mode; or to behave as two single-channel DCTs; this is called unganged mode. The ganged mode is similar to the behavior of the K8 processor memory controller. I want to mention, that according to AMD terminology: a DRAM channel is the group of the DRAM interface pins that connect to one series of DIMMs. The processor supports two DDR channels. The processor includes two DCTs. Each DCT controls one 64-bit DDR DIMM channel (BIOS and Kernel Developer's Guide (BKDG) For AMD Family 10h Processors, p.60). We can get a clue about the influence of the number and type of memory modules on the maximum supported frequency from the picture:
Let's move to the tweaking itself. You should download the datasheet with the registers description. Open WPCREDIT and select the device with the DevID-VendorID 1202-1022. I want to mention that the description of the registers is in 32-bit view, so, maybe it would be better to switch WPCREDIT into an appropriate view. There won’t be problems with the numeration then. Let’s have a look on a list of the most interesting, in my opinion, settings:
78[32:22]=MaxRdLatency: maximum read latency
78=EarlyArbEn: early arbitration enable
78[13:12]=Trdrd[3:2]: read to read timing
78[11:10]=Twrwr[3:2]: write to write timing
78[9:8]=Twrrd[3:2]: write to read DIMM termination turnaround
84[22:20]=Tcwl: CAS write latency
84[6:4]=Twr: write recovery
88[31:24]=MemClkDis: MEMCLK disable
88[23:22]=Trrd: row to row delay (or RAS to RAS delay)
88[21:20]=Twr: write recovery time.
88[19:16]=Trc: row cycle time.
88[15:12]=Tras: row active strobe
88[11:10]=Trtp: read to precharge time
88[9:7]=Trp: row precharge time
88[6:4]=Trcd: RAS to CAS delay
88[3:0]=Tcl: CAS latency
8C[31:29]=Trfc3: auto-refresh row cycle time for logical DIMM 3
8C[28:26]=Trfc2: auto-refresh row cycle time for logical DIMM 2
8C[25:23]=Trfc1: auto-refresh row cycle time for logical DIMM 1
8C[22:20]=Trfc0: auto-refresh row cycle time for logical DIMM 0
8C[17:16]=Tref: refresh rate
8C[15:14]=Trdrd[1:0]: read to read timing
8C[13:12]=Twrwr[1:0]: write to write timing
8C[11:10]=Twrrd[1:0]: write to read DIMM termination turnaround
8C[9:8]=Twtr: internal DRAM write to read command delay
8C[7:4]=TrwtTO: read to write turnaround for data, DQS contention
8C[3:0]=TrwtWB: read to write turnaround for opportunistic write bursting
90[22:21]=IdleCycLowLimit: idle cycle low limit
90=DynPageCloseEn: dynamic page close enable
90=Width128: width of DRAM interface in 128-bit mode
94[31:28]=FourActWindow[3:0]: four bank activate window
94[27:24]=DcqBypassMax: DRAM controller queue bypass maximum
94=SlowAccessMode: slow access mode (a.k.a. 2T mode)
94=DcqArbBypassEn: DRAM controller arbiter bypass enable
94[2:0]=MemClkFreq: memory clock frequency
The full list is much bigger and can be found in the datasheet. Let's see, how to work with them. For example, let's set timings to 3-3-3-12 1T for DDR2-667. Select register 88. Look on the register description:
Set 88[3:0]=0010. So the value should be set to "1" in bit 1, and "0" for the rest (3,2,0) of them. As a result the register will look as following: 88[9:0]=0000000010. By making this, we will have 3-3-3. Then set 88[15:12]=1001. This will give us 3-3-3-12. And, finally, set 94=0. We have got the "formula" 3-3-3-12 1T. To change it to 4-4-4-12 1T, we should change register 88 as 88[9:0]=0100010011. The same way, by following the registers description in the datasheet, we can set all timings that we want and not only them.
I hope, this article will help those people, who are working hard on researching the work of A64 memory controllers and those, who want to set it up on maximum performance.
You can leave your feedback here