SNES Controllers typically look like this:
SNES controller is a cheap and easy way how to add a joystick interface to a DIY project. One controller can be cheaply had from aliexpress or similar for around 2 USD.
The controller has just 12 digital (on/off) buttons organized in various patterns, as you see in the photo: D-pad, ABXY buttons, two shoulder buttons, start and select. There are no analog interfaces, no LEDs etc.
Internally all the buttons are connected to a 16-bit parallel-in / serial-out register. Original SNES electronic had two CMOS 4021 chips connected in series. These are 8-stage parallel-in, serial-out registers. The host controller (game console) communicates with a controller over 3 signals: LATCH, CLK, DATAOUT. At the beginning of a read-out cycle, LATCH is shortly pulsed LO-HI-LO by the console to latch all button state into the shift register. Then the console sends 16 pulses on CLK (HI-LO-HI) to serially read-out the shift register via the DATAOUT signal.
The following two oscilloscope pictures demonstrate the behaviour with the “B” button on the controller. The B button is the first output bit on DATAOUT in each cycle. Buttons are registered active-low, i.e. a pressed button will send “0” to the console, released will send “1”.
DATAOUT changes during LATCH (bits from buttons are immediately loaded into the shift register), and then just after every rising edge of CLK when the next bit is shifted out.
Here is how an SNES controller may look inside:
There is one main PCB for most electronics, and two miniature PCBs for the shoulder buttons. The component side of the main PCB looks like this:
The shift registers are implemented by the black blob, which is a chip-on-board construction. The overall production quality is not very high (this is not an original SNES, of course), but it works as expected.
The Mess
Now we are getting to the problem at hand. Over the year 2023 I bought in total 4 SNES controllers for use with X16. The first two were from a single alliexpress page, ordered just 1 month apart. When I finally tested them, I recognized that the earlier one works OK, but the second one shows a problem with the “B” button. The problem is that while the “B” is kept pressed, computer intermittently reads presssed/released – an unintended autofire.
The following video is a screen capture of the excelent joystick test program “CONTEST” written by Dušan Štarkl. The “B” button was firmly pressed, but the computer readings are a bit random:
Controllers from Aliexpress were unbranded; I just colored one red and other blue to distinguish them.
Unsure about the situation, I bought two more controllers from a czech online shop (at 5x the cost of aliexpress) – these are branded “Under Control”. Both new controllers show the same problem with “B” as the second one from aliexpress.
Outside, all four controllers look identical.
Inside – and wow! – the problematic ones are quite different. The main PCB has much simpler rectangular shape, without intricate cutouts like the good one:
On the component side of the main PCB we can see a completely different implementation:
Instead of a chip-on-board, we see a SO-type SMD package with some IC. The IC type is rubbed off, but as we will see later on, I guess it is a microcontroller chip with firmware inside (wtf?!).
Lets see which output signal the problematic controller gives. The following two pictures show the same situation as previous: first – “B” button released, second – “B” button pressed:
First of all, we see a strange slow rising edge on DATAOUT during the LATCH signal. This indicates that the chip is not driving signal actively, but rather through a pull resistor. The 4021 cannot do this, because it has just a push-pull driver output. But a microcontroller could behave like this.
When the button “B” is pressed in the second picture, the data-valid time window is completely different than what we saw previously. The DATAOUT signal value becomes valid some 0.5us AFTER the falling edge of CLK, and the next button bit is shifted out some 1us AFTER the rising edge of CLK.
Moreover, all edges on DATAOUT are fuzzy – the timing is not stable (this could be clearly seen in the live oscilloscope image, it is harder to be seen in a static image). This tells me, that the rubbed-of IC is not hardware shift register, but a microcontroller with own clock that simulates a shift register in software, bit-banging style.
Now, how could it be that two different implementations with quite different DATAOUT signal both work correctly with original SNES consoles (but not with X16)? We have to think like engineers in 1980’s when SNES was designed. There was no bit-banging by software at the time, that would be too wastful of resources. They designed the controller as a shift register, parallel-in to serial-out, and then they designed in the console hardware also a shift register, serial-in to parallel-out. The shift register in the console captures DATAOUT bit always on the rising edge of CLK. This works correctly with the first bit after LATCH as well as with any further bit. The DATAOUT bit is captured into register at the latest possible moment, just before the shift register in the controller shifts out the next bit. (The clk-to-out + propagation delay must be longer than data-hold in the console’s register, but that is easy to achieve.)
With this explanation, we can understand what the problematic controller is doing with DATAOUT: it is providing a valid button signal exactly before and after a rising edge on CLK, because that is the moment when console samples it. It is not providing it immediately after LATCH, and it is not caring about falling edge on CLK, as 4021 would.
Compatibility with X16 ROM
The X16 ROM implements the SNES interface by bit-banging through the VIA (GPIO) port in the file kernal/drivers/x16/joystick.s:
joystick_scan:
KVARS_START_TRASH_A_NZ
lda nes_ddr
and #$ff-bit_data1-bit_data2-bit_data3-bit_data4
ora #bit_latch+bit_jclk
sta nes_ddr
lda #bit_latch
trb nes_data
lda #bit_jclk
tsb nes_data
; pulse latch
lda #bit_latch
tsb nes_data ; ACCESS #(1) => LATCH HI
pha
pla
pha
pla
pha
pla
pha
pla
trb nes_data ; ACCESS #(2) => LATCH LO
; read 3x 8 bits
ldx #0
l2: ldy #8
l1: lda #bit_jclk
; ACCESS #(3): => CLK LO
trb nes_data ; Drive NES clock low (NES controller doesn't change when low)
lda nes_data ; Read all controller bits ; ACCESS #(4) => READ DATAOUT
pha
lda #bit_jclk
tsb nes_data ; Drive NES clock high ; ACCESS #(5) => CLK HI
pla
; process while NES clock is high (bits change)
rol ; Move bit 7 into C
rol joy1,x ; Roll C into joy1
rol ; Move bit 6 into C
rol joy2,x ; Roll C into joy2
rol ; Roll bit 5 into C
rol joy3,x ; Roll C into joy3
rol ; Roll bit 4 into C
rol joy4,x ; Roll C into joy4
dey
bne l1
inx
cpx #3
bne l2
....
I marked important lines where the SNES port is accesss with numbers #(1) to #(5) in the above listing. By connecting VIA (GPIO) chip-select signal to an additional oscilloscope channel we can corellate source code listing with real-time hardware behaviour.
The following picture shows the same situation like the previous one: the “B” button is pressed on the problematic controller. The channel 4 is connected to the VIA CS signal:
The correlated steps are:
- LATCH -> HI
- LATCH -> LO
- CLK -> LO
- READ DATAOUT — the critical timing step!
- CLK -> HI
As you can see, by unlucky chance, the step #(4) READ DATAOUT is positioned right at the point when the controller changes the output signal. Here is a zoomed view (I also increased persistance time to 1sec to see multiple acquisitions):
Depending if the falling edge on DATAOUT (magenta) line comes before or after the step #(4) the computer reads the “B” button state intermitently as pressed (0) or released (1).
Solution in X16 ROM
The solution is to delay the timing of DATAOUT read in software by cca 1us so that it happens safely after the magenta transition. Here is the updated code snippet:
; read 3x 8 bits
ldx #0
l2: ldy #8
l1: lda #bit_jclk
trb nes_data ; Drive NES clock low ; ACCESS #(3): => CLK LO
pha ; 3T = 375ns
pla ; 4T = 500ns
lda nes_data ; Read all controller bits ; ACCESS #(4) => READ DATAOUT
pha
lda #bit_jclk
tsb nes_data ; Drive NES clock high ; ACCESS #(5) => CLK HI
pla
I added one PHA / PLA instruction pair. This is equivalant to a NOP, because PHA just pushes the accumulator to stack, and PLA pulls it out back. The combined instructions take 7 processor cycles, which at 8MHz is 0.875us. Here is a new oscilloscope picture after the software change:
Now the step #(4) READ DATAOUT is always happening well after the signal has settled.
For completeness, here is the overview of the new timing:
The final step will be to get this small timing modification into the mainline x16-rom repository.
Conclusion
When SNES was released 33 years ago, the controller and the console (host) implemented the joystick interface by hardware shift registers. Over time, the SNES controller gets reused in other projects and the host side protocol gets bit-banged in software. Finally, the circle closes: someone in China realizes that he could make a penny more by implementing the SNES controller with a generic MCU instead of shift registers. And so we get here: both sides, controller and host, are microcontrollers running software that desparately pretends to a be… a humble shift register. And both sides do it wrong… :-/
xswr1i