CHIRBot design discussions
CHIRBot design discussions
The below sections were Q&A style discussions with members of the GP2040-CE community which resulted in some useful discussions around various usecases. They're here temporarily until the contents is more fully integrated into the draft specification, although they're beneficial for restating how the pieces described above fit together.
wren: "...i’m curious on how you’re gonna handle lag on devices like the ps1 (is it deterministic?) and stuff requiring strict inputs from power-on."
One note about deterministic console behavior - it's an unfortunate reality that many output devices will be unable to do consistent full-length macro playback without modifications (as in, what I would call from the TASBot perspective a console verification of a TAS). This is a problem we face today, which is why the list of consoles the TAStm32 replay device can handle is primarily focused on things like the NES. I've done extensive research on related topics, at times even at academic/industry journal/company whitepaper levels of research, and I can authoritatively speak to the difficulty of non-deterministic behavior at great length, but the gist of it is we have to be able to control all sources of entropy for any pseudo random number generator as well as control or predict all system clock sources if we expect to see deterministic results. It's a much longer topic than I should type about here, so let's just put this one in the category of "I'm painfully familiar with this challenge and know the limits intimately"
wren: "I would also recommend a DB15 output for superguns, as verifying TASes made on something like mame-rr or MAMEhawk on original hardware would be awesome"
We chose USB Type-C directly attached to the input and output modules to connect to the core CHIRBot board in a way that allows for USB 2.0 communication, power delivery, and critically Debug Accessory Mode to allow using remaining pins for SPI clock signals. It was the hardest and most fraught decision of the project and delayed it by literally two years. However, with that decision out of the way, it means that an output adapter module itself could absolutely have a DB15 output if someone made it, which would be completely possible thanks to the open hardware, open source software, open documentation design
Henré: "Regarding clock, does your architecture allow for a way to provide an out-of-band synchronisation signal? I'm thinking of stuff like having a sensor watching the refresh rate on a display to provide timing information to automated inputs like macros"
I touch on it briefly in one of the output module examples - an Atari 2600 and similar controller types has no latch signal or other source of controller polling/frame edge timing, so we use a VSync extraction method - specifically, that was the method I used when I helped Omnigamer prove that Todd Rogers had cheated in Dragster which ultimately led to him losing his Guinness World Record video - https://youtu.be/oXMxZbPzRzs doesn't show it well as it was one of my earlier videos and it's not edited well but there's a breakout board that extracts the VSync signal from the composite video output (after adaptation from RF) which is then used as the clock source for when to transition to the next set of input
One caveat with any device that watches the video signal is that it could lead to an aimbot that is the exact opposite of what I want to build here. There's a fine line between providing tools that are useful and necessary to help a user with disabilities play a singleplayer video game as opposed to providing tools that allow a jerk to cheat in an online multiplayer game.
Henré: "Ultimately I don't think you can do much about cheaters, they're going to be who they're going to be"
The downside of deliberately choosing to be open hardware/software/firmware/documentation is that some people may use it in ways I find objectionable; the best I can do is ensure the main branded CHIRBot product I'm releasing does everything as above reproach as possible
jfedor: "Yeah, SNES might have given people the false sense that perfect replay is easy or even possible. On platforms that use USB you might replay the USB frames perfectly and still get different results on two instances of original hardware (or even the same instance) because V-sync and USB clocks are not synchronized."
We used a USB plus VSync extraction method when we did Super Mario Maker 2 on the Switch and it can work but it works way less consistently in a game like Legend of Zelda: Breath of the Wild due to non-deterministic behavior. I did a series of tests replaying the same button presses when jumping off a tower and the landing location was wildly different for instance. The SNES is by no means easy to do a perfect replay on and we're going to insane lengths to try to make it more deterministic; the issue is that 30 years on, the ceramic oscillator driving the APU has degraded for most people to the point that middle C is no longer perfectly middle C as one analogy example. We're working on a project to restore SNES consoles to match the timing behavior that game developers were told to expect in the documentation they were provided which involves ensuring consistent startup of the console clocks and components. Here's an image showing how complex that process is:
Henré: "I'm sure there's insane complexity hidden there. Just curious about the general idea of feeding in a separate clock signal of some kind"
And you're right! There is insanity inducing complexity. Ask me how I know. :) I anticipate some consoles will never achieve full consistency, but even in the Breath of the Wild example above, it should be possible to create a macro that allows a user to press just one button to kick off a dodge roll and repeatedly attack an enemy; in other words, reducing the number of buttons that have to be simultaneously held and reducing the amount of button mashing. It may not work every time especially as it's still timing oriented but it could allow a user with disabilities to overcome battles that would otherwise be impossible. One of the future aspects of the CHIRBot project is an idea I have of a bounty board where users can request macros to be created and it's possible to filter through a list of macros to download and use. Yes, even for otherwise fully capable scrub tier casuals that are just struggling with one aspect, there's no discrimination (other than jerks that abuse what we're trying to build in order to cheat in online player vs player games, those abusers are going to be the death of me)
In summary, yep, it's going to be complex and not all output devices or games will be capable of full game playback but that's okay, there are enough scenarios where it's good enough that it's still worth doing - and even if macros aren't perfect, the project would still allow a user to use an Xbox Adaptive Controller on a Nintendo Switch or even a GameCube which is still a win
Henré: "My question is how do you feed that signal into CHIRBot?"
An output module contains the connectors and electrical support required for whatever signals are needed - for instance, an output adapter module for an NES would only need a native controller plug end that an extension cable could attach to
As in, an Nintendo Entertainment System or AV Famicom controller has a specific connector, and third parties sell controller cable extension cables (some of which have all 7 wires needed to support the additional data lines present). An NES output adapter module would only need this one connector because the NES has a latch signal we can use as the clock source
In fact, the NES and SNES are so similar that we could make an output adapter module that contains both an NES and an SNES connector, although I don't know as if I would want to do that
For an Atari 2600, you'd have a native controller cable port (basically a DB9) but you'd additionally have an RCA jack to attach a composite signal to that you'd use to extract the VSync signal. Since an Atari 2600 sends an RF signal you'd need to externally convert that, or the output module could get into a bit of scope creep and also handle RF conversion and spit out a composite video output and mono audio output but again, scope creep ;)
Henré: "Alright so the output module reads that signal. I assume the core is what reads that & polls the input accordingly? Or it polls input as fast as possible, but times output messages according to the output module's clock signal. Is that right?"
The input adapter module reads the input devices at the maximum speed allowed, likely 1 ms. The most current input is sent to the output side every time it signals a new poll edge (think Super Mario Bros. 1 triggering a latch signal every frame which at 60 FPS for NTSC translates to every 16.6 ms).
Let's talk about a specific use case - well-known world record holder Mitchflowerpower is doing RTA speedrun attempts of Super Mario Bros. 3 on an NES attached to an output adapter module on CHIRBot. He has his original NES controller attached directly to the NES controller port of an NES input adapter module, which samples his input continuously and sends that over SPI in a format such as the TASD spec to the CHIRBot core. He configures CHIRBot in Record mode and powers on the NES, at which point it starts polling the controller twice per frame due to DPCM clock filtering (which can be abused via the technique described at https://fuzyll.com/2016/the-smb3-input-polling-glitch but let's assume we're properly providing a copy of the same input due to configuring CHIRBot to account for it). As Mitch plays, every time the NES raises the latch signal, we record the output that was sent to the microSD card. Once the run is complete and Mitch powers off the NES, he can switch CHIRBot to playback mode which sets the input source to the file he just recorded. When he powers on the console, he'll see a perfect replay of what he did thanks to the NES using only player input as the seed source for the pseudo random number generator. All throughout this process, the input is being shown on a visualization display or being sent out to an attached PC and shown as a Twitch overlay that visualizes the controller buttons pressed.
Henré: "I always wonder about stuff like that: What happens if I press A, release it, and press B all within one (host) poll window? What gets sent? Should this differ in different situations?"
This answer varies by controller type and console polling mechanism but for the NES the closest analogy would be the following situation: You, a human player holding an NES controller, release A and press B "simultaneously" (for the sake of argument, let's assume it's literally simultaneous). You happen to do it while the state of the buttons is being polled. The result is predictable; the NES will first raise the latch signal to indicate it's about to ask what buttons are held and to reset to the first button in a pre-arranged sequence. The console then raises the clock line and the controller responds with the state of the first button, such as A (this ignores several complications like the fact that it's normally holding the line high and signals the button is held by pulling the data line low, but I digress again). The console then raises the clock line and requests the next button, let's say it's B. If you simultaneously released A and pressed B in the few nanoseconds between those two, both A and B would be seen as held (because you were still holding A when A's state was sent and you started holding B before it sent B's state). With all that said, it doesn't technically work exactly that way for a variety of reasons and it's better to think of it as all of the buttons you're holding are locked in when the latch signal hits but that should give you the general picture. In the case of CHIRBot, every 1 ms the input will be collected and replaced 1 ms later by the next input poll, only advancing to the output console when it polls for input with the result that the vast majority of your inputs are, er, sacrificed in the name of latency, heh. The point is that there's little concerns about the type of situation you describe meaningfully affecting outcome of the game you're playing due to the timings being beyond human ability. It matters a lot more when you're making complicated TAS payloads but you don't have to worry about that as a human player.