Thank you for the praise! The lack of dependencies helps with getting this behavior right. crossterm is a good example for this, because it’s not really a good abstraction over UNIX and Windows.
I’m one of the Windows console subsystem maintainers and how it reads input is what may be diplomatically described as "not good": https://github.com/crossterm-rs/crossterm/blob/6af9116b6a8ba365d8d8e7a806cbce318498b84d/src/event/source/windows.rs#L49-L51
Every single character you type, will trigger 4 syscalls. Your maximum throughput is then maybe? 10kB/s? People often complain to us that Windows Terminal is laggy - more often than not, they’re unknowingly jus…
Thank you for the praise! The lack of dependencies helps with getting this behavior right. crossterm is a good example for this, because it’s not really a good abstraction over UNIX and Windows.
I’m one of the Windows console subsystem maintainers and how it reads input is what may be diplomatically described as "not good": https://github.com/crossterm-rs/crossterm/blob/6af9116b6a8ba365d8d8e7a806cbce318498b84d/src/event/source/windows.rs#L49-L51
Every single character you type, will trigger 4 syscalls. Your maximum throughput is then maybe? 10kB/s? People often complain to us that Windows Terminal is laggy - more often than not, they’re unknowingly just using poorly written code. For fun, I wrote an example to demonstrate this performance issue. Here’s crossterm reading stdin VS msedit reading stdin:
EEAGXPKT6dfyUMaS.mp4
The left half goes on for at least another 5min, after which I simply aborted it. The difference is rather comical.
FWIW crossterm should just use ReadConsoleInputExW with CONSOLE_READ_NOWAIT and batch the reads.
In any case, to recognize resizing, on UNIX you need to subscribe to SIGWINCH and on Windows you should read with ReadConsoleInput or ReadConsoleInputEx which will contain WINDOW_BUFFER_SIZE_RECORD events.
Then, during rendering of a fullscreen application like this, you can emit explicit newlines at the end of each line. This ensures that resizing the terminal, but before your application had a chance your react, the terminal won’t try to reflow (word wrap) the lines, which would create an ugly zigzag line.
12 replies
This already looks pretty good though! If you have never written a FSM before, this can be a neat training IMO for your first time doing it.
For a full-featured VT parser I can’t recommend anything besides that flowchart, because pretty much everyone refers to this exact flowchart in their implementation. To translate that flowchart into code, you need to keep in mind that the "anywhere" boxes are the same as the "ground" state.
First, you should start by adding just the "ground" state to your state enum. Then check the flowchart and find all the "anywhere" starters. Each arrow from there will contain a "byte range / action" description, where the byte ranges are the bytes you should match and the "action" is what you should do with it. But the action doesn’t matter that much for you yet.
You can see how prettypretty translates this literally here: https://github.com/apparebit/prettypretty/blob/ba6b06a10bb6877eb06b8eba8b1696e01b012805/crates/prettytty/src/scan/machine.rs (They split the nested match clauses up into individual functions, but the concept is the same.)
I hate to seem stupid but is there anything else you could recommend besides the VT100.net flowchart? I just cannot wrap my head around it.
The alternative approach, and the approach I chose, is to just "vibe code" it, but without the AI. Start writing your application and switch into the terminal raw mode I mentioned before. Then if you press your arrow keys, home/end, or your F keys or something, etc., you’ll see sequences like \x1b[123;456m in your input. Those are CSI sequences, and you can parse them by just writing a state machine like you already did, but without reading the flowchart. Just... wing it. You can already see how the escape key starts some escape sequence so clearly you need an State::Esc state. And then you encounter a [ so you transition to State::Csi. And then you encounter numeric parameters separated by ; so clearly you need to accumulate them somehow into an array or vector. The finalizer is some non-numeric ASCII character.
If you swing it that way, it’ll also be good enough.
Okay, I’m back a day later with incredible news. Instead of the flowchart which I found unnecessarily complicated, I opted for this resource instead, listing all the different ANSI escape codes, including OSC and DCS. Coming from Python roots and taking a page out of my parsing era, I compiled the following. It’s Lark grammar and it should just be self-explanatory but I added some annotations nevertheless. I’m going to use it as a blueprint for how I want things to be parsed in my VT parser:
ESC: "\x1b"
// Different types of printables
ONE_PRINTABLE_NO_SEMICOLON: /([\x20-\x3A]|[\x3C-\x7F])/
PRINTABLE_NO_SEMICOLON: /([\x20-\x3A]|[\x3C-\x7F])+/
ONE_PRINTABLE: /[\x20-\x7F]/
PRINTABLE: /[\x20-\x7F]+/
DIGITS: /\d+/
// String terminator - used in OSC and DCS sequences
ST: "\x07" | ESC "\\" | "\x9C"
// ESC sequences
esc: ESC ONE_PRINTABLE
// CSI sequences
CSI: ESC "["
CSI_HEADER.1: "?" | "!" | ">"
CSI_FOOTER: ONE_PRINTABLE_NO_SEMICOLON
csi_params: DIGITS (";" DIGITS)*
csi: CSI CSI_HEADER? csi_params? CSI_FOOTER?
// OSC sequences
OSC: ESC "]"
osc: OSC PRINTABLE* ST
// DCS sequences
DCS: "\x90"
dcs: DCS PRINTABLE* ST
// Printable characters
char: ONE_PRINTABLE
// Control characters
control: /[\x00-\x1B]/
// Group them all together and iteratively search for these.
// Lark, by default, will collect these into a tree for us.
next: csi | osc | esc | char | control
start: next*
Given that it’s in this familiar layout, I think I can easily turn this into a parser. That would be a lot easier for me, since I’ve never really used a state machine before - although that definitely just sounds like a skill issue. Either way, I made a breakthrough and I figured you should be the first to know.
As I go and do that, if you’re able to mentally parse the grammar I showed (the reference for which is here), am I missing anything per se? I went with your OSC and DCS Token enum members where the data is given as a raw string.
I’m not familiar with Lark, but I think char: PRINTABLE* would make more sense to me.
PRINTABLE should also not include 0x7f since that’s the escape character. But it should include 0x80-0xff or some sort of it, in order to support UTF8 sequences.
0x9C is not an ST you’ll see with UTF8 encoded output (that’s for the legacy byte-wise ASCII-style encoding before UTF8 was a thing).
CSI_FOOTER should probably be 0x40-0x7e as per the DEC spec linked above. You can ignore any other (invalid) chars if you’d like or ignore the entire sequence.
I have implemented char: PRINTABLE* in my own parser and none of my ranges include 0x7F. Thank you for pointing out CSI_FOOTER as I didn’t catch that initially. If an invalid CSI_FOOTER byte comes through, I’ll just not include the footer in the final Token::Csi enum.
Also, if 0x9C isn’t an ST, then are 0x07 and b"\x1b\\" the only ones?
I have been making spectacular progress and I’m excited to show you once it’s done.
Something else I want to mention while making my own parser: I found this in Edit and I wondered: is this not better off rewritten with .saturating_x?
// Edit code
let add = bytes[self.off] as u32 - b'0' as u32;
let value = *dst as u32 * 10 + add;
*dst = value.min(u16::MAX as u32) as u16;
// Suggested replacement
*dst = dst.saturating_mul(10).saturating_add(add);
I figure that’s cleaner than casting between u32 and u16 to account for overflows.
Another thing on the list: I recall you mentioned keeping a 1KiB-4KiB buffer for stdin. How exactly am I supposed to implement that? Obviously I don’t wait for that buffer to fill up before I process instructions, but am I not reading stdin byte-by-byte? Do I just collect a few then throw them in the parser? I’m starting to get confused there.
I got a working example! https://mystb.in/bd3e29bd413da21bb5 🎉
Let me know what needs improving and where to go from here. I already have the rendering thing - do I just start writing commands for things like entering alternate screen and moving the cursor, or am I missing a step?
Also another thing now that I think about it: where did you get the ANSI representations of all the special keys? I don’t want to directly rip everything from your code but I’m not sure where else to look. I have some but not all the mappings, and I’m not sure if they differ between Windows and Linux.