The first step in any compiler is lexical analysis - breaking source code into tokens. When you see 2 + 3, you immediately recognize three separate things: a number, a plus sign, and another number. Your brain does this automatically. But a compiler has to explicitly walk through each character and figure out what’s meaningful.

Let’s say we want to tokenize this expression:

2 + 3

We need to walk through the string character by character. When we see 2, that’s a number. The space doesn’t mean anything. The + is an operator. Another space. Then 3 is another number.

So we end up with three tokens: NUMBER(2), PLUS(+), NUMBER(3).

That’s it. That’s what a lexer does - it turns a string of characters into a list of meaningful tokens.

Tokens and Lexeme…

Similar Posts

Loading similar posts...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help