Preface
- Introduction
- What and Why is Redis?
- The Redis Serialization Protocol (RESP)
- How does RESP Work?
- What is a Parser?
- What is a Token?
- What is a Lexer?
- How to use our Tokens: Building a Lexer
- What is a Command
- Defining GET
- Implementing a Parser
- Parser Recap
- Creating Values
- Defining Expressions
- Storing Values
- Evaluating Commands
Introduction
Redis revolutionized the database world by embracing a radical idea: complexity should exist within simplicity. This article serves as your entry point into that philosophy. We will reverse-engineer Redis step by step—starting from a simple TCP socket, implementing the Redis Serialization Protocol (RESP), and finally evaluating core commands—all built in Golang using elegant, minimalist solutions.
Note: 👋 This article is designed for anyone curious about how Redis works under the hood. Instead of overwhelming you with thousands of lines of code, we will focus on the essential algorithms and architectural principles that power this Redis clone.
Want to dive deeper? Check out the full project on GitHub! 🚀
What and Why is Redis?
Before Redis, scaling applications presented a significant bottleneck. As usage soared, traditional infrastructure often struggled under the weight of high-frequency tasks like managing shopping carts, user sessions, and real-time leaderboards. These systems typically relied on slow, disk-bound databases and inconsistent, fragmented caching layers.
Redis provided the breakthrough. It is a blazingly fast, in-memory database built for simplicity: featuring atomic operations, a lightweight protocol (RESP), and the raw speed of RAM. By offloading ephemeral data—such as sessions and cached queries—Redis became the essential engine allowing systems to scale beyond conventional limits. 🌟
The Redis Serialization Protocol (RESP)
Think of computer communication like a professional conference. For participants to exchange ideas effectively, they need a shared language. In the world of Redis, that language is RESP.
In our architecture, the Redis Server acts as the keynote speaker, while the Redis Clients are the participants. For them to understand each other, every interaction—or connection—must follow the strict rules of RESP.
How does RESP Work?
Redis commands like GET, SET, and PING aren't just strings; they follow a specific grammar. Let’s
look at how a simple PING command is structured in RESP:
The PING Command:
At first glance, this looks like "computer gibberish", but it’s actually a highly efficient way to structure a request. In RESP, every command is sent as an Array, even if it only contains a single string.
Here's the breakdown:
| Element | Type | Data |
|---|---|---|
*1 | Array | Tells the server an array with 1 element is coming. |
\r\n | CRLF | The Protocol Delimiter. It marks the end of a header or data block. |
$4 | Bulk String | Tells the server the next item is a string of 4 bytes. |
\r\n | CRLF | Separates the length indicator from the actual data. |
PING | Data | The actual payload. |
\r\n | CRLF | The final terminator for the entire command. |
While PING is a great starting point, most Redis interactions
involve data. Let's see how to parse a more complex command like GET.
In RESP, this is represented as an array of two bulk strings:
At first, this looks significantly more complex, but the logic remains identical. We are simply nesting more data inside our Command Array.
| Element | Type | Data |
|---|---|---|
*2 | Array | Declares an Array with 2 elements ( The second element is the key ). |
\r\n | CRLF | Protocol Delimiter |
$3 | Bulk String | Declares the first element is 3 bytes long. |
\r\n | CRLF | Protocol Delimiter |
GET | Data | The actual command the Redis Instance is going to execute. |
\r\n | CRLF | Protocol Delimiter |
$5 | Bulk String | Declares that the argument is 5 bytes long. |
\r\n | CRLF | Protocol Delimiter |
mykey | Key Argument |
The GET command requires a key to look
up in memory. In our example, that key is mykey.
|
\r\n | CRLF | The final terminator for the entire command. |
While PING and GET provide a solid foundation,
they only scratch the surface of what we are building.
Here are some examples of other Redis Commands:
SET <key> <value>
GETSET <key> <value>
GETDEL <key>
INCRBY <key> <increment>
INCR <key>
Later in this article, we will dedicate specific sections to implementing the
full range of commands. To finish our look at RESP, let's see how the protocol
scales with a 3-part command like SET:
Give it a try before you see the solution.
| Element | Type | Data |
|---|---|---|
*3 | Array | Declares an Array with 3 elements. |
\r\n | CRLF | Protocol Delimiter |
$3 | Bulk String | Declares the following BulkString is 3 bytes long. |
\r\n | CRLF | Protocol Delimiter |
SET | Command | The command the Redis Instance is going to execute. |
\r\n | CRLF | Protocol Delimiter |
$8 | Bulk String |
Declares that the first argument of our SET command is
8 bytes long.
|
\r\n | CRLF | Protocol Delimiter |
userName | Key Argument |
The first argument of the SET command is the key the value is going to be stored on.
|
\r\n | CRLF | Protocol Delimiter |
$9 | Bulk String | Declares that the second argument is 9 bytes long. |
\r\n | CRLF | Protocol Delimiter |
Alejandro | Value Argument |
The second argument of the SET command
is the value to be stored.
|
\r\n | CRLF | The final terminator for the entire command. |
It might seem a long process but gluing it all together had come to a result.
If you are still confused about how RESP works, I highly recommend you to consult this Redis article about RESP. You can find it here! 🚀
If you're wondering why we're doing all of this, remember that computers
communicate with each other in a shared language.
This language had to be efficient and as expressive as possible, as this is crucial
to optimize the Redis Parser.
But wait, what is a Parser?
What is a Parser?
Definition: Parsing is the process of structuring a linear
representation in accordance with a given grammar.
WHAT? Even if this definition can sound a little complex, it is, in fact, is
pretty simple.
A Linear Representation can be either a sentence of a language,
a piece of music, or a Redis Command.
And grammar refers to the structure of a sentence, or a command,
this of it as a formula to get different sentences.
So, applied to our use case:
Case Definition: Parsing is the process of structuring a
Redis Command in accordance with a given Command Structure.
This is the theory, but in practice, we'll have an input string that we'll try to parse and the output will be a representation of the Redis Command known as an Abstract Syntax Tree (AST).
But still we haven't implemented anything yet, and we still don't know what is a Token. So...
What is a Token?
A Token is the smallest atomic unit of a command. Think of them
as the fundamental building blocks—like verbs (GET) and
nouns (mykey) —that our parser uses to reconstruct a
client’s command.
There are several kinds of tokens in our system:
- Commands: The instructions (e.g.,
GET,SET,PING). - Identifiers: The keys or values (e.g.,
userName,Alejandro). - Numbers: Numeric values used in operations (e.g.,
100).
As shown in the diagram above, the command SET userName Alejandro is not treated as one long string. Instead, the Lexer
identifies three distinct tokens: a specific SET command followed
by two Identifiers.
Defining Token Kinds
Now that we’ve defined the concept of a Token, we need a way to categorize
them in our code. In Go, we do this using Enums (enumerated
constants).
We will create a TokenKind type to represent every possible
category our parser might encounter—such as Commands, Identifiers, or Numbers.
By using Go's iota keyword, we can efficiently assign unique,
incremental values to each kind, starting with technical markers like EOF (End of File) and ILLEGAL (for unknown characters).
type TokenKind byte
// Check the source code to see all the Tokens!
const (
EOF TokenKind = iota // EOF = End Of File
ILLEGAL
// Special
CRLF
TEXT
// Commands
keyword_beg
IDENT
SET
GET
INCR
INCRBY
... byte for TokenKind?
You might wonder why we chose a byte instead of a string or a standard int. There are two primary engineering
reasons:
- Memory Efficiency: Since our Lexer will generate a new instance for every single word or symbol in a command, using a 1-byte type keeps our memory footprint as small as possible.
- Sufficient Capacity: A
byte( an alias foruint8) can represent up to 256 unique values. Since our Redis clone will only have a few dozen command types and symbols, this is the perfect, lightweight fit for ourTokenKind.
Knowing this, we need a container to hold both the TokenKind of the token and the raw data it represents, called the Literal.
The Token struct is composed of:
-
Literal: The raw string value extracted from the input (e.g.,GETormykey). -
Kind: The TokenKind (our byte-sized Enum) that tells the parser how to interpret that literal.
🔥 Fun Fact: Even thought i just said that the Token struct should be as lightweight as possible, this is not the most lightweight
representation for a Token.
The Golang Parser uses an amazing technique: instead of storing
a piece of text in the struct as we do, they just store the indices from where the piece of text beings and ends, a simple and ingenious trick if
you want to implement it, Check the details here.
But now that we have defined our tokens it's time to use them!
What is a Lexer?
Think of the Lexer (or Scanner) as "your eyes" reading through a page.
If you read the phrase "Redis is fast", you don't think about as the 13 characters and spaces. Instead, you instantly group those characters into three distinct concepts: [Redis], [is], and [fast].
In computer science, this process is known as Lexical Analysis (or "Lexing"). A Lexer performs this exact task for our Commands:
it takes a unstructured string of text and transforms it into an organized list
of Tokens.
Here's an example. This command is given to the Lexer:
The output that the Lexer produces kinda looks like this:
SET_Tokens = [
# (Skipping the underlying RESP protocol details for now)
SET,
Identifier("username"),
Identifier("Alejandro"),
Argument("EX"),
Integer(20)
]
To achieve this result we will use a Two pointer strategy.
You might wonder why we don't just use a single index to walk through the string.
While a single pointer can identify a single character, it cannot predict what
comes next. By using a "Two-Pointer" approach, we gain a Lookahead of 1 (k=1):
- Multi-byte Patterns: The Redis protocol (RESP) relies heavily on
\r\nto terminate commands. With two pointers, our "scout" (readPos) can verify the\nbefore our "focus" (pos) even leaves the\r. - Zero-Copy Slicing: Instead of copying characters into a temporary
buffer as we find them, we simply track the "start" and "end" of a Token. This
allows us to use Go’s
input[pos:readPos]syntax, which creates a lightweight "view" of the data rather than a memory-heavy copy.
But let's take a look at the Lexer mechanics to see how all of this apply.
This animation captures the "heartbeat" of our Lexer: scanning characters,
identifying patterns, and bundling them into structured Tokens.
Feel free to play with the visualization above if you have any questions
about the Lexer Mechanics. Watching how the pos and readPos pointers
coordinate to "snip" out identifiers like username is often more helpful than reading
fifty lines of code.
Once you have a feel for the rhythm of the pointers, let's look at how we translate
this logic into idiomatic Go code.
How to use our Tokens: Building a Lexer
To transform raw input like SET username Alejandro into a predictable
stream of tokens, we need a Lexer (often called a Scanner).
If Tokens are the nouns of our language, the Lexer is the eye that identifies
them. It scans the input text and converts it into a structured stream of data
that our Parser can actually understand.
The Engine Room: Key Lexer Mechanics
To navigate through strings efficiently, our Lexer tracks its internal state using three specific fields:
-
pos: The starting index of the character we are currently examining. -
readPos: Our "look-ahead" pointer. It sits one step ahead of pos to see what’s coming next (crucial for detecting\r\nsequences). -
ch: The actual byte value at our current position.
By maintaining these two pointers, we can perform substring extraction with
zero memory waste. To isolate a word like userName, we
simply "slice" the input string: input[pos:readPos]
type Lexer struct {
input string // Current command as raw text
pos int
readPos int
ch byte
} The Constructor: Initializing the API
To make our Lexer a reliable API, we have to create some helper functions first.
-
trimInput(): This helper removes trailing whitespaces to the right of the command. This prevents the Lexer from processing "ghost" characters at the end of a line. -
The
New()constructor initializes thelexerby callingnext(), which handles character-by-character advancement through the input string. Here’s how it works:
func trimInput(input string) string {
return strings.TrimRightFunc(input, func(r rune) bool {
return r == ' ' || r == '\t'
})
}
func New(input string) *Lexer {
l := &Lexer{
input: trimInput(input),
pos: 0,
readPos: 0,
}
l.next()
return l
}
Now that we have our New() constructor, we can define next().
func (l *Lexer) next() {
if l.readPos >= len(l.input) {
l.ch = byte(token.EOF)
return
} else {
l.ch = l.input[l.readPos]
}
l.pos = l.readPos
l.readPos++
}
The next() method is the internal engine that drives character-by-character
advancement through the input string. It handles two critical cases:
- End of File (EOF): If
readPosexceeds the input length,chis set totoken.EOF(the ASCII Null character). This signals that the input string has been fully consumed. - Advancing: Otherwise,
chis updated to the character atreadPos.posthen catches up toreadPos, andreadPosincrements to "scout" the next index.
With character advancement handled by next(), we'll now
implement NextToken() to scan and return structured tokens from
the input.
Note: If you review the code implementation, you'll notice that I have skipped some code. This is because the Lexer is
used in both the Serializer and Parser.
However, the ideal approach is to create separate implementations for them,
since this allows you to leverage the full potential of RESP.
func (l *Lexer) NextToken() token.Token {
// First Operation
if l.ch == byte(token.EOF) {
return token.New(token.EOF, "")
}
// Second Operation
l.skipWhitespaces()
// Third Operation
if l.ch == '\r' && l.peekChar() == '\n' {
l.next()
return token.New(token.CRLF, "\r\n")
}
...
}
The first thing you can see is that the NextToken() returns
a token.Token we have defined here.
Also this method performs three key operations:
-
If
l.chistoken.EOF, this indicates that the input string has been fully consumed. In this case, an emptyTokenwith itsKindset totoken.EOFis returned. -
Skips any leading white space characters (e.g., spaces, tabs
\t, newlines\n). -
Using the
peekChar()function which returns us the next character without advancingreadPos, we can check if the current character is\rand the next character is\n(forming\r\nknown as CRLF), which acts as aTokenseparator, so a CRLF token is returned.
EmptyToken{
Kind: token.EOF,
Literal: "",
}
CRLFToken{
Kind : token.CRLF,
Literal : "\r\n",
}
Now, we need to be able to create Tokens for like SET, userName, and Alejandro. As you may
have noticed, what these all have in common is that they are composed of letters.
To handle the identification of keywords and identifiers more effectively, let's
create two utility functions.
func isLetter(ch byte) bool {
return ('a' <= ch && ch <= 'z') || ('A' <= ch && ch <= 'Z') || ch == '_'
}
func (l *Lexer) readIdent() string {
pos := l.pos
for isLetter(l.ch) {
l.next()
}
return l.input[pos : l.pos]
}
This is a simplified implementation, if you wanna see it whole i recommend you
to check the implementation details.
For now let's see what is happening here.
-
isLetter: This function checks if the given character (ch) is a letter(a-z, A-Z)or an underscore (_). -
readIdent: This method reads an entire identifier (a sequence of letters and underscores) from the input
Now that we have implemented these utility functions, let's finally examine
the second part of the NextToken() implementation.
func (l *Lexer) NextToken() token.Token {
// ... (previous logic for EOF, whitespace, CRLF)
if isLetter(l.ch) {
t.Literal = l.readIdent()
t.Kind = token.LookupIdent(t.Literal)
return t
}
// ... (Handeling numbers)
t.Literal = ""
t.Kind = token.ILLEGAL
return t
} NOTE: Implementing Numbers is pretty similar to the characters implementation.
This completes the core logic for identifying keywords and identifiers. If the
lexer doesn't recognize any token (including keywords, identifiers or numbers), it defaults to returning a token.ILLEGAL. As you may have noticed, we haven't yet defined token.LookupIdent. Let's do that now.
var keywords = map[string]TokenKind{
"GET": GET,
"GETSET": GETSET,
"GETEX": GETEX,
"GETDEL": GETDEL,
"SET": SET,
"DEL": DEL,
"INCR": INCR,
"INCRBY": INCRBY,
"DECR": DECR,
"DECRBY": DECRBY,
// ... (The rest of the Commands and Keywords)
}
func LookupIdent(ident string) TokenKind {
kw, ok := keywords[ident]
if ok {
return kw
}
return IDENT
}
As you can see this function is pretty straight forward, and with this we have
finished our Lexer.
So, after all of this work we can finally implement a Parser.
What is a Command
Do you remember the Case Definition we proposed ? Let me give
you a hint.
Case Definition: Parsing is the process of structuring a
Redis Command in accordance with a given Command Structure.
Structure and Redis Command are highlighted for these a specific reason:
- To create a good Structure we have to define all of the required fields that a Command must have.
- To create a good Redis Command we'll use a function that checks all of the required fields to successfully create a Command.
To achieve the first of these requirements we are going to create a Command Interface which is going to identify all of the possible Commands.
type Node interface {
String() string
TokenLiteral() string
}
type Command interface {
Node
cmdNode()
} Node represents any parseable unit (Command or Expressions).
-
Stringreturns a Human-readable representation of the Command. -
TokenLiteralreturns the Literal of the prefixToken(SET,GET,INCR, etc.)
Command marks Redis Commands nodes (GET, SET, etc.)
-
The
Nodeinterface is embedded intoCommandas Command is also a Node. -
cmdNode()is an empty method that differentiates Commands from Expression (which we will see soon).
Now that we have created our Command interface, we can start
implementing Commands.
NOTE: Through the rest of the article we will just implement the
GET command, but all the Command implementations are similar,
and you can check them here.
So... Let's do it!
Defining GET
GET Command Reminder: The GET Command takes a key as a parameter and returns the value associated to that key.
To implement the Command GET we first have to create a Structure which implements the Command interface, then we will create
a function that takes a Stream of Tokens and returns a GetCommand.
As you can see the GetCommand structure stores all the relevant
information to represent the GET Command Structure.
But it still does not implement the Command interface.
func (gc *GetCommand) cmdNode() {}
func (gc *GetCommand) TokenLiteral() string {
return gc.Token.Literal
}
func (gc *GetCommand) String() string {
var sb strings.Builder
sb.WriteString(gc.Token.Literal)
sb.WriteByte(' ')
sb.WriteString(gc.Key)
return sb.String()
} There are a lot of thing that are happening right here so let's break it down!
-
cmdNode: This is correctly implemented as an empty method to satisfy theCommandinterface, serving its purpose of differentiating Commands from Expressions. -
TokenLiteral: This method returns the literal value of the command's token (e.g.,GET), adhering to theNodeinterface. -
String: This method provides a human-readable representation of the GET command (e.g.,GET mykey), which is crucial for debugging and logging, as required by the Node interface. The use of strings.Builder is a good practice for efficient string concatenation in Go.
Now that our GetCommand satisfies the Command interface. It's time to put it to work.
Implementing a Parser
Now that we have create the Command we are going to Parse (GET) it's time to create the Parser!
The Parser logic is pretty similar to the Token's one, let's see why.
- They both use a two index approach: In the same way that the
lexeruses aposandreadPosthe Parser usescurTokandreadTokto parse the incoming input string. - They both use special functions: To being able to parse a Command, well create a specific function for each command that tries to parse it or throws an error.
NOTE: I'm going to present you a simplified implementation to avoid boilerplate.
Knowing this, let's define the Parser!
-
*lexer.Lexer: The lexer contains theNextToken()method that converts raw input (likeSET foo bar) into a stream of parsed tokens (likeSET,foo,bar). -
curTok: Represents the token being processed right now—like the word your eyes are focused on while reading. Determines the parser’s immediate action (e.g., "Is this a SET or GET?"). -
readTok: Represents the upcoming token, peeked at but not yet consumed. Acts like a "sneak preview" to resolve ambiguities (e.g., seeing if theSETCommand contains an optional argument).
Now that we have our definition of our Parser, it's time to
initialize it!
func (p *Parser) next() {
p.curTok = p.readTok
p.readTok = p.l.NextToken()
}
func New(l *lexer.Lexer) *Parser {
p := &Parser{l: l}
p.next()
p.next()
return p
} The initialization of the Parser is pretty simple, let's take a brief look:
-
next(): This function advances the Tokens by using theNextToken()method of lexer. -
New(): This method takes a*lexer.Lexeras argument and then it initializes thecurTokandreadTokby using thenext()method.
With this we are almost done, we just have to create a function that depending
on the first Token it selects the according parsing function.
Before creating the function that parses the GET command we
have to create a function that Parses all the possible Commands, we will call it
Parse:
func (p *Parser) Parse() (ast.Command, error) {
// ( 1 ) Checks if the Command starts with a Array Representation
if err := p.parseBegCommand(); err != nil {
return nil, err
}
// ( 2 ) Checks if the Command continues with a BulkString
if err := p.skipBulkString(); err != nil {
return nil, err
}
// ( 3 ) Parses the Command
cmd, err := p.parseCommand()
if err != nil {
return nil, err
}
return cmd, nil
} This function can seem tough but it just handles these three key parts:
-
p.parseBegCommand(): Checks if the Command starts with anArrayset of Tokens (e.g.,*2\r\n) if it does, it advances the Tokens till the next set of Tokens ($3\r\n). -
p.skipBulkString(): Checks if the current set of Tokens denotes aBulkString(e.g,$3\r\n), advancing the Tokens as in the last point. -
p.parseCommand(): After all of the executions of the previous functions, this function checks if the currentTokenis a command prefix (e.g,SET,GET) executing its corresponding parsing function (defined down bellow).
We do all of this because in this stage we will get the Command in a RESP format:
*2\r\n -- $3\r\nGET\r\n -- $3\r\nkey\r\n
# | | |
# We Check all of this parts individualy
With this we are almost finished with Parsing, we just have to create the p.parseCommand() and p.parseGetCommand(). So, let's do it!
The definition of p.parseCommand() is actually pretty simple.
func (p *Parser) parseCommand() (ast.Command, error) {
switch p.curTok.Kind {
case token.GET:
return p.parseGetCommand()
...
}
return nil, fmt.Errorf(
"command not supported. got=%d (%q)",
p.curTok.Kind, p.curTok.Literal,
)
}
As you can see the p.parseCommand() returns a ast.Command and a error, then using a switch statement
we determine the corresponding Parsing Function for that respective
Command.
If the Command is not found: we return a "Command not supported"
error.
Knowing this, let's define p.parseGetCommand() and finish this
whole Parsing section.
func (p *Parser) parseGetCommand() (*ast.GetCommand, error) {
getCmd := &ast.GetCommand{Token: p.curTok}
p.next()
if err := p.checkCRLF(); err != nil {
return nil, err
}
key, err := p.parseIdent()
if err != nil {
return nil, err
}
getCmd.Key = key
return getCmd, nil
}
As you can see the parseGetCommand function returns the *ast.GetCommand structure we have defined before and an error.
There are also some special functions we will now see:
func (p *Parser) checkCRLF() error {
// Checks if the current Token is CRLF
if !p.curTokIs(token.CRLF) {
return fmt.Errorf(
"token expected=%d ('CRLF'). got=%d (%s)",
token.CRLF,
p.curTok.Kind,
p.curTok.Literal,
)
}
p.next()
return nil
}
func (p *Parser) parseIdent() (string, error) {
if err := p.skipBulkString(); err != nil {
return "", err
}
if !p.curTokIs(token.IDENT) {
return "", fmt.Errorf(
"token expected=%d ('IDENT'). got=%d (%q)",
token.IDENT,
p.curTok.Kind,
p.curTok.Literal,
)
}
return p.curTok.Literal, nil
} These two functions are widely used in the whole Parser, let's explain them quickly:
-
checkCRLF: Checks if the current Token is\r\n(CRLF) and advances to the next Token, if not it returns a "token expected" error. -
parseIdent: It first uses thep.skipBulkStringfunction (defined bellow) to checks if the following set of Tokens describes aBulkStringand advances the Tokens till the actual text.
Then it checks if the current Token is an Identifier (token.IDENT) that contains the actual key of theGETcommand returning the command if parsed successfully.
Now the only thing left is creating the skipBulkString function
which you may have seen in the last snippets.
func (p *Parser) skipBulkString() error {
if !p.curTokIs(token.BULKSTRING) {
return p.isNotBulkStringErr()
}
p.next()
if !p.curTokIs(token.NUMBER) {
return p.isNotNumberErr()
}
p.next()
if err := p.checkCRLF(); err != nil {
return err
}
return nil
} As you can see this functions just returns a error if some of the operations fail. Let's see what it does:
-
Checks if the
curTokis a$(BulkString) advancing the Tokens. -
Checks if the
curTokis aNumberrepresenting the length of the actual text and then it advances the Tokens with thep.next()method. -
Checks if the
curTokis a\r\n(CRLF) therefore checking if it's a valid BulkString.
This have been a tough section, so let's do a recap.
Parser Recap
So, we have gone a long way, don't worry if you think that you didn't get Parsers, in fact, this is quite a complicated topic.
So let's see how Parsers work in a more graphical way.
Parsers consume Tokens: To be able to Parse complex Command we
use a curTok and nextToken.
As you can see by default we put the curTok and nextTok in the first two Tokens of the Command (in this example GET and userName).
We use the content of this Tokens to fill the GetCommand structure
with its representing data.
You also have to remember that this is the raw representation
of the Command.
In fact the Parser will get something like this.
We need to advance through the Tokens to be able to Parse the whole Command,
so we need a function that advances our 'two-token window' (curTok and nextTok).
The next() method achieves this purpose each time we call it.
After calling next():
As you can see after calling next() now both Tokens have advances
one position.
One we encounter a GET token, we execute a special function
that parses the required fields of the Command, in this case parseGetCommand().
NOTE: As you may have guested all of the words and symbols are Tokens and Parsing is the process of going through them and storing the useful information.
With this, we are almost done with this article, we just need to Evaluate Commands.
Creating Values
This is the part of the article when everything comes together, but before we do that we have to create a way to store the value of our keys. So, let's do it!
All of the Values should have its corresponding kind we will focus on Nil and
String value kinds, let's define the value kinds first and then we will store its values.
type valueKind byte
const (
StringKind valueKind = iota
NilKind
// ... The rest of the Value Kinds
) If you notice this is actually pretty similar to the Token logic, in fact, it is. So, let's break it down.
-
valueKind: Is anEnum(Enumeration) of the available Value Kinds. -
StringKind: Represents theStringValue Kind. -
NilKind: This special kind signifies the absence of a value if a key does not exists we will return aNilVal(a value ofNilKind).
We have mention a Value a couple of times. So, let's define
them too!
As you can see Values are pretty similar to Tokens.
-
Kind(): This method returns thevalueKindof the value. -
String(): This method returns the String representation of the current Value.
As Value is an interface we have to create its corresponding
struct in this case String and NilVal, this structure is going to be the way we are going to store values.
type String struct {
Value string
}
// Returns the String Value stored
func (s *String) String() string { return s.Value }
func (s *String) Kind() valueKind { return StringKind }
type NilVal struct{}
// Retuns the (nil) String
func (n *NilVal) String() string { return "(nil)" }
func (n *NilVal) Kind() valueKind { return NilKind }
As you can see both String and NilVal now
implement the Value interface.
-
String: It stores a string in itsValuefield. -
NilVal: It does not represent any Value asNilrepresents the absence of one.
We are almost done with values! Let's just create a helper function to create
new values easily, we will call it NewValue.
func NewValue(v ast.Expression) Value {
switch v := v.(type) {
case *ast.StringExpr:
return &String{v.String()}
// Creating the rest of the values...
}
return nil
}
Have you notice it? This function takes an Expression as argument
which determines what kind of Value gets returned.
If the given Expression is not supported nil is returned, but
as all the nil values are the same so for that reason we will
create a general nil value.
Have you notice it? This function takes an Expression and we
still have not explore what a Expression does, so let's do it.
Also this function uses the Expression type to decide which Value
should be returned, if the Expression is not supported it returns
nil.
Defining Expressions
Do you remember when we were defining Commands and we briefly
mentioned Expressions, this is the time when we define
them!
Commands like GET receive a key parameter which
can be represented a String. But commands like SET can receive as a parameter any value (Number, String, Bool, etc...), we wrap all of these values as Expressions.
type Node interface {
String() string
TokenLiteral() string
}
type Expression interface {
Node
exprNode()
} Expression is also a Node and it uses the
exprNode() method to differentiate from Commands.
Now that we have defined Expressions let's implement them!
type StringExpr struct {
Token token.Token
}
func (se *StringExpr) exprNode() {}
func (se *StringExpr) TokenLiteral() string {
return se.Token.Literal
}
func (se *StringExpr) String() string {
return se.Token.Literal
}
In this article we will just implement StringExpr as we will
just support strings. So, let's break it down quickly.
-
StringExpr: The string it represents is stored in theTokenfield as it contains the whole string. -
exprNode: This method ensures thatStringExprimplements theExpressioninterface, in the same way asCommand. -
String: This method provides a string representation of the actualExpression.
If you want to know the use of Expressions in Commands i highly recommend you to check out the source code!
Now that we have created our Values it's time to store them!
Storing Values
We are going to take a simple approach when it comes to storing Values, we are going to create a concurrency safe map which maps string keys to Value pointers.
NOTE: This implementation was written before the weak package was released, so keep in mind that now there is a better way to implement the storage of keys.
Knowing this, let's implement our Storage.
As you can see the storage is just a map of Values, but
above it you can see a RWMutex.
A Mutex (mutual exclusion) is one of the most complex and deep
concepts in all of computing, so let's try to explain them quickly.
The Problem of the Five Philosophers and the Limited Forks:
Imagine five brilliant philosophers sitting around a circular table, ready to enjoy some delicious food. Between each philosopher is a single fork. To eat, a philosopher needs two forks: one from their left and one from their right. The problem? There are only five forks for five philosophers, and each fork is a shared resource.
Now, let's say all five philosophers get hungry at the same time
and each reaches for the fork on their left. They all succeed! But now, they're
stuck. Each philosopher has one fork, but they all need a second fork (from their right), which is currently held by their neighbor. No one can eat, and no one can
release their fork because they're all waiting for another – this is a classic
deadlock. They're all holding onto a resource
and waiting indefinitely for another resource that's also held.
Here's how a mutex acts as the polite, orderly "fork manager":
- Acquiring the Mutex (Picking up a Fork): When a philosopher
wants to pick up a fork, they first try to 'lock' its
mutex. If the mutex is unlocked (meaning the fork is available), they successfully acquire themutexand pick up the fork. - Mutual Exclusion Ensured: If the
mutexis already locked (meaning another philosopher is holding that fork), our hungry philosopher must wait until themutexis released. This ensures that only one philosopher can ever hold a specific fork (resource) at a time. - Releasing the Mutex (Putting down a Fork): Once a philosopher
finishes eating, they 'unlock' the
mutexesfor both fork they were holding. This signals to any waiting philosophers that those forks are now available.
By implementing a strategy where philosophers acquire both necessary fork
mutexes before attempting to eat (or acquiring them in a specific
order, or acquiring a global eating permission mutex), we prevent
the deadlock. The mutex ensures that critical sections of code (like picking up
a shared fork) are accessed by only one goroutine (philosopher)
at a time, preventing race conditions and resource contention.
In our Redis clone, the sync.RWMutex acts precisely like these
fork managers, ensuring that when multiple parts of our program
try to read from or write to our Storage map simultaneously,
they do so in an orderly, deadlock-free fashion, just like
our philosophers eventually get to enjoy their food!
Now that we have scratched the surface of Mutexes, we are
ready to finish our Storage!
func (s *Storage) Get(k string) (Value, bool) {
// Locks the map
s.mu.RLock()
val, ok := s.storage[k]
// Unlocks the map
s.mu.RUnlock()
if !ok {
return nil, ok
}
return val, ok
}
As you can see, the Get method is pretty simple, let's break
it down!
-
s.mu.RLock(): This method locks theMutexto get the value from the storage. -
s.mu.RUnlock(): This method unlocks the Mutex, freeing the value. -
At last, if the value exists, it is returned; if not,
nilis returned.
The only thing left is creating a New() function that initializes
Storage.
The New() function creates the storage map;
we don't need to create the Mutex because its zero value is
sufficient for use. With this, we are now ready to evaluate Redis Commands!
NOTE: I highly recommend you to look up at the whole implementation of Storage as it has all of the Command storage logic.
Evaluating Commands
This is the last section of this article, you can't imagine how excited I am!
Evaluating Commands can seem a little bit intimidating, but in
fact is pretty simple. We will first define a Evaluator struct
that contains all of our Evaluation Logic.
type Evaluator struct {
s *storage.Storage
}
func New() *Evaluator {
return &Evaluator{
s: storage.New(),
}
}
As you can see Evaluator uses the
Storage because evaluating a command is just creating and editing
key-values.
func (e *Evaluator) Eval(cmd ast.Command) (string, error) {
switch cmd := cmd.(type) {
case *ast.GetCommand:
return e.evalGetCommand(cmd)
}
return "", fmt.Errorf(
"command not supported for evaluation. got=%T",
cmd,
)
} Evaluator has a method Eval that depending
on the given command (GET, SET, INCR) it executes its corresponding eval function (evalGetCommand()).
Now the only thing left is define the evalGetCommand(), so
let's do it!