Back

Redis: The Swiss Army Knife of Data

redis

Preface

  1. Introduction
  2. What and Why is Redis?
  3. The Redis Serialization Protocol (RESP)
  4. How does RESP Work?
  5. What is a Parser?
  6. What is a Token?
  7. What is a Lexer?
  8. How to use our Tokens: Building a Lexer
  9. What is a Command
  10. Defining GET
  11. Implementing a Parser
  12. Parser Recap
  13. Creating Values
  14. Defining Expressions
  15. Storing Values
  16. Evaluating Commands

Introduction

Redis revolutionized the database world by embracing a radical idea: complexity should exist within simplicity. This article serves as your entry point into that philosophy. We will reverse-engineer Redis step by step—starting from a simple TCP socket, implementing the Redis Serialization Protocol (RESP), and finally evaluating core commands—all built in Golang using elegant, minimalist solutions.

Note: 👋 This article is designed for anyone curious about how Redis works under the hood. Instead of overwhelming you with thousands of lines of code, we will focus on the essential algorithms and architectural principles that power this Redis clone.

Want to dive deeper? Check out the full project on GitHub! 🚀

What and Why is Redis?

Before Redis, scaling applications presented a significant bottleneck. As usage soared, traditional infrastructure often struggled under the weight of high-frequency tasks like managing shopping carts, user sessions, and real-time leaderboards. These systems typically relied on slow, disk-bound databases and inconsistent, fragmented caching layers.

Redis provided the breakthrough. It is a blazingly fast, in-memory database built for simplicity: featuring atomic operations, a lightweight protocol (RESP), and the raw speed of RAM. By offloading ephemeral data—such as sessions and cached queries—Redis became the essential engine allowing systems to scale beyond conventional limits. 🌟

The Redis Serialization Protocol (RESP)

Think of computer communication like a professional conference. For participants to exchange ideas effectively, they need a shared language. In the world of Redis, that language is RESP.

In our architecture, the Redis Server acts as the keynote speaker, while the Redis Clients are the participants. For them to understand each other, every interaction—or connection—must follow the strict rules of RESP.

redis arch

How does RESP Work?

Redis commands like GET, SET, and PING aren't just strings; they follow a specific grammar. Let’s look at how a simple PING command is structured in RESP:

The PING Command:

Ping Command In RESP
*1\r\n$4\r\nPING\r\n

At first glance, this looks like "computer gibberish", but it’s actually a highly efficient way to structure a request. In RESP, every command is sent as an Array, even if it only contains a single string.

Here's the breakdown:

Element Type Data
Serializing PING.
*1 Array Tells the server an array with 1 element is coming.
\r\n CRLF The Protocol Delimiter. It marks the end of a header or data block.
$4 Bulk String Tells the server the next item is a string of 4 bytes.
\r\n CRLF Separates the length indicator from the actual data.
PING Data The actual payload.
\r\n CRLF The final terminator for the entire command.

While PING is a great starting point, most Redis interactions involve data. Let's see how to parse a more complex command like GET.

Get Command Structure
  GET       <key>
   ▲          ▲
Command   BulkString

In RESP, this is represented as an array of two bulk strings:

Serialized GET command
*3\r\n$3\r\nGET\r\n$5\r\nmykey\r\n

At first, this looks significantly more complex, but the logic remains identical. We are simply nesting more data inside our Command Array.

Element Type Data
Serializing GET.
*2 Array Declares an Array with 2 elements ( The second element is the key ).
\r\n CRLF Protocol Delimiter
$3 Bulk String Declares the first element is 3 bytes long.
\r\n CRLF Protocol Delimiter
GET Data The actual command the Redis Instance is going to execute.
\r\n CRLF Protocol Delimiter
$5 Bulk String Declares that the argument is 5 bytes long.
\r\n CRLF Protocol Delimiter
mykey Key Argument The GET command requires a key to look up in memory. In our example, that key is mykey.
\r\n CRLF The final terminator for the entire command.

While PING and GET provide a solid foundation, they only scratch the surface of what we are building.
Here are some examples of other Redis Commands:

Redis commands
SET <key> <value>
GETSET <key> <value>
GETDEL <key>
INCRBY <key> <increment>
INCR <key>

Later in this article, we will dedicate specific sections to implementing the full range of commands. To finish our look at RESP, let's see how the protocol scales with a 3-part command like SET:

Try to serialize this SET Command:
SET userName Alejandro

Give it a try before you see the solution.

Element Type Data
Serializing SET Command.
*3 Array Declares an Array with 3 elements.
\r\n CRLF Protocol Delimiter
$3 Bulk String Declares the following BulkString is 3 bytes long.
\r\n CRLF Protocol Delimiter
SET Command The command the Redis Instance is going to execute.
\r\n CRLF Protocol Delimiter
$8 Bulk String Declares that the first argument of our SET command is 8 bytes long.
\r\n CRLF Protocol Delimiter
userName Key Argument The first argument of the SET command is the key the value is going to be stored on.
\r\n CRLF Protocol Delimiter
$9 Bulk String Declares that the second argument is 9 bytes long.
\r\n CRLF Protocol Delimiter
Alejandro Value Argument The second argument of the SET command is the value to be stored.
\r\n CRLF The final terminator for the entire command.

It might seem a long process but gluing it all together had come to a result.

SET Command Serialized:
*3\r\n$3\r\nSET\r\n$8\r\nuserName\r\n$9\r\nAlejandro\r\n

If you are still confused about how RESP works, I highly recommend you to consult this Redis article about RESP. You can find it here! 🚀

If you're wondering why we're doing all of this, remember that computers communicate with each other in a shared language.
This language had to be efficient and as expressive as possible, as this is crucial to optimize the Redis Parser.

But wait, what is a Parser?

What is a Parser?

Definition: Parsing is the process of structuring a linear representation in accordance with a given grammar.

WHAT? Even if this definition can sound a little complex, it is, in fact, is pretty simple.
A Linear Representation can be either a sentence of a language, a piece of music, or a Redis Command.
And grammar refers to the structure of a sentence, or a command, this of it as a formula to get different sentences.

So, applied to our use case:
Case Definition: Parsing is the process of structuring a Redis Command in accordance with a given Command Structure.

This is the theory, but in practice, we'll have an input string that we'll try to parse and the output will be a representation of the Redis Command known as an Abstract Syntax Tree (AST).

Parser control flow

But still we haven't implemented anything yet, and we still don't know what is a Token. So...

What is a Token?

A Token is the smallest atomic unit of a command. Think of them as the fundamental building blocks—like verbs (GET) and nouns (mykey) —that our parser uses to reconstruct a client’s command.

There are several kinds of tokens in our system:

  • Commands: The instructions (e.g., GET, SET, PING).
  • Identifiers: The keys or values (e.g., userName, Alejandro).
  • Numbers: Numeric values used in operations (e.g., 100).
Command tokens

As shown in the diagram above, the command SET userName Alejandro is not treated as one long string. Instead, the Lexer identifies three distinct tokens: a specific SET command followed by two Identifiers.

Defining Token Kinds

Now that we’ve defined the concept of a Token, we need a way to categorize them in our code. In Go, we do this using Enums (enumerated constants).

We will create a TokenKind type to represent every possible category our parser might encounter—such as Commands, Identifiers, or Numbers. By using Go's iota keyword, we can efficiently assign unique, incremental values to each kind, starting with technical markers like EOF (End of File) and ILLEGAL (for unknown characters).

Redis Tokens
type TokenKind byte

// Check the source code to see all the Tokens!
const (
  EOF TokenKind = iota // EOF = End Of File
  ILLEGAL

  // Special
  CRLF
  TEXT
  
  // Commands
  keyword_beg
  IDENT
  SET
  GET
  INCR
  INCRBY
  ...
Why use a byte for TokenKind?

You might wonder why we chose a byte instead of a string or a standard int. There are two primary engineering reasons:

  1. Memory Efficiency: Since our Lexer will generate a new instance for every single word or symbol in a command, using a 1-byte type keeps our memory footprint as small as possible.
  2. Sufficient Capacity: A byte ( an alias for uint8 ) can represent up to 256 unique values. Since our Redis clone will only have a few dozen command types and symbols, this is the perfect, lightweight fit for our TokenKind.

Knowing this, we need a container to hold both the TokenKind of the token and the raw data it represents, called the Literal.

Token Struct Definition
type Token struct {
  Kind    TokenKind
  Literal string
}

The Token struct is composed of:

  1. Literal: The raw string value extracted from the input (e.g., GET or mykey).
  2. Kind: The TokenKind (our byte-sized Enum) that tells the parser how to interpret that literal.

🔥 Fun Fact: Even thought i just said that the Token struct should be as lightweight as possible, this is not the most lightweight representation for a Token.
The Golang Parser uses an amazing technique: instead of storing a piece of text in the struct as we do, they just store the indices from where the piece of text beings and ends, a simple and ingenious trick if you want to implement it, Check the details here.

But now that we have defined our tokens it's time to use them!

What is a Lexer?

Think of the Lexer (or Scanner) as "your eyes" reading through a page.

If you read the phrase "Redis is fast", you don't think about as the 13 characters and spaces. Instead, you instantly group those characters into three distinct concepts: [Redis], [is], and [fast].

In computer science, this process is known as Lexical Analysis (or "Lexing"). A Lexer performs this exact task for our Commands: it takes a unstructured string of text and transforms it into an organized list of Tokens.

Here's an example. This command is given to the Lexer:

SET command example
SET username Alejandro EX 20

The output that the Lexer produces kinda looks like this:

Tokens of the SET Command
SET_Tokens = [
  # (Skipping the underlying RESP protocol details for now)
  SET,
  Identifier("username"),
  Identifier("Alejandro"),
  Argument("EX"),
  Integer(20)
]

To achieve this result we will use a Two pointer strategy.
You might wonder why we don't just use a single index to walk through the string. While a single pointer can identify a single character, it cannot predict what comes next. By using a "Two-Pointer" approach, we gain a Lookahead of 1 (k=1):

  1. Multi-byte Patterns: The Redis protocol (RESP) relies heavily on \r\n to terminate commands. With two pointers, our "scout" (readPos) can verify the \n before our "focus" (pos) even leaves the \r.
  2. Zero-Copy Slicing: Instead of copying characters into a temporary buffer as we find them, we simply track the "start" and "end" of a Token. This allows us to use Go’s input[pos:readPos] syntax, which creates a lightweight "view" of the data rather than a memory-heavy copy.

But let's take a look at the Lexer mechanics to see how all of this apply.

This animation captures the "heartbeat" of our Lexer: scanning characters, identifying patterns, and bundling them into structured Tokens.
Feel free to play with the visualization above if you have any questions about the Lexer Mechanics. Watching how the pos and readPos pointers coordinate to "snip" out identifiers like username is often more helpful than reading fifty lines of code.
Once you have a feel for the rhythm of the pointers, let's look at how we translate this logic into idiomatic Go code.

How to use our Tokens: Building a Lexer

To transform raw input like SET username Alejandro into a predictable stream of tokens, we need a Lexer (often called a Scanner).

If Tokens are the nouns of our language, the Lexer is the eye that identifies them. It scans the input text and converts it into a structured stream of data that our Parser can actually understand.

The Engine Room: Key Lexer Mechanics

To navigate through strings efficiently, our Lexer tracks its internal state using three specific fields:

  1. pos: The starting index of the character we are currently examining.
  2. readPos: Our "look-ahead" pointer. It sits one step ahead of pos to see what’s coming next (crucial for detecting \r\n sequences).
  3. ch: The actual byte value at our current position.

By maintaining these two pointers, we can perform substring extraction with zero memory waste. To isolate a word like userName, we simply "slice" the input string: input[pos:readPos]

Lexer Initialization
type Lexer struct {
  input   string   // Current command as raw text

  pos     int
  readPos int
  ch byte
}

The Constructor: Initializing the API

To make our Lexer a reliable API, we have to create some helper functions first.

  1. trimInput(): This helper removes trailing whitespaces to the right of the command. This prevents the Lexer from processing "ghost" characters at the end of a line.
  2. The New() constructor initializes the lexer by calling next(), which handles character-by-character advancement through the input string. Here’s how it works:
Lexer Definition
func trimInput(input string) string {
  return strings.TrimRightFunc(input, func(r rune) bool {
      return r == ' ' || r == '\t'
  })
}

func New(input string) *Lexer {
  l := &Lexer{
    input:   trimInput(input),
    pos:     0,
    readPos: 0,
  }

  l.next()

  return l
}

Now that we have our New() constructor, we can define next().

next() definition
func (l *Lexer) next() {
  if l.readPos >= len(l.input) {
    l.ch = byte(token.EOF)
    return
  } else {
    l.ch = l.input[l.readPos]
  }

  l.pos = l.readPos
  l.readPos++
}

The next() method is the internal engine that drives character-by-character advancement through the input string. It handles two critical cases:

  1. End of File (EOF): If readPos exceeds the input length, ch is set to token.EOF (the ASCII Null character). This signals that the input string has been fully consumed.
  2. Advancing: Otherwise, ch is updated to the character at readPos. pos then catches up to readPos, and readPos increments to "scout" the next index.

With character advancement handled by next(), we'll now implement NextToken() to scan and return structured tokens from the input.

Note: If you review the code implementation, you'll notice that I have skipped some code. This is because the Lexer is used in both the Serializer and Parser. However, the ideal approach is to create separate implementations for them, since this allows you to leverage the full potential of RESP.

NextToken() First Part Definition
func (l *Lexer) NextToken() token.Token {

  // First Operation
  if l.ch == byte(token.EOF) {
    return token.New(token.EOF, "")
  }

  // Second Operation
  l.skipWhitespaces()

  // Third Operation
  if l.ch == '\r' && l.peekChar() == '\n' {
    l.next()
    return token.New(token.CRLF, "\r\n")
  }

  ...
}

The first thing you can see is that the NextToken() returns a token.Token we have defined here.
Also this method performs three key operations:

  1. If l.ch is token.EOF, this indicates that the input string has been fully consumed. In this case, an empty Token with its Kind set to token.EOF is returned.
  2. Skips any leading white space characters (e.g., spaces, tabs \t, newlines \n).
  3. Using the peekChar() function which returns us the next character without advancing readPos, we can check if the current character is \r and the next character is \n (forming \r\n known as CRLF), which acts as a Token separator, so a CRLF token is returned.
First returned tokens
EmptyToken{
  Kind: token.EOF,
  Literal: "",
}

CRLFToken{
  Kind : token.CRLF,
  Literal : "\r\n",
}

Now, we need to be able to create Tokens for like SET, userName, and Alejandro. As you may have noticed, what these all have in common is that they are composed of letters.
To handle the identification of keywords and identifiers more effectively, let's create two utility functions.

Utility Functions
func isLetter(ch byte) bool {
	return ('a' <= ch && ch <= 'z') || ('A' <= ch && ch <= 'Z') || ch == '_'
}

func (l *Lexer) readIdent() string {
  pos := l.pos

  for isLetter(l.ch) {
    l.next()
  }

  return l.input[pos : l.pos]
}

This is a simplified implementation, if you wanna see it whole i recommend you to check the implementation details.
For now let's see what is happening here.

  1. isLetter: This function checks if the given character (ch) is a letter (a-z, A-Z) or an underscore (_).
  2. readIdent: This method reads an entire identifier (a sequence of letters and underscores) from the input

Now that we have implemented these utility functions, let's finally examine the second part of the NextToken() implementation.

NextToken() Second Part Definition
func (l *Lexer) NextToken() token.Token {
  // ... (previous logic for EOF, whitespace, CRLF)

  if isLetter(l.ch) {
    t.Literal = l.readIdent()
    t.Kind = token.LookupIdent(t.Literal)
    return t
  } 

  // ... (Handeling numbers)

  t.Literal = ""
  t.Kind = token.ILLEGAL
  return t
}

NOTE: Implementing Numbers is pretty similar to the characters implementation.

This completes the core logic for identifying keywords and identifiers. If the lexer doesn't recognize any token (including keywords, identifiers or numbers), it defaults to returning a token.ILLEGAL. As you may have noticed, we haven't yet defined token.LookupIdent. Let's do that now.

LookupIdent() Definition
var keywords = map[string]TokenKind{
  "GET":    GET,
  "GETSET": GETSET,
  "GETEX":  GETEX,
  "GETDEL": GETDEL,
  "SET":    SET,
  "DEL":    DEL,
  "INCR":   INCR,
  "INCRBY": INCRBY,
  "DECR":   DECR,
  "DECRBY": DECRBY,
  // ... (The rest of the Commands and Keywords)
}

func LookupIdent(ident string) TokenKind {
  kw, ok := keywords[ident]

  if ok {
    return kw
  }

  return IDENT
}

As you can see this function is pretty straight forward, and with this we have finished our Lexer.
So, after all of this work we can finally implement a Parser.

What is a Command

Do you remember the Case Definition we proposed ? Let me give you a hint.

Case Definition: Parsing is the process of structuring a Redis Command in accordance with a given Command Structure.

Structure and Redis Command are highlighted for these a specific reason:

  1. To create a good Structure we have to define all of the required fields that a Command must have.
  2. To create a good Redis Command we'll use a function that checks all of the required fields to successfully create a Command.

To achieve the first of these requirements we are going to create a Command Interface which is going to identify all of the possible Commands.

Command Interface
type Node interface {
  String() string
  TokenLiteral() string
}

type Command interface {
  Node

  cmdNode()
}

Node represents any parseable unit (Command or Expressions).

  1. String returns a Human-readable representation of the Command.
  2. TokenLiteral returns the Literal of the prefix Token (SET, GET, INCR, etc.)
Command marks Redis Commands nodes (GET, SET, etc.)
  1. The Node interface is embedded into Command as Command is also a Node.
  2. cmdNode() is an empty method that differentiates Commands from Expression (which we will see soon).

Now that we have created our Command interface, we can start implementing Commands.

NOTE: Through the rest of the article we will just implement the GET command, but all the Command implementations are similar, and you can check them here.
So... Let's do it!

Defining GET

GET Command Reminder: The GET Command takes a key as a parameter and returns the value associated to that key.

To implement the Command GET we first have to create a Structure which implements the Command interface, then we will create a function that takes a Stream of Tokens and returns a GetCommand.

GET Definition
type GetCommand struct {
  Token token.Token
  Key   string
}

As you can see the GetCommand structure stores all the relevant information to represent the GET Command Structure.
But it still does not implement the Command interface.

GET implementation of Command
func (gc *GetCommand) cmdNode() {}

func (gc *GetCommand) TokenLiteral() string {
  return gc.Token.Literal 
}

func (gc *GetCommand) String() string {
  var sb strings.Builder

  sb.WriteString(gc.Token.Literal)
  sb.WriteByte(' ')

  sb.WriteString(gc.Key)

  return sb.String()
}

There are a lot of thing that are happening right here so let's break it down!

  1. cmdNode: This is correctly implemented as an empty method to satisfy the Command interface, serving its purpose of differentiating Commands from Expressions.
  2. TokenLiteral: This method returns the literal value of the command's token (e.g., GET), adhering to the Node interface.
  3. String: This method provides a human-readable representation of the GET command (e.g., GET mykey), which is crucial for debugging and logging, as required by the Node interface. The use of strings.Builder is a good practice for efficient string concatenation in Go.

Now that our GetCommand satisfies the Command interface. It's time to put it to work.

Implementing a Parser

Now that we have create the Command we are going to Parse (GET) it's time to create the Parser!

The Parser logic is pretty similar to the Token's one, let's see why.

  1. They both use a two index approach: In the same way that the lexer uses a pos and readPos the Parser uses curTok and readTok to parse the incoming input string.
  2. They both use special functions: To being able to parse a Command, well create a specific function for each command that tries to parse it or throws an error.

NOTE: I'm going to present you a simplified implementation to avoid boilerplate.

Knowing this, let's define the Parser!

Parser Defintion
type Parser struct {
  l *lexer.Lexer

  curTok  token.Token
  readTok token.Token
}
  1. *lexer.Lexer: The lexer contains the NextToken() method that converts raw input (like SET foo bar) into a stream of parsed tokens (like SET, foo, bar).
  2. curTok: Represents the token being processed right now—like the word your eyes are focused on while reading. Determines the parser’s immediate action (e.g., "Is this a SET or GET?").
  3. readTok: Represents the upcoming token, peeked at but not yet consumed. Acts like a "sneak preview" to resolve ambiguities (e.g., seeing if the SET Command contains an optional argument).

Now that we have our definition of our Parser, it's time to initialize it!

Parser Initialization
func (p *Parser) next() {
  p.curTok = p.readTok
  p.readTok = p.l.NextToken()
}

func New(l *lexer.Lexer) *Parser {
  p := &Parser{l: l}
  p.next()
  p.next()

  return p
}

The initialization of the Parser is pretty simple, let's take a brief look:

  1. next(): This function advances the Tokens by using the NextToken() method of lexer.
  2. New(): This method takes a *lexer.Lexer as argument and then it initializes the curTok and readTok by using the next() method.

With this we are almost done, we just have to create a function that depending on the first Token it selects the according parsing function.

Before creating the function that parses the GET command we have to create a function that Parses all the possible Commands, we will call it Parse:

General Parse Function
func (p *Parser) Parse() (ast.Command, error) {
  // ( 1 ) Checks if the Command starts with a Array Representation
  if err := p.parseBegCommand(); err != nil {
    return nil, err
  }

  // ( 2 ) Checks if the Command continues with a BulkString
  if err := p.skipBulkString(); err != nil {
    return nil, err
  }

  // ( 3 ) Parses the Command
  cmd, err := p.parseCommand()
  if err != nil {
    return nil, err
  }

  return cmd, nil
}

This function can seem tough but it just handles these three key parts:

  1. p.parseBegCommand(): Checks if the Command starts with an Array set of Tokens (e.g., *2\r\n) if it does, it advances the Tokens till the next set of Tokens ($3\r\n).
  2. p.skipBulkString(): Checks if the current set of Tokens denotes a BulkString (e.g, $3\r\n), advancing the Tokens as in the last point.
  3. p.parseCommand(): After all of the executions of the previous functions, this function checks if the current Token is a command prefix (e.g, SET, GET) executing its corresponding parsing function (defined down bellow).

We do all of this because in this stage we will get the Command in a RESP format:

Command Structure for Checking
*2\r\n -- $3\r\nGET\r\n -- $3\r\nkey\r\n
#   |            |                |
# We Check all of this parts individualy

With this we are almost finished with Parsing, we just have to create the p.parseCommand() and p.parseGetCommand(). So, let's do it!

The definition of p.parseCommand() is actually pretty simple.

General Parse Function
func (p *Parser) parseCommand() (ast.Command, error) {
  switch p.curTok.Kind {
  case token.GET:
    return p.parseGetCommand()
  ...
  }

  return nil, fmt.Errorf(
    "command not supported. got=%d (%q)",
    p.curTok.Kind, p.curTok.Literal,
  )
}

As you can see the p.parseCommand() returns a ast.Command and a error, then using a switch statement we determine the corresponding Parsing Function for that respective Command.
If the Command is not found: we return a "Command not supported" error.

Knowing this, let's define p.parseGetCommand() and finish this whole Parsing section.

Parse Get Function
func (p *Parser) parseGetCommand() (*ast.GetCommand, error) {
  getCmd := &ast.GetCommand{Token: p.curTok}

  p.next()
  if err := p.checkCRLF(); err != nil {
    return nil, err
  }

  key, err := p.parseIdent()
  if err != nil {
    return nil, err
  }

  getCmd.Key = key

  return getCmd, nil
}

As you can see the parseGetCommand function returns the *ast.GetCommand structure we have defined before and an error.
There are also some special functions we will now see:

Parse Identifier Function
func (p *Parser) checkCRLF() error {
  // Checks if the current Token is CRLF
  if !p.curTokIs(token.CRLF) {
    return fmt.Errorf(
      "token expected=%d ('CRLF'). got=%d (%s)",
      token.CRLF,
      p.curTok.Kind,
      p.curTok.Literal,
    )
  }

  p.next()
  return nil
}

func (p *Parser) parseIdent() (string, error) {
  if err := p.skipBulkString(); err != nil {
    return "", err
  }

  if !p.curTokIs(token.IDENT) {
    return "", fmt.Errorf(
      "token expected=%d ('IDENT'). got=%d (%q)",
      token.IDENT,
      p.curTok.Kind,
      p.curTok.Literal,
    )
  }

  return p.curTok.Literal, nil
}

These two functions are widely used in the whole Parser, let's explain them quickly:

  1. checkCRLF: Checks if the current Token is \r\n (CRLF) and advances to the next Token, if not it returns a "token expected" error.
  2. parseIdent: It first uses the p.skipBulkString function (defined bellow) to checks if the following set of Tokens describes a BulkString and advances the Tokens till the actual text.
    Then it checks if the current Token is an Identifier (token.IDENT) that contains the actual key of the GET command returning the command if parsed successfully.

Now the only thing left is creating the skipBulkString function which you may have seen in the last snippets.

Skip BulkString Function
func (p *Parser) skipBulkString() error {
  if !p.curTokIs(token.BULKSTRING) {
    return p.isNotBulkStringErr()
  }

  p.next()
  if !p.curTokIs(token.NUMBER) {
    return p.isNotNumberErr()
  }

  p.next()
  if err := p.checkCRLF(); err != nil {
    return err
  }

  return nil
}

As you can see this functions just returns a error if some of the operations fail. Let's see what it does:

  1. Checks if the curTok is a $ (BulkString) advancing the Tokens.
  2. Checks if the curTok is a Number representing the length of the actual text and then it advances the Tokens with the p.next() method.
  3. Checks if the curTok is a \r\n (CRLF) therefore checking if it's a valid BulkString.
Checking BulkString
$    --    3    --    \r\n    --    key
|          |            |            |
Bulk     Number        CRLF        CurTok
Str                                Ends Here

This have been a tough section, so let's do a recap.

Parser Recap

So, we have gone a long way, don't worry if you think that you didn't get Parsers, in fact, this is quite a complicated topic.

So let's see how Parsers work in a more graphical way.

Parsers consume Tokens: To be able to Parse complex Command we use a curTok and nextToken.

Parser tokens

As you can see by default we put the curTok and nextTok in the first two Tokens of the Command (in this example GET and userName).

We use the content of this Tokens to fill the GetCommand structure with its representing data.

Get Structure Filled
GetCommand{
  Token: token.New(token.GET, "GET"),
  Key:   "userName",
}

You also have to remember that this is the raw representation of the Command.
In fact the Parser will get something like this.

Get Serialized

We need to advance through the Tokens to be able to Parse the whole Command, so we need a function that advances our 'two-token window' (curTok and nextTok).
The next() method achieves this purpose each time we call it.

After calling next():

After Calling Next

As you can see after calling next() now both Tokens have advances one position.

One we encounter a GET token, we execute a special function that parses the required fields of the Command, in this case parseGetCommand().

Get Parsed

NOTE: As you may have guested all of the words and symbols are Tokens and Parsing is the process of going through them and storing the useful information.

With this, we are almost done with this article, we just need to Evaluate Commands.

Creating Values

This is the part of the article when everything comes together, but before we do that we have to create a way to store the value of our keys. So, let's do it!

All of the Values should have its corresponding kind we will focus on Nil and String value kinds, let's define the value kinds first and then we will store its values.

Value Kind Definitions
type valueKind byte

const (
  StringKind valueKind = iota
  NilKind
  // ... The rest of the Value Kinds
)

If you notice this is actually pretty similar to the Token logic, in fact, it is. So, let's break it down.

  1. valueKind: Is an Enum (Enumeration) of the available Value Kinds.
  2. StringKind: Represents the String Value Kind.
  3. NilKind: This special kind signifies the absence of a value if a key does not exists we will return a NilVal (a value of NilKind).

We have mention a Value a couple of times. So, let's define them too!

Value Definition
type Value interface {
	Kind() valueKind
	String() string
}

As you can see Values are pretty similar to Tokens.

  1. Kind(): This method returns the valueKind of the value.
  2. String(): This method returns the String representation of the current Value.

As Value is an interface we have to create its corresponding struct in this case String and NilVal, this structure is going to be the way we are going to store values.

Value Interface Implementation
type String struct {
  Value string
}

// Returns the String Value stored
func (s *String) String() string { return s.Value }
func (s *String) Kind() valueKind { return StringKind }

type NilVal struct{}

// Retuns the (nil) String
func (n *NilVal) String() string { return "(nil)" }
func (n *NilVal) Kind() valueKind { return NilKind }

As you can see both String and NilVal now implement the Value interface.

  1. String: It stores a string in its Value field.
  2. NilVal: It does not represent any Value as Nil represents the absence of one.

We are almost done with values! Let's just create a helper function to create new values easily, we will call it NewValue.

NewValue definition
func NewValue(v ast.Expression) Value {

  switch v := v.(type) {
  case *ast.StringExpr:
    return &String{v.String()}

  // Creating the rest of the values...
  }

  return nil
}

Have you notice it? This function takes an Expression as argument which determines what kind of Value gets returned.
If the given Expression is not supported nil is returned, but as all the nil values are the same so for that reason we will create a general nil value.

Have you notice it? This function takes an Expression and we still have not explore what a Expression does, so let's do it.

Also this function uses the Expression type to decide which Value should be returned, if the Expression is not supported it returns nil.

Defining Expressions

Do you remember when we were defining Commands and we briefly mentioned Expressions, this is the time when we define them!

Commands like GET receive a key parameter which can be represented a String. But commands like SET can receive as a parameter any value (Number, String, Bool, etc...), we wrap all of these values as Expressions.

Expression Definition
type Node interface {
  String() string
  TokenLiteral() string
}

type Expression interface {
  Node

  exprNode()
}

Expression is also a Node and it uses the exprNode() method to differentiate from Commands.

Now that we have defined Expressions let's implement them!

String Expression
type StringExpr struct {
  Token token.Token
}

func (se *StringExpr) exprNode() {}

func (se *StringExpr) TokenLiteral() string {
  return se.Token.Literal 
}

func (se *StringExpr) String() string {
  return se.Token.Literal
}

In this article we will just implement StringExpr as we will just support strings. So, let's break it down quickly.

  1. StringExpr: The string it represents is stored in the Token field as it contains the whole string.
  2. exprNode: This method ensures that StringExpr implements the Expression interface, in the same way as Command.
  3. String: This method provides a string representation of the actual Expression.

If you want to know the use of Expressions in Commands i highly recommend you to check out the source code!

Now that we have created our Values it's time to store them!

Storing Values

We are going to take a simple approach when it comes to storing Values, we are going to create a concurrency safe map which maps string keys to Value pointers.

NOTE: This implementation was written before the weak package was released, so keep in mind that now there is a better way to implement the storage of keys.

Knowing this, let's implement our Storage.

Storage Definition
type Storage struct {
  mu      sync.RWMutex
  storage map[string]Value
}

As you can see the storage is just a map of Values, but above it you can see a RWMutex.
A Mutex (mutual exclusion) is one of the most complex and deep concepts in all of computing, so let's try to explain them quickly.

The Problem of the Five Philosophers and the Limited Forks:

Imagine five brilliant philosophers sitting around a circular table, ready to enjoy some delicious food. Between each philosopher is a single fork. To eat, a philosopher needs two forks: one from their left and one from their right. The problem? There are only five forks for five philosophers, and each fork is a shared resource.

Philosophers on a table

Now, let's say all five philosophers get hungry at the same time and each reaches for the fork on their left. They all succeed! But now, they're stuck. Each philosopher has one fork, but they all need a second fork (from their right), which is currently held by their neighbor. No one can eat, and no one can release their fork because they're all waiting for another – this is a classic deadlock. They're all holding onto a resource and waiting indefinitely for another resource that's also held.

Philosophers on a deadlock

Here's how a mutex acts as the polite, orderly "fork manager":

  • Acquiring the Mutex (Picking up a Fork): When a philosopher wants to pick up a fork, they first try to 'lock' its mutex. If the mutex is unlocked (meaning the fork is available), they successfully acquire the mutex and pick up the fork.
  • Mutual Exclusion Ensured: If the mutex is already locked (meaning another philosopher is holding that fork), our hungry philosopher must wait until the mutex is released. This ensures that only one philosopher can ever hold a specific fork (resource) at a time.
  • Releasing the Mutex (Putting down a Fork): Once a philosopher finishes eating, they 'unlock' the mutexes for both fork they were holding. This signals to any waiting philosophers that those forks are now available.

By implementing a strategy where philosophers acquire both necessary fork mutexes before attempting to eat (or acquiring them in a specific order, or acquiring a global eating permission mutex), we prevent the deadlock. The mutex ensures that critical sections of code (like picking up a shared fork) are accessed by only one goroutine (philosopher) at a time, preventing race conditions and resource contention.

Deadlock fixed

In our Redis clone, the sync.RWMutex acts precisely like these fork managers, ensuring that when multiple parts of our program try to read from or write to our Storage map simultaneously, they do so in an orderly, deadlock-free fashion, just like our philosophers eventually get to enjoy their food!

Now that we have scratched the surface of Mutexes, we are ready to finish our Storage!

Storage Get Method
func (s *Storage) Get(k string) (Value, bool) {
  // Locks the map
  s.mu.RLock()

  val, ok := s.storage[k]

  // Unlocks the map
  s.mu.RUnlock()

  if !ok {
    return nil, ok
  }

  return val, ok
}

As you can see, the Get method is pretty simple, let's break it down!

  1. s.mu.RLock(): This method locks the Mutex to get the value from the storage.
  2. s.mu.RUnlock(): This method unlocks the Mutex, freeing the value.
  3. At last, if the value exists, it is returned; if not, nil is returned.

The only thing left is creating a New() function that initializes Storage.

Initializing Storage
func New() *Storage {
  return &Storage{
    storage: make(map[string]Value),
  }
}

The New() function creates the storage map; we don't need to create the Mutex because its zero value is sufficient for use. With this, we are now ready to evaluate Redis Commands!

NOTE: I highly recommend you to look up at the whole implementation of Storage as it has all of the Command storage logic.

Evaluating Commands

This is the last section of this article, you can't imagine how excited I am!

Evaluating Commands can seem a little bit intimidating, but in fact is pretty simple. We will first define a Evaluator struct that contains all of our Evaluation Logic.

Evaluator Definition
type Evaluator struct {
  s *storage.Storage
}

func New() *Evaluator {
  return &Evaluator{
    s: storage.New(),
  }
}

As you can see Evaluator uses the Storage because evaluating a command is just creating and editing key-values.

Eval method
func (e *Evaluator) Eval(cmd ast.Command) (string, error) {

  switch cmd := cmd.(type) {
  case *ast.GetCommand:
    return e.evalGetCommand(cmd)
  }

  return "", fmt.Errorf(
      "command not supported for evaluation. got=%T",
      cmd,
    )
}

Evaluator has a method Eval that depending on the given command (GET, SET, INCR) it executes its corresponding eval function (evalGetCommand()).
Now the only thing left is define the evalGetCommand(), so let's do it!

Eval Get Command definition
func (e *Evaluator) evalGetCommand(
  gc *ast.GetCommand
) (string, error) {
  val, ok := e.s.Get(gc.Key)

  if !ok {
    return storage.Nil.String(), nil
  }

  return val.String(), nil
}