How we communicate with Redis?

Redis follows a traditional Client-Server architecture. One or many Redis Clients send commands like GET, SET, or DEL to a single central Redis Server. The server reads the command from the connection, parses it, evaluates it, and writes the response back to the client.

But how does the server actually understand a command? To a computer, SET userName Alejandro is nothing more than a raw string of characters. Before the server can act on it, it needs to transform it into a meaningful representation. That process is called Parsing.

So, before we can evaluate a Command, we are going to transform it twice into more accessible representations:

Commands transformations

The first transformation is turning our Raw Command into Tokens, which are the atomic units that compose a Command Representation.
The process of turning raw text into meaningful Tokens is called Lexical Analysis, and it's done by a Lexer.

For example, feeding this command to our Lexer:

SET command example
SET userName Alejandro

The Lexer Produces the following:

Tokens of the SET Command
setCommandTokens := []Token{
  {Kind: SET,   Literal: "SET"},
  {Kind: IDENT, Literal: "userName"},
  {Kind: IDENT, Literal: "Alejandro"},
}

Notice that not all Tokens are equal. SET is a command token — it represents the operation to perform. The remaining two are identifier tokens — they carry the arguments. Each Token knows both what kind it is and what it contains.

This is the theory — let's define some tokens!

Defining Tokens

Command tokens

As shown above, SET userName Alejandro is not treated as one long string. The Lexer identifies three distinct Tokens: a command token and two identifier tokens — each with its own TokenKind.

TokenKind is the first entity we will define in our project. Enough theory — it's time to write our first lines of code.

Defining TokenKinds

Defining all the possible Tokens of Redis
type TokenKind byte

// Check the source code to see all the Tokens!
const (
  EOF TokenKind = iota // EOF = End Of File
  ILLEGAL

  // Special
  CRLF
  
  // All Commands have their respective TokenKind
  keyword_beg
  IDENT
  SET
  GET
  INCR
  INCRBY
  ...

As you can see, the first TokenKinds we define are EOF and ILLEGAL. These two play very specific roles in our Lexer:

  • EOF: Stands for End Of File and signals that there is no more input to read.
  • ILLEGAL: represents any character our Lexer doesn't recognize.
Why use a byte for TokenKind?

You might have noticed that we defined TokenKind as a byte. This is a deliberate engineering decision — here's why we chose it over a string or a standard int:

  1. Memory Efficiency: Since our Lexer will generate a new instance for every single word or symbol in a command, using a 1-byte type keeps our memory footprint as small as possible.
  2. Sufficient Capacity: A byte ( an alias for uint8 ) can represent up to 256 unique values. Since our Redis clone will only have a few dozen command types and symbols, this is the perfect, lightweight fit for our TokenKind.

With our TokenKind defined, we now need a container to hold two things: the kind of token it is, and the raw text it was derived from — which as seen in the SET Tokens example we'll call the Literal.

Token Struct Definition
type Token struct {
  Kind    TokenKind
  Literal string
}

🔥 Fun Fact: Even thought i just said that the Token struct should be as lightweight as possible, this is not the most lightweight representation for a Token.

The Golang Parser uses an amazing technique: instead of storing a piece of text in the struct as we do, they just store the indices from where the piece of text beings and ends, a simple and ingenious trick if you want to implement it, Check the details here.

Now the only thing left is creating a constructor for our Token struct. This will make instantiating new Tokens clean and consistent throughout the codebase:

Token New() definition
func New(kind TokenKind, literal string) Token {
  return Token{
    Kind:    kind,
    Literal: literal,
  }
}

In the next chapter, we will build the Lexer — the engine responsible for turning a raw Redis command into exactly these tokens.