Understanding Strings in Go: A Journey through Bytes, Runes, and Unicode

I am a Computer Engineering undergraduate at Vishwakarma Institute of Technology, Pune . With hands-on experience in software development and cloud-native applications, I specialize in Python, Go, C#, and full-stack web development using MERN, ASP.NET Core, and Angular. I have interned at Alemeno and CodingKraft, where I developed AI-driven compliance systems, Python execution engines, and secure web solutions. My projects include scalable microservices, machine learning pipelines, and secure banking APIs deployed on cloud platforms like Azure with Kubernetes and CI/CD automation. Adept in tools like Docker, Redis, RabbitMQ, and GitHub Actions, I am certified in Deep Learning and DevOps. I have a strong foundation in algorithms, having solved over 350 coding problems across platforms like Leetcode and Codeforces. Additionally, I actively contribute to open-source projects, mentoring initiatives, and hackathons.
In this blog, we're diving deep into the fascinating world of strings in Go. Strings are a bit more complex than they first appear, and understanding their internal workings can give you a huge advantage in writing efficient and correct Go code. So, let's get into it!
Strings: Logical vs. Physical Representation
As we continue talking about types in Go, in this segment, I'd like to talk about strings. Strings are a bit of a curious type in Go because they have two natures: logical and physical. The reason for this is that strings in Go are all Unicode, which is a technique that allows us to represent international characters.
In the old days, programming languages used ASCII, which represented characters with 7 bits and essentially only covered American English characters. When we moved to international languages with accent marks and non-Roman scripts like Chinese or Arabic, we needed different techniques to represent those characters. Unicode was developed to handle this, using numbers that are bigger than what fits into a byte.
In Go, we have the concept of a rune, which is equivalent to what some languages call a wide character. A rune is a synonym for a 32-bit integer, big enough to represent any Unicode code point. However, to make programs efficient, we don't represent every character with four bytes all the time. Instead, we use UTF-8 encoding, a way to represent Unicode in bytes efficiently. Interestingly, UTF-8 was invented by some of the same people who worked on Go at Bell Labs.
Physically, strings in Go are the UTF-8 encoding of Unicode characters. There's another type in Go called the byte, which is a synonym for an 8-bit integer. A string is physically a sequence of bytes needed to encode Unicode characters. Logically, these are runes.
Let's look at some code to illustrate these concepts:
package main
import (
"fmt"
)
func main() {
s := "élite"
fmt.Printf("Type: %T, Value: %v\n", s, s)
// Output: Type: string, Value: élite
}
When you print the string, you get the Unicode output.
Understanding Runes and UTF-8 Encoding
Let's dive deeper into the concept of bytes vs. runes. I'll demonstrate this in the playground by creating a string with a non-ASCII character, like the French accented 'é':
package main
import (
"fmt"
)
func main() {
s := "élite"
fmt.Printf("String: %s\n", s)
// Output: String: élite
bytes := []byte(s)
fmt.Printf("Bytes: %v\n", bytes)
// Output: Bytes: [195 169 108 105 116 101]
runes := []rune(s)
fmt.Printf("Runes: %v\n", runes)
// Output: Runes: [233 108 105 116 101]
fmt.Printf("Length of string (bytes): %d\n", len(s))
// Output: Length of string (bytes): 6
fmt.Printf("Length of runes: %d\n", len(runes))
// Output: Length of runes: 5
}
When you run this program, you'll notice the length of the string is 6 bytes, not 5, because the accented 'é' is represented by two bytes in UTF-8. This illustrates the difference between logical characters (runes) and their physical byte representation.
Memory Representation of Strings
In Go, strings are immutable and represented by a string descriptor, which includes a pointer to the data and the length of the string in bytes. This allows Go to handle string operations efficiently without needing a terminating null byte.

Here's a visualization of how strings are stored in memory:
package main
import (
"fmt"
)
func main() {
s := "hello, world"
h := s[:5]
w := s[7:]
fmt.Printf("Original: %s\n", s)
// Output: Original: hello, world
fmt.Printf("Substring 1: %s\n", h)
// Output: Substring 1: hello
fmt.Printf("Substring 2: %s\n", w)
// Output: Substring 2: world
}
String Operations: Concatenation and Case Conversion
When you modify a string in Go, you're actually creating a new string. Here’s an example of concatenation and case conversion:
package main
import (
"fmt"
"strings"
)
func main() {
s := "the quick brown fox"
s = s + " jumps over the lazy dog"
fmt.Println("Concatenated:", s)
// Output: Concatenated: the quick brown fox jumps over the lazy dog
upper := strings.ToUpper(s)
fmt.Println("Uppercase:", upper)
// Output: Uppercase: THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG
}
A Simple Search and Replace Tool
To wrap things up, let’s build a simple search and replace tool in Go. This will demonstrate some practical string manipulation.
package main
import (
"bufio"
"fmt"
"os"
"strings"
)
func main() {
if len(os.Args) < 3 {
fmt.Println("Usage: go run main.go <old> <new>")
return
}
oldStr := os.Args[1]
newStr := os.Args[2]
scanner := bufio.NewScanner(os.Stdin)
for scanner.Scan() {
line := scanner.Text()
modifiedLine := strings.ReplaceAll(line, oldStr, newStr)
fmt.Println(modifiedLine)
}
if err := scanner.Err(); err != nil {
fmt.Println("Error reading input:", err)
}
}
Run this tool from the command line, providing the word to replace and the replacement word:
go run main.go "mat" "ed" < input.txt
If your input.txt contains:
mat went to Greece.
mat is a good friend.
The output will be:
ed went to Greece.
ed is a good friend.
Strings in Go are more than just sequences of characters. They involve complex concepts like runes, UTF-8 encoding, and immutability. By understanding these details, you can write more efficient and effective Go code. Keep exploring and happy coding!






