Zig and Strings: A Chill Guide
Hey there! I’m excited to walk you through how strings work in Zig. Whether you’re a seasoned Zig dev or just starting out, this guide has something for you. We’ll go over what I’ve learned about strings in Zig, especially since they’re a bit different from what you might be used to in other languages. This is a long piece of content, so I’ve included a table of contents for easy navigation.
The examples were written with version 0.14.0, and syntax may have changed in later versions.
Table of Contents
- Table of Contents
- What is a String in Zig?
- String Literals and Data Types
- Copy vs. Reference
- Mutability:
[]const u8
vs[]u8
- Memory Management and Allocators
- Unicode and UTF-8 Handling
- Checking for Emptiness
- Iterating Over Strings
- Escape Sequences
- Slicing Strings
- Comparing Strings
- Searching and Indexing
- Prefix and Suffix Checks
- Formatting and Interpolation
- Concatenation
- String Builders and Buffers
- Splitting and Joining
- Trimming
- Replacing Substrings
- Case Conversion
- Padding
- Type Conversion: String and Number
- Error Handling in String Operations
- Wrap Up
What is a String in Zig?
In Zig, a “string” isn’t a special built-in type like in many other languages. Instead, you’ll usually work with []const u8
, which is a slice of constant (read-only) bytes. Think of it as a view into a sequence of bytes encoded in UTF-8. This is different from languages like JavaScript, where strings are immutable sequences of UTF-16 characters. In Zig, you’re much closer to the metal, working directly with bytes.
A string slice in Zig is simply a pointer and a length. It doesn’t own the data or manage memory for you. That part is on you, especially when you need to create or store new strings.
String Literals and Data Types
You can create a string literal in Zig using double quotes, just like you’d expect:
const hello = "Hello, world!";
Zig infers the type of hello
as *const [13:0]u8
. This is a pointer to a constant array of 13 bytes, with a null terminator (0
) at the end. Because string literals have both a known length and a null terminator, they can be easily converted to both Slices and Null-Terminated Pointers.
You can get a slice from a literal like this:
const hello: []const u8 = "Hello, world!";
Here, Zig automatically converts the null-terminated array pointer into a slice, conveniently handling the length for you.
You can also create multi-line string literals:
const multi =
\\This is a
\\multi-line string.
;
A multi-line string literal starts and ends with \\
and doesn’t process any escape sequences. This is super handy for embedding long text blocks without worrying about formatting.
To get the length of a string in bytes, you can use the .len
property. Keep in mind this is the number of bytes, not necessarily the number of characters.
const len = hello.len; // 13
For more details, check out the Zig Language Reference for String Literals.
Copy vs. Reference
In a language like JavaScript, strings are values, so when you assign a string to a new variable, you get a copy. In Zig, a string slice is a pointer and a length. When you assign it, you’re just copying the pointer and length, not the underlying data. Both variables will point to the same bytes in memory.
const a: []const u8 = "hello";
const b = a;
// a and b point to the same data
std.debug.print("a: {}, b: {}\n", .{ &a, &b });
// both hold the same pointer values. e.g. a: []const u8@100ecd498, b: []const u8@100ecd498
If you need a true copy of the string data, you’ll have to allocate memory and copy the bytes yourself. We’ll get to that in a bit!
Mutability: []const u8
vs []u8
By default, strings in Zig are immutable. A string literal gives you a []const u8
, which is a read-only slice of bytes. You can’t change its contents.
If you need to modify a string, you must create a mutable slice, []u8
. This is typically done by creating a buffer (an array) and slicing it.
Here’s an example that shows the difference:
const std = @import("std");
pub fn main() !void {
// String literals are immutable.
const immutable_greeting: []const u8 = "Hello";
// immutable_greeting[0] = 'h'; // This would be a compile error.
// To create a mutable string, you need a mutable buffer.
var buffer: [16]u8 = undefined;
// The buffer must be large enough to hold the formatted string, otherwise this will error.
const mutable_slice = try std.fmt.bufPrint(&buffer, "{s}", .{immutable_greeting});
std.debug.print("Before: {s}\n", .{mutable_slice});
// Now we can change the contents of the mutable slice.
mutable_slice[0] = 'h';
std.debug.print("After: {s}\n", .{mutable_slice});
}
In this example, attempting to change immutable_greeting
would fail at compile time. However, after copying the string into a buffer
, we get a mutable_slice
that we can freely modify.
Memory Management and Allocators
Zig doesn’t have a garbage collector like JavaScript or Python. If you need to create new string data—for example, by concatenating strings or formatting them—you have to explicitly allocate memory. Zig handles this with allocators, and the standard library provides a standard std.mem.Allocator
interface.
This is a big shift if you’re coming from a garbage-collected language. You are in charge of the memory you allocate. If you don’t free it, you’ll have memory leaks. Here’s how you can allocate a new string on the heap:
const std = @import("std");
pub fn main() !void {
// Set up an allocator for memory management.
var gpa = std.heap.DebugAllocator(.{}){};
defer {
// Make sure all memory is freed at the end.
const leaked = gpa.deinit();
// This will panic if we leaked memory, which is great for debugging.
if (leaked) @panic("Memory leaked!\n");
}
const allocator = gpa.allocator(); // Get the Allocator interface.
const original = "hello";
// Allocate memory for a copy.
const copy = try allocator.alloc(u8, original.len);
// `defer` is your friend! It ensures the memory is freed when the function exits.
defer allocator.free(copy);
// Copy the bytes from the original string to the new buffer.
std.mem.copyForwards(u8, copy, original);
std.debug.print("Original: {s}, Copy: {s}\n", .{ original, copy });
// Now `copy` is its own string, living on the heap.
}
Choosing the right allocator (like ArenaAllocator
for temporary allocations) is part of the fun. Just remember to free
what you alloc
. For more on this, see std.mem.Allocator and the guide on choosing an allocator.
Unicode and UTF-8 Handling
Zig strings are UTF-8 encoded, which means a single character can be made up of one or more bytes. The .len
property gives you the byte count, not the character count. This is different from JavaScript’s .length
, which gives you the number of UTF-16 code units.
Zig trusts you to put valid UTF-8 in a []const u8
, but it doesn’t verify it. You might run into issues if you pass invalid UTF-8 to a system that expects it to be valid, or when you’re working with systems that use a different encoding (like UTF-16, which is common for APIs).
You can check if a string is valid UTF-8 using std.unicode.utf8ValidateSlice
:
pub fn main() !void {
// This is an invalid UTF-8 sequence for the character 'A'.
const s: []const u8 = "\xC0\x81";
const is_valid = std.unicode.utf8ValidateSlice(s);
if (is_valid) {
std.debug.print("The string is valid UTF-8.\n", .{});
} else {
std.debug.print("The string is not valid UTF-8.\n", .{});
}
}
In the next section, we’ll see how to properly iterate over Unicode characters.
Checking for Emptiness
This one is pretty straightforward, but worth mentioning. You can check if a string is empty by looking at its length.
if (s.len == 0) {
// the string is empty
}
Iterating Over Strings
Looping over strings is a fundamental task. Here are a few ways to do it in Zig:
const std = @import("std");
pub fn main() !void {
const stdout = std.io.getStdOut().writer();
// Example 1: Loop over a string slice byte by byte
const message = "Hello, Zig!";
try stdout.print("\nExample 1: Looping over bytes in a string:\n", .{});
for (message, 0..) |char, i| {
try stdout.print("Byte at position {d}: '{c}'\n", .{ i, char });
}
// Example 2: Loop over an array of strings
try stdout.print("\nExample 2: Looping over an array of strings:\n", .{});
const words = [_][]const u8{ "apple", "banana", "cherry", "date" };
for (words, 0..) |word, i| {
try stdout.print("Word {d}: {s}\n", .{ i, word });
}
// Example 3: Loop over strings in a slice
try stdout.print("\nExample 3: Looping over strings in a slice:\n", .{});
const colors = &[_][]const u8{ "red", "green", "blue", "yellow" };
for (colors) |color| {
try stdout.print("Color: {s}\n", .{color});
}
// Example 4: Using an iterator with a while loop
try stdout.print("\nExample 4: Using while loop with an iterator:\n", .{});
var iterator = std.mem.splitScalar(u8, "one,two,three,four", ',');
while (iterator.next()) |item| {
try stdout.print("Item: {s}\n", .{item});
}
// Example 5: Using a UTF-8 iterator for proper Unicode handling
try stdout.print("\nExample 5: UTF-8 iteration (Unicode support):\n", .{});
const unicode_text = "Hello, 世界!";
var utf8_iterator = std.unicode.Utf8Iterator{ .bytes = unicode_text, .i = 0 };
var index: usize = 0;
while (utf8_iterator.nextCodepoint()) |codepoint| {
try stdout.print("Unicode codepoint at position {d}: U+{X:0>4}\n", .{ index, codepoint });
index += 1;
}
}
In this example, we cover looping over a string byte by byte, iterating through an array of strings, and using an iterator with while
. Most importantly, we show how to handle Unicode correctly with std.unicode.Utf8Iterator
.
Escape Sequences
Escape sequences let you include special characters in your string literals. Zig supports the usual suspects:
\n
for newline\t
for tab\"
for a double quote\\
for a backslash\xNN
for a byte in hex\u{NNNN}
for a Unicode scalar value in hex
You can find more about this in the Zig Language Reference: Source Encoding.
Slicing Strings
You can grab a piece of a string (or any slice) with the [start..end]
syntax. The end
is exclusive and optional. For example:
const string = "abcdef";
const sub_string = string[1..4]; // "bcd"
This is a zero-cost abstraction—it just creates a new view into the original string without copying any data.
Comparing Strings
So, how do you compare strings? Zig doesn’t have built-in operators like ==
or <
for strings. Instead, you’ll use functions from the standard library, mainly from the std.mem
namespace.
You can use std.mem.eql
to check if two strings are equal:
pub fn main() !void {
const stdout = std.io.getStdOut().writer();
// Example 1: Basic string equality
try stdout.print("\nExample 1: Basic string equality\n", .{});
const str1 = "hello";
const str2 = "hello";
const str3 = "world";
const str4 = "Hello";
// Use std.mem.eql for string equality
const equal1 = std.mem.eql(u8, str1, str2);
const equal2 = std.mem.eql(u8, str1, str3);
const equal3 = std.mem.eql(u8, str1, str4);
try stdout.print(" \"{s}\" == \"{s}\": {}\n", .{ str1, str2, equal1 });
try stdout.print(" \"{s}\" == \"{s}\": {}\n", .{ str1, str3, equal2 });
try stdout.print(" \"{s}\" == \"{s}\": {}\n", .{ str1, str4, equal3 });
}
//Prints:
// Example 1: Basic string equality
// "hello" == "hello": true
// "hello" == "world": false
// "hello" == "Hello": false
For case-insensitive comparison, std.ascii.eqlIgnoreCase
is what you’re looking for:
pub fn main() !void {
const stdout = std.io.getStdOut().writer();
// Example 2: Case-insensitive comparison
try stdout.print("\nExample 2: Case-insensitive comparison\n", .{});
const upper = "HELLO";
const lower = "hello";
// Case-sensitive comparison
const case_sensitive_equal = std.mem.eql(u8, upper, lower);
try stdout.print(" Case-sensitive: \"{s}\" == \"{s}\": {}\n", .{ upper, lower, case_sensitive_equal });
// Case-insensitive comparison
const case_insensitive_equal = std.ascii.eqlIgnoreCase(upper, lower);
try stdout.print(" Case-insensitive: \"{s}\" == \"{s}\": {}\n", .{ upper, lower, case_insensitive_equal });
}
//Prints:
// Example 2: Case-insensitive comparison
// Case-sensitive: "HELLO" == "hello": false
// Case-insensitive: "HELLO" == "hello": true
To sort strings or compare them lexicographically (dictionary order), you can use std.mem.order
:
const cmp = std.mem.order(u8, "abc", "abd");
// cmp will be .lt, .eq, or .gt
pub fn main() !void {
const stdout = std.io.getStdOut().writer();
// Example 3: String ordering (lexicographical comparison)
try stdout.print("\nExample 3: String ordering (lexicographical comparison)\n", .{});
const a = "apple";
const b = "banana";
// Use std.mem.order for lexicographical ordering
const ordering = std.mem.order(u8, a, b);
try stdout.print(" \"{s}\" compared to \"{s}\": ", .{ a, b });
switch (ordering) {
.lt => try stdout.print("less than\n", .{}),
.eq => try stdout.print("equal\n", .{}),
.gt => try stdout.print("greater than\n", .{}),
}
}
By now, you’re probably noticing a pattern: many of the string utilities you’d expect are in the std.mem
and std.ascii
modules. Zig’s approach to strings is all about being explicit and efficient, giving you full control over memory and performance. Dive into std.mem and std.ascii to see what else is available.
Searching and Indexing
To check if a string contains a substring or a specific byte, you can use functions in std.mem
. For instance, std.mem.indexOf
can tell you if a substring exists:
pub fn main() !void {
const stdout = std.io.getStdOut().writer();
const haystack = "The quick brown fox jumps over the lazy dog";
const needle1 = "fox";
const needle2 = "cat";
const contains1 = std.mem.indexOf(u8, haystack, needle1) != null;
const contains2 = std.mem.indexOf(u8, haystack, needle2) != null;
try stdout.print(" \"{s}\" contains \"{s}\": {}\n", .{ haystack, needle1, contains1 });
try stdout.print(" \"{s}\" contains \"{s}\": {}\n", .{ haystack, needle2, contains2 });
}
// Prints:
// "The quick brown fox jumps over the lazy dog" contains "fox": true
// "The quick brown fox jumps over the lazy dog" contains "cat": false
The indexOf
function returns an optional index (?usize
), so we check if it’s null
. Another way to do this is with std.mem.containsAtLeast
or containsAtLeastScalar
:
pub fn main() !void {
const stdout = std.io.getStdOut().writer();
const haystack = "The quick brown fox jumps over the lazy dog";
const needle1 = "fox";
const needle2 = "cat";
// Using std.mem.containsAtLeast
const contains1 = std.mem.containsAtLeast(u8, haystack, 1, needle1);
const contains2 = std.mem.containsAtLeast(u8, haystack, 1, needle2);
try stdout.print(" \"{s}\" contains \"{s}\": {}\n", .{ haystack, needle1, contains1 });
try stdout.print(" \"{s}\" contains \"{s}\": {}\n", .{ haystack, needle2, contains2 });
}
The results are the same. containsAtLeast
checks if the haystack contains at least a certain number of occurrences of the needle.
There are plenty of other useful functions for finding things in strings, like std.mem.lastIndexOf
, std.mem.lastIndexOfScalar
, std.mem.indexOfPos
, and std.mem.indexOfMax
. Check out std.mem for all the options.
Prefix and Suffix Checks
Checking if a string starts or ends with a certain substring is a common task. You can use std.mem.startsWith
and std.mem.endsWith
for this. Here’s an example:
pub fn main() !void {
const stdout = std.io.getStdOut().writer();
const full_string = "hello world";
const substring1 = "hello";
const substring2 = "world";
// Check if the string starts with a prefix
const starts_with1 = std.mem.startsWith(u8, full_string, substring1);
const starts_with2 = std.mem.startsWith(u8, full_string, substring2);
try stdout.print(" \"{s}\" starts with \"{s}\": {}\n", .{ full_string, substring1, starts_with1 });
try stdout.print(" \"{s}\" starts with \"{s}\": {}\n", .{ full_string, substring2, starts_with2 });
// Check if the string ends with a suffix
const ends_with1 = std.mem.endsWith(u8, full_string, substring1);
const ends_with2 = std.mem.endsWith(u8, full_string, substring2);
try stdout.print(" \"{s}\" ends with \"{s}\": {}\n", .{ full_string, substring1, ends_with1 });
try stdout.print(" \"{s}\" ends with \"{s}\": {}\n", .{ full_string, substring2, ends_with2 });
}
// Prints:
// "hello world" starts with "hello": true
// "hello world" starts with "world": false
// "hello world" ends with "hello": false
// "hello world" ends with "world": true
Once you know to look in std.mem
, these kinds of tasks become second nature.
Formatting and Interpolation
The std.fmt
struct provides powerful functions for formatting values into strings. You’ve already seen this in action with std.debug.print
and std.io.getStdOut().writer()
. Here’s how you can do string interpolation:
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
const name = "Zig";
const count = 5;
// Use allocPrint to format into a newly allocated string.
const msg = try std.fmt.allocPrint(allocator, "Hello, {s}! Count: {d}", .{ name, count });
// Don't forget to free the memory!
defer allocator.free(msg);
std.debug.print("{s}\n", .{msg}); // Prints "Hello, Zig! Count: 5"
}
This requires an allocator because the final string size isn’t known at compile time. It’s similar to template literals in JavaScript: Hello, ${name}! Count: ${count}
. The format specifiers like {s}
(string) and {d}
(decimal) tell Zig how to format the values. According to fmt.format()
, there are several other specifiers you can use:
x
andX
: hexadecimals
: stringe
: scientific notationd
: decimalb
: binaryo
: octalc
: ASCII characteru
: UTF-8 character?
: optional value!
: error union*
: pointer addressany
: default format
If you know the maximum size of the string you’re creating, you can use std.fmt.bufPrint
to avoid dynamic memory allocation.
const std = @import("std");
pub fn main() !void {
var buffer: [50]u8 = undefined;
const count = 5;
const msg = try std.fmt.bufPrint(&buffer, "Hello, {s}! Count: {d}", .{ "Zig", count });
std.debug.print("{s}\n", .{msg}); // Prints "Hello, Zig! Count: 5"
}
This is a great pattern for creating formatted strings without hitting the allocator. The buffer just needs to be large enough to hold the result.
There are many other cool functions in std.fmt worth exploring. For example, fmtDuration
can format a time in nanoseconds into a human-readable string. try stdout.print("Duration: {}\n", .{std.fmt.fmtDuration(90 * std.time.ns_per_s)});
will print Duration: 1m30s
.
Concatenation
To join strings together, you can use std.mem.concat
. It takes a slice of strings and copies their bytes into a new, single slice.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
const a = "foo";
const b = "bar";
const c = "baz";
const parts = &[_][]const u8{ a, b, c };
// Concatenate the slices in `parts` into a new string.
const joined = try std.mem.concat(allocator, u8, parts);
defer allocator.free(joined);
std.debug.print("Joined: {s}\n", .{joined}); // Prints "Joined: foobarbaz"
}
There are also concatMaybeSentinel
and concatWithSentinel
if you need to add a sentinel byte at the end.
Performance Tip: Concatenating strings with
concat
allocates a new buffer every time. If you’re doing a lot of concatenations in a loop, it’s much more efficient to use anArrayList
to build the string incrementally. If you know the final size, you can pre-allocate withtry list.ensureTotalCapacity(expected_size)
to avoid reallocations.
String Builders and Buffers
For building strings efficiently, especially when you’re appending multiple times, std.ArrayList(u8)
is your best friend. It helps minimize reallocations.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
// Initialize an ArrayList with an allocator.
var list = std.ArrayList(u8).init(allocator);
// `defer list.deinit()` makes sure the list's internal buffer is freed.
defer list.deinit();
// Append slices or individual bytes.
try list.appendSlice("Hello, ");
try list.appendSlice("world");
try list.appendByte('!');
// Get the resulting slice. The ArrayList still owns the data.
const result: []const u8 = list.items;
std.debug.print("Built string: {s}\n", .{result}); // Prints "Built string: Hello, world!"
}
Check out std.ArrayList for more info.
Performance Note: When building strings with many appends, you can get a nice performance boost by pre-allocating the
ArrayList
’s capacity if you have a good estimate of the final size. Usetry list.ensureTotalCapacity(estimated_size)
before you start appending.
Splitting and Joining
To split a string by a single delimiter, use std.mem.splitScalar
. It returns an iterator over the resulting slices, and best of all, it doesn’t require any memory allocation.
const std = @import("std");
pub fn main() void {
const s = "a,b,c";
// Split the string `s` by the comma character.
var it = std.mem.splitScalar(u8, s, ',');
while (it.next()) |part| {
// `part` is just a slice (a view) of the original string.
std.debug.print("Part: {s}\n", .{part});
}
// Output:
// Part: a
// Part: b
// Part: c
}
To join a slice of strings with a separator, use std.mem.join
. This does require an allocator since it creates a new string.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
const parts = &[_][]const u8{"a", "b", "c"};
const separator = ",";
// Join the slices with the separator.
const joined = try std.mem.join(allocator, separator, parts);
defer allocator.free(joined);
std.debug.print("Joined: {s}\n", .{joined}); // Prints "Joined: a,b,c"
}
See std.mem.splitScalar and std.mem.join in the docs. For splitting by multiple characters or a sequence of characters, check out std.mem.splitAny
and std.mem.splitSequence
.
Trimming
To remove leading or trailing characters (like whitespace), use std.mem.trim
. It returns a new slice that is a view into the original string, so no allocation is needed.
const std = @import("std");
pub fn main() void {
const s = " hello ";
const whitespace = " \t\n\r"; // Characters to trim
// Trim leading and trailing whitespace.
const trimmed = std.mem.trim(u8, s, whitespace);
std.debug.print("Trimmed: '{s}'\n", .{trimmed}); // Prints "Trimmed: 'hello'"
// Trim only from the start.
const trim_start = std.mem.trimLeft(u8, s, whitespace);
std.debug.print("TrimStart: '{s}'\n", .{trim_start}); // Prints "TrimStart: 'hello '"
// Trim only from the end.
const trim_end = std.mem.trimRight(u8, s, whitespace);
std.debug.print("TrimEnd: '{s}'\n", .{trim_end}); // Prints "TrimEnd: ' hello'"
}
trim
doesn’t allocate memory. If you need an owned, trimmed string, you’ll have to allocate memory and copy the result yourself.
See std.mem.trim for more.
Replacing Substrings
To replace all occurrences of a substring, use std.mem.replace
. This requires an allocator because the length of the resulting string can change.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
const s = "foo bar foo baz foo";
const old = "foo";
const new = "zig";
// Replace all occurrences of `old` with `new`.
const replaced = try std.mem.replace(u8, allocator, s, old, new);
defer allocator.free(replaced);
std.debug.print("Original: {s}\n", .{s});
std.debug.print("Replaced: {s}\n", .{replaced}); // Prints "Replaced: zig bar zig baz zig"
}
See std.mem.replace for details.
Case Conversion
To convert ASCII strings to upper or lower case, you can use functions from std.ascii
. These usually require an allocator or a pre-allocated mutable buffer.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
const s = "Hello World";
// Allocate a new string for the lowercased version.
const lower = try std.ascii.allocLower(allocator, s);
defer allocator.free(lower);
// Allocate a new string for the uppercased version.
const upper = try std.ascii.allocUpper(allocator, s);
defer allocator.free(upper);
std.debug.print("Original: {s}\n", .{s});
std.debug.print("Lower: {s}\n", .{lower}); // Prints "Lower: hello world"
std.debug.print("Upper: {s}\n", .{upper}); // Prints "Upper: HELLO WORLD"
// You can also modify a string in-place if you have a mutable buffer.
var buffer: [32]u8 = undefined;
const mutable_slice = try std.fmt.bufPrint(&buffer, "Mutable", .{});
std.ascii.lowerSlice(mutable_slice);
std.debug.print("In-place lower: {s}\n", .{mutable_slice}); // Prints "In-place lower: mutable"
}
A quick heads-up: these functions only work correctly for ASCII characters. For full Unicode case conversion, you’d need a more advanced library.
See std.ascii.allocLower and std.ascii.allocUpper.
Padding
To pad a string to a certain length, you can use the formatting options in std.fmt
. This will require an allocator if you’re using allocPrint
.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
const s = "42";
// Right-align (left-pad with spaces) to a width of 5.
const padded_right = try std.fmt.allocPrint(allocator, "{s:>5}", .{s});
defer allocator.free(padded_right);
std.debug.print("Padded Right: '{s}'\n", .{padded_right}); // Prints "Padded Right: ' 42'"
// Left-align (right-pad with spaces) to a width of 5.
const padded_left = try std.fmt.allocPrint(allocator, "{s:<5}", .{s});
defer allocator.free(padded_left);
std.debug.print("Padded Left: '{s}'\n", .{padded_left}); // Prints "Padded Left: '42 '"
// Pad with zeros instead of spaces.
const padded_zeros = try std.fmt.allocPrint(allocator, "{s:0>5}", .{s});
defer allocator.free(padded_zeros);
std.debug.print("Padded Zeros: '{s}'\n", .{padded_zeros}); // Prints "Padded Zeros: '00042'"
}
You can also pad into a fixed-size buffer with std.fmt.bufPrint
if you know the maximum size ahead of time.
For more on this, check out the format specifications section in the std.fmt documentation.
Type Conversion: String and Number
To parse a string into a number, you can use functions like std.fmt.parseInt
or std.fmt.parseFloat
. These will return an error if the parsing fails.
const std = @import("std");
pub fn main() !void {
const s_int = "123";
const s_float = "3.14";
const s_invalid = "not a number";
// Parse an integer (base 10)
const n_int = try std.fmt.parseInt(i32, s_int, 10);
std.debug.print("Parsed int: {d}\n", .{n_int}); // Prints "Parsed int: 123"
// Parse a float
const n_float = try std.fmt.parseFloat(f64, s_float);
std.debug.print("Parsed float: {d}\n", .{n_float}); // Prints "Parsed float: 3.14"
// Handle a parsing error
const n_fail = std.fmt.parseInt(i32, s_invalid, 10) catch |err| {
std.debug.print("Failed to parse '{s}': {}\n", .{ s_invalid, err });
// Return a default value or handle the error in another way
return; // Or: return err;
};
}
To format a number as a string, you can use std.fmt.allocPrint
(which needs an allocator) or std.fmt.bufPrint
(which needs a buffer).
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
const n: i32 = 42;
const f: f32 = 1.618;
// Format an integer to an allocated string
const s_int = try std.fmt.allocPrint(allocator, "{}", .{n});
defer allocator.free(s_int);
std.debug.print("Int as string: {s}\n", .{s_int}); // Prints "Int as string: 42"
// Format a float to an allocated string with 3 decimal places
const s_float = try std.fmt.allocPrint(allocator, "{d:.3}", .{f});
defer allocator.free(s_float);
std.debug.print("Float as string: {s}\n", .{s_float}); // Prints "Float as string: 1.618"
// Format into a fixed-size buffer
var buffer: [10]u8 = undefined;
const s_buf = try std.fmt.bufPrint(&buffer, "{}", .{n});
std.debug.print("Int in buffer: {s}\n", .{s_buf}); // Prints "Int in buffer: 42"
}
For more, see std.fmt.parseInt, std.fmt.parseFloat, std.fmt.allocPrint, and std.fmt.bufPrint.
Error Handling in String Operations
Many of Zig’s string functions return error unions or optionals. Some common errors you’ll encounter are:
error.OutOfMemory
: The allocator couldn’t get enough memory.error.InvalidCharacter
: You tried to parse a string with invalid characters.error.Overflow
: You parsed a number that was too big for the type you specified.error.StreamTooLong
: You tried to write past the end of a buffer.
You have to handle these errors explicitly with try
or catch
.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
const s = "not a number";
// Option 1: Use `try` to propagate the error up the call stack.
// This would return an error from the main function.
// try std.fmt.parseInt(i32, s, 10);
// Option 2: Use `catch` to handle the error right here.
const n = std.fmt.parseInt(i32, s, 10) catch |err| {
std.debug.print("Parse failed: {s}\n", .{@errorName(err)});
return; // Or handle the error in some other way
};
// For functions that return optionals, use `if` or `orelse`.
const index = std.mem.indexOf(u8, "hello world", "xyz");
if (index) |i| {
std.debug.print("Found at {d}\n", .{i});
} else {
std.debug.print("Not found\n", .{});
}
// Or, you can use `orelse`.
const pos = std.mem.indexOf(u8, "hello world", "xyz") orelse {
std.debug.print("Substring not found\n", .{});
return;
};
}
Proper error handling is a cornerstone of Zig’s design. Good string manipulation code always accounts for these possibilities.
See the Error Handling section in the language documentation for more.
Wrap Up
And that’s a wrap on our tour of strings in Zig! We’ve covered a lot, from the basics of what a []const u8
slice is, to the details of memory management with allocators, and a whole bunch of practical operations like formatting, splitting, and joining.
My goal was to create the kind of guide I wish I had when I was starting out—something to help fellow developers get comfortable with how Zig handles strings. I’m still learning too, so if you see anything that could be improved, have a suggestion, or just want to ask a question, please drop a comment below. Your feedback is incredibly valuable and helps make this guide better for everyone.
If you enjoyed this deep dive and want more content about Zig, be sure to subscribe to my newsletter for future guides and updates. Thanks for reading, and happy coding!