Working with Strings in Zig: A Comprehensive Guide

Last updated on

Strings in Zig work differently from most languages you might be familiar with. There’s no built-in string type, no garbage collector managing memory for you, and you’re working directly with bytes rather than abstract character sequences. This guide covers everything I’ve learned about representing and manipulating strings in Zig—from the basics of slices and memory allocation through to Unicode handling and C interoperability. Use the table of contents below to jump to the sections most relevant to you.

The examples were written with version 0.14.0, and syntax may have changed in later versions.

Table of Contents


What is a String in Zig?

Zig doesn’t have a dedicated string type. If you’re coming from JavaScript, Ruby, or PHP, this might feel strange at first—in those languages, strings are their own special type that the runtime manages for you. In Zig, what you’ll work with instead is []const u8 or []u8, which are byte slices. Think of a slice as a view into a sequence of bytes stored somewhere in memory.

Here’s the key mental shift: a Zig “string” is really just two things bundled together—a pointer to where the bytes live, and a number telling you how many bytes there are. That’s it. There’s no hidden magic, no automatic memory management, and no built-in encoding. The bytes could represent UTF-8 text, binary data, or anything else. Zig doesn’t care.

What makes Zig practical for working with text is that string literals are UTF-8 encoded by default, and the standard library provides helpful functions that assume UTF-8. But here’s the catch: at the type level, Zig makes no guarantees about what’s actually in those bytes. A []const u8 doesn’t own the data it points to—it’s just a window looking at bytes stored somewhere else. If you need to create new strings or keep them around, you’re responsible for allocating and freeing that memory yourself. This is very different from garbage-collected languages where you can create strings freely and let the runtime clean up after you.

String Literals and Data Types

You can create a string literal in Zig using double quotes, just like you’d expect:

const hello = "Hello, world!";

Zig infers the type of hello as *const [13:0]u8. This is a pointer to a constant array of 13 bytes, with a null terminator (0) at the end. Because string literals have both a known length and a null terminator, they can be easily converted to both Slices and Null-Terminated Pointers.

You can get a slice from a literal like this:

const hello: []const u8 = "Hello, world!";

Here, Zig automatically converts the null-terminated array pointer into a slice, conveniently handling the length for you.

You can also create multi-line string literals:

const multi =
    \\This is a
    \\multi-line string.
;

A multi-line string literal starts and ends with \\ and doesn’t process any escape sequences. This is particularly useful for embedding long text blocks without worrying about formatting.

To get the length of a string in bytes, you can use the .len property. Keep in mind this is the number of bytes, not necessarily the number of characters.

const len = hello.len; // 13

For more details, check out the Zig Language Reference for String Literals.

Copy vs. Reference

In a language like JavaScript, strings are values, so when you assign a string to a new variable, you get a copy. In Zig, a string slice is a pointer and a length. When you assign it, you’re just copying the pointer and length, not the underlying data. Both variables will point to the same bytes in memory.

const a: []const u8 = "hello";
const b = a;
// a and b point to the same data
std.debug.print("a.ptr: {*}, b.ptr: {*}\n", .{ a.ptr, b.ptr });
// both hold the same pointer values. e.g. a.ptr: u8@100ecd498, b.ptr: u8@100ecd498

If you need a true copy of the string data, you’ll have to allocate memory and copy the bytes yourself. See Memory Management and Allocators for how to do this.

Mutability: []const u8 vs []u8

By default, strings in Zig are immutable. A string literal gives you a []const u8, which is a read-only slice of bytes. You can’t change its contents.

If you need to modify a string, you must create a mutable slice, []u8. This is typically done by creating a buffer (an array) and slicing it.

Here’s an example that shows the difference:

const std = @import("std");

pub fn main() !void {
    // String literals are immutable.
    const immutable_greeting: []const u8 = "Hello";
    // immutable_greeting[0] = 'h'; // This would be a compile error.

    // To create a mutable string, you need a mutable buffer.
    var buffer: [16]u8 = undefined;
    // The buffer must be large enough to hold the formatted string, otherwise this will error.
    const mutable_slice = try std.fmt.bufPrint(&buffer, "{s}", .{immutable_greeting});

    std.debug.print("Before: {s}\n", .{mutable_slice});

    // Now we can change the contents of the mutable slice.
    mutable_slice[0] = 'h';
    std.debug.print("After: {s}\n", .{mutable_slice});
}

In this example, attempting to change immutable_greeting would fail at compile time. However, after copying the string into a buffer, we get a mutable_slice that we can freely modify.

Memory Management and Allocators

Zig doesn’t have a garbage collector like JavaScript or Python. If you need to create new string data—for example, by concatenating strings or formatting them—you have to explicitly allocate memory. Zig handles this with allocators, and the standard library provides a standard std.mem.Allocator interface.

This is a big shift if you’re coming from a garbage-collected language. You are in charge of the memory you allocate. If you don’t free it, you’ll have memory leaks. Here’s how you can allocate a new string on the heap:

const std = @import("std");

pub fn main() !void {
    // Set up an allocator for memory management.
    var gpa = std.heap.DebugAllocator(.{}){};
    // This assertion will panic if we leaked memory, which is great for debugging.
    defer std.debug.assert(gpa.deinit() == .ok);
    const allocator = gpa.allocator(); // Get the Allocator interface.

    const original = "hello";

    // Allocate memory and copy the string in one step with `dupe`.
    const copy = try allocator.dupe(u8, original);

    // Use `defer` to ensure the memory is freed when the function exits.
    defer allocator.free(copy);

    std.debug.print("Original: {s}, Copy: {s}\n", .{ original, copy });
    // Now `copy` is its own string, living on the heap.
}

The dupe function is a convenient shorthand that allocates memory and copies the bytes in one step. If you ever need more control—for example, allocating a buffer of a different size—you can use alloc and @memcpy separately:

const copy = try allocator.alloc(u8, original.len);
defer allocator.free(copy);
@memcpy(copy, original);

Choosing the right allocator (like ArenaAllocator for temporary allocations) is an important consideration. Just remember to free what you alloc. For more on this, see std.mem.Allocator and the guide on choosing an allocator.

A note on error handling: Throughout this guide, you’ll see try and catch used frequently. Many string operations in Zig can fail—whether due to memory allocation issues, invalid input, or buffer overflows. We cover error handling patterns in detail in the Error Handling in String Operations section, but for now, just know that try propagates errors up the call stack, whilst catch lets you handle them locally.

Unicode and UTF-8 Handling

Zig string literals are UTF-8 encoded, and by convention, strings in Zig are expected to contain valid UTF-8. However, it’s important to understand that a []u8 or []const u8 is simply a sequence of bytes—Zig doesn’t enforce any particular encoding at the type level. This means you could store Latin-1, UTF-16, or even binary data in a byte slice. The standard library’s string functions assume UTF-8, so sticking to this convention keeps things consistent.

Because UTF-8 is a variable-width encoding, a single character (or more precisely, a Unicode code point) can be made up of one to four bytes. The .len property gives you the byte count, not the character count. This is different from JavaScript’s .length, which gives you the number of UTF-16 code units.

For example, the string "Hello, 世界!" has 13 bytes but only 10 characters. The two Chinese characters each require 3 bytes in UTF-8.

const greeting = "Hello, 世界!";
std.debug.print("Bytes: {d}\n", .{greeting.len}); // Prints: Bytes: 13

Common Pitfalls

One important thing to watch out for: slicing a string at arbitrary byte positions can split a multi-byte character in half, resulting in invalid UTF-8. This won’t cause a compile error, but it can lead to garbled output or unexpected behaviour when the string is processed later.

const text = "café";
// The 'é' character takes 2 bytes (positions 3 and 4).
// Slicing at position 4 splits the character!
const broken = text[0..4]; // This produces invalid UTF-8
const correct = text[0..5]; // This correctly includes the full 'é'

To safely work with Unicode strings, you should iterate over code points rather than bytes. The Iterating Over Strings section shows how to do this with std.unicode.Utf8View.

Validating UTF-8

Zig trusts you to put valid UTF-8 in a []const u8, but it doesn’t verify it automatically. You might run into issues if you pass invalid UTF-8 to a system that expects it to be valid, or when working with systems that use a different encoding (like UTF-16, which is common for Windows APIs).

You can check if a string is valid UTF-8 using std.unicode.utf8ValidateSlice:

const std = @import("std");

pub fn main() !void {
    // This is an invalid UTF-8 sequence (overlong encoding of 'A').
    const invalid: []const u8 = "\xC0\x81";
    const valid = "Hello, 世界!";

    std.debug.print("Invalid string is valid UTF-8: {}\n", .{std.unicode.utf8ValidateSlice(invalid)}); // false
    std.debug.print("Valid string is valid UTF-8: {}\n", .{std.unicode.utf8ValidateSlice(valid)}); // true
}

Counting Characters

If you need to count the actual number of Unicode code points in a string, you can use std.unicode.utf8CountCodepoints:

const std = @import("std");

pub fn main() !void {
    const text = "Hello, 世界!";
    const byte_count = text.len;
    const codepoint_count = try std.unicode.utf8CountCodepoints(text);

    std.debug.print("Bytes: {d}, Code points: {d}\n", .{ byte_count, codepoint_count });
    // Prints: Bytes: 13, Code points: 10
}

For more on iterating over Unicode characters properly, see the next section on Iterating Over Strings.

Working with UTF-16

If you need to work with UTF-16 (common when dealing with Windows APIs or interfacing with languages like JavaScript that use UTF-16 internally), the approach is similar to UTF-8. You use []u16 or []const u16 slices instead of []u8.

The main difference is that Zig doesn’t have convenient UTF-16 string literals. However, the standard library provides std.unicode.utf8ToUtf16LeStringLiteral to convert UTF-8 literals to UTF-16 at compile time:

const std = @import("std");

pub fn main() void {
    // Convert a UTF-8 string literal to UTF-16 (little-endian) at compile time.
    const utf16_string = std.unicode.utf8ToUtf16LeStringLiteral("Hello, 世界!");

    std.debug.print("UTF-16 length: {d} code units\n", .{utf16_string.len});
    // Prints: UTF-16 length: 10 code units
}

For runtime conversions, you can use std.unicode.utf8ToUtf16Le which writes to a buffer you provide.

Checking for Emptiness

You can check if a string is empty by looking at its length:

if (s.len == 0) {
    // the string is empty
}

Iterating Over Strings

Looping over strings is a fundamental task. Here are a few ways to do it in Zig:

const std = @import("std");

pub fn main() !void {
    const stdout = std.io.getStdOut().writer();

    // Example 1: Loop over a string slice byte by byte
    const message = "Hello, Zig!";
    try stdout.print("\nExample 1: Looping over bytes in a string:\n", .{});

    for (message, 0..) |char, i| {
        try stdout.print("Byte at position {d}: '{c}'\n", .{ i, char });
    }

    // Example 2: Loop over an array of strings
    try stdout.print("\nExample 2: Looping over an array of strings:\n", .{});

    const words = [_][]const u8{ "apple", "banana", "cherry", "date" };
    for (words, 0..) |word, i| {
        try stdout.print("Word {d}: {s}\n", .{ i, word });
    }

    // Example 3: Loop over strings in a slice
    try stdout.print("\nExample 3: Looping over strings in a slice:\n", .{});

    const colors = &[_][]const u8{ "red", "green", "blue", "yellow" };
    for (colors) |color| {
        try stdout.print("Color: {s}\n", .{color});
    }

    // Example 4: Using an iterator with a while loop
    try stdout.print("\nExample 4: Using while loop with an iterator:\n", .{});

    var iterator = std.mem.splitScalar(u8, "one,two,three,four", ',');
    while (iterator.next()) |item| {
        try stdout.print("Item: {s}\n", .{item});
    }

    // Example 5: Using a UTF-8 iterator for proper Unicode handling
    try stdout.print("\nExample 5: UTF-8 iteration (Unicode support):\n", .{});

    const unicode_text = "Hello, 世界!";
    var utf8_view = std.unicode.Utf8View.initUnchecked(unicode_text);
    var utf8_iterator = utf8_view.iterator();
    var index: usize = 0;
    while (utf8_iterator.nextCodepoint()) |codepoint| {
        try stdout.print("Unicode codepoint at position {d}: U+{X:0>4}\n", .{ index, codepoint });
        index += 1;
    }
}

In this example, we cover looping over a string byte by byte, iterating through an array of strings, and using an iterator with while. Most importantly, we show how to handle Unicode correctly with std.unicode.Utf8View.

Escape Sequences

Escape sequences let you include special characters in your string literals. Zig supports the usual suspects:

  • \n for newline
  • \t for tab
  • \" for a double quote
  • \\ for a backslash
  • \xNN for a byte in hex
  • \u{NNNN} for a Unicode scalar value in hex

You can find more about this in the Zig Language Reference: Source Encoding.

Slicing Strings

You can grab a piece of a string (or any slice) with the [start..end] syntax. The end is exclusive and optional. For example:

const string = "abcdef";
const sub_string = string[1..4]; // "bcd"

This is a zero-cost abstraction—it just creates a new view into the original string without copying any data. Be aware that slicing out of bounds will cause a runtime panic in safe builds, so always ensure your indices are valid.

Comparing Strings

Zig doesn’t have built-in operators like == or < for strings. Instead, you’ll use functions from the standard library, mainly from the std.mem namespace.

You can use std.mem.eql to check if two strings are equal:

pub fn main() !void {
    const stdout = std.io.getStdOut().writer();

    // Example 1: Basic string equality
    try stdout.print("\nExample 1: Basic string equality\n", .{});

    const str1 = "hello";
    const str2 = "hello";
    const str3 = "world";
    const str4 = "Hello";

    // Use std.mem.eql for string equality
    const equal1 = std.mem.eql(u8, str1, str2);
    const equal2 = std.mem.eql(u8, str1, str3);
    const equal3 = std.mem.eql(u8, str1, str4);

    try stdout.print("  \"{s}\" == \"{s}\": {}\n", .{ str1, str2, equal1 });
    try stdout.print("  \"{s}\" == \"{s}\": {}\n", .{ str1, str3, equal2 });
    try stdout.print("  \"{s}\" == \"{s}\": {}\n", .{ str1, str4, equal3 });
}

//Prints:
// Example 1: Basic string equality
//  "hello" == "hello": true
//  "hello" == "world": false
//  "hello" == "Hello": false

For case-insensitive comparison, std.ascii.eqlIgnoreCase is what you’re looking for:

pub fn main() !void {
    const stdout = std.io.getStdOut().writer();

    // Example 2: Case-insensitive comparison
    try stdout.print("\nExample 2: Case-insensitive comparison\n", .{});

    const upper = "HELLO";
    const lower = "hello";

    // Case-sensitive comparison
    const case_sensitive_equal = std.mem.eql(u8, upper, lower);
    try stdout.print("  Case-sensitive: \"{s}\" == \"{s}\": {}\n", .{ upper, lower, case_sensitive_equal });

    // Case-insensitive comparison
    const case_insensitive_equal = std.ascii.eqlIgnoreCase(upper, lower);
    try stdout.print("  Case-insensitive: \"{s}\" == \"{s}\": {}\n", .{ upper, lower, case_insensitive_equal });
}

//Prints:
// Example 2: Case-insensitive comparison
//  Case-sensitive: "HELLO" == "hello": false
//  Case-insensitive: "HELLO" == "hello": true

To sort strings or compare them lexicographically (dictionary order), you can use std.mem.order:

pub fn main() !void {
    const stdout = std.io.getStdOut().writer();

    // Example 3: String ordering (lexicographical comparison)
    try stdout.print("\nExample 3: String ordering (lexicographical comparison)\n", .{});

    const a = "apple";
    const b = "banana";

    // Use std.mem.order for lexicographical ordering
    const ordering = std.mem.order(u8, a, b);

    try stdout.print("  \"{s}\" compared to \"{s}\": ", .{ a, b });
    switch (ordering) {
        .lt => try stdout.print("less than\n", .{}),
        .eq => try stdout.print("equal\n", .{}),
        .gt => try stdout.print("greater than\n", .{}),
    }
}

By now, you’re probably noticing a pattern: many of the string utilities you’d expect are in the std.mem and std.ascii modules. Zig’s approach to strings is all about being explicit and efficient, giving you full control over memory and performance. Dive into std.mem and std.ascii to see what else is available.

Searching and Indexing

To check if a string contains a substring or a specific byte, you can use functions in std.mem. For instance, std.mem.indexOf can tell you if a substring exists:

pub fn main() !void {
    const stdout = std.io.getStdOut().writer();

    const haystack = "The quick brown fox jumps over the lazy dog";
    const needle1 = "fox";
    const needle2 = "cat";

    const contains1 = std.mem.indexOf(u8, haystack, needle1) != null;
    const contains2 = std.mem.indexOf(u8, haystack, needle2) != null;

    try stdout.print("  \"{s}\" contains \"{s}\": {}\n", .{ haystack, needle1, contains1 });
    try stdout.print("  \"{s}\" contains \"{s}\": {}\n", .{ haystack, needle2, contains2 });
}

// Prints:
//   "The quick brown fox jumps over the lazy dog" contains "fox": true
//   "The quick brown fox jumps over the lazy dog" contains "cat": false

The indexOf function returns an optional index (?usize), so we check if it’s null. Another way to do this is with std.mem.containsAtLeast or containsAtLeastScalar:

pub fn main() !void {
    const stdout = std.io.getStdOut().writer();

    const haystack = "The quick brown fox jumps over the lazy dog";
    const needle1 = "fox";
    const needle2 = "cat";

    // Using std.mem.containsAtLeast
    const contains1 = std.mem.containsAtLeast(u8, haystack, 1, needle1);
    const contains2 = std.mem.containsAtLeast(u8, haystack, 1, needle2);

    try stdout.print("  \"{s}\" contains \"{s}\": {}\n", .{ haystack, needle1, contains1 });
    try stdout.print("  \"{s}\" contains \"{s}\": {}\n", .{ haystack, needle2, contains2 });
}

The results are the same. containsAtLeast checks if the haystack contains at least a certain number of occurrences of the needle.

There are plenty of other useful functions for finding things in strings, like std.mem.lastIndexOf, std.mem.lastIndexOfScalar, std.mem.indexOfPos, and std.mem.indexOfMax. Check out std.mem for all the options.

Prefix and Suffix Checks

Checking if a string starts or ends with a certain substring is a common task. You can use std.mem.startsWith and std.mem.endsWith for this. Here’s an example:

pub fn main() !void {
    const stdout = std.io.getStdOut().writer();

    const full_string = "hello world";
    const substring1 = "hello";
    const substring2 = "world";

    // Check if the string starts with a prefix
    const starts_with1 = std.mem.startsWith(u8, full_string, substring1);
    const starts_with2 = std.mem.startsWith(u8, full_string, substring2);

    try stdout.print("  \"{s}\" starts with \"{s}\": {}\n", .{ full_string, substring1, starts_with1 });
    try stdout.print("  \"{s}\" starts with \"{s}\": {}\n", .{ full_string, substring2, starts_with2 });

    // Check if the string ends with a suffix
    const ends_with1 = std.mem.endsWith(u8, full_string, substring1);
    const ends_with2 = std.mem.endsWith(u8, full_string, substring2);

    try stdout.print("  \"{s}\" ends with \"{s}\": {}\n", .{ full_string, substring1, ends_with1 });
    try stdout.print("  \"{s}\" ends with \"{s}\": {}\n", .{ full_string, substring2, ends_with2 });
}

// Prints:
//   "hello world" starts with "hello": true
//   "hello world" starts with "world": false
//   "hello world" ends with "hello": false
//   "hello world" ends with "world": true

Once you know to look in std.mem, these kinds of tasks become second nature.

Formatting and Interpolation

The std.fmt struct provides powerful functions for formatting values into strings. You’ve already seen this in action with std.debug.print and std.io.getStdOut().writer(). Here’s how you can do string interpolation:

const std = @import("std");

pub fn main() !void {
    var gpa = std.heap.DebugAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    const name = "Zig";
    const count = 5;

    // Use allocPrint to format into a newly allocated string.
    const msg = try std.fmt.allocPrint(allocator, "Hello, {s}! Count: {d}", .{ name, count });
    // Don't forget to free the memory!
    defer allocator.free(msg);

    std.debug.print("{s}\n", .{msg}); // Prints "Hello, Zig! Count: 5"
}

This requires an allocator because the final string size isn’t known at compile time. It’s similar to template literals in JavaScript: Hello, ${name}! Count: ${count}. The format specifiers like {s} (string) and {d} (decimal) tell Zig how to format the values. According to fmt.format(), there are several other specifiers you can use:

  • x and X: hexadecimal
  • s: string
  • e: scientific notation
  • d: decimal
  • b: binary
  • o: octal
  • c: ASCII character
  • u: UTF-8 character
  • ?: optional value
  • !: error union
  • *: pointer address
  • any: default format

If you know the maximum size of the string you’re creating, you can use std.fmt.bufPrint to avoid dynamic memory allocation.

const std = @import("std");

pub fn main() !void {
    var buffer: [50]u8 = undefined;
    const count = 5;
    const msg = try std.fmt.bufPrint(&buffer, "Hello, {s}! Count: {d}", .{ "Zig", count });

    std.debug.print("{s}\n", .{msg}); // Prints "Hello, Zig! Count: 5"
}

This is a great pattern for creating formatted strings without hitting the allocator. The buffer just needs to be large enough to hold the result.

There are many other cool functions in std.fmt worth exploring. For example, fmtDuration can format a time in nanoseconds into a human-readable string. try stdout.print("Duration: {}\n", .{std.fmt.fmtDuration(90 * std.time.ns_per_s)}); will print Duration: 1m30s.

Concatenation

To join strings together, you can use std.mem.concat. It takes a slice of strings and copies their bytes into a new, single slice.

const std = @import("std");

pub fn main() !void {
    var gpa = std.heap.DebugAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    const a = "foo";
    const b = "bar";
    const c = "baz";
    const parts = &[_][]const u8{ a, b, c };

    // Concatenate the slices in `parts` into a new string.
    const joined = try std.mem.concat(allocator, u8, parts);
    defer allocator.free(joined);

    std.debug.print("Joined: {s}\n", .{joined}); // Prints "Joined: foobarbaz"
}

There are also concatMaybeSentinel and concatWithSentinel if you need to add a sentinel byte at the end.

Performance Tip: Concatenating strings with concat allocates a new buffer every time. If you’re doing a lot of concatenations in a loop, it’s much more efficient to use an ArrayList to build the string incrementally. If you know the final size, you can pre-allocate with try list.ensureTotalCapacity(expected_size) to avoid reallocations.

String Builders and Buffers

For building strings efficiently, especially when you’re appending multiple times, std.ArrayList(u8) is an excellent choice. It helps minimize reallocations.

const std = @import("std");

pub fn main() !void {
    var gpa = std.heap.DebugAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    // Initialize an ArrayList with an allocator.
    var list = std.ArrayList(u8).init(allocator);
    // `defer list.deinit()` makes sure the list's internal buffer is freed.
    defer list.deinit();

    // Append slices or individual bytes.
    try list.appendSlice("Hello, ");
    try list.appendSlice("world");
    try list.appendByte('!');

    // Get the resulting slice. The ArrayList still owns the data.
    const result: []const u8 = list.items;

    std.debug.print("Built string: {s}\n", .{result}); // Prints "Built string: Hello, world!"
}

Check out std.ArrayList for more info.

Performance Note: When building strings with many appends, you can get a nice performance boost by pre-allocating the ArrayList’s capacity if you have a good estimate of the final size. Use try list.ensureTotalCapacity(estimated_size) before you start appending.

Splitting and Joining

To split a string by a single delimiter, use std.mem.splitScalar. It returns an iterator over the resulting slices, and best of all, it doesn’t require any memory allocation.

const std = @import("std");

pub fn main() void {
    const s = "a,b,c";
    // Split the string `s` by the comma character.
    var it = std.mem.splitScalar(u8, s, ',');
    while (it.next()) |part| {
        // `part` is just a slice (a view) of the original string.
        std.debug.print("Part: {s}\n", .{part});
    }
    // Output:
    // Part: a
    // Part: b
    // Part: c
}

To join a slice of strings with a separator, use std.mem.join. This does require an allocator since it creates a new string.

const std = @import("std");

pub fn main() !void {
    var gpa = std.heap.DebugAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    const parts = &[_][]const u8{"a", "b", "c"};
    const separator = ",";

    // Join the slices with the separator.
    const joined = try std.mem.join(allocator, separator, parts);
    defer allocator.free(joined);

    std.debug.print("Joined: {s}\n", .{joined}); // Prints "Joined: a,b,c"
}

See std.mem.splitScalar and std.mem.join in the docs. For splitting by multiple characters or a sequence of characters, check out std.mem.splitAny and std.mem.splitSequence.

Trimming

To remove leading or trailing characters (like whitespace), use std.mem.trim. It returns a new slice that is a view into the original string, so no allocation is needed.

const std = @import("std");

pub fn main() void {
    const s = "  hello  ";
    const whitespace = " \t\n\r"; // Characters to trim

    // Trim leading and trailing whitespace.
    const trimmed = std.mem.trim(u8, s, whitespace);
    std.debug.print("Trimmed: '{s}'\n", .{trimmed}); // Prints "Trimmed: 'hello'"

    // Trim only from the start.
    const trim_start = std.mem.trimLeft(u8, s, whitespace);
    std.debug.print("TrimStart: '{s}'\n", .{trim_start}); // Prints "TrimStart: 'hello  '"

    // Trim only from the end.
    const trim_end = std.mem.trimRight(u8, s, whitespace);
    std.debug.print("TrimEnd: '{s}'\n", .{trim_end}); // Prints "TrimEnd: '  hello'"
}

trim doesn’t allocate memory. If you need an owned, trimmed string, you’ll have to allocate memory and copy the result yourself.

See std.mem.trim for more.

Replacing Substrings

To replace all occurrences of a substring, use std.mem.replace. This function writes the result into a destination buffer you provide, so you’ll need to calculate the required size first.

const std = @import("std");

pub fn main() !void {
    var gpa = std.heap.DebugAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    const s = "foo bar foo baz foo";
    const old = "foo";
    const new = "zig";

    // Count occurrences to calculate the new length.
    const count = std.mem.count(u8, s, old);
    const new_len = s.len - (count * old.len) + (count * new.len);

    // Allocate a buffer for the result.
    const output = try allocator.alloc(u8, new_len);
    defer allocator.free(output);

    // Replace all occurrences of `old` with `new`.
    _ = std.mem.replace(u8, s, old, new, output);

    std.debug.print("Original: {s}\n", .{s});
    std.debug.print("Replaced: {s}\n", .{output}); // Prints "Replaced: zig bar zig baz zig"
}

The function returns the number of replacements made, which can be useful if you need to know whether any substitutions occurred.

See std.mem.replace for details.

Case Conversion

To convert ASCII strings to upper or lower case, you can use functions from std.ascii. These usually require an allocator or a pre-allocated mutable buffer.

const std = @import("std");

pub fn main() !void {
    var gpa = std.heap.DebugAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    const s = "Hello World";

    // Allocate a new string for the lowercased version.
    const lower = try std.ascii.allocLower(allocator, s);
    defer allocator.free(lower);

    // Allocate a new string for the uppercased version.
    const upper = try std.ascii.allocUpper(allocator, s);
    defer allocator.free(upper);

    std.debug.print("Original: {s}\n", .{s});
    std.debug.print("Lower:    {s}\n", .{lower}); // Prints "Lower:    hello world"
    std.debug.print("Upper:    {s}\n", .{upper}); // Prints "Upper:    HELLO WORLD"

    // You can also modify a string in-place if you have a mutable buffer.
    var buffer: [32]u8 = undefined;
    const mutable_slice = try std.fmt.bufPrint(&buffer, "Mutable", .{});
    std.ascii.lowerSlice(mutable_slice);
    std.debug.print("In-place lower: {s}\n", .{mutable_slice}); // Prints "In-place lower: mutable"
}

A quick heads-up: these functions only work correctly for ASCII characters. For full Unicode case conversion, you’d need a more advanced library.

See std.ascii.allocLower and std.ascii.allocUpper.

Padding

To pad a string to a certain length, you can use the formatting options in std.fmt. This will require an allocator if you’re using allocPrint.

const std = @import("std");

pub fn main() !void {
    var gpa = std.heap.DebugAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    const s = "42";

    // Right-align (left-pad with spaces) to a width of 5.
    const padded_right = try std.fmt.allocPrint(allocator, "{s:>5}", .{s});
    defer allocator.free(padded_right);
    std.debug.print("Padded Right: '{s}'\n", .{padded_right}); // Prints "Padded Right: '   42'"

    // Left-align (right-pad with spaces) to a width of 5.
    const padded_left = try std.fmt.allocPrint(allocator, "{s:<5}", .{s});
    defer allocator.free(padded_left);
    std.debug.print("Padded Left:  '{s}'\n", .{padded_left}); // Prints "Padded Left:  '42   '"

    // Pad with zeros instead of spaces.
    const padded_zeros = try std.fmt.allocPrint(allocator, "{s:0>5}", .{s});
    defer allocator.free(padded_zeros);
    std.debug.print("Padded Zeros: '{s}'\n", .{padded_zeros}); // Prints "Padded Zeros: '00042'"
}

You can also pad into a fixed-size buffer with std.fmt.bufPrint if you know the maximum size ahead of time.

For more on this, check out the format specifications section in the std.fmt documentation.

Type Conversion: String and Number

To parse a string into a number, you can use functions like std.fmt.parseInt or std.fmt.parseFloat. These will return an error if the parsing fails.

const std = @import("std");

pub fn main() !void {
    const s_int = "123";
    const s_float = "3.14";
    const s_invalid = "not a number";

    // Parse an integer (base 10)
    const n_int = try std.fmt.parseInt(i32, s_int, 10);
    std.debug.print("Parsed int: {d}\n", .{n_int}); // Prints "Parsed int: 123"

    // Parse a float
    const n_float = try std.fmt.parseFloat(f64, s_float);
    std.debug.print("Parsed float: {d}\n", .{n_float}); // Prints "Parsed float: 3.14"

    // Handle a parsing error
    _ = std.fmt.parseInt(i32, s_invalid, 10) catch |err| {
        std.debug.print("Failed to parse '{s}': {}\n", .{ s_invalid, err });
        return;
    };
}

To format a number as a string, you can use std.fmt.allocPrint (which needs an allocator) or std.fmt.bufPrint (which needs a buffer).

const std = @import("std");

pub fn main() !void {
    var gpa = std.heap.DebugAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    const n: i32 = 42;
    const f: f32 = 1.618;

    // Format an integer to an allocated string
    const s_int = try std.fmt.allocPrint(allocator, "{}", .{n});
    defer allocator.free(s_int);
    std.debug.print("Int as string: {s}\n", .{s_int}); // Prints "Int as string: 42"

    // Format a float to an allocated string with 3 decimal places
    const s_float = try std.fmt.allocPrint(allocator, "{d:.3}", .{f});
    defer allocator.free(s_float);
    std.debug.print("Float as string: {s}\n", .{s_float}); // Prints "Float as string: 1.618"

    // Format into a fixed-size buffer
    var buffer: [10]u8 = undefined;
    const s_buf = try std.fmt.bufPrint(&buffer, "{}", .{n});
    std.debug.print("Int in buffer: {s}\n", .{s_buf}); // Prints "Int in buffer: 42"
}

For more, see std.fmt.parseInt, std.fmt.parseFloat, std.fmt.allocPrint, and std.fmt.bufPrint.

Null-Terminated Strings and C Interop

One of Zig’s strengths is its seamless interoperability with C. Since C strings are null-terminated (they end with a \0 byte), Zig provides special types and functions to work with them.

Sentinel-Terminated Slices

In Zig, a null-terminated string is represented as [:0]const u8. The :0 indicates that the slice has a sentinel value of 0 at the end. This is different from a regular slice []const u8, which only has a pointer and length.

const std = @import("std");

pub fn main() void {
    // String literals are already null-terminated.
    // The type is *const [5:0]u8, which coerces to [:0]const u8.
    const hello: [:0]const u8 = "hello";

    std.debug.print("String: {s}\n", .{hello});
    std.debug.print("Length: {d}\n", .{hello.len}); // 5 (doesn't include the null terminator)

    // You can access the null terminator explicitly.
    std.debug.print("Null terminator: {d}\n", .{hello[hello.len]}); // 0
}

Converting Between Slice Types

You can convert a sentinel-terminated slice to a regular slice implicitly, since a [:0]const u8 has all the information a []const u8 needs:

const null_terminated: [:0]const u8 = "hello";
const regular_slice: []const u8 = null_terminated; // This works

Going the other way requires more care, since a regular slice might not be null-terminated. You can use std.mem.sliceTo to find the null terminator in a buffer:

const std = @import("std");

pub fn main() void {
    var buffer: [10]u8 = undefined;
    buffer[0] = 'h';
    buffer[1] = 'i';
    buffer[2] = 0; // null terminator

    // Get a slice up to (but not including) the null terminator.
    const slice = std.mem.sliceTo(&buffer, 0);
    std.debug.print("Slice: {s}, Length: {d}\n", .{ slice, slice.len }); // "hi", 2
}

Working with C Functions

When calling C functions that expect char* or const char*, you’ll need to pass a pointer to the first element of a null-terminated slice. Zig makes this straightforward:

const std = @import("std");
const c = @cImport({
    @cInclude("stdio.h");
});

pub fn main() void {
    const message: [:0]const u8 = "Hello from Zig!";

    // Pass to C's puts() function.
    // The .ptr gives us the raw pointer that C expects.
    _ = c.puts(message.ptr);
}

Converting C Strings to Zig Slices

When receiving a string from C code, you’ll typically get a [*:0]const u8 (a many-pointer with a null sentinel) or [*c]const u8 (a C pointer). Use std.mem.span to convert it to a Zig slice:

const std = @import("std");
const c = @cImport({
    @cInclude("stdlib.h");
});

pub fn main() void {
    // Simulate receiving a C string (e.g., from getenv).
    const c_string: [*:0]const u8 = "PATH=/usr/bin";

    // Convert to a Zig slice.
    const zig_slice: [:0]const u8 = std.mem.span(c_string);

    std.debug.print("C string as Zig slice: {s}\n", .{zig_slice});
    std.debug.print("Length: {d}\n", .{zig_slice.len});
}

Creating Null-Terminated Strings Dynamically

If you need to create a null-terminated string at runtime (for example, to pass to a C function), you can use allocator.dupeZ:

const std = @import("std");

pub fn main() !void {
    var gpa = std.heap.DebugAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    const original: []const u8 = "dynamic string";

    // Create a null-terminated copy.
    const null_terminated = try allocator.dupeZ(u8, original);
    defer allocator.free(null_terminated);

    std.debug.print("Null-terminated: {s}\n", .{null_terminated});
    std.debug.print("Last byte is null: {}\n", .{null_terminated[null_terminated.len] == 0}); // true
}

For more details on C interoperability, see the Zig documentation on C Pointers and Sentinel-Terminated Pointers.

Error Handling in String Operations

Many of Zig’s string functions return error unions or optionals. Some common errors you’ll encounter are:

  • error.OutOfMemory: The allocator couldn’t get enough memory.
  • error.InvalidCharacter: You tried to parse a string with invalid characters.
  • error.Overflow: You parsed a number that was too big for the type you specified.
  • error.StreamTooLong: You tried to write past the end of a buffer.

You have to handle these errors explicitly with try or catch.

const std = @import("std");

pub fn main() !void {
    var gpa = std.heap.DebugAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    const s = "not a number";

    // Option 1: Use `try` to propagate the error up the call stack.
    // This would return an error from the main function.
    // try std.fmt.parseInt(i32, s, 10);

    // Option 2: Use `catch` to handle the error right here.
    const n = std.fmt.parseInt(i32, s, 10) catch |err| {
        std.debug.print("Parse failed: {s}\n", .{@errorName(err)});
        return; // Or handle the error in some other way
    };

    // For functions that return optionals, use `if` or `orelse`.
    const index = std.mem.indexOf(u8, "hello world", "xyz");
    if (index) |i| {
        std.debug.print("Found at {d}\n", .{i});
    } else {
        std.debug.print("Not found\n", .{});
    }

    // Or, you can use `orelse`.
    const pos = std.mem.indexOf(u8, "hello world", "world") orelse {
        std.debug.print("Substring not found\n", .{});
        return;
    };
    std.debug.print("Found at position: {d}\n", .{pos});
}

Proper error handling is a cornerstone of Zig’s design. Good string manipulation code always accounts for these possibilities.

See the Error Handling section in the language documentation for more.

Wrap Up

And that’s a wrap on our tour of strings in Zig! We’ve covered a lot, from the basics of what a []const u8 slice is, to the details of memory management with allocators, and a whole bunch of practical operations like formatting, splitting, and joining.

My goal was to create the kind of guide I wish I had when I was starting out—something to help fellow developers get comfortable with how Zig handles strings. I’m still learning too, so if you see anything that could be improved, have a suggestion, or just want to ask a question, please drop a comment below. Your feedback is incredibly valuable and helps make this guide better for everyone.

If you enjoyed this deep dive and want more content about Zig, be sure to subscribe to my newsletter for future guides and updates. Thanks for reading, and happy coding!

Subscribe to newsletter

Subscribe to receive expert insights on high-performance Web and Node.js optimization techniques, and distributed systems engineering. I share practical tips, tutorials, and fun projects that you will appreciate. No spam, unsubscribe anytime.