BlockchainSolidity8 min readUpdated

The string Type in Solidity: UTF-8 Text on Chain

By Mudassir Khan — Agentic AI Consultant & AI Systems Architect, Islamabad, Pakistan

Cover illustration for: The string Type in Solidity: UTF-8 Text on Chain

Section 01 · Definition

What is the string type in Solidity?

string is the type for variable-length UTF-8 text. It looks familiar to anyone coming from JavaScript or Python, but it is much more limited.

Quick answer

What is string? string is Solidity's dynamic UTF-8 text type. Under the hood it is a thin wrapper over the bytes type, with the convention that the bytes form a valid UTF-8 sequence. Solidity does not let you index a string by character or call .length on it directly — it has no character-level operations at all.

solidity
string public name = "MudassirCoin";
string public symbol = "MUDK";
string public baseURI = "ipfs://QmXyz.../";

function setName(string calldata newName) external {
    name = newName;
}

A string declared at contract level lives in storage. A string parameter to an external function lives in calldata (cheapest), and a string used inside a function lives in memory. Choosing the right location is one of the first gas optimisations a beginner learns. The full Solidity guide covers data locations end-to-end.

Section 02 · Storage cost

Why on-chain strings are expensive

Every additional 32 bytes of string payload is a fresh storage write. Long strings can dominate a contract's gas profile.

Diagram of a Solidity string laid out in storage with a length slot and chained data slots.
A long string spreads across multiple storage slots, each one paid for at deploy or write time.

The first 31 bytes of a string can fit alongside the length in a single slot. Anything longer is laid out at a separate location computed from keccak256(slot), chained 32 bytes at a time. A 200-character description costs roughly seven storage writes — at a typical 20 000 gas per fresh slot that is 140 000 gas just for the text.

Rule of thumb for text on chain

If the text is for humans (a description, a long name, marketing copy) — store it off chain on IPFS or Arweave and put only the content hash on chain. If the text is for machines (a 4-byte selector, a 32-byte commitment) — use bytes4 or bytes32 directly. Reach for the string type only when the field is short, named, and shown to wallets directly (a token name or symbol).

Section 03 · What you can and cannot do

The thin Solidity string API

Solidity gives you almost no built-in string functions. The few operations available all happen through the underlying bytes representation.

solidity
string memory greeting = "Hello, smart contracts!";

// Length — count of bytes (NOT characters when there are non-ASCII chars)
uint256 byteCount = bytes(greeting).length;     // 23

// Equality — compare hashes, never == or !=
bool same = keccak256(bytes(greeting)) == keccak256(bytes("Hello, smart contracts!"));

// Concatenation — string.concat in 0.8.12+
string memory full = string.concat("Hi ", "there!");

// Char-by-char access — read the underlying bytes
bytes1 firstByte = bytes(greeting)[0];          // 0x48 ('H')

Notice how every operation goes through bytes(...). That is the only legal way to inspect a string's contents. Higher-level operations such as substring extraction, case conversion, or regex matching do not exist; libraries like OpenZeppelin's Strings fill some of the gaps but stay deliberately small because every byte processed costs gas.

Section 04 · Real-world uses

Where string shows up in production code

Three patterns that justify using a string on chain — and one that does not.

solidity
// 1. ERC20 metadata
string public name;        // shown by wallets like MetaMask
string public symbol;      // 3-5 character ticker

// 2. ERC721 base URI
string public baseTokenURI;

function tokenURI(uint256 id) external view returns (string memory) {
    return string.concat(baseTokenURI, Strings.toString(id), ".json");
}

// 3. Revert reason
function withdraw(uint256 amount) external {
    require(balances[msg.sender] >= amount, "insufficient balance");
    // ...
}

// 4. ANTI-PATTERN — long descriptive text on chain
string public description = "This collection celebrates ..."; // 4 KB+ — DON'T

The first three are short, well-defined, and read by tooling. The fourth is the trap: long marketing copy belongs in an IPFS document referenced by tokenURI, not in contract storage.

Section 06 · FAQ

Frequently asked questions

Why can I not write s.length on a string in Solidity?

Because string is UTF-8 and one character can take more than one byte. Solidity refuses to give a misleading number. To get the byte length, write bytes(s).length. To get a true character count you would need to walk the bytes yourself, which is rare in production code.

How do I compare two strings for equality?

Hash both with keccak256(bytes(a)) == keccak256(bytes(b)). Solidity does not allow the == operator on strings directly. The hash comparison is exact and works for any length.

What is the difference between string and bytes?

Internally they are the same dynamic byte array. The only difference is intent: string promises to hold valid UTF-8 text, while bytes is raw binary data. Operations that interpret the contents (concatenation, hashing for equality) work the same on both.

Are revert reason strings expensive?

Yes — at deploy time and at revert time. Each revert string is encoded into the contract bytecode and the runtime gas to format the error message is non-trivial. Solidity 0.8.4 introduced custom errors, which are 4 bytes and cheaper. Use them for any contract that gets meaningful traffic.

Should I store NFT metadata as a string on chain?

Almost never. Store the JSON metadata off chain (IPFS or HTTPS) and put the URL or content hash on chain. The standard way is the ERC721 tokenURI function, which returns a string pointing to the off-chain document. Only fully on-chain SVG NFTs store rendered metadata directly, and they pay heavily in gas.

Written by Mudassir Khan

Agentic AI consultant and AI systems architect based in Islamabad, Pakistan. CEO of Cube A Cloud. 38+ agentic AI launches delivered for global founders and CTOs.

View agentic AI consulting serviceSee ChainTrust case study

Related service

Agentic AI Consulting

See scope & pricing →

Related case study

ChainTrust Compliance Engine

Read case study →

More on this topic

Need an AI systems architect?

Book a 30-minute architecture call. I will sketch the high-level design for your use case and give you an honest view of the trade-offs.

Book a strategy call →