Symbol vs. String in Ruby

For the longest time, I've postponed digging deeper to find the differences.

I would tell myself I'd eventually find the time to look into it, but there were always other things that felt more important than figuring out this seemingly minute detail.

But as I kept pounding the keyboard, I'd eventually run into a crossroad.

Should I use a string here? Or a symbol?

I can't tell you how many times I've just meh-ed over that thought.

But that feeling of uncertainty ends now. If you're curious, as I was, about which one to choose, or why there are both available, or which one is better, then keep reading as I will explain everything you need to know.

Why does Ruby have both symbols and strings

Even though in the day-to-day symbols and strings are used interchangeably, and mostly because developers don't know the difference, they should not be.

Symbols were made to be used as identifiers (i.e., variables, method names, constant names). In contrast, strings were made to be used as data. And they were both optimized for their respective goals.

If you think about it, most of the data that comes into your application via an HTML form, or a JSON response, or what have you, will be mostly strings.

So, the fact that strings were built to be used as data makes a lot of sense.

You can't get a symbol from the outside world. You can only get a symbol if you convert a string to a symbol.

What is the difference between a symbol and a string?

A string, in Ruby, is a mutable series of characters or bytes.

Symbols, on the other hand, are immutable values. Just like the integer 2 is a value.

Mutability is the ability for an object to change.
In the case of a string, you can add to, or remove from the string. And thus immutable means once you create it, it can never be changed.

Because symbols are immutable, Ruby doesn't have to allocate more memory for the same symbol. That is because it knows that once it put the value in memory, it will never be changed, so it can reuse it.

You can easily see this by looking at their object IDs.

# Symbols (same id)
:foo_bar.object_id # => 2386588
:foo_bar.object_id # => 2386588

# Strings (different ids)
"foo_bar".object_id # => 1020
"foo_bar".object_id # => 1040

Which one is faster?

Obviously, since symbols are immutable values, working with symbols requires less memory, and it's just easier for computers to work with literal values than it is to work with complex objects. So, lookups for symbols are faster as well.

So… the answer is symbols. Right?

Well, there's one more thing to take into consideration.

And that is…

Frozen strings

Starting with Ruby version 2.1, you can freeze strings. And that means Ruby doesn't need to allocate new memory when you create a second identical string. It just reuses the first one.

"foo_bar".freeze.object_id # => 1060
"foo_bar".freeze.object_id # => 1060

"foo_bar".object_id # => 1100
"foo_bar".object_id # => 1120

But with Ruby 2.3, you also have the option to use a magic comment that you can put at the top of your file, which will freeze all the string literals in that file.

# frozen_string_literal: true

puts "foo".object_id
puts "foo".object_id

If you put the code above in a file and execute it with ruby ./yourfile.rb, you will see the same number printed twice. This means the memory address of both strings (technically, it's just one reused) is the same even though you're not calling .freeze on them.

So, that obviously helps with memory allocations, but what about lookup speed? Is there a difference between the two when it comes to identifier lookup?

So let's look at a few benchmarks (originally posted in this Gist, but I will copy them here for convenience) and re-run them with Ruby 3.0.

Accessing a hash

require 'benchmark/ips'

FOO  = "foo".freeze
HASH = {"foo" => "bar"}

Benchmark.ips do |x|
  x.report("constant") { HASH[FOO] }
  x.report("regular")  { HASH["foo"]}
end

constant  15.669M (± 3.5%) i/s - 79.519M in 5.082575s
 regular  15.882M (± 1.9%) i/s - 80.402M in 5.064430s

Making a hash

require 'benchmark/ips'

FOO = "foo".freeze

Benchmark.ips do |x|
  x.report("constant") { {FOO   => "bar"} }
  x.report("regular")  { {"foo" => "bar"} }
end

constant  8.123M (± 1.2%) i/s - 41.199M in 5.072646s
 regular  8.189M (± 1.9%) i/s - 41.190M in 5.031931s

Comparisons

require 'benchmark/ips'

FOO = "foo".freeze

Benchmark.ips do |x|
  x.report("constant") { "foo" == FOO }
  x.report("regular")  { "foo" == "foo" }
end

constant  12.790M (± 1.4%) i/s - 65.094M in 5.090450s
 regular   9.835M (± 1.0%) i/s - 50.155M in 5.100092s

Passing in an argument to a regular method

require 'benchmark/ips'

FOO = "foo".freeze

Benchmark.ips do |x|
  x.report("constant") { "foo".gsub(FOO, "") }
  x.report("regular")  { "foo".gsub("foo", "") }
end

constant  1.472M (± 1.3%) i/s - 7.478M in 5.079855s
 regular  1.416M (± 1.8%) i/s - 7.203M in 5.086869s

As you can see from these benchmarks, the differences are negligible.

When to use symbols and when to use strings?

It's best to use them as they were intended whenever possible (i.e., use symbols when you want to create an identifier, and use strings for data).

Or to quote the late Jim Weirich…

“If the contents (the sequence of characters) of the object are important, use a string. If the identity of the object is important, use a symbol.” — Jim Weirich (Ruby Cookbook)

Why use symbols as hash keys in Ruby?

Historically, there was a performance benefit of using symbols over strings, but in recent versions of Ruby, that benefit is no longer significant.

But even today, with all the performance improvements to strings, most Ruby developers still use symbols over strings for hash keys.

They read better, and it's using them for what they were meant to be used (i.e., identifiers).

Conclusion

I hope you learned a thing or two about the difference between symbols and strings and that you are better equipped to answer the question the next time it pops into your head.