Rails to Phoenix: getting started with Ecto

When using Elixir, it's clear that a lot of attention has been paid to productivity - given their age, tools like mix and iex are incredibly mature. However, tools are only a part of the story; a big piece of the productivity of a language is the amount and quality of the libraries that are available for it.

Like the majority of Rails developers, I use ActiveRecord to work with databases. Although I have my fair share of gripes, it's still invaluable in building web applications. In Elixir, Ecto is the tool you'd most likely use to work with a database. Let's dive in and write the backend for a blog to see how it works.

Getting set up

First, let's create a new project. We'll pass the supervisor flag to mix - we'll see why later:

mix new ecto_blog --sup

To use ecto, we'll have to add it as a dependency, along with an adapter for your specific database. In your newly created project, edit mix.exs to add the dependencies:

defp deps do
  [{:postgrex, ">= 0.0.0"},
   {:ecto, "~> 1.0"}]
end

Run mix deps.get followed by mix ecto.gen.repo. Ecto generates a new file, repo.ex, and informs you to add it to the supervision tree. Before you forget, go ahead and modify lib/ecto_blog.ex to add the worker.

The generator also adds database configuration config.exs, but you will need to modify the username and password to access your local database. After you do so, you can run mix ecto.create to create the database.

We've successfully added the necessary dependencies and set up our database - but what's a repository, and why is it supervised?

Repositories

While ActiveRecord centralizes database access, query generation, and validation in your models, Ecto divides these responsibilities into separate modules.

The repository provides an interface to execute queries against a database. Functions like get, get_by, insert, update, and delete mirror ActiveRecord's find, find_by, create, update, and destroy, with the caveat that ActiveRecord methods are executed within the model class or instance, while Ecto expects you to pass a model, query, or changeset to its functions.

Let's create a simple post model to explore further. First, create a migration using mix ecto.gen.migration create_posts. Edit the generated file to look like this:

defmodule EctoBlog.Repo.Migrations.CreatePosts do
  use Ecto.Migration

  def change do
    create table(:posts) do
      add :title,    :string, size: 100
      add :content, :text

      timestamps
    end
  end
end

You can run the migration with mix ecto.migrate. Then, create the model, lib/ecto_blog/post.ex, with the following code:

defmodule EctoBlog.Post do
  use Ecto.Model

  schema "posts" do
    field :title
    field :content

    timestamps
  end
end

Working with the migration feels very familiar, but the model looks different. We'll come back to it soon, but for now, let's try using it with our repo. Drop into an iex session by typing iex -S mix.

iex(1)> post = %EctoBlog.Post{title: "Hello", content: "Ecto"}     
iex(2)> {:ok, inserted_post} = EctoBlog.Repo.insert(post)                  

The ActiveRecord equivalent would have been:

irb(main):001:0> post = Post.create(title: "Hello", content: "AR")                  

This exposes a difference between the two libraries. Where ActiveRecord returns a model in all cases, Ecto returns either {:ok, model} or {:error, changeset}. Ecto uses the changeset to perform validations, rather than deal with validations inside the model.

We'll come back to changesets later. For now, let's find, update, and delete our post:

iex(3)> post = EctoBlog.Repo.get_by(EctoBlog.Post, title: "Hello")     
iex(4)> changed_post = %{post | title: "Hello!"}
iex(5)> {:ok, updated_post} = EctoBlog.Repo.update(changed_post)
iex(6)> {:ok, deleted_post} = EctoBlog.Repo.delete(updated_post)                   

And the equivalent:

irb(main):002:0> post = Post.find_by(title: "Hello")
irb(main):003:0> post.update(title: "Hello!")
irb(main):004:0> post.destroy       

Ecto's update, and delete follow a similar pattern of a possible {:error, changeset} result, while get and get_by will return nil if the post was not found. Ecto also provides get!, get_by!, insert!, update!, and delete!; those familiar with ActiveRecord will correctly guess that these act differently by raising exceptions for their error cases.

Earlier, Ecto told us we need to add EctoBlog.Repo as a worker to our supervision tree. To understand why, let's take a peek under the hood.

Database connections and concurrency

Typical ruby applications using ActiveRecord are single threaded, and when you query the database, your process is blocked until you receive a response from the database.

When using Ruby on Rails, this works out - you typically can't respond to a web request until you've got all the data, and you can start multiple ruby processes to handle requests concurrently. It's not the most efficient thing in the world, but it can work.

Ecto handles things differently. Each connection to the database is maintained in its own Elixir (not system) process, which are kept in a pool by poolboy*. When you query the database through a repository, Ecto checks out a connection from the pool, and the query executes in that process, running concurrently with your other work.

All these processes are managed through a tree of supervisors, and the repository we created sits at the top. That's why we added the repo as a child of our application's supervisor earlier.

Supervision Tree

Models

We took a brief look at models earlier when we created lib/ecto_blog/post.ex. What stood out immediately is that while ActiveRecord infers a model's schema by querying the database, Ecto's models explicitly define theirs.

Let's build out our database with a table to allow people to rate posts. Run mix ecto.gen.migration create_ratings, and add the following code:

defmodule EctoBlog.Repo.Migrations.CreateRatings do
  use Ecto.Migration

  def change do
    create table(:ratings) do
      add :post_id, :integer
      add :value, :integer

      timestamps
    end

    alter table(:posts) do
      add :average_rating, :decimal
    end
  end
end

Ignore the new average_rating column for now, we'll use it later. Add a file for your new rating model at lib/ecto_blog/rating.ex:

defmodule EctoBlog.Rating do
  use Ecto.Model

  schema "ratings" do
    belongs_to :post, EctoBlog.Post
    field :value, :integer

    timestamps
  end
end

These are two more cases where we have to be more explicit than we would in ActiveRecord: we need to specify a module in our call to belongs_to, and we need to tell Ecto the type of our rating field (we didn't before because string is the default type).

Types

The type information actually specifies an Ecto.Type, which controls how data is converted as it comes both to and from the database, as well as how values are represented in queries.

While it's great that ActiveRecord infers types and automatically handles conversions for you, Ecto's types allow for greater control. It's worth noting that the upcoming attributes API changes ActiveRecord to work much more like Ecto in this regard.

Associations

Coming from ActiveRecord, belongs_to seems pretty straightforward, but dealing with the associations at runtime is a bit different. Add the other side of the association to the schema defined in lib/ecto_blog/post.ex

has_many :ratings, EctoBlog.Rating

Then run mix ecto.migrate to create the ratings table, and drop into iex again:

iex(1)> post = EctoBlog.Repo.get(EctoBlog.Post, 1)
iex(2)> post.ratings

You'll notice iex reports that the ratings field is of type Ecto.Association.NotLoaded. Ecto does not preload associations for you. If you need the association, you need to tell the Repo to preload it at query time:

iex(1)> post = EctoBlog.Repo.get(EctoBlog.Post, 1) |> EctoBlog.Repo.preload(:ratings)
iex(2)> post.ratings

If you execute the code above, you'll see the association is loaded and empty.

Ecto.Model also provides conveniences when creating new records through associations (similar to build on an ActiveRecord association):

iex(3)> rating = Ecto.Model.build(post, :ratings, value: 5) |> EctoBlog.Repo.insert
Changesets

Let's add some validations. To do so, we'll need to construct a changeset for our model. Within lib/ecto_blog/post.ex, add:

def changeset(post, params \\ :empty) do
  post
  |> cast(params, ~w(title), ~w(content))
  |> validate_length(:title, min: 3)
end

The cast function serves a similar purpose that ActiveRecord's former attr_accessible/attr_protected macros did, along with the newer strong_parameters: any parameters not found in the two lists will be ignored, which helps deal with potentially malicious inputs.

The second argument, ~w(title), denotes the required parameters, while the third, ~w(content), is optional. Our changeset will not be valid unless a title is provided. The validate_length function works as you'd expect.

Let's test the changeset from within iex. By the way, if you don't want to keep restarting iex, you can reload the module you changed within it by using the r function, for example, r(EctoBlog.Post).

iex(4)> EctoBlog.Post.changeset(post, %{title: ""}).valid?
iex(5)> EctoBlog.Post.changeset(post, %{title: "Learn about Ecto"}).valid?

These return false and true, respectively. Note that even though valid? might remind you of a method on an ActiveRecord model, it's actually just a simple struct field. The changeset also has the familiar errors field with any validation failure messages.

Earlier, we passed models to the repo to insert and update our database - but we can pass changesets as well:

iex(6)> changeset = EctoBlog.Post.changeset(post, %{title: "Learn"}) 
iex(7)> {:ok, updated_post} = EctoBlog.Repo.update(changeset)

Typically, you'd work with a changeset for making modifications to a model via the repo, and you'd work with the model when fetching the data for display.

Callbacks

It would be great if we could maintain an average rating for our posts, rather than having to compute each time we view a post. Luckily, Ecto models support callbacks.

Ecto's callbacks are defined on the model. They receive a changeset representing the insert/update/delete, and are expected to return a changeset. Unlike ActiveRecord's callbacks, they have no means of aborting the database operation.

We already migrated the average_rating column to posts - but we still need to map the column in the schema section of lib/ecto_blog/post.ex, so add field :average_rating, :decimal there. Then, in lib/ecto_blog/rating.ex, add the callbacks:

after_insert :update_average_rating
after_update :update_average_rating
after_delete :update_average_rating

def update_average_rating(changeset) do
  post_id = get_field(changeset, :post_id)

  average_rating = EctoBlog.Repo.one(
    from r in EctoBlog.Rating,
      select: avg(r.value),
      where: r.post_id == ^post_id
  )

  EctoBlog.Repo.update_all(
    from(p in EctoBlog.Post, where: p.id == ^post_id),
    set: [average_rating: average_rating]
  )

  changeset
end

This callback will execute two queries - one to select the average of all ratings on the newly rated post, and another to update the post with the computed value. Since Ecto runs its callbacks within a transaction (and we modify the rating before updating the average), we're guaranteed to stay in sync.

Wrapping up

We've covered quite a bit of ground. For me, Ecto is a breath of fresh air. It lacks the magic that makes ActiveRecord too mysterious at times, and its separate modules feel much more contained than ActiveRecord::Base.

If you'd like, the code from this post is available on GitHub. In the next post, I'll talk about one of Ecto's most interesting features, its query DSL.


* There are alternative pools available in Ecto as well, but poolboy is the default.