Your Location is: Home > Sql

Postgresql “column must appear in the GROUP BY clause or be used in an aggregate function” and unique field

From: Budapest View: 1583 Mathieu Mahé 

Question

I know this question has been asked a lot, but I don't find an answer as to why I get this error message with a unique field:

Here are my 2 tables and an index:

CREATE TABLE posts (
    id bigint NOT NULL,
    user_id bigint NOT NULL,
    content text
);

CREATE TABLE users (
    id bigint NOT NULL,
    email character varying DEFAULT ''::character varying NOT NULL
)

CREATE UNIQUE INDEX index_users_on_email ON users USING btree (email);

The following sql request:

SELECT posts.content, users.email /*, other aggregate fields not relevant for the question */
   FROM posts
   INNER JOIN users ON posts.user_id = users.id
   /* Other `inner join`s but not relevant for the question */
   GROUP BY posts.id;

give me the error column "users.email" must appear in the GROUP BY clause or be used in an aggregate function.

But the email field is unique (if it changes anything) and a post can only have one user (so one email).

I don't understand why this request is not valid since it's not possible to have multiple values of email per post.

Best answer

You need to add the primary key of the user table to the group by clause to make the query a valid aggregation query:

SELECT p.content, u.email /*, other aggregate fields not relevant for the question */
FROM posts p
INNER JOIN users u ON p.user_id = u.id
/* Other `inner join`s but not relevant for the question */
GROUP BY posts.id, u.id;

Postgres is quite smart about functional dependencies, but not that smart. It understand the concept of functionally-dependent columns, but not across tables. It cannot foresee that a post uniquely refers to a user, even if you have a proper foreign key set up. I don't think that such things is defined in standard ANSI SQL either.