Py学习  »  DATABASE

mysql left join返回意外结果

Supun Kavinda • 5 年前 • 1495 次点击  

我有两张桌子 talk_comments talk_comment_votes .

我运行以下代码来选择, commentId , numberOfUpvotes , whetherUserUpvoted , numberOfDownvotes , whetherUserDownvoted 使用左连接到同一个表。

SELECT c.id, COUNT(v1.id) as upvotes, COUNT(v2.id) as userUpvoted, COUNT(v3.id) as downvotes, COUNT(v4.id) as userDownvoted FROM talk_comments c
    LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1 
    LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
    LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
    LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
WHERE c.id = 2 GROUP BY c.id

我的演讲评论投票表中有以下数据

phpMyAdmin database screenshot

因此,根据查询,它应该选择值 2,2,0,1,1 分别是。当我断开这些连接语句并执行查询时,它将返回预期的结果。但是,对于连接,它返回如下内容。

phpMyAdmin table results

我能帮你修一下吗?

谢谢。

Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/43076
 
1495 次点击  
文章 [ 2 ]  |  最新文章 5 年前
spencer7593
Reply   •   1 楼
spencer7593    6 年前

我会使用条件聚合。对单个引用的联接 tall_comment_votes ,然后检查表达式中的条件。

SELECT c.id
     , SUM(IF(v.status = 1                                          ,1,0)) AS upvotes
     , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
     , SUM(IF(v.status = 2                                          ,1,0)) AS downvotes
     , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
  FROM talk_comments c
  LEFT
  JOIN talk_comment_votes v
    ON v.comment_id = c.id
 WHERE c.id = 2
 GROUP
    BY c.id

这避免了当v1、v2、v3和v4返回多行时部分交叉积的问题。

MySQL IF() 表达式可以替换为更符合ANSI标准的 CASE 表达式,例如

    , SUM(CASE WHEN v.status = 1 THEN 1 ELSE 0 END)  AS upvotes

追随

设置测试用例并遵守执行计划和性能

填充表

CREATE TABLE talk_comments (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT);
CREATE TABLE talk_comment_votes (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, comment_id  INT UNSIGNED NOT NULL, user_id  INT UNSIGNED NOT NULL, is_anonymous TINYINT(1) UNSIGNED NOT NULL, STATUS TINYINT UNSIGNED, time_ INT UNSIGNED); 
CREATE INDEX talk_comment_votes_IX1 ON talk_comment_votes (comment_id, STATUS, user_id, is_anonymous) ;
INSERT INTO talk_comments (id) VALUES (1),(2),(3);
INSERT INTO talk_comment_votes (id, comment_id, user_id, is_anonymous, STATUS, time_) VALUES (1,2,2,0,1,0),(2,1,1,0,1,0),(3,2,1,0,2,NULL),(4,7,1,0,2,NULL),(5,1,14,1,1,NULL),(6,2,14,1,1,NULL);

查询执行计划

EXPLAIN
SELECT c.id, COUNT(DISTINCT v1.id) AS upvotes, COUNT(DISTINCT v2.id) AS userUpvoted, COUNT(DISTINCT v3.id) AS downvotes, COUNT(DISTINCT v4.id) AS userDownvoted FROM talk_comments c
  LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1 
    LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
    LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
    LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
WHERE c.id = 2 GROUP BY c.id
;

EXPLAIN
SELECT c.id
     , SUM(IF(v.status = 1                                          ,1,0)) AS upvotes
     , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
     , SUM(IF(v.status = 2                                          ,1,0)) AS downvotes
     , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
    FROM talk_comments c
    LEFT
    JOIN talk_comment_votes v
      ON v.comment_id = c.id
   WHERE c.id = 2
   GROUP BY c.id
;

explain的输出

--     id  select_type  table   type    possible_keys           key                     key_len  ref                        rows  Extra        
-- ------  -----------  ------  ------  ----------------------  ----------------------  -------  -----------------------  ------  -------------
--      1  SIMPLE       c       const   PRIMARY                 PRIMARY                 4        const                         1  Using index  
--      1  SIMPLE       v1      ref     talk_comment_votes_IX1  talk_comment_votes_IX1  6        const,const                   2  Using index  
--      1  SIMPLE       v2      ref     talk_comment_votes_IX1  talk_comment_votes_IX1  11       const,const,const,const       1  Using index  
--      1  SIMPLE       v3      ref     talk_comment_votes_IX1  talk_comment_votes_IX1  6        const,const                   1  Using index  
--      1  SIMPLE       v4      ref     talk_comment_votes_IX1  talk_comment_votes_IX1  11       const,const,const,const       1  Using index  



--     id  select_type  table   type    possible_keys           key                     key_len  ref       rows  Extra        
-- ------  -----------  ------  ------  ----------------------  ----------------------  -------  ------  ------  -------------
--      1  SIMPLE       c       const   PRIMARY                 PRIMARY                 4        const        1  Using index  
--      1  SIMPLE       v       ref     talk_comment_votes_IX1  talk_comment_votes_IX1  4        const        3  Using index  

测量性能:

100 executions                        round 1      round 2     round 3
------------------------------------  ----------   ----------  ---------
multiple left join, count(distinct    0.123 secs   0.130 secs  0.125 secs
conditional aggregation sum(if        0.113 secs   0.114 secs  0.111 secs
Supun Kavinda
Reply   •   2 楼
Supun Kavinda    6 年前

我运行了一个基于@spencer7593和@raymondnijland的两个答案的查询基准。

左加入获胜!

1。使用左连接

SELECT c.id, COUNT(DISTINCT v1.id) as upvotes, COUNT(DISTINCT v2.id) as userUpvoted, COUNT(DISTINCT v3.id) as downvotes, COUNT(DISTINCT v4.id) as userDownvoted FROM talk_comments c
  LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1 
    LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
    LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
    LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
WHERE c.id = 2 GROUP BY c.id

1000次查询的时间: 0.55000805854797秒


2。使用子查询

SELECT c.id,c.user_id, c.time,c.body, c.reply_to, 
    (SELECT COUNT(v1.id) FROM talk_comment_votes v1 WHERE v1.comment_id = c.id AND v1.status = 1 LIMIT 1) as upvotes, 
    (SELECT COUNT(v2.id) FROM talk_comment_votes v2 WHERE v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 LIMIT 1) as clientUpvoted,
    (SELECT COUNT(v3.id) FROM talk_comment_votes v3 WHERE v3.comment_id = c.id AND v3.status = 2 LIMIT 1) as downvotes, 
    (SELECT COUNT(v4.id) FROM talk_comment_votes v4 WHERE v4.comment_id = c.id AND v4.status = 2 AND v4.user_id = 1 LIMIT 1) as clientDownvoted
      FROM talk_comments c 
           WHERE c.id = 2 GROUP BY c.id

1000次查询的时间: 0.95499300956726秒

三。使用sum,如果

SELECT c.id
     , SUM(IF(v.status = 1                                          ,1,0)) AS upvotes
     , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
     , SUM(IF(v.status = 2                                          ,1,0)) AS downvotes
     , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
    FROM talk_comments c
    LEFT
    JOIN talk_comment_votes v
      ON v.comment_id = c.id
   WHERE c.id = 2
   GROUP BY c.id

1000次查询的时间: 1.2266919612885秒

谢谢你的回答。