首页 > 数据库技术 > 详细

再战mysql 数据去重

时间:2018-11-13 13:49:49      阅读:146      评论:0      收藏:0      [点我收藏+]

年初时,写过一篇去重的,在小表中还能用用,在大表中真的是效率低下,现在给了一次优化
https://www.cnblogs.com/jarjune/p/8328013.html

继上一篇文章

方法三:

DELIMITER //

DROP PROCEDURE IF EXISTS delete_rows_2;

CREATE PROCEDURE delete_rows_2(IN TABLENAME VARCHAR(50), IN FIELDNAMES VARCHAR(100), IN AUTOFIELD VARCHAR(50))
BEGIN

DECLARE DELETE_TABLE_ROWS_SQL VARCHAR(1000);

SET DELETE_TABLE_ROWS_SQL = CONCAT('
        DELETE 
        FROM 
            ', TABLENAME ,' 
        WHERE 
            (', FIELDNAMES ,') IN ( 
                SELECT ', FIELDNAMES ,' 
                FROM (
                    SELECT 
                        ', FIELDNAMES ,' 
                    FROM 
                        ', TABLENAME ,' 
                    GROUP BY 
                        ', FIELDNAMES ,' 
                    HAVING 
                        COUNT(1) > 1 
                ) t1
            ) 
        AND ', AUTOFIELD ,' NOT IN ( 
            SELECT ', AUTOFIELD ,' 
            FROM (
                SELECT 
                    MAX(', AUTOFIELD ,') ', AUTOFIELD ,' 
                FROM 
                    ', TABLENAME ,'
                GROUP BY 
                    ', FIELDNAMES ,' 
                HAVING 
                    COUNT(1) > 1 
                ) t2
            )
');

SET @DELETE_TABLE_ROWS_SQL = DELETE_TABLE_ROWS_SQL;

PREPARE DELETE_TABLE_ROWS_SQL_PRE FROM @DELETE_TABLE_ROWS_SQL;
EXECUTE DELETE_TABLE_ROWS_SQL_PRE;

END//

DELIMITER ;

CALL delete_rows_1('表名', '字段1,字段2,字段3...', '主键(唯一)字段');

之后发现删除的效率还是挺低,又优化成

方法三(优化):

DELIMITER //

DROP PROCEDURE IF EXISTS delete_rows_2;

CREATE PROCEDURE delete_rows_2(IN TABLENAME VARCHAR(50), IN FIELDNAMES VARCHAR(100), IN AUTOFIELD VARCHAR(50))
BEGIN

DECLARE DELETE_TABLE_ROWS_SQL VARCHAR(1000);

SET DELETE_TABLE_ROWS_SQL = CONCAT('
        DELETE 
        FROM 
            ', TABLENAME ,' 
        WHERE 
            ', AUTOFIELD ,' IN ( 
                SELECT 
                    ', AUTOFIELD ,' 
                FROM
                    (
                    SELECT 
                        ', AUTOFIELD ,' 
                    FROM 
                        ', TABLENAME ,' 
                    WHERE 
                        (', FIELDNAMES ,') IN ( 
                            SELECT 
                                ', FIELDNAMES ,' 
                            FROM 
                                ', TABLENAME ,' 
                            GROUP BY 
                                ', FIELDNAMES ,' 
                            HAVING 
                                COUNT(1) > 1 
                        ) 
                    AND ', AUTOFIELD ,' NOT IN ( 
                        SELECT 
                            MAX(', AUTOFIELD ,') 
                        FROM 
                            ', TABLENAME ,'
                        GROUP BY 
                            ', FIELDNAMES ,' 
                        HAVING 
                            COUNT(1) > 1 
                    ) 
                ) t2 
            ) 
    ');

SET @DELETE_TABLE_ROWS_SQL = DELETE_TABLE_ROWS_SQL;

PREPARE DELETE_TABLE_ROWS_SQL_PRE FROM @DELETE_TABLE_ROWS_SQL;
EXECUTE DELETE_TABLE_ROWS_SQL_PRE;

END//

DELIMITER ;

CALL delete_rows_2('表名', '字段1,字段2,字段3...', '主键字段');

由于上述都要group by 两次,又换了一种思路

方法四

delete t1
FROM
    l_weijij_47 t1,
    (
        SELECT
            f01,
            f02,
            f03,
            MAX(seq_value) seq_value
        FROM
            l_weijij_47
        GROUP BY
            f01,
            f02,
            f03
        HAVING
            COUNT(1) > 1
        ORDER BY NULL
    ) t2
where
    t1.f01 = t2.f01
AND t1.f02 = t2.f02
AND t1.f03 = t2.f03
and t1.seq_value < t2.seq_value

注:group by默认会进行排序,所以要加上order by NULL就避免了排序

综上,方法四是目前在用的去重。

再战mysql 数据去重

原文:https://www.cnblogs.com/jarjune/p/9951573.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!