入需要匹配的用户数据数据至本地
命令行进入msyql(不会的自己百度)
然后执行:mysql>source C:/Users/zhudong/Desktop/userdb.sql
3、对比筛选用户
对比
程序写好了,切记在命令行下运行:
复制代码 代码如下:
<?php
$link = mysql_connect(''localhost'', ''root'', ''admin'', true);
mysql_select_db(''csdn'',$link);
$handle_username = fopen("E:/records_username.txt","a");
//$handle_email = fopen("E:/records_email.txt","a");
$username_num = $email_num = $uid = 0;
while ($uid<2181106) {
$nextuid=$uid+10000;
$query = mysql_query("SELECT * FROM pw_members WHERE uid>''$uid'' AND uid<''$nextuid''");
while ($rt = mysql_fetch_array($query,MYSQL_ASSOC)) {
$username = $rt[''username''];
$email = $rt[''email''];
$query2 = mysql_query("SELECT * FROM scdn_userdb WHERE username=''$username'' OR email=''$email''");
while ($rt2 = mysql_fetch_array($query2,MYSQL_ASSOC)) {
if ($rt[''password''] = md5($rt2[''password''])) {
if ($rt2[''username''] == $username) {
$username_num++;
fwrite($handle_username,''OWN:''.$rt[''uid''].''|''.$rt[''username''].''|''.$rt[''password''].''|''.$rt[''email''].'' CSDN:''.$rt2[''username''].''|''.$rt2[''password''].''|''.$rt2[''email'']."\r\n");
echo ''username_num=''.$username_num."\r\n";
continue;
}
/*
if ($rt2[''email''] == $email) {
$email_num++;
fwrite($handle_email,''OWN:''.$rt[''uid''].''|''.$rt[''username''].''|''.$rt[''password''].''|''.$rt[''email''].'' CSDN:''.$rt2[''username''].''|''.$rt2[''password''].''|''.$rt2[''email'']."\r\n");
echo ''email_num=''.$email_num."\r\n";
}
*/
}
}
mysql_free_result($query2);
}
$uid = $nextuid;
}
?>
您看到的以上的代码是非常蹩脚的,因为其效率特别低 ,几百万的数据,要跑10多个小时,怎么能忘记连表
查询这么基本的东西呢,以下为修正后的方法
复制代码 代码如下:
$link = mysql_connect(''localhost'', ''root'', ''admin'', true);
mysql_select_db(''csdn'',$link);
$handle_username = fopen("E:/records_username.txt","a");
while($uid<2181106) {//此处的数字为要对比用户库的最大ID
$nextuid= $uid+10000;
$query = mysql_query("SELECT m.uid,m.username,m.password,m.email,u.password as csdn_password,u.email as csdn_email FROM own_members m LEFT JOIN csdn_userdb u USING(username) WHERE m.uid>''$uid'' AND m.uid<=''$nextuid'' AND u.username!=''''");
while ($rt = mysql_fetch_array($query,MYSQL_ASSOC)) {
if ($rt[''password''] == md5($rt[''csdn_password''])) {
$username_num++;
fwrite($handle_username,''OWN:''.$rt[''uid''].''|''.$rt[''username''].''|''.$rt[''password''].''|''.$rt[''email''].'' CSDN:''.$rt[''username''].''|''.$rt[''csdn_password''].''|''.$rt[''csdn_email'']."\r\n");
echo ''username_num=''.$username_num."\r\n";
}
}
$uid = $nextuid;
echo ''uid=''.$uid;
}
?>
总对比时间25分钟,相比较之前10多个小时的执行真是大有提升
总重名用户:34175
占总会员比例:1.7%
1.7%的重名用户还是挺严重的,希望本文对各位站长对比出本站的用户有所帮助