欢迎各位兄弟 发布技术文章

这里的技术是共享的

You are here

Fiddler抓取微信公众号数据

写这篇博文的主要目的,记录我使用Fiddler 抓包工具完成公众号请求信息的抓取,并解析抓取的数据的全过程。
准备工作:
下载:Fiddler_5.0.20173.49666_Setup.exe
1.安装Fiddler_5.0.20173.49666_Setup.exe,很简单,打开效果如下图:

2.生成证书文件FiddlerRoot.cer
     在菜单栏中依次选择 【Tools】->【Options】->【HTTPS】,勾上如下图的选项

    然后点击【Actions】选择导出证书到桌面

3.手动安装证书
   在fiddler目录下有一个makecert.exe ,创建myTest.bat 内容如下:
[html] view plain copy
  1. makecert.exe -r -ss my -n “CN=DO_NOT_TRUST_FiddlerRootO=DO_NOT_TRUSTOU=Created by http://www.fiddler2.com” -sky signature -eku 1.3.6.1.5.5.7.3.1 -h 1 -cy authority -a sha1 -m 120 -b 09/05/2012  

4.抓取我想要的微信公众号的数据
  a.原理:fiddler工具为我们提供了请求前的方法和请求响应后的方法
[html] view plain copy
  1. OnBeforeRequest(),OnBeforeResponse()  

  b.配置抓取规则
     选择菜单【rules 】--- >【customs rules】选项,然后重启一下进入到如图所示的界面

     修改OnBeforeRequest()
[html] view plain copy
  1. <span style="color:#663300;"> if (oSession.fullUrl.Contains("mp.weixin.qq.com"))  

  2.  {  

  3.      var fso;  

  4.      var file;  

  5.      fso = new ActiveXObject("Scripting.FileSystemObject");  

  6.      //文件保存路径,可自定义  

  7.      file = fso.OpenTextFile("c:\\Sessions.txt",8 ,true, true);  

  8.      file.writeLine("Request url: " + oSession.url);  

  9.      file.writeLine("Request header:" + "\n" + oSession.oRequest.headers);  

  10.      file.writeLine("Request body: " + oSession.GetRequestBodyAsString());  

  11.      file.writeLine("\n");  

  12.      file.close();  

  13.  }</span>  

   修改OnBeforeResponse()
[html] view plain copy
  1. <span style="color:#663300;">if(oSession.fullUrl.Contains("weixin/searchShiFu.php"))  

  2.         {  

  3.          oSession.utilDecodeResponse();//消除保存的请求可能存在乱码的情况  

  4.             var fso;  

  5.             var file;  

  6.             fso=new ActiveXObject("Scripting.FileSystemObject");  

  7.             //文件保存路径,可自定义  

  8.   

  9.             file=fso.OpenTextFile("d:\\Response.txt",8,true,true);  

  10.             //file.writeLine("Response code: "+oSession.responseCode);  

  11.             file.writeLine("Response body: "+oSession.GetResponseBodyAsString());  

  12.             file.writeLine("\n");  

  13.             file.close();  

  14.         }</span>  

   保存退出,重启fiddler即可使用。
5.解析抓取的内容
   a.响应获取的解析数据d:\\Response.txt中,内容如下:
[html] view plain copy
  1. Response body: {"nickname":"秦人","totalTimes":33,"todayTimes":2,"total":598,"thisNum":8,"yuanjin":1,"oneSF":[{"juli":"2.1 公里","name":"马帅军","phone":"15529016011","address":"陕西西安未央区建章路","longitude_S":"108.848384","latitude_S":"34.318548","jianjie":"工龄4年。     施工20~25元/卷","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"12527","weixin":"0","headimgurl":""},{"juli":"2.4 公里","name":"张帅","phone":"13571547952","address":"陕西西安施工范围。全。","longitude_S":"108.893215","latitude_S":"34.332735","jianjie":"无妨壁纸25元一卷。长纤,蚕丝等30元一卷。壁画20元一平。壁布10元一平。","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"14235","weixin":"1","headimgurl":"http://wx.qlogo.cn/mmopen/PiajxSqBRaEJkMDHthV4HGnCWtEk7TCTvDOQUId5uvHaOZkzxN8nRJv8C7YicFia8KibNhvyjW..."},{"juli":"2.9 公里","name":"黄师傅","phone":"13289380958","address":"西安西安市","longitude_S":"108.894098","latitude_S":"34.305800","jianjie":"工龄:6年,本人从事墙纸粘贴行业已有6年,积累了丰富的墙纸施工方面的经验和技术能对各种高中","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"2248","weixin":"0","headimgurl":""},{"juli":"3 公里","name":"尚俊","phone":"18802920027","address":"陕西西安未央区汉城街办西查村","longitude_S":"108.900890","latitude_S":"34.331328","jianjie":"6年工龄,无纺纸20其他25。团队6人,随时准备24小时为您服务。","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"13814","weixin":"0","headimgurl":""},{"juli":"3.2 公里","name":"刘小虎","phone":"18710629117","address":"陕西咸阳武功县普集镇令新村","longitude_S":"108.904032","latitude_S":"34.316132","jianjie":"","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"11609","weixin":"0","headimgurl":""},{"juli":"3.6 公里","name":"张跃武","phone":"13772014639","address":"陕西西安莲湖区莲湖区邓家村小学","longitude_S":"108.880513","latitude_S":"34.292046","jianjie":"贴了6年壁纸,工费是根据纸的材料而定,合作搭档两人,","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"5794","weixin":"0","headimgurl":""},{"juli":"3.6 公里","name":"小何","phone":"15291480050","address":"陕西西安凤城四路","longitude_S":"108.909152","latitude_S":"34.324399","jianjie":"","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"13726","weixin":"0","headimgurl":""},{"juli":"3.6 公里","name":"魏师傅","phone":"15291814440","address":"西安西安市","longitude_S":"108.904098","latitude_S":"34.306800","jianjie":"工龄:7年,专业","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"2241","weixin":"0","headimgurl":""}]}  

   我的目的是从上面的响应数据获取到name,phone,addrss的信息;
   备注:默认生成的Response.txt文件的字符集是ucs-2 little endian ,在java中的字符集类型为:UTF-16LE

   b.解析Response.txt文本内容,输出name,phone,addrss信息到D:\\Handle_Response.txt
     我用java实现文本内容的解析,代码如下:
[java] view plain copy
  1. package com.wang.readText;  

  2.   

  3. import java.io.BufferedReader;  

  4. import java.io.BufferedWriter;  

  5. import java.io.File;  

  6. import java.io.FileInputStream;  

  7. import java.io.FileOutputStream;  

  8. import java.io.InputStreamReader;  

  9. import java.io.OutputStreamWriter;  

  10. import java.io.UnsupportedEncodingException;  

  11. import java.util.ArrayList;  

  12. import java.util.HashMap;  

  13. import java.util.List;  

  14. import java.util.Map;  

  15. import java.util.Set;  

  16.   

  17. import com.alibaba.fastjson.JSONArray;  

  18. import com.alibaba.fastjson.JSONObject;  

  19.   

  20. public class ReadTextUtils {  

  21.   

  22.     private static List<UserInfo> resultList = new ArrayList<>();  

  23.     private static String SRC_PATH = "D:/Response.txt";  

  24.     private static String OUT_PATH = "D:/Handle_Response.txt";  

  25.   

  26.     public static void main(String[] args) {  

  27.           

  28.           

  29.         String srcPath = args[0];  

  30.         String outPath = args[1];  

  31. //      String srcPath = SRC_PATH;  

  32. //      String outPath = OUT_PATH;  

  33.         readTxtContent(srcPath);  

  34.         writeTxtContent(outPath);  

  35.             

  36.     }  

  37.       

  38.     public static void readTxtContent(String srcPath){  

  39.   

  40.         /* 读取数据 */  

  41.         try {  

  42.             BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(new File(srcPath)),"UTF-16LE"));  

  43.             String lineTxt = null;  

  44.             while ((lineTxt = br.readLine()) != null) {  

  45.                 if(!"".equals(lineTxt)) {  

  46.                     lineTxt = lineTxt.substring(15);  

  47.                     JSONObject object = (JSONObject) JSONObject.parse(lineTxt);  

  48.                     if(!object.get("oneSF").equals(0)) {  

  49.                         JSONArray jsonArray =  (JSONArray)object.get("oneSF");  

  50.                         String jsonarrayString = jsonArray.toJSONString();  

  51.                         List<UserInfo> userList=JSONArray.parseArray(jsonarrayString, UserInfo.class);  

  52.                         resultList.addAll(userList);  

  53.                         System.out.println("--read line data count---"+userList.size());  

  54.                     }  

  55.                 }  

  56.             }  

  57.             br.close();  

  58.         } catch (Exception e) {  

  59.             System.err.println("read errors :" + e);  

  60.         }  

  61.     }  

  62.   

  63.     public  static List<UserInfo> quchongfu() {  

  64.         HashMap <String,UserInfo> userMap = new HashMap<>();  

  65.         List<UserInfo> userInfoList = new ArrayList<>();  

  66.         for(UserInfo userInfo :resultList) {  

  67.             userMap.put(userInfo.getPhone(), userInfo);  

  68.         }  

  69.         Set<String> keySet = userMap.keySet();  

  70.         for(String str:keySet) {  

  71.             userInfoList.add(userMap.get(str));  

  72.         }  

  73.         return userInfoList;  

  74.     }  

  75.     public static void writeTxtContent(String outPath){  

  76.         /* 输出数据 */  

  77.         try {  

  78.             BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(new File(outPath)),"UTF-8"));  

  79.   

  80.             bw.write("name\t\t    phone\t\t\t    address");  

  81.             bw.newLine();  

  82.             for(UserInfo userInfo :quchongfu()){  

  83.                 bw.write(userInfo.getName()+"\t\t "+userInfo.getPhone()+"\t\t "+userInfo.getAddress());  

  84.                 bw.newLine();  

  85.             }  

  86.             bw.close();  

  87.         } catch (Exception e) {  

  88.             System.err.println("write errors :" + e);  

  89.         }  

  90.     }  

  91.   

  92.   

  93. }  

   输出的文本内容如下图:

  6.一键式执行数据处理:
     a.将ReadTextUtils类打包成可执行的jar;
  
    b.编写简单的runReadTxt.bat文件,内容如下:
   
[html] view plain copy
  1. @echo off  

  2. echo -----------read infos-----------    

  3. java -jar %cd%\"readTxt.jar" %cd%\"Response.txt" %cd%\"Handle_Response.txt"  

  4. echo ---------------finish!!!-----------------------------    

  5. PAUSE  

   c.整体一键式运行小工具搞定;备注:运行jar的前提是要安装java运行环境
    
     希望能帮到你,欢迎指正!
     

来自  https://blog.csdn.net/huaairen/article/details/79243760

普通分类: