閱讀87 返回首頁    go 火車采集器


關於中華英才網的網址列表采集(11.20)

關於中華英才網的網址列表采集(11.20)

作者:小文 發布於:2010-11-17 19:56 Wednesday 分類:軟件培訓

我們在中華英才網進行職位搜索時,會發現它的網址一直是變化的,比如 https://searchjob.chinahr.com/ ,我們從這個頁麵上輸入關鍵字和其它條件時,就可以獲取內容.

但我們要獲取這個網址列表是不容易的,經分析,它的網址是經過urlencode,baseb6加密等方法進行處理的,直接是無法獲取的.本次我們隻講一下如何分析這個網址.


需要的工具:

fiddler,firefox(組件firebug).


參考代碼:

 

2
base64_decode
occIDList%3D604%252c600%26occParentIDList%3D604%252c600%26myLocIDList%3D30000%26myLocParentIDList%3D30000%26companyTypeIDList%3D%26isInterView%3D1%26from%3Dsearch%26IsModel%3Dfalse%26positionName%3D%25u9879%25u76ee%26prj%3Dwww1%26sj%3D1%26curPage%3D2%26pageSize%3D20%26recordCount%3D2000%26orderField%3D%26order%3DDESC
urldecode
occIDList=604%2c600&occParentIDList=604%2c600&myLocIDList=30000&myLocParentIDList=30000&companyTypeIDList=&isInterView=1&from=search&IsModel=false&positionName=%u9879%u76ee&prj=www1&sj=1&curPage=2&pageSize=20&recordCount=2000&orderField=&order=DESC
urldecode
occIDList=604,600&occParentIDList=604,600&myLocIDList=30000&myLocParentIDList=30000&companyTypeIDList=&isInterView=1&from=search&IsModel=false&positionName=項目&prj=www1&sj=1&curPage=2&pageSize=20&recordCount=2000&orderField=&order=DESC




1

occIDList%3D604%252c600%26occParentIDList%3D604%252c600%26myLocIDList%3D30000%26myLocParentIDList%3D30000%26companyTypeIDList%3D%26isInterView%3D1%26from%3Dsearch%26IsModel%3Dfalse%26positionName%3D%25u9879%25u76ee%26prj%3Dwww1%26sj%3D1%26curPage%3D1%26pageSize%3D20%26recordCount%3D2000%26orderField%3D%26order%3DDESC

occIDList=604%2c600&occParentIDList=604%2c600&myLocIDList=30000&myLocParentIDList=30000&companyTypeIDList=&isInterView=1&from=search&IsModel=false&positionName=%u9879%u76ee&prj=www1&sj=1&curPage=1&pageSize=20&recordCount=2000&orderField=&order=DESC

occIDList=604%2c600&occParentIDList=604%2c600&myLocIDList=30000&myLocParentIDList=30000&companyTypeIDList=&isInterView=1&from=search&IsModel=false&positionName=%e9%a1%b9%e7%9b%ae&prj=www1

occIDList%3d604%2c600%26occParentIDList%3d604%2c600%26myLocIDList%3d30000%26myLocParentIDList%3d30000%26companyTypeIDList%3d%26isInterView%3d1%26from%3dsearch%26IsModel%3dfalse%26positionName%3d%e9%a1%b9%e7%9b%ae%26prj%3dwww1%26curPage%3d1%26pageSize%3d20%26recordCount%3d2000%26orderField%3d%26order%3dDesc%26sj%3d1


occIDList=604,600&occParentIDList=604,600&myLocIDList=30000&myLocParentIDList=30000&companyTypeIDList=&isInterView=1&from=search&IsModel=false&positionName=項目&prj=www1&sj=1&curPage=1&pageSize=20&recordCount=2000&orderField=&order=DESC
 

相關日誌:

火車瀏覽器開發SDK下載

分頁的采集培訓

火車頭數據采集平台Web發布模塊插件的開發文檔

一個腳本網址的采集辦法(11.24)

YY語音火車采集器V7版本特性介紹(3.8)

« 一個特殊的網站的數據的采集,主講算法(11.22) | 使用Post方法采集獲取網址列表(2010.11.16)»

評論:

jianzhang5555
2011-01-06 15:20
相當無語。、。。。
大哥
2010-11-27 17:56
大哥能寫明白點麼?
小文
2010-11-27 18:57
@大哥:這個是個培訓,已經過去了.材料暫時沒整理

發表評論:

最後更新:2017-05-09 01:05:54

  上一篇:go E-AUTO 3.0.0 新聞免登陸發布接口
  下一篇:go DedeCMS 5.7發布模塊和免登陸接口