[
  {
    "path": ".gitignore",
    "content": ".idea/\n.idea\n*_result.txt\n.DS_Store\n*.txt\n"
  },
  {
    "path": "README.md",
    "content": "# YiSpider\r\nA distributed spider platform\r\n\r\n## 介绍\r\n\r\n一款分布式爬虫平台，帮助你更好的管理和开发爬虫。\r\n内置一套爬虫定义规则（模版），可使用模版快速定义爬虫，也可当作框架手动开发爬虫\r\n\r\n## 计划\r\n- [x] 增加了更多例子。\r\n- [x] 内置实现了基于redis的调度器。\r\n- [ ] 正在准备管理网页端部分的制作，敬请期待。\r\n\r\n## 架构\r\n\r\n目前框架分为2个部分:  \r\n#### 1.爬虫部分（spider节点）:\r\n\r\n内部结构参考python scrapy框架，主要由 schedule,page process,pipline 4个部分组成，单个爬虫单独调度器，单独上下文管理,目前内置2中pipline的方式，控制台和文件,节点信息注册在etcd上用于manage节点发现。 \r\n\r\n- `core`:负责爬虫生命周期、上下文的管理，负责爬虫的运行。\r\n- `schedule`:负责爬虫请求的调度。(基于 channel 或 redis 的调度器)\r\n- `process`：负责请求结果的处理。\r\n- `pipline`： 结果的输出输出到不同渠道,如控制台，文件，消息队列，数据库等等\r\n- `register`：负责服务的注册（目前只支持etcd)\r\n- `http`: 提供一些http接口\r\n\r\n#### 2.管理部分（manage节点）:  \r\n负责spider节点的管理，用etcd进行spider节点的发现。通过http与spider节点通讯。\r\n\r\n\r\n## 开始使用\r\n\r\n### 例子\r\nexample-spider包内有大量实例\r\n- 哔哩哔哩\r\n- 嘀哩嘀哩\r\n- 豆瓣电影\r\n- 好奇心日报\r\n- 京东\r\n- 穷游\r\n- 糗百\r\n- 推库\r\n- 网易云音乐\r\n\r\n### 请求介绍\r\n\r\n初始请求（Request）Url有2种语法糖方式,用于简便易用：\r\n#### 1. http://xxx/xxx/{begin-end,offset}\r\n```\r\nstart = 0 20 40 ... 10000\r\nurl = https://movie.douban.com/j/new_search_subjects?sort=T&range=0,10&tags=&start={0-10000,20}\r\n```\r\n#### 2. http://xxx/xxx/{aa|bb|cc}\r\n```\r\nstart = 0 20 40 60\r\nurl = https://movie.douban.com/j/new_search_subjects?sort=T&range=0,10&tags=&start={0|20|40|60}\r\n\r\n```\r\n#### 3.http://www.dilidili.wang{$href}  (AddQueue特有)\r\n\r\n```\r\n如果 href = \"/abc\" (href是process解析出的参数)\r\nurl = http://www.dilidili.wang{$href}\r\nurl = http://www.dilidili.wang/abc\r\nurl = https://movie.douban.com/j/new_search_subjects?sort=T&range=0,10&tags=&start={0-$count,20}\r\n等等\r\n\r\n```\r\n\r\n\r\n### 实例\r\n\r\n#### 1. Json模版\r\n```\r\nhttp接口调用\r\ncurl -d '{\"id\":\"douban-movie\",\"Name\":\"douban-movie\",\"request\":[{\"url\":\"https://movie.douban.com/j/new_search_subjects?sort=T\\u0026range=0,10\\u0026tags=\\u0026start={0-100,20}\",\"method\":\"get\",\"type\":\"\",\"data\":null,\"header\":null,\"cookies\":{\"url\":\"\",\"data\":\"\"},\"process_name\":\"movie\"}],\"process\":[{\"name\":\"movie\",\"reg_url\":null,\"type\":\"json\",\"template_rule\":{\"Rule\":null},\"json_rule\":{\"Rule\":{\"casts\":\"casts\",\"cover\":\"cover\",\"id\":\"id\",\"node\":\"array|data\",\"rate\":\"rate\",\"star\":\"star\",\"title\":\"title\",\"url\":\"url\"}},\"add_queue\":null}],\"pipline\":\"file\",\"depth\":0,\"end_count\":0}' \"http://127.0.0.1:7774/task/addAndRun\"\r\n```\r\n\r\n豆瓣电影模版\r\n```\r\n {\r\n    \"id\": \"douban-movie\",\r\n    \"Name\": \"douban-movie\",\r\n    \"request\": [\r\n        {\r\n            \"url\": \"https://movie.douban.com/j/new_search_subjects?sort=T&range=0,10&tags=&start={0-10,20}\",\r\n            \"method\": \"get\",\r\n            \"process_name\": \"movie\"\r\n        }\r\n    ],\r\n    \"process\": [\r\n        {\r\n            \"name\": \"movie\",\r\n            \"type\": \"json\",\r\n            \"json_rule\": {\r\n                \"Rule\": {\r\n                    \"casts\": \"casts\",\r\n                    \"cover\": \"cover\",\r\n                    \"id\": \"id\",\r\n                    \"node\": \"array|data\",\r\n                    \"rate\": \"rate\",\r\n                    \"star\": \"star\",\r\n                    \"title\": \"title\",\r\n                    \"url\": \"url\"\r\n                }\r\n            },\r\n            \"add_queue\": null\r\n        }\r\n    ],\r\n    \"pipline\": \"file\",\r\n    \"depth\": 0,\r\n    \"end_count\": 0\r\n}\r\n\r\n```\r\ndilidili模版\r\n``` \r\n   {\r\n    \"id\": \"dilidili\",\r\n    \"Name\": \"dilidili\",\r\n    \"request\": [\r\n        {\r\n            \"url\": \"http://www.dilidili.wang/{gaoxiao|kehuan|yundong|danmei|zhiyuxi|luoli|zhenren|zhuangbi|youxi|tuili|qingchun|kongbu|jizhan|rexue|qingxiaoshuo|maoxian|hougong|qihuan|tongnian|lianai|meishaonv|lizhi|baihe|paomianfan|yinv}/\",\r\n            \"method\": \"get\",\r\n            \"process_name\": \"animelist\"\r\n        }\r\n    ],\r\n    \"process\": [\r\n        {\r\n            \"name\": \"animelist\",\r\n            \"type\": \"template\",\r\n            \"template_rule\": {\r\n                \"Rule\": {\r\n                    \"content\": \"text|dd div\",\r\n                    \"desc\": \"text|dd p\",\r\n                    \"href\": \"attr.href|dt a\",\r\n                    \"img\": \"attr.src|dt a img\",\r\n                    \"node\": \"array|.anime_list dl\",\r\n                    \"title\": \"text|dd h3 a\"\r\n                }\r\n            },\r\n            \"add_queue\": [\r\n                {\r\n                    \"url\": \"http://www.dilidili.wang{href}\",\r\n                    \"method\": \"get\",\r\n                    \"process_name\": \"animeinfo\"\r\n                }\r\n            ]\r\n        },\r\n        {\r\n            \"name\": \"animeinfo\",\r\n            \"type\": \"template\",\r\n            \"template_rule\": {\r\n                \"Rule\": {\r\n                    \"episode\": \"texts|.time_con .swiper-slide .clear li a em\",\r\n                    \"episode-link\": \"attrs.href|.time_con .swiper-slide .clear li a\",\r\n                    \"title\": \"text|.detail dl dd h1\"\r\n                }\r\n            },\r\n            \"add_queue\": [\r\n                {\r\n                    \"url\": \"{episode-link}\",\r\n                    \"method\": \"get\",\r\n                    \"process_name\": \"episodeinfo\"\r\n                }\r\n            ]\r\n        },\r\n        {\r\n            \"name\": \"episodeinfo\",\r\n            \"reg_url\": null,\r\n            \"type\": \"template\",\r\n            \"template_rule\": {\r\n                \"Rule\": {\r\n                    \"player\": \"attr.src|.player_main iframe\",\r\n                    \"title\": \"text|#intro2 h1\",\r\n                    \"url\": \"attr.href|link[rel=\\\"canonical\\\"]\"\r\n                }\r\n            },\r\n            \"add_queue\": null\r\n        }\r\n    ],\r\n    \"pipline\": \"file\",\r\n    \"depth\": 0,\r\n    \"end_count\": 0\r\n}\r\n```\r\n\r\n#### 2. 代码模版 编写\r\n豆瓣电影\r\n```\r\npackage main\r\n\r\nimport (\r\n\t\"YiSpider/spider/model\"\r\n\t\"YiSpider/spider\"\r\n\tspider2 \"YiSpider/spider/spider\"\r\n)\r\n\r\nfunc main(){\r\n\r\n\ttask := &model.Task{\r\n\t\tId:\"douban-movie\",\r\n\t\tName:\"douban-movie\",\r\n\t\tRequest:[]*model.Request{\r\n\t\t\t{\r\n\t\t\t\tMethod:\"get\",\r\n\t\t\t\tUrl:\"https://movie.douban.com/j/new_search_subjects?sort=T&range=0,10&tags=&start={0-10000,20}\",\r\n\t\t\t\tProcessName:\"movie\",\r\n\t\t\t},\r\n\t\t},\r\n\t\tProcess: []model.Process{\r\n\t\t\t{\r\n\t\t\t\tName:\"movie\",\r\n\t\t\t\tType:\"json\",\r\n\t\t\t\tJsonRule:model.JsonRule{\r\n\t\t\t\t\tRule:map[string]string{\r\n\t\t\t\t\t\t\"node\":\"array|data\",\r\n\t\t\t\t\t\t\"rate\":\"rate\",\r\n\t\t\t\t\t\t\"star\":\"star\",\r\n\t\t\t\t\t\t\"id\":\"id\",\r\n\t\t\t\t\t\t\"url\":\"url\",\r\n\t\t\t\t\t\t\"title\":\"title\",\r\n\t\t\t\t\t\t\"cover\":\"cover\",\r\n\t\t\t\t\t\t\"casts\":\"casts\",\r\n\t\t\t\t\t},\r\n\t\t\t\t},\r\n\t\t\t},\r\n\t\t},\r\n\t\tPipline:\"file\",\r\n\t}\r\n\r\n\tapp := spider.New()\r\n\tapp.AddSpider(spider2.InitWithTask(task))\r\n\tapp.Run()\r\n}\r\n```\r\ndilidili番剧\r\n```\r\npackage main\r\n\r\nimport (\r\n\t\"YiSpider/spider/model\"\r\n\t\"YiSpider/spider\"\r\n\tspider2 \"YiSpider/spider/spider\"\r\n)\r\n\r\nfunc main(){\r\n\r\n\ttask := &model.Task{\r\n\t\tId:\"dilidili\",\r\n\t\tName:\"dilidili\",\r\n\t\tRequest:[]*model.Request{\r\n\t\t\t{\r\n\t\t\t\tMethod:\"get\",\r\n\t\t\t\tUrl:\"http://www.dilidili.wang/{gaoxiao|kehuan|yundong|danmei|zhiyuxi|luoli|zhenren|zhuangbi|youxi|tuili|qingchun|kongbu|jizhan|rexue|qingxiaoshuo|maoxian|hougong|qihuan|tongnian|lianai|meishaonv|lizhi|baihe|paomianfan|yinv}/\",\r\n\t\t\t\tProcessName:\"animelist\",\r\n\t\t\t},\r\n\t\t},\r\n\t\tProcess: []model.Process{\r\n\t\t\t{\r\n\t\t\t\tName:\"animelist\",\r\n\t\t\t\tType:\"template\",\r\n\t\t\t\tTemplateRule:model.TemplateRule{\r\n\t\t\t\t\tRule:map[string]string{\r\n\t\t\t\t\t\t\"node\":\"array|.anime_list dl\",\r\n\t\t\t\t\t\t\"img\":\"attr.src|dt a img\",\r\n\t\t\t\t\t\t\"title\":\"text|dd h3 a\",\r\n\t\t\t\t\t\t\"href\":\"attr.href|dt a\",\r\n\t\t\t\t\t\t\"content\":\"text|dd div\",\r\n\t\t\t\t\t\t\"desc\":\"text|dd p\",\r\n\t\t\t\t\t},\r\n\t\t\t\t},\r\n\t\t\t\tAddQueue:[]*model.Request{\r\n\t\t\t\t\t{\r\n\t\t\t\t\t\tMethod:      \"get\",\r\n\t\t\t\t\t\tUrl:         \"http://www.dilidili.wang{$href}\",\r\n\t\t\t\t\t\tProcessName: \"animeinfo\",\r\n\t\t\t\t\t},\r\n\t\t\t\t},\r\n\t\t\t},\r\n\t\t\t{\r\n\t\t\t\tName:\"animeinfo\",\r\n\t\t\t\tType:\"template\",\r\n\t\t\t\tTemplateRule:model.TemplateRule{\r\n\t\t\t\t\tRule:map[string]string{\r\n\t\t\t\t\t\t\"episode\":\"texts|.time_con .swiper-slide .clear li a em\",\r\n\t\t\t\t\t\t\"title\":\"text|.detail dl dd h1\",\r\n\t\t\t\t\t\t\"episode-link\":\"attrs.href|.time_con .swiper-slide .clear li a\",\r\n\t\t\t\t\t},\r\n\t\t\t\t},\r\n\t\t\t\tAddQueue:[]*model.Request{\r\n\t\t\t\t\t{\r\n\t\t\t\t\t\tMethod:      \"get\",\r\n\t\t\t\t\t\tUrl:         \"{$episode-link}\",\r\n\t\t\t\t\t\tProcessName: \"episodeinfo\",\r\n\t\t\t\t\t},\r\n\t\t\t\t},\r\n\t\t\t},\r\n\t\t\t{\r\n\t\t\t\tName:\"episodeinfo\",\r\n\t\t\t\tType:\"template\",\r\n\t\t\t\tTemplateRule:model.TemplateRule{\r\n\t\t\t\t\tRule:map[string]string{\r\n\t\t\t\t\t\t\"url\":\"attr.href|link[rel=\\\"canonical\\\"]\",\r\n\t\t\t\t\t\t\"title\":\"text|#intro2 h1\",\r\n\t\t\t\t\t\t\"player\":\"attr.src|.player_main iframe\",\r\n\t\t\t\t\t},\r\n\t\t\t\t},\r\n\t\t\t},\r\n\t\t},\r\n\r\n\t\tPipline:\"file\",\r\n\t}\r\n\r\n\r\n\tapp := spider.New()\r\n\tapp.AddSpider(spider2.InitWithTask(task))\r\n\tapp.Run()\r\n\r\n}\r\n```\r\n\r\n3. 纯代码编写\r\n```\r\ntype Movies struct {\r\n\tDatas []Movie `json:\"data\"`\r\n}\r\ntype Movie struct {\r\n\tRate  string   `json:\"rate\"`\r\n\tStart string   `json:\"start\"`\r\n\tId    string   `json:\"id\"`\r\n\tUrl   string   `json:\"url\"`\r\n\tTitle string   `json:\"title\"`\r\n\tCover string   `json:\"cover\"`\r\n\tCasts []string `json:\"casts\"`\r\n}\r\n\r\ntype PageProcess struct{}\r\n\r\nfunc (p *PageProcess) Process(context model.Context) (*model.Page, error) {\r\n\tmovies := Movies{}\r\n\tif err := json.Unmarshal(context.Body, &movies); err != nil {\r\n\t\treturn nil, err\r\n\t}\r\n\tpage := &model.Page{}\r\n\tfor _, movie := range movies.Datas {\r\n\t\tpage.AddResult(movie)\r\n\t}\r\n\treturn page, nil\r\n}\r\n\r\nfunc main() {\r\n\tsp := &spider2.Spider{}\r\n\tsp.Name = \"douban-movie-code\"\r\n\tsp.Id = \"douban-movie-code\"\r\n\tsp.Requests = []*model.Request{\r\n\t\t{\r\n\t\t\tMethod:      \"get\",\r\n\t\t\tUrl:         \"https://movie.douban.com/j/new_search_subjects?sort=T&range=0,10&tags=&start={0-10000,20}\",\r\n\t\t\tProcessName: \"movie\",\r\n\t\t},\r\n\t}\r\n\tsp.AddProcess(\"movie\", &PageProcess{})\r\n\tsp.Pipline = file.NewFilePipline(\"./\")\r\n\r\n\tapp := spider.New()\r\n\tapp.AddSpider(sp)\r\n\tapp.Run()\r\n}\r\n\r\n```\r\n"
  },
  {
    "path": "example-spider/bilibili/conf.json",
    "content": "{\n  \"name\":\"bilibili_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 10,\n  \"max_wait_num\":12000,\n  \"http_addr\":\"127.0.0.1:7774\",\n  \"etcd\":[\"http://127.0.0.1:2379\"],\n\n  \"schedule\":\"redis\",\n  \"redis_addr\":\"127.0.0.1:6379\",\n\n  \"mysql\":\"root:123456@tcp(127.0.0.1:3306)/auto_db?charset=utf8\"\n\n}"
  },
  {
    "path": "example-spider/bilibili/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n)\n\nfunc main() {\n\n\ttask := &model.Task{\n\t\tId:   \"bilibili\",\n\t\tName: \"bilibili\",\n\t\tRequest: []*model.Request{\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://bangumi.bilibili.com/web_api/season/index_global?page={1-147,1}&page_size=20&version=0&is_finish=0&start_year=0&tag_id=&index_type=1&index_sort=0&quarter=0\",\n\t\t\t\tProcessName: \"animelist\",\n\t\t\t},\n\t\t},\n\t\tProcess: []model.Process{\n\t\t\t{\n\t\t\t\tName: \"animelist\",\n\t\t\t\tType: \"json\",\n\t\t\t\tJsonRule: model.JsonRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":        \"array|result.list\",\n\t\t\t\t\t\t\"img\":         \"cover\",\n\t\t\t\t\t\t\"favorites\":   \"favorites\",\n\t\t\t\t\t\t\"title\":       \"title\",\n\t\t\t\t\t\t\"total_count\": \"total_count\",\n\t\t\t\t\t\t\"update_time\": \"update_time\",\n\t\t\t\t\t\t\"url\":         \"url\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t\tAddQueue: nil,\n\t\t\t},\n\t\t},\n\n\t\tPipline: \"mysql\",\n\t}\n\n\tapp := spider.New()\n\tapp.AddSpider(spider2.InitWithTask(task))\n\tapp.Run()\n\n}\n\n/*\n   dilidili json\n\n   {\n    \"id\": \"dilidili\",\n    \"Name\": \"dilidili\",\n    \"request\": [\n        {\n            \"url\": \"http://www.dilidili.wang/{gaoxiao|kehuan|yundong|danmei|zhiyuxi|luoli|zhenren|zhuangbi|youxi|tuili|qingchun|kongbu|jizhan|rexue|qingxiaoshuo|maoxian|hougong|qihuan|tongnian|lianai|meishaonv|lizhi|baihe|paomianfan|yinv}/\",\n            \"method\": \"get\",\n            \"type\": \"\",\n            \"data\": null,\n            \"header\": null,\n            \"cookies\": {\n                \"url\": \"\",\n                \"data\": \"\"\n            },\n            \"process_name\": \"animelist\"\n        }\n    ],\n    \"process\": [\n        {\n            \"name\": \"animelist\",\n            \"reg_url\": null,\n            \"type\": \"template\",\n            \"template_rule\": {\n                \"Rule\": {\n                    \"content\": \"text|dd div\",\n                    \"desc\": \"text|dd p\",\n                    \"href\": \"attr.href|dt a\",\n                    \"img\": \"attr.src|dt a img\",\n                    \"node\": \"array|.anime_list dl\",\n                    \"title\": \"text|dd h3 a\"\n                }\n            },\n            \"json_rule\": {\n                \"Rule\": null\n            },\n            \"add_queue\": [\n                {\n                    \"url\": \"http://www.dilidili.wang{href}\",\n                    \"method\": \"get\",\n                    \"type\": \"\",\n                    \"data\": null,\n                    \"header\": null,\n                    \"cookies\": {\n                        \"url\": \"\",\n                        \"data\": \"\"\n                    },\n                    \"process_name\": \"animeinfo\"\n                }\n            ]\n        },\n        {\n            \"name\": \"animeinfo\",\n            \"reg_url\": null,\n            \"type\": \"template\",\n            \"template_rule\": {\n                \"Rule\": {\n                    \"episode\": \"texts|.time_con .swiper-slide .clear li a em\",\n                    \"episode-link\": \"attrs.href|.time_con .swiper-slide .clear li a\",\n                    \"title\": \"text|.detail dl dd h1\"\n                }\n            },\n            \"json_rule\": {\n                \"Rule\": null\n            },\n            \"add_queue\": [\n                {\n                    \"url\": \"{episode-link}\",\n                    \"method\": \"get\",\n                    \"type\": \"\",\n                    \"data\": null,\n                    \"header\": null,\n                    \"cookies\": {\n                        \"url\": \"\",\n                        \"data\": \"\"\n                    },\n                    \"process_name\": \"episodeinfo\"\n                }\n            ]\n        },\n        {\n            \"name\": \"episodeinfo\",\n            \"reg_url\": null,\n            \"type\": \"template\",\n            \"template_rule\": {\n                \"Rule\": {\n                    \"player\": \"attr.src|.player_main iframe\",\n                    \"title\": \"text|#intro2 h1\",\n                    \"url\": \"attr.href|link[rel=\\\"canonical\\\"]\"\n                }\n            },\n            \"json_rule\": {\n                \"Rule\": null\n            },\n            \"add_queue\": null\n        }\n    ],\n    \"pipline\": \"file\",\n    \"depth\": 0,\n    \"end_count\": 0\n}\n\n{\"id\":\"dilidili\",\"Name\":\"dilidili\",\"request\":[{\"url\":\"http://www.dilidili.wang/{gaoxiao|kehuan|yundong|danmei|zhiyuxi|luoli|zhenren|zhuangbi|youxi|tuili|qingchun|kongbu|jizhan|rexue|qingxiaoshuo|maoxian|hougong|qihuan|tongnian|lianai|meishaonv|lizhi|baihe|paomianfan|yinv}/\",\"method\":\"get\",\"type\":\"\",\"data\":null,\"header\":null,\"cookies\":{\"url\":\"\",\"data\":\"\"},\"process_name\":\"animelist\"}],\"process\":[{\"name\":\"animelist\",\"reg_url\":null,\"type\":\"template\",\"template_rule\":{\"Rule\":{\"content\":\"text|dd div\",\"desc\":\"text|dd p\",\"href\":\"attr.href|dt a\",\"img\":\"attr.src|dt a img\",\"node\":\"array|.anime_list dl\",\"title\":\"text|dd h3 a\"}},\"json_rule\":{\"Rule\":null},\"add_queue\":[{\"url\":\"http://www.dilidili.wang{href}\",\"method\":\"get\",\"type\":\"\",\"data\":null,\"header\":null,\"cookies\":{\"url\":\"\",\"data\":\"\"},\"process_name\":\"animeinfo\"}]},{\"name\":\"animeinfo\",\"reg_url\":null,\"type\":\"template\",\"template_rule\":{\"Rule\":{\"episode\":\"texts|.time_con .swiper-slide .clear li a em\",\"episode-link\":\"attrs.href|.time_con .swiper-slide .clear li a\",\"title\":\"text|.detail dl dd h1\"}},\"json_rule\":{\"Rule\":null},\"add_queue\":[{\"url\":\"{episode-link}\",\"method\":\"get\",\"type\":\"\",\"data\":null,\"header\":null,\"cookies\":{\"url\":\"\",\"data\":\"\"},\"process_name\":\"episodeinfo\"}]},{\"name\":\"episodeinfo\",\"reg_url\":null,\"type\":\"template\",\"template_rule\":{\"Rule\":{\"player\":\"attr.src|.player_main iframe\",\"title\":\"text|#intro2 h1\",\"url\":\"attr.href|link[rel=\\\"canonical\\\"]\"}},\"json_rule\":{\"Rule\":null},\"add_queue\":null}],\"pipline\":\"file\",\"depth\":0,\"end_count\":0}\n\n*/\n"
  },
  {
    "path": "example-spider/dilidili/conf.json",
    "content": "{\n  \"name\":\"dilidili_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 100,\n  \"max_wait_num\":1000000,\n  \"http_addr\":\"127.0.0.1:7774\",\n  \"etcd\":[\"http://127.0.0.1:2379\"],\n\n  \"schedule\":\"redis\",\n  \"redis_addr\":\"127.0.0.1:6379\"\n}"
  },
  {
    "path": "example-spider/dilidili/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n)\n\nfunc main() {\n\n\ttask := &model.Task{\n\t\tId:   \"dilidili\",\n\t\tName: \"dilidili\",\n\t\tRequest: []*model.Request{\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://www.dilidili.wang/{gaoxiao|kehuan|yundong|danmei|zhiyuxi|luoli|zhenren|zhuangbi|youxi|tuili|qingchun|kongbu|jizhan|rexue|qingxiaoshuo|maoxian|hougong|qihuan|tongnian|lianai|meishaonv|lizhi|baihe|paomianfan|yinv}/\",\n\t\t\t\tProcessName: \"animelist\",\n\t\t\t},\n\t\t},\n\t\tProcess: []model.Process{\n\t\t\t{\n\t\t\t\tName: \"animelist\",\n\t\t\t\tType: \"template\",\n\t\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":    \"array|.anime_list dl\",\n\t\t\t\t\t\t\"img\":     \"attr.src|dt a img\",\n\t\t\t\t\t\t\"title\":   \"text|dd h3 a\",\n\t\t\t\t\t\t\"href\":    \"attr.href|dt a\",\n\t\t\t\t\t\t\"content\": \"text|dd div\",\n\t\t\t\t\t\t\"desc\":    \"text|dd p\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t\tAddQueue: []*model.Request{\n\t\t\t\t\t{\n\t\t\t\t\t\tMethod:      \"get\",\n\t\t\t\t\t\tUrl:         \"http://www.dilidili.wang{$href}\",\n\t\t\t\t\t\tProcessName: \"animeinfo\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t\t{\n\t\t\t\tName: \"animeinfo\",\n\t\t\t\tType: \"template\",\n\t\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"episode\":      \"texts|.time_con .swiper-slide .clear li a em\",\n\t\t\t\t\t\t\"title\":        \"text|.detail dl dd h1\",\n\t\t\t\t\t\t\"episode-link\": \"attrs.href|.time_con .swiper-slide .clear li a\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t\tAddQueue: []*model.Request{\n\t\t\t\t\t{\n\t\t\t\t\t\tMethod:      \"get\",\n\t\t\t\t\t\tUrl:         \"{$episode-link}\",\n\t\t\t\t\t\tProcessName: \"episodeinfo\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t\t{\n\t\t\t\tName: \"episodeinfo\",\n\t\t\t\tType: \"template\",\n\t\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"url\":    \"attr.href|link[rel=\\\"canonical\\\"]\",\n\t\t\t\t\t\t\"title\":  \"text|#intro2 h1\",\n\t\t\t\t\t\t\"player\": \"attr.src|.player_main iframe\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\n\t\tPipline: \"file\",\n\t}\n\n\tapp := spider.New()\n\tapp.AddSpider(spider2.InitWithTask(task))\n\tapp.Run()\n\n}\n\n/*\n   dilidili json\n\n   {\n    \"id\": \"dilidili\",\n    \"Name\": \"dilidili\",\n    \"request\": [\n        {\n            \"url\": \"http://www.dilidili.wang/{gaoxiao|kehuan|yundong|danmei|zhiyuxi|luoli|zhenren|zhuangbi|youxi|tuili|qingchun|kongbu|jizhan|rexue|qingxiaoshuo|maoxian|hougong|qihuan|tongnian|lianai|meishaonv|lizhi|baihe|paomianfan|yinv}/\",\n            \"method\": \"get\",\n            \"type\": \"\",\n            \"data\": null,\n            \"header\": null,\n            \"cookies\": {\n                \"url\": \"\",\n                \"data\": \"\"\n            },\n            \"process_name\": \"animelist\"\n        }\n    ],\n    \"process\": [\n        {\n            \"name\": \"animelist\",\n            \"reg_url\": null,\n            \"type\": \"template\",\n            \"template_rule\": {\n                \"Rule\": {\n                    \"content\": \"text|dd div\",\n                    \"desc\": \"text|dd p\",\n                    \"href\": \"attr.href|dt a\",\n                    \"img\": \"attr.src|dt a img\",\n                    \"node\": \"array|.anime_list dl\",\n                    \"title\": \"text|dd h3 a\"\n                }\n            },\n            \"json_rule\": {\n                \"Rule\": null\n            },\n            \"add_queue\": [\n                {\n                    \"url\": \"http://www.dilidili.wang{href}\",\n                    \"method\": \"get\",\n                    \"type\": \"\",\n                    \"data\": null,\n                    \"header\": null,\n                    \"cookies\": {\n                        \"url\": \"\",\n                        \"data\": \"\"\n                    },\n                    \"process_name\": \"animeinfo\"\n                }\n            ]\n        },\n        {\n            \"name\": \"animeinfo\",\n            \"reg_url\": null,\n            \"type\": \"template\",\n            \"template_rule\": {\n                \"Rule\": {\n                    \"episode\": \"texts|.time_con .swiper-slide .clear li a em\",\n                    \"episode-link\": \"attrs.href|.time_con .swiper-slide .clear li a\",\n                    \"title\": \"text|.detail dl dd h1\"\n                }\n            },\n            \"json_rule\": {\n                \"Rule\": null\n            },\n            \"add_queue\": [\n                {\n                    \"url\": \"{episode-link}\",\n                    \"method\": \"get\",\n                    \"type\": \"\",\n                    \"data\": null,\n                    \"header\": null,\n                    \"cookies\": {\n                        \"url\": \"\",\n                        \"data\": \"\"\n                    },\n                    \"process_name\": \"episodeinfo\"\n                }\n            ]\n        },\n        {\n            \"name\": \"episodeinfo\",\n            \"reg_url\": null,\n            \"type\": \"template\",\n            \"template_rule\": {\n                \"Rule\": {\n                    \"player\": \"attr.src|.player_main iframe\",\n                    \"title\": \"text|#intro2 h1\",\n                    \"url\": \"attr.href|link[rel=\\\"canonical\\\"]\"\n                }\n            },\n            \"json_rule\": {\n                \"Rule\": null\n            },\n            \"add_queue\": null\n        }\n    ],\n    \"pipline\": \"file\",\n    \"depth\": 0,\n    \"end_count\": 0\n}\n\n{\"id\":\"dilidili\",\"Name\":\"dilidili\",\"request\":[{\"url\":\"http://www.dilidili.wang/{gaoxiao|kehuan|yundong|danmei|zhiyuxi|luoli|zhenren|zhuangbi|youxi|tuili|qingchun|kongbu|jizhan|rexue|qingxiaoshuo|maoxian|hougong|qihuan|tongnian|lianai|meishaonv|lizhi|baihe|paomianfan|yinv}/\",\"method\":\"get\",\"type\":\"\",\"data\":null,\"header\":null,\"cookies\":{\"url\":\"\",\"data\":\"\"},\"process_name\":\"animelist\"}],\"process\":[{\"name\":\"animelist\",\"reg_url\":null,\"type\":\"template\",\"template_rule\":{\"Rule\":{\"content\":\"text|dd div\",\"desc\":\"text|dd p\",\"href\":\"attr.href|dt a\",\"img\":\"attr.src|dt a img\",\"node\":\"array|.anime_list dl\",\"title\":\"text|dd h3 a\"}},\"json_rule\":{\"Rule\":null},\"add_queue\":[{\"url\":\"http://www.dilidili.wang{href}\",\"method\":\"get\",\"type\":\"\",\"data\":null,\"header\":null,\"cookies\":{\"url\":\"\",\"data\":\"\"},\"process_name\":\"animeinfo\"}]},{\"name\":\"animeinfo\",\"reg_url\":null,\"type\":\"template\",\"template_rule\":{\"Rule\":{\"episode\":\"texts|.time_con .swiper-slide .clear li a em\",\"episode-link\":\"attrs.href|.time_con .swiper-slide .clear li a\",\"title\":\"text|.detail dl dd h1\"}},\"json_rule\":{\"Rule\":null},\"add_queue\":[{\"url\":\"{episode-link}\",\"method\":\"get\",\"type\":\"\",\"data\":null,\"header\":null,\"cookies\":{\"url\":\"\",\"data\":\"\"},\"process_name\":\"episodeinfo\"}]},{\"name\":\"episodeinfo\",\"reg_url\":null,\"type\":\"template\",\"template_rule\":{\"Rule\":{\"player\":\"attr.src|.player_main iframe\",\"title\":\"text|#intro2 h1\",\"url\":\"attr.href|link[rel=\\\"canonical\\\"]\"}},\"json_rule\":{\"Rule\":null},\"add_queue\":null}],\"pipline\":\"file\",\"depth\":0,\"end_count\":0}\n\n*/\n"
  },
  {
    "path": "example-spider/douban-movie/conf.json",
    "content": "{\n  \"name\":\"sohu_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 50,\n  \"max_wait_num\":4096,\n  \"http_addr\":\"127.0.0.1:7774\",\n\n  \"mysql\":\"root:123456@tcp(127.0.0.1:3306)/auto_db?charset=utf8\"\n}"
  },
  {
    "path": "example-spider/douban-movie/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n)\n\nfunc main() {\n\n\ttask := &model.Task{\n\t\tId:   \"douban-movie\",\n\t\tName: \"douban-movie\",\n\t\tRequest: []*model.Request{\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"https://movie.douban.com/j/new_search_subjects?sort=T&range=0,10&tags=&start={0-10000,20}\",\n\t\t\t\tProcessName: \"movie\",\n\t\t\t},\n\t\t},\n\t\tProcess: []model.Process{\n\t\t\t{\n\t\t\t\tName: \"movie\",\n\t\t\t\tType: \"json\",\n\t\t\t\tJsonRule: model.JsonRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":  \"array|data\",\n\t\t\t\t\t\t\"rate\":  \"rate\",\n\t\t\t\t\t\t\"star\":  \"star\",\n\t\t\t\t\t\t\"id\":    \"id\",\n\t\t\t\t\t\t\"url\":   \"url\",\n\t\t\t\t\t\t\"title\": \"title\",\n\t\t\t\t\t\t\"cover\": \"cover\",\n\t\t\t\t\t\t\"casts\": \"casts\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\t\tPipline:\"mysql\",\n\t}\n\n\tapp := spider.New()\n\tapp.AddSpider(spider2.InitWithTask(task))\n\tapp.Run()\n}\n\n/*\n douban-movie json\n\n {\n    \"id\": \"douban-movie\",\n    \"Name\": \"douban-movie\",\n    \"request\": [\n        {\n            \"url\": \"https://movie.douban.com/j/new_search_subjects?sort=T&range=0,10&tags=&start={0-10,20}\",\n            \"method\": \"get\",\n            \"type\": \"\",\n            \"data\": null,\n            \"header\": null,\n            \"cookies\": {\n                \"url\": \"\",\n                \"data\": \"\"\n            },\n            \"process_name\": \"movie\"\n        }\n    ],\n    \"process\": [\n        {\n            \"name\": \"movie\",\n            \"reg_url\": null,\n            \"type\": \"json\",\n            \"template_rule\": {\n                \"Rule\": null\n            },\n            \"json_rule\": {\n                \"Rule\": {\n                    \"casts\": \"casts\",\n                    \"cover\": \"cover\",\n                    \"id\": \"id\",\n                    \"node\": \"array|data\",\n                    \"rate\": \"rate\",\n                    \"star\": \"star\",\n                    \"title\": \"title\",\n                    \"url\": \"url\"\n                }\n            },\n            \"add_queue\": null\n        }\n    ],\n    \"pipline\": \"file\",\n    \"depth\": 0,\n    \"end_count\": 0\n}\n\ncurl -d '{\"id\":\"douban-movie\",\"Name\":\"douban-movie\",\"request\":[{\"url\":\"https://movie.douban.com/j/new_search_subjects?sort=T\\u0026range=0,10\\u0026tags=\\u0026start={0-100,20}\",\"method\":\"get\",\"type\":\"\",\"data\":null,\"header\":null,\"cookies\":{\"url\":\"\",\"data\":\"\"},\"process_name\":\"movie\"}],\"process\":[{\"name\":\"movie\",\"reg_url\":null,\"type\":\"json\",\"template_rule\":{\"Rule\":null},\"json_rule\":{\"Rule\":{\"casts\":\"casts\",\"cover\":\"cover\",\"id\":\"id\",\"node\":\"array|data\",\"rate\":\"rate\",\"star\":\"star\",\"title\":\"title\",\"url\":\"url\"}},\"add_queue\":null}],\"pipline\":\"file\",\"depth\":0,\"end_count\":0}' \"http://127.0.0.1:7774/task/addAndRun\"\n\n\n*/\n"
  },
  {
    "path": "example-spider/douban-movie-code/conf.json",
    "content": "{\n  \"name\":\"douban_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 10,\n  \"max_wait_num\":4096,\n  \"http_addr\":\"127.0.0.1:7774\",\n  \"etcd\":[\"http://127.0.0.1:2379\"],\n\n  \"schedule\":\"redis\",\n  \"redis_addr\":\"127.0.0.1:6379\"\n}"
  },
  {
    "path": "example-spider/douban-movie-code/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\t\"YiSpider/spider/pipline/file\"\n\tspider2 \"YiSpider/spider/spider\"\n\t\"encoding/json\"\n)\n\ntype Movies struct {\n\tDatas []Movie `json:\"data\"`\n}\ntype Movie struct {\n\tRate  string   `json:\"rate\"`\n\tStart string   `json:\"start\"`\n\tId    string   `json:\"id\"`\n\tUrl   string   `json:\"url\"`\n\tTitle string   `json:\"title\"`\n\tCover string   `json:\"cover\"`\n\tCasts []string `json:\"casts\"`\n}\n\ntype PageProcess struct{}\n\nfunc (p *PageProcess) Process(context model.Context) (*model.Page, error) {\n\tmovies := Movies{}\n\tif err := json.Unmarshal(context.Body, &movies); err != nil {\n\t\treturn nil, err\n\t}\n\tpage := &model.Page{}\n\tfor _, movie := range movies.Datas {\n\t\tpage.AddResult(movie)\n\t}\n\treturn page, nil\n}\n\nfunc main() {\n\tsp := &spider2.Spider{}\n\tsp.Name = \"douban-movie-code\"\n\tsp.Id = \"douban-movie-code\"\n\tsp.Requests = []*model.Request{\n\t\t{\n\t\t\tMethod:      \"get\",\n\t\t\tUrl:         \"https://movie.douban.com/j/new_search_subjects?sort=T&range=0,10&tags=&start={0-10000,20}\",\n\t\t\tProcessName: \"movie\",\n\t\t},\n\t}\n\tsp.AddProcess(\"movie\", &PageProcess{})\n\tsp.Pipline = file.NewFilePipline(\"./\")\n\n\tapp := spider.New()\n\tapp.AddSpider(sp)\n\tapp.Run()\n}\n"
  },
  {
    "path": "example-spider/empty/conf.json",
    "content": "{\n  \"name\":\"sohu_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 200,\n  \"max_wait_num\":12000,\n  \"http_addr\":\"127.0.0.1:7774\",\n  \"etcd\":[\"http://127.0.0.1:2379\"]\n}"
  },
  {
    "path": "example-spider/empty/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n)\n\nfunc main() {\n\tapp := spider.New()\n\tapp.Run()\n}\n"
  },
  {
    "path": "example-spider/haoqi/conf.json",
    "content": "{\n  \"name\":\"haoqi_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 100,\n  \"max_wait_num\":12000,\n  \"http_addr\":\"127.0.0.1:7774\",\n  \"etcd\":[\"http://127.0.0.1:2379\"]\n}"
  },
  {
    "path": "example-spider/haoqi/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n)\n\nfunc main() {\n\n\ttask := &model.Task{\n\t\tId:   \"haoqi\",\n\t\tName: \"haoqi\",\n\t\tRequest: []*model.Request{\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://www.qdaily.com/categories/categorymore/{1-54,1}/1509942163.json\",\n\t\t\t\tProcessName: \"articles\",\n\t\t\t},\n\t\t},\n\t\tProcess: []model.Process{\n\t\t\t{\n\t\t\t\tName: \"articles\",\n\t\t\t\tType: \"json\",\n\t\t\t\tJsonRule: model.JsonRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":     \"array|data.feeds\",\n\t\t\t\t\t\t\"datatype\": \"datatype\",\n\t\t\t\t\t\t\"image\":    \"image\",\n\t\t\t\t\t\t\"post\":     \"post\",\n\t\t\t\t\t\t\"type\":     \"type\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t\tAddQueue: nil,\n\t\t\t},\n\t\t\t{\n\t\t\t\tName: \"articles\",\n\t\t\t\tType: \"json\",\n\t\t\t\tJsonRule: model.JsonRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":     \"nil|data\",\n\t\t\t\t\t\t\"last_key\": \"last_key\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t\tAddQueue: []*model.Request{\n\t\t\t\t\t{\n\t\t\t\t\t\tMethod:      \"get\",\n\t\t\t\t\t\tUrl:         \"http://www.qdaily.com/categories/categorymore/18/{$last_key}.json\",\n\t\t\t\t\t\tProcessName: \"articles\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\n\t\tPipline: \"file\",\n\t}\n\n\tapp := spider.New()\n\tapp.AddSpider(spider2.InitWithTask(task))\n\tapp.Run()\n\n}\n\n/*\n   dilidili json\n\n   {\n    \"id\": \"dilidili\",\n    \"Name\": \"dilidili\",\n    \"request\": [\n        {\n            \"url\": \"http://www.dilidili.wang/{gaoxiao|kehuan|yundong|danmei|zhiyuxi|luoli|zhenren|zhuangbi|youxi|tuili|qingchun|kongbu|jizhan|rexue|qingxiaoshuo|maoxian|hougong|qihuan|tongnian|lianai|meishaonv|lizhi|baihe|paomianfan|yinv}/\",\n            \"method\": \"get\",\n            \"type\": \"\",\n            \"data\": null,\n            \"header\": null,\n            \"cookies\": {\n                \"url\": \"\",\n                \"data\": \"\"\n            },\n            \"process_name\": \"animelist\"\n        }\n    ],\n    \"process\": [\n        {\n            \"name\": \"animelist\",\n            \"reg_url\": null,\n            \"type\": \"template\",\n            \"template_rule\": {\n                \"Rule\": {\n                    \"content\": \"text|dd div\",\n                    \"desc\": \"text|dd p\",\n                    \"href\": \"attr.href|dt a\",\n                    \"img\": \"attr.src|dt a img\",\n                    \"node\": \"array|.anime_list dl\",\n                    \"title\": \"text|dd h3 a\"\n                }\n            },\n            \"json_rule\": {\n                \"Rule\": null\n            },\n            \"add_queue\": [\n                {\n                    \"url\": \"http://www.dilidili.wang{href}\",\n                    \"method\": \"get\",\n                    \"type\": \"\",\n                    \"data\": null,\n                    \"header\": null,\n                    \"cookies\": {\n                        \"url\": \"\",\n                        \"data\": \"\"\n                    },\n                    \"process_name\": \"animeinfo\"\n                }\n            ]\n        },\n        {\n            \"name\": \"animeinfo\",\n            \"reg_url\": null,\n            \"type\": \"template\",\n            \"template_rule\": {\n                \"Rule\": {\n                    \"episode\": \"texts|.time_con .swiper-slide .clear li a em\",\n                    \"episode-link\": \"attrs.href|.time_con .swiper-slide .clear li a\",\n                    \"title\": \"text|.detail dl dd h1\"\n                }\n            },\n            \"json_rule\": {\n                \"Rule\": null\n            },\n            \"add_queue\": [\n                {\n                    \"url\": \"{episode-link}\",\n                    \"method\": \"get\",\n                    \"type\": \"\",\n                    \"data\": null,\n                    \"header\": null,\n                    \"cookies\": {\n                        \"url\": \"\",\n                        \"data\": \"\"\n                    },\n                    \"process_name\": \"episodeinfo\"\n                }\n            ]\n        },\n        {\n            \"name\": \"episodeinfo\",\n            \"reg_url\": null,\n            \"type\": \"template\",\n            \"template_rule\": {\n                \"Rule\": {\n                    \"player\": \"attr.src|.player_main iframe\",\n                    \"title\": \"text|#intro2 h1\",\n                    \"url\": \"attr.href|link[rel=\\\"canonical\\\"]\"\n                }\n            },\n            \"json_rule\": {\n                \"Rule\": null\n            },\n            \"add_queue\": null\n        }\n    ],\n    \"pipline\": \"file\",\n    \"depth\": 0,\n    \"end_count\": 0\n}\n\n{\"id\":\"dilidili\",\"Name\":\"dilidili\",\"request\":[{\"url\":\"http://www.dilidili.wang/{gaoxiao|kehuan|yundong|danmei|zhiyuxi|luoli|zhenren|zhuangbi|youxi|tuili|qingchun|kongbu|jizhan|rexue|qingxiaoshuo|maoxian|hougong|qihuan|tongnian|lianai|meishaonv|lizhi|baihe|paomianfan|yinv}/\",\"method\":\"get\",\"type\":\"\",\"data\":null,\"header\":null,\"cookies\":{\"url\":\"\",\"data\":\"\"},\"process_name\":\"animelist\"}],\"process\":[{\"name\":\"animelist\",\"reg_url\":null,\"type\":\"template\",\"template_rule\":{\"Rule\":{\"content\":\"text|dd div\",\"desc\":\"text|dd p\",\"href\":\"attr.href|dt a\",\"img\":\"attr.src|dt a img\",\"node\":\"array|.anime_list dl\",\"title\":\"text|dd h3 a\"}},\"json_rule\":{\"Rule\":null},\"add_queue\":[{\"url\":\"http://www.dilidili.wang{href}\",\"method\":\"get\",\"type\":\"\",\"data\":null,\"header\":null,\"cookies\":{\"url\":\"\",\"data\":\"\"},\"process_name\":\"animeinfo\"}]},{\"name\":\"animeinfo\",\"reg_url\":null,\"type\":\"template\",\"template_rule\":{\"Rule\":{\"episode\":\"texts|.time_con .swiper-slide .clear li a em\",\"episode-link\":\"attrs.href|.time_con .swiper-slide .clear li a\",\"title\":\"text|.detail dl dd h1\"}},\"json_rule\":{\"Rule\":null},\"add_queue\":[{\"url\":\"{episode-link}\",\"method\":\"get\",\"type\":\"\",\"data\":null,\"header\":null,\"cookies\":{\"url\":\"\",\"data\":\"\"},\"process_name\":\"episodeinfo\"}]},{\"name\":\"episodeinfo\",\"reg_url\":null,\"type\":\"template\",\"template_rule\":{\"Rule\":{\"player\":\"attr.src|.player_main iframe\",\"title\":\"text|#intro2 h1\",\"url\":\"attr.href|link[rel=\\\"canonical\\\"]\"}},\"json_rule\":{\"Rule\":null},\"add_queue\":null}],\"pipline\":\"file\",\"depth\":0,\"end_count\":0}\n\n*/\n"
  },
  {
    "path": "example-spider/jingdong/conf.json",
    "content": "{\n  \"name\":\"qiongyou_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 128,\n  \"max_wait_num\":100000000,\n  \"http_addr\":\"127.0.0.1:7774\",\n  \"etcd\":[\"http://127.0.0.1:2379\"]\n}"
  },
  {
    "path": "example-spider/jingdong/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n\t\"fmt\"\n)\n\nfunc main() {\n\n\tgoodsType := `电子书刊|电子书|网络原创|数字杂志|多媒体图书|音像|音乐|影视|教育音像|英文原版|少儿|商务投资|英语学习与考试|文学|传记|励志|文艺|小说|文学|青春文学|传记|艺术|少儿|少儿|0-2岁|3-6岁|7-10岁|11-14岁|人文社科|历史|哲学|国学|政治/军事|法律|人文社科|心理学|文化|社会科学|经管励志|经济|金融与投资|管理|励志与成功|生活|生活|健身与保健|家庭与育儿|旅游|烹饪美食|科技|工业技术|科普读物|建筑|医学|科学与自然|计算机与互联网|电子通信|教育|中小学教辅|教育与考试|外语学习|大中专教材|字典词典|港台图书|艺术/设计/收藏|经济管理|文化/学术|少儿|其他|工具书|杂志/期刊|套装书|手机通讯|手机|对讲机|运营商|合约机|选号中心|装宽带|办套餐|手机配件|移动电源|电池/移动电源|蓝牙耳机|充电器/数据线|苹果周边|手机耳机|手机贴膜|手机存储卡|充电器|数据线|手机保护套|车载配件|iPhone 配件|手机电池|创意配件|便携/无线音响|手机饰品|拍照配件|手机支架|大 家 电|平板电视|空调|冰箱|洗衣机|家庭影院|DVD/电视盒子|迷你音响|冷柜/冰吧|家电配件|功放|回音壁/Soundbar|Hi-Fi专区|电视盒子|酒柜|厨卫大电|燃气灶|油烟机|热水器|消毒柜|洗碗机|厨房小电|料理机|榨汁机|电饭煲|电压力锅|豆浆机|咖啡机|微波炉|电烤箱|电磁炉|面包机|煮蛋器|酸奶机|电炖锅|电水壶/热水瓶|电饼铛|多用途锅|电烧烤炉|果蔬解毒机|其它厨房电器|养生壶/煎药壶|电热饭盒|生活电器|取暖电器|净化器|加湿器|扫地机器人|吸尘器|挂烫机/熨斗|插座|电话机|清洁机|除湿机|干衣机|收录/音机|电风扇|冷风扇|其它生活电器|生活电器配件|净水器|饮水机|个护健康|剃须刀|剃/脱毛器|口腔护理|电吹风|美容器|理发器|卷/直发器|按摩椅|按摩器|足浴盆|血压计|电子秤/厨房秤|血糖仪|体温计|其它健康电器|计步器/脂肪检测仪|五金家装|电动工具|手动工具|仪器仪表|浴霸/排气扇|灯具|LED灯|洁身器|水槽|龙头|淋浴花洒|厨卫五金|家具五金|门铃|电气开关|插座|电工电料|监控安防|电线/线缆|摄影摄像|数码相机|单电/微单相机|单反相机|摄像机|拍立得|运动相机|镜头|户外器材|影棚器材|冲印服务|数码相框|数码配件|存储卡|读卡器|滤镜|闪光灯/手柄|相机包|三脚架/云台|相机清洁/贴膜|机身附件|镜头附件|电池/充电器|移动电源|数码支架|智能设备|智能手环|智能手表|智能眼镜|运动跟踪器|健康监测|智能配饰|智能家居|体感车|其他配件|智能机器人|无人机|影音娱乐|MP3/MP4|智能设备|耳机/耳麦|便携/无线音箱|音箱/音响|高清播放器|收音机|MP3/MP4配件|麦克风|专业音频|苹果配件|电子教育|学生平板|点读机/笔|早教益智|录音笔|电纸书|电子词典|复读机|虚拟商品|延保服务|杀毒软件|积分商品|家纺|桌布/罩件|地毯地垫|沙发垫套/椅垫|床品套件|被子|枕芯|床单被罩|毯子|床垫/床褥|蚊帐|抱枕靠垫|毛巾浴巾|电热毯|窗帘/窗纱|布艺软饰|凉席|灯具|台灯|节能灯|装饰灯|落地灯|应急灯/手电|LED灯|吸顶灯|五金电器|筒灯射灯|吊灯|氛围照明|生活日用|保暖防护|收纳用品|雨伞雨具|浴室用品|缝纫/针织用品|洗晒/熨烫|净化除味|家装软饰|相框/照片墙|装饰字画|节庆饰品|手工/十字绣|装饰摆件|帘艺隔断|墙贴/装饰贴|钟饰|花瓶花艺|香薰蜡烛|创意家居|宠物生活|宠物主粮|宠物零食|医疗保健|家居日用|宠物玩具|出行装备|洗护美容|电脑整机|笔记本|超极本|游戏本|平板电脑|平板电脑配件|台式机|服务器/工作站|笔记本配件|一体机|电脑配件|CPU|主板|显卡|硬盘|SSD固态硬盘|内存|机箱|电源|显示器|刻录机/光驱|散热器|声卡/扩展卡|装机配件|组装电脑|外设产品|移动硬盘|U盘|鼠标|键盘|鼠标垫|摄像头|手写板|硬盘盒|插座|线缆|UPS电源|电脑工具|游戏设备|电玩|电脑清洁|网络仪表仪器|游戏设备|游戏机|游戏耳机|手柄/方向盘|游戏软件|游戏周边|网络产品|路由器|网卡|交换机|网络存储|4G/3G上网|网络盒子|网络配件|办公设备|投影机|投影配件|多功能一体机|打印机|传真设备|验钞/点钞机|扫描设备|复合机|碎纸机|考勤机|收款/POS机|会议音频视频|保险柜|装订/封装机|安防监控|办公家具|白板|文具/耗材|硒鼓/墨粉|墨盒|色带|纸类|办公文具|学生文具|财会用品|文件管理|本册/便签|计算器|笔类|画具画材|刻录碟片/附件|服务产品|上门安装|延保服务|维修保养|电脑软件|京东服务|烹饪锅具|炒锅|煎锅|压力锅|蒸锅|汤锅|奶锅|锅具套装|煲类|水壶|火锅|刀剪菜板|菜刀|剪刀|刀具套装|砧板|瓜果刀/刨|多功能刀|厨房配件|保鲜盒|烘焙/烧烤|饭盒/提锅|储物/置物架|厨房DIY/小工具|水具酒具|塑料杯|运动水壶|玻璃杯|陶瓷/马克杯|保温杯|保温壶|酒杯/酒具|杯具套装|餐具|餐具套装|碗/碟/盘|筷勺/刀叉|一次性用品|果盘/果篮|酒店用品|自助餐炉|酒店餐具|酒店水具|茶具/咖啡具|整套茶具|茶杯|茶壶|茶盘茶托|茶叶罐|茶具配件|茶宠摆件|咖啡具|其他|清洁用品|纸品湿巾|衣物清洁|清洁工具|驱虫用品|家庭清洁|皮具护理|一次性用品|面部护肤|洁面|乳液面霜|面膜|剃须|套装|精华|眼霜|卸妆|防晒|防晒隔离|T区护理|眼部护理|精华露|爽肤水|身体护理|沐浴|润肤|颈部|手足|纤体塑形|美胸|套装|精油|洗发护发|染发/造型|香薰精油|磨砂/浴盐|手工/香皂|洗发|护发|染发|磨砂膏|香皂|口腔护理|牙膏/牙粉|牙刷/牙线|漱口水|套装|女性护理|卫生巾|卫生护垫|私密护理|脱毛膏|其他|洗发护发|洗发|护发|染发|造型|假发|套装|美发工具|脸部护理|香水彩妆|香水|底妆|腮红|眼影|唇部|美甲|眼线|美妆工具|套装|防晒隔离|卸妆|眉笔|睫毛膏|女装|T恤|衬衫|针织衫|雪纺衫|卫衣|马甲|连衣裙|半身裙|牛仔裤|休闲裤|打底裤|正装裤|小西装|短外套|风衣|毛呢大衣|真皮皮衣|棉服|羽绒服|大码女装|中老年女装|婚纱|打底衫|旗袍/唐装|加绒裤|吊带/背心|羊绒衫|短裤|皮草|礼服|仿皮皮衣|羊毛衫|设计师/潮牌|男装|衬衫|T恤|POLO衫|针织衫|羊绒衫|卫衣|马甲/背心|夹克|风衣|毛呢大衣|仿皮皮衣|西服|棉服|羽绒服|牛仔裤|休闲裤|西裤|西服套装|大码男装|中老年男装|唐装/中山装|工装|真皮皮衣|加绒裤|卫裤/运动裤|短裤|设计师/潮牌|羊毛衫|内衣|文胸|女式内裤|男式内裤|睡衣/家居服|塑身美体|泳衣|吊带/背心|抹胸|连裤袜/丝袜|美腿袜|商务男袜|保暖内衣|情侣睡衣|文胸套装|少女文胸|休闲棉袜 |大码内衣|内衣配件|打底裤袜|打底衫|秋衣秋裤|情趣内衣|洗衣服务|服装洗护|服饰配件|太阳镜|光学镜架/镜片|围巾/手套/帽子套装|袖扣|棒球帽|毛线帽|遮阳帽|老花镜|装饰眼镜|防辐射眼镜|游泳镜|女士丝巾/围巾/披肩|男士丝巾/围巾|鸭舌帽|贝雷帽|礼帽|真皮手套|毛线手套|防晒手套|男士腰带/礼盒|女士腰带/礼盒|钥匙扣|遮阳伞/雨伞|口罩|耳罩/耳包|假领|毛线/布面料|领带/领结/领带夹|钟表|男表|瑞表|女表|国表|日韩表|欧美表|德表|儿童手表|智能手表|闹钟|座钟挂钟|钟表配件|流行男鞋|商务休闲鞋|正装鞋|休闲鞋|凉鞋/沙滩鞋|男靴|功能鞋|拖鞋/人字拖|雨鞋/雨靴|传统布鞋|鞋配件|帆布鞋|增高鞋|工装鞋|定制鞋|时尚女鞋|高跟鞋|单鞋|休闲鞋|凉鞋|女靴|雪地靴|拖鞋/人字拖|踝靴|筒靴|帆布鞋|雨鞋/雨靴|妈妈鞋|鞋配件|特色鞋|鱼嘴鞋|布鞋/绣花鞋|马丁靴|坡跟鞋|松糕鞋|内增高|防水台|奶粉|婴幼奶粉|孕妈奶粉|营养辅食|益生菌/初乳|米粉/菜粉|果泥/果汁|DHA|宝宝零食|钙铁锌/维生素|清火/开胃|面条/粥|尿裤湿巾|婴儿尿裤|拉拉裤|婴儿湿巾|成人尿裤|喂养用品|奶瓶奶嘴|吸奶器|暖奶消毒|儿童餐具|水壶/水杯|牙胶安抚|围兜/防溅衣|辅食料理机|食物存储|洗护用品|宝宝护肤|洗发沐浴|奶瓶清洗|驱蚊防晒|理发器|洗澡用具|婴儿口腔清洁|洗衣液/皂|日常护理|座便器|童车童床|婴儿推车|餐椅摇椅|婴儿床|学步车|三轮车|自行车|电动车|扭扭车|滑板车|婴儿床垫|寝居服饰|婴儿外出服|婴儿内衣|婴儿礼盒|婴儿鞋帽袜|安全防护|家居床品|睡袋/抱被|爬行垫|妈妈专区|妈咪包/背婴带|产后塑身|文胸/内裤|防辐射服|孕妈装|孕期营养|孕妇护肤|待产护理|月子装|防溢乳垫|童装童鞋|套装|上衣|裤子|裙子|内衣/家居服|羽绒服/棉服|亲子装|儿童配饰|礼服/演出服|运动鞋|皮鞋/帆布鞋|靴子|凉鞋|功能鞋|户外/运动服|安全座椅|提篮式|安全座椅|增高垫|潮流女包|钱包|手拿包|单肩包|双肩包|手提包|斜挎包|钥匙包|卡包/零钱包|精品男包|男士钱包|男士手包|卡包名片夹|商务公文包|双肩包|单肩/斜挎包|钥匙包|功能箱包|电脑包|拉杆箱|旅行包|旅行配件|休闲运动包|拉杆包|登山包|妈咪包|书包|相机包|腰包/胸包|礼品|火机烟具|礼品文具|军刀军具|收藏品|工艺礼品|创意礼品|礼盒礼券|鲜花绿植|婚庆节庆|京东卡|美妆礼品|礼品定制|京东福卡|古董文玩|奢侈品|箱包|钱包|服饰|腰带|太阳镜/眼镜框|配件|鞋靴|饰品|名品腕表|高档化妆品|婚庆|婚嫁首饰|婚纱摄影|婚纱礼服|婚庆服务|婚庆礼品/用品|婚宴|进口食品|饼干蛋糕|糖果/巧克力|休闲零食|冲调饮品|粮油调味|牛奶|地方特产|其他特产|新疆|北京|山西|内蒙古|福建|湖南|四川|云南|东北|休闲食品|休闲零食|坚果炒货|肉干肉脯|蜜饯果干|糖果/巧克力|饼干蛋糕|无糖食品|粮油调味|米面杂粮|食用油|调味品|南北干货|方便食品|有机食品|饮料冲调|饮用水|饮料|牛奶乳品|咖啡/奶茶|冲饮谷物|蜂蜜/柚子茶|成人奶粉|食品礼券|月饼|大闸蟹|粽子|卡券|茗茶|铁观音|普洱|龙井|绿茶|红茶|乌龙茶|花草茶|花果茶|养生茶|黑茶|白茶|其它茶|时尚饰品|项链|手链/脚链|戒指|耳饰|毛衣链|发饰/发卡|胸针|饰品配件|婚庆饰品|黄金|黄金吊坠|黄金项链|黄金转运珠|黄金手镯/手链/脚链|黄金耳饰|黄金戒指|K金饰品|K金吊坠|K金项链|K金手镯/手链/脚链|K金戒指|K金耳饰|金银投资|投资金|投资银|投资收藏|银饰|银吊坠/项链|银手镯/手链/脚链|银戒指|银耳饰|足银手镯|宝宝银饰|钻石|裸钻|钻戒|钻石项链/吊坠|钻石耳饰|钻石手镯/手链|翡翠玉石|项链/吊坠|手镯/手串|戒指|耳饰|挂件/摆件/把件|玉石孤品|水晶玛瑙|项链/吊坠|耳饰|手镯/手链/脚链|戒指|头饰/胸针|摆件/挂件|彩宝|琥珀/蜜蜡|碧玺|红宝石/蓝宝石|坦桑石|珊瑚|祖母绿|葡萄石|其他天然宝石|项链/吊坠|耳饰|手镯/手链|戒指|铂金|铂金项链/吊坠|铂金手镯/手链/脚链|铂金戒指|铂金耳饰|木手串/把件|小叶紫檀|黄花梨|沉香木|金丝楠|菩提|其他|橄榄核/核桃|檀香|珍珠|珍珠项链|珍珠吊坠|珍珠耳饰|珍珠手链|珍珠戒指|珍珠胸针|维修保养|机油|正时皮带|添加剂|汽车喇叭|防冻液|汽车玻璃|滤清器|火花塞|减震器|柴机油/辅助油|雨刷|车灯|后视镜|轮胎|轮毂|刹车片/盘|维修配件|蓄电池|底盘装甲/护板|贴膜|汽修工具|改装配件|车载电器|导航仪|安全预警仪|行车记录仪|倒车雷达|蓝牙设备|车载影音|净化器|电源|智能驾驶|车载电台|车载电器配件|吸尘器|智能车机|冰箱|汽车音响|车载生活电器|美容清洗|车蜡|补漆笔|玻璃水|清洁剂|洗车工具|镀晶镀膜|打蜡机|洗车配件|洗车机|洗车水枪|毛巾掸子|汽车装饰|脚垫|座垫|座套|后备箱垫|头枕腰靠|方向盘套|香水|空气净化|挂件摆件|功能小件|车身装饰件|车衣|安全自驾|安全座椅|胎压监测|防盗设备|应急救援|保温箱|地锁|摩托车|充气泵|储物箱|自驾野营|摩托车装备|汽车服务|清洗美容|功能升级|保养维修|油卡充值|车险|加油卡|ETC|驾驶培训|赛事改装|赛事服装|赛事用品|制动系统|悬挂系统|进气系统|排气系统|电子管理|车身强化|赛事座椅|运动鞋包|跑步鞋|休闲鞋|篮球鞋|板鞋|帆布鞋|足球鞋|乒羽网鞋|专项运动鞋|训练鞋|拖鞋|运动包|运动服饰|羽绒服|棉服|运动裤|夹克/风衣|卫衣/套头衫|T恤|套装|乒羽网服|健身服|运动背心|毛衫/线衫|运动配饰|骑行运动|折叠车|山地车/公路车|电动车|其他整车|骑行服|骑行装备|平衡车|垂钓用品|鱼竿鱼线|浮漂鱼饵|钓鱼桌椅|钓鱼配件|钓箱鱼包|其它|游泳用品|泳镜|泳帽|游泳包防水包|女士泳衣|男士泳衣|比基尼|其它|户外鞋服|冲锋衣裤|速干衣裤|滑雪服|羽绒服/棉服|休闲衣裤|抓绒衣裤|软壳衣裤|T恤|户外风衣|功能内衣|军迷服饰|登山鞋|雪地靴|徒步鞋|越野跑鞋|休闲鞋|工装鞋|溯溪鞋|沙滩/凉拖|户外袜|户外装备|帐篷/垫子|睡袋/吊床|登山攀岩|户外配饰|背包|户外照明|户外仪表|户外工具|望远镜|旅游用品|便携桌椅床|野餐烧烤|军迷用品|救援装备|滑雪装备|极限户外|冲浪潜水|健身训练|综合训练器|其他大型器械|哑铃|仰卧板/收腹机|其他中小型器材|瑜伽舞蹈|甩脂机|踏步机|武术搏击|健身车/动感单车|跑步机|运动护具|体育用品|羽毛球|乒乓球|篮球|足球|网球|排球|高尔夫|台球|棋牌麻将|轮滑滑板|其他|适用年龄|0-6个月|6-12个月|1-3岁|3-6岁|6-14岁|14岁以上|遥控/电动|遥控车|遥控飞机|遥控船|机器人|轨道/助力|毛绒布艺|毛绒/布艺|靠垫/抱枕|娃娃玩具|芭比娃娃|卡通娃娃|智能娃娃|模型玩具|仿真模型|拼插模型|收藏爱好|健身玩具|炫舞毯|爬行垫/毯|户外玩具|戏水玩具|动漫玩具|电影周边|卡通周边|网游周边|益智玩具|摇铃/床铃|健身架|早教启智|拖拉玩具|积木拼插|积木|拼图|磁力棒|立体拼插|DIY玩具|手工彩泥|绘画工具|情景玩具|创意减压|减压玩具|创意玩具|乐器|钢琴|电子琴/电钢琴|吉他/尤克里里|打击乐器|西洋管弦|民族管弦乐器|乐器配件|电脑音乐|工艺礼品乐器|口琴/口风琴/竖笛|手风琴||机票|国内机票|酒店|国内酒店|酒店团购|旅行|度假|景点|租车|火车票|旅游团购|充值|手机充值|游戏|游戏点卡|QQ充值|票务|电影票|演唱会|话剧歌剧|音乐会|体育赛事|舞蹈芭蕾|戏曲综艺|产地直供|水果|苹果|橙子|奇异果/猕猴桃|车厘子/樱桃|芒果|蓝莓|火龙果|葡萄/提子|柚子|香蕉|牛油果|梨|菠萝/凤梨|桔/橘|柠檬|草莓|桃/李/杏|更多水果|水果礼盒/券|猪牛羊肉|牛肉|羊肉|猪肉|内脏类|海鲜水产|鱼类|虾类|蟹类|贝类|海参|海产干货|其他水产|海产礼盒|禽肉蛋品|鸡肉|鸭肉|蛋类|其他禽类|冷冻食品|水饺/馄饨|汤圆/元宵|面点|火锅丸串|速冻半成品|奶酪黄油|熟食腊味|熟食|腊肠/腊肉|火腿|糕点|礼品卡券|饮品甜品|冷藏果蔬汁|冰激凌|其他`\n\n\ttask := &model.Task{\n\t\tId:      \"jingdong\",\n\t\tName:    \"jingdong\",\n\t\tRequest: []*model.Request{\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         fmt.Sprintf(\"https://search.jd.com/Search?keyword={%s}&enc=utf-8&page={1-5,1}\", goodsType),\n\t\t\t\tProcessName: \"jingdong-list\",\n\t\t\t},\n\t\t},\n\t\tProcess: []model.Process{\n\t\t\t{\n\t\t\t\tName: \"jingdong-list\",\n\t\t\t\tType: \"template\",\n\t\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":        \"array|.gl-item\",\n\t\t\t\t\t\t\"img\":         \"attr.src|.err-product\",\n\t\t\t\t\t\t\"price\":       \"text|.p-price strong i\",\n\t\t\t\t\t\t\"goods_name\":  \"text|.p-name a em\",\n\t\t\t\t\t\t\"desc\":        \"text|.p-name a i\",\n\t\t\t\t\t\t\"comment_num\": \"text|.p-commit strong a\",\n\t\t\t\t\t\t\"shop_addr\":   \"attr.href|.curr-shop\",\n\t\t\t\t\t\t\"shop_name\":   \"attr.title|.curr-shop\",\n\t\t\t\t\t\t\"goods_id\":    \"attr.data-sku|.J_focus\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t\tAddQueue: []*model.Request{\n\t\t\t\t\t{\n\t\t\t\t\t\tMethod:      \"get\",\n\t\t\t\t\t\tUrl:         \"https://sclub.jd.com/comment/productPageComments.action?productId={$goods_id}&score=0&sortType=5&page=0&pageSize=10\",\n\t\t\t\t\t\tProcessName: \"jingdong-comment-first\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t\t{\n\t\t\t\t//\n\t\t\t\tName: \"jingdong-comment-first\",\n\t\t\t\tType: \"json\",\n\t\t\t\tJsonRule: model.JsonRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"max_page\": \"maxPage\",\n\t\t\t\t\t\t\"id\":       \"productCommentSummary.productId\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t\tAddQueue: []*model.Request{\n\t\t\t\t\t{\n\t\t\t\t\t\tMethod:      \"get\",\n\t\t\t\t\t\tUrl:         \"https://sclub.jd.com/comment/productPageComments.action?productId={$id}&score=0&sortType=5&page={0-$max_page,1}&pageSize=10\",\n\t\t\t\t\t\tProcessName: \"jingdong-comments\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t\t{\n\t\t\t\tName: \"jingdong-comments\",\n\t\t\t\tType: \"json\",\n\t\t\t\tJsonRule: model.JsonRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":            \"array|comments\",\n\t\t\t\t\t\t\"comment_id\":      \"id\",\n\t\t\t\t\t\t\"content\":         \"content\",\n\t\t\t\t\t\t\"create_time\":     \"creationTime\",\n\t\t\t\t\t\t\"image_count\":     \"imageCount\",\n\t\t\t\t\t\t\"isMobile\":        \"isMobile\",\n\t\t\t\t\t\t\"productColor\":    \"productColor\",\n\t\t\t\t\t\t\"productSize\":     \"productSize\",\n\t\t\t\t\t\t\"productId\":       \"referenceId\",\n\t\t\t\t\t\t\"score\":           \"score\",\n\t\t\t\t\t\t\"replyCount\":      \"replyCount\",\n\t\t\t\t\t\t\"usefulVoteCount\": \"usefulVoteCount\",\n\t\t\t\t\t\t\"userClient\":      \"userClient\",\n\t\t\t\t\t\t\"userClientShow\":  \"userClientShow\",\n\t\t\t\t\t\t\"userLevelId\":     \"userLevelId\",\n\t\t\t\t\t\t\"userLevelName\":   \"userLevelName\",\n\t\t\t\t\t\t\"userProvince\":    \"userProvince\",\n\t\t\t\t\t\t\"nickname\":        \"nickname\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t\t{\n\t\t\t\tName: \"jingdong-type\",\n\t\t\t\tType: \"template\",\n\t\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"type\": \"texts|.items a\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\n\t\tPipline: \"file\",\n\t}\n\n\tapp := spider.New()\n\tapp.AddSpider(spider2.InitWithTask(task))\n\tapp.Run()\n}\n"
  },
  {
    "path": "example-spider/qiongyou/conf.json",
    "content": "{\n  \"name\":\"qiongyou_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 5,\n  \"max_wait_num\":12000,\n  \"http_addr\":\"127.0.0.1:7774\",\n  \"etcd\":[\"http://127.0.0.1:2379\"],\n\n  \"schedule\":\"redis\",\n  \"redis_addr\":\"127.0.0.1:6379\"\n}"
  },
  {
    "path": "example-spider/qiongyou/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n)\n\nfunc main() {\n\n\ttask := &model.Task{\n\t\tId:   \"qiongyou\",\n\t\tName: \"qiongyou\",\n\t\tRequest: []*model.Request{\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://plan.qyer.com/search_0_0_0_0_0_1_{1-134,1}/\",\n\t\t\t\tProcessName: \"qiongyou-list\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://plan.qyer.com/search_0_0_0_0_0_2_{1-134,1}/\",\n\t\t\t\tProcessName: \"qiongyou-list\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://plan.qyer.com/search_0_0_0_0_0_3_{1-134,1}/\",\n\t\t\t\tProcessName: \"qiongyou-list\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://plan.qyer.com/search_0_0_0_0_0_4_{1-134,1}/\",\n\t\t\t\tProcessName: \"qiongyou-list\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://plan.qyer.com/search_0_0_0_0_0_5_{1-134,1}/\",\n\t\t\t\tProcessName: \"qiongyou-list\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://plan.qyer.com/search_0_0_0_0_0_6_{1-134,1}/\",\n\t\t\t\tProcessName: \"qiongyou-list\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://plan.qyer.com/search_0_0_0_0_0_7_{1-134,1}/\",\n\t\t\t\tProcessName: \"qiongyou-list\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://plan.qyer.com/search_0_0_0_0_0_8_{1-134,1}/\",\n\t\t\t\tProcessName: \"qiongyou-list\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://plan.qyer.com/search_0_0_0_0_0_9_{1-134,1}/\",\n\t\t\t\tProcessName: \"qiongyou-list\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://plan.qyer.com/search_0_0_0_0_0_10_{1-134,1}/\",\n\t\t\t\tProcessName: \"qiongyou-list\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://plan.qyer.com/search_0_0_0_0_0_11_{1-134,1}/\",\n\t\t\t\tProcessName: \"qiongyou-list\",\n\t\t\t},\n\t\t},\n\t\tProcess: []model.Process{\n\t\t\t{\n\t\t\t\tName: \"qiongyou-list\",\n\t\t\t\tType: \"template\",\n\t\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":     \"array|.items\",\n\t\t\t\t\t\t\"img\":      \"attr.src|.plan-cover\",\n\t\t\t\t\t\t\"time\":     \"text|.fontYaHei dt\",\n\t\t\t\t\t\t\"title\":    \"text|.fontYaHei dd\",\n\t\t\t\t\t\t\"day\":      \"text|.day strong\",\n\t\t\t\t\t\t\"tag\":      \"text|.tag strong\",\n\t\t\t\t\t\t\"plan\":     \"text|.plan p\",\n\t\t\t\t\t\t\"author\":   \"text|.name\",\n\t\t\t\t\t\t\"read_num\": \"text|.number .icon1\",\n\t\t\t\t\t\t\"xx_num\":   \"text|.number .icon2\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\n\t\tPipline: \"file\",\n\t}\n\n\tapp := spider.New()\n\tapp.AddSpider(spider2.InitWithTask(task))\n\tapp.Run()\n}\n"
  },
  {
    "path": "example-spider/qiubai/conf.json",
    "content": "{\n  \"name\":\"sohu_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 1,\n  \"max_wait_num\":4096,\n  \"http_addr\":\"127.0.0.1:7773\",\n  \"etcd\":[\"http://127.0.0.1:2379\"]\n}"
  },
  {
    "path": "example-spider/qiubai/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n)\n\nfunc main() {\n\n\ttask := &model.Task{\n\t\tId:   \"qiiubai\",\n\t\tName: \"qiubai\",\n\t\tRequest: []*model.Request{\n\t\t\t{\n\t\t\t\tMethod: \"get\",\n\t\t\t\tUrl:    \"https://www.qiushibaike.com\",\n\t\t\t},\n\t\t},\n\t\tProcess: []model.Process{\n\t\t\t{\n\t\t\t\tRegUrl: []string{\n\t\t\t\t\t\"/.*?/page/[0-9]+\",\n\t\t\t\t\t\"/hot/|/imgrank/|/text/|/history/|/pic/|/textnew/\",\n\t\t\t\t},\n\t\t\t\tType: \"template\",\n\t\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":        \"array|.article\",\n\t\t\t\t\t\t\"url\":         \"attr.href|.contentHerf\",\n\t\t\t\t\t\t\"author\":      \"attr.alt|.author a img\",\n\t\t\t\t\t\t\"content\":     \"text|.content span\",\n\t\t\t\t\t\t\"like_num\":    \"text|.stats-vote i\",\n\t\t\t\t\t\t\"comment_num\": \"text|.stats-comments a i\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\t\tPipline: \"file\",\n\t}\n\n\tapp := spider.New()\n\tapp.AddSpider(spider2.InitWithTask(task))\n\tapp.Run()\n}\n"
  },
  {
    "path": "example-spider/qiubai/sem_test.go",
    "content": "package main\n\nimport (\n\t\"fmt\"\n\t\"os\"\n\t\"strings\"\n\t\"time\"\n\n\t\"github.com/tebeka/selenium\"\n)\n\nfunc Example() {\n\t// running).\n\tconst (\n\t\tseleniumPath    = \"vendor/selenium-server-standalone-3.4.jar\"\n\t\tgeckoDriverPath = \"vendor/geckodriver-v0.18.0-linux64\"\n\t\tport            = 8080\n\t)\n\topts := []selenium.ServiceOption{\n\t\tselenium.StartFrameBuffer(),           // Start an X frame buffer for the browser to run in.\n\t\tselenium.GeckoDriver(geckoDriverPath), // Specify the path to GeckoDriver in order to use Firefox.\n\t\tselenium.Output(os.Stderr),            // Output debug information to STDERR.\n\t}\n\tselenium.SetDebug(true)\n\tservice, err := selenium.NewSeleniumService(seleniumPath, port, opts...)\n\tif err != nil {\n\t\tpanic(err) // panic is used only as an example and is not otherwise recommended.\n\t}\n\tdefer service.Stop()\n\n\t// Connect to the WebDriver instance running locally.\n\tcaps := selenium.Capabilities{\"browserName\": \"firefox\"}\n\twd, err := selenium.NewRemote(caps, fmt.Sprintf(\"http://localhost:%d/wd/hub\", port))\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\tdefer wd.Quit()\n\n\t// Navigate to the simple playground interface.\n\tif err := wd.Get(\"http://play.golang.org/?simple=1\"); err != nil {\n\t\tpanic(err)\n\t}\n\n\t// Get a reference to the text box containing code.\n\telem, err := wd.FindElement(selenium.ByCSSSelector, \"#code\")\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\t// Remove the boilerplate code already in the text box.\n\tif err := elem.Clear(); err != nil {\n\t\tpanic(err)\n\t}\n\n\t// Enter some new code in text box.\n\terr = elem.SendKeys(`\n\t\tpackage main\n\t\timport \"fmt\"\n\t\tfunc main() {\n\t\t\tfmt.Println(\"Hello WebDriver!\\n\")\n\t\t}\n\t`)\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\n\t// Click the run button.\n\tbtn, err := wd.FindElement(selenium.ByCSSSelector, \"#run\")\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\tif err := btn.Click(); err != nil {\n\t\tpanic(err)\n\t}\n\n\t// Wait for the program to finish running and get the output.\n\toutputDiv, err := wd.FindElement(selenium.ByCSSSelector, \"#output\")\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\n\tvar output string\n\tfor {\n\t\toutput, err = outputDiv.Text()\n\t\tif err != nil {\n\t\t\tpanic(err)\n\t\t}\n\t\tif output != \"Waiting for remote server...\" {\n\t\t\tbreak\n\t\t}\n\t\ttime.Sleep(time.Millisecond * 100)\n\t}\n\n\tfmt.Printf(\"%s\", strings.Replace(output, \"\\n\\n\", \"\\n\", -1))\n\n}\n"
  },
  {
    "path": "example-spider/ttkb/conf.json",
    "content": "{\n  \"name\":\"ttkb-author\",\n  \"version\":\"0.01\",\n  \"work_num\": 50,\n  \"max_wait_num\":200000,\n  \"http_addr\":\"127.0.0.1:7774\",\n  \"etcd\":[\"http://127.0.0.1:2379\"],\n\n  \"mysql\":\"root:123456@tcp(127.0.0.1:3306)/auto_db?charset=utf8\"\n\n}"
  },
  {
    "path": "example-spider/ttkb/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n\t\"fmt\"\n)\n\nfunc main() {\n\ttypes := \"daily_timeline|kb_video_news|kb_news_bagua|kb_news_qipa|kb_photo_news|kb_news_tech|kb_news_finance|location|kb_news_world|kb_news_movie|kb_news_gaojidi|kb_news_wealth|kb_photo_gif|kb_news_sports|kb_news_mil|kb_news_history|kb_news_nba|kb_news_car|kb_news_chaobao|kb_news_laugh|kb_news_pet|kb_news_science|kb_news_baby|kb_news_astro|kb_news_sex|kb_news_beauty|kb_news_house|kb_news_share|kb_news_rock|kb_news_tfboys|kb_news_augury|kb_news_photography|kb_news_lottery|kb_news_cate|kb_news_julebu|kb_news_travel|kb_news_idea|kb_news_lol|kb_news_erciyuan|kb_news_space|kb_news_game|kb_news_iphone|kb_news_esport|kb_news_health|kb_news_outfit|kb_news_furnishing|kb_news_workout|kb_news_soup|kb_news_run|kb_news_fishing|kb_news_buddism|kb_news_diet|kb_news_football|kb_news_tennis|kb_news_tea|kb_news_yoga|kb_news_plaything|kb_news_watch\"\n\t//types := \"daily_timeline|kb_video_news|kb_news_bagua|kb_news_qipa|kb_photo_news|kb_news_tech|kb_news_finance|location|kb_news_world|kb_news_movie|kb_news_gaojidi|kb_news_wealth|kb_photo_gif|kb_news_sports|kb_news_mil|kb_news_history|kb_news_nba|kb_news_car|kb_news_chaobao|kb_news_laugh|kb_news_pet|kb_news_science|kb_news_baby|kb_news_astro|kb_news_sex|kb_news_beauty|kb_news_house|kb_news_share|kb_news_rock|kb_news_tfboys|kb_news_augury|kb_news_photography|kb_news_lottery|kb_news_cate|kb_news_julebu|kb_news_travel|kb_news_idea|kb_news_lol|kb_news_erciyuan|kb_news_space|kb_news_game|kb_news_iphone|kb_news_esport|kb_news_health|kb_news_outfit|kb_news_furnishing|kb_news_workout|kb_news_soup|kb_news_run|kb_news_fishing|kb_news_buddism|kb_news_diet|kb_news_football|kb_news_tennis|kb_news_tea|kb_news_yoga|kb_news_plaything|kb_news_watch\"\n\n\ttask := &model.Task{\n\t\tId:   \"ttkb-author\",\n\t\tName: \"ttkb-author\",\n\t\tRequest: []*model.Request{\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         fmt.Sprintf(`http://r.cnews.qq.com/getSubNewsChlidInterest?patchver=4511&mid=fd248c13ee1ce793495484e4cf3250f8ebbb475a&devid=860046037899335&store=60009&screen_height=1920&apptype=android&origin_imei=860046037899335&hw=OnePlus_ONEPLUSA3000&appver=25_areading_4.5.11&appversion=4.5.11&uid=bfa0a264a6547298&screen_width=1080&sceneid=&android_id=bfa0a264a6547298&last_id=20171207A03G7J00&ssid=GeeyueTech_5G&forward=0&IronThroneBuildTime=1512716487405&omgid=e0f7a4180378ba4e5ee80b0820ef5a1744ca0010211815&IronThroneRelBuildTime=415047497&refreshType=normal&qqnetwork=wifi&last_time=&bottom_id=20171207A0BFU500&top_time=1512631500&currentTab=kuaibao&top_id=20171207C0HX4500&is_wap=0&omgbizid=b03081d3f5806f45b65904d08cfad6bc77130080211815&page={1-1000,1}&imsi=460019017167485&lastRefreshTime=&IronThroneRelExecTime=415047499&muid=49887860909485482&activefrom=icon&cachedCount=20&direction=0&sessionid=&chRefreshTimes=0&chlid={%s}&bottom_time=1512603257&IronThroneExecTime=1512716487407&qn-sig=284d6905ece4010e0ebd89dce072b5ee&qn-rid=6e63ca4d-1285-47ee-b95d-0bb49da3ce03`,types),\n\t\t\t\tProcessName: \"ttkblist\",\n\t\t\t},\n\t\t},\n\t\tProcess: []model.Process{\n\t\t\t{\n\t\t\t\tName: \"ttkblist\",\n\t\t\t\tType: \"json\",\n\t\t\t\tJsonRule: model.JsonRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":   \"array|newslist\",\n\t\t\t\t\t\t\"chlid\":    \"chlid\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t\tAddQueue:[]*model.Request{\n\t\t\t\t\t{\n\t\t\t\t\t\tMethod:  \"get\",\n\t\t\t\t\t\tUrl :    \"http://r.cnews.qq.com/getSubItem?chlid={$chlid}\",\n\t\t\t\t\t\tProcessName: \"author\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t\t{\n\t\t\t\tName: \"author\",\n\t\t\t\tType: \"json\",\n\t\t\t\tJsonRule: model.JsonRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t//\"node\":   \"\",\n\t\t\t\t\t\t\"chlid\":    \"channelInfo.chlid\",\n\t\t\t\t\t\t\"chlname\":\"channelInfo.chlname\",\n\t\t\t\t\t\t\"desc\":\"channelInfo.desc\",\n\t\t\t\t\t\t\"subCount\":\"channelInfo.subCount\",\n\t\t\t\t\t\t\"uin\":\"channelInfo.uin\",\n\t\t\t\t\t\t\"intro\":\"channelInfo.intro\",\n\t\t\t\t\t\t\"recommend\":\"channelInfo.recommend\",\n\t\t\t\t\t\t\"followCount\":\"channelInfo.followCount\",\n\t\t\t\t\t\t\"readCount\":\"channelInfo.readCount\",\n\t\t\t\t\t\t\"shareCount\":\"channelInfo.shareCount\",\n\t\t\t\t\t\t\"colCount\":\"channelInfo.colCount\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\n\t\tPipline: \"mysql\",\n\t}\n\n\tapp := spider.New()\n\tapp.AddSpider(spider2.InitWithTask(task))\n\tapp.Run()\n}\n"
  },
  {
    "path": "example-spider/ttkb-author/conf.json",
    "content": "{\n  \"name\":\"ttkb-author\",\n  \"version\":\"0.01\",\n  \"work_num\": 50,\n  \"max_wait_num\":210000,\n  \"http_addr\":\"127.0.0.1:7774\",\n  \"etcd\":[\"http://127.0.0.1:2379\"],\n\n  \"mysql\":\"root:123456@tcp(127.0.0.1:3306)/auto_db?charset=utf8\"\n\n}"
  },
  {
    "path": "example-spider/ttkb-author/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n)\n\nfunc main() {\n\n\ttask := &model.Task{\n\t\tId:   \"ttkb-author\",\n\t\tName: \"ttkb-author\",\n\t\tRequest: []*model.Request{\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:`http://r.cnews.qq.com/getSubItem?chlid={6000000-6200000,1}`,\n\t\t\t\tProcessName: \"ttkb-author\",\n\t\t\t},\n\t\t},\n\t\tProcess: []model.Process{\n\t\t\t{\n\t\t\t\tName: \"ttkb-author\",\n\t\t\t\tType: \"json\",\n\t\t\t\tJsonRule: model.JsonRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"chlid\":    \"channelInfo.chlid\",\n\t\t\t\t\t\t\"chlname\":\"channelInfo.chlname\",\n\t\t\t\t\t\t\"desc\":\"channelInfo.desc\",\n\t\t\t\t\t\t\"subCount\":\"channelInfo.subCount\",\n\t\t\t\t\t\t\"uin\":\"channelInfo.uin\",\n\t\t\t\t\t\t\"intro\":\"channelInfo.intro\",\n\t\t\t\t\t\t\"recommend\":\"channelInfo.recommend\",\n\t\t\t\t\t\t\"followCount\":\"channelInfo.followCount\",\n\t\t\t\t\t\t\"readCount\":\"channelInfo.readCount\",\n\t\t\t\t\t\t\"shareCount\":\"channelInfo.shareCount\",\n\t\t\t\t\t\t\"colCount\":\"channelInfo.colCount\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\n\t\tPipline: \"mysql\",\n\t}\n\n\tapp := spider.New()\n\tapp.AddSpider(spider2.InitWithTask(task))\n\tapp.Run()\n}\n"
  },
  {
    "path": "example-spider/tuiku/conf.json",
    "content": "{\n  \"name\":\"tuiku_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 2,\n  \"max_wait_num\":12000,\n  \"http_addr\":\"127.0.0.1:7774\",\n  \"etcd\":[\"http://127.0.0.1:2379\"]\n}"
  },
  {
    "path": "example-spider/tuiku/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n)\n\nfunc main() {\n\n\ttask := &model.Task{\n\t\tId:   \"tuiku\",\n\t\tName: \"tuiku\",\n\t\tRequest: []*model.Request{\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://www.tuicool.com/ah/0/{1-100,1}?lang=1\",\n\t\t\t\tProcessName: \"tuikulist\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://www.tuicool.com/ah/101000000/{1-100,1}?lang=1\",\n\t\t\t\tProcessName: \"tuikulist\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://www.tuicool.com/ah/101040000/{1-100,1}?lang=1\",\n\t\t\t\tProcessName: \"tuikulist\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://www.tuicool.com/ah/101050000/{1-100,1}?lang=1\",\n\t\t\t\tProcessName: \"tuikulist\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://www.tuicool.com/ah/20/{1-100,1}?lang=1\",\n\t\t\t\tProcessName: \"tuikulist\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://www.tuicool.com/ah/108000000/{1-100,1}?lang=1\",\n\t\t\t\tProcessName: \"tuikulist\",\n\t\t\t},\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://www.tuicool.com/ah/114000000/{1-100,1}?lang=1\",\n\t\t\t\tProcessName: \"tuikulist\",\n\t\t\t},\n\t\t},\n\t\tProcess: []model.Process{\n\t\t\t{\n\t\t\t\tName: \"tuikulist\",\n\t\t\t\tType: \"template\",\n\t\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":   \"array|.list_article_item\",\n\t\t\t\t\t\t\"img\":    \"attr.src|.article_thumb_image img\",\n\t\t\t\t\t\t\"title\":  \"text|.title a\",\n\t\t\t\t\t\t\"author\": \"text|.tip span:nth-child(1)\",\n\t\t\t\t\t\t\"time\":   \"text|.tip span:nth-child(3)\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\n\t\tPipline: \"file\",\n\t}\n\n\tapp := spider.New()\n\tapp.AddSpider(spider2.InitWithTask(task))\n\tapp.Run()\n}\n"
  },
  {
    "path": "example-spider/wangyi-music/conf.json",
    "content": "{\n  \"name\":\"music_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 100,\n  \"max_wait_num\":100000,\n  \"http_addr\":\"127.0.0.1:7775\",\n\n  \"mysql\":\"root:123456@tcp(127.0.0.1:3306)/auto_db?charset=utf8\"\n}"
  },
  {
    "path": "example-spider/wangyi-music/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n\t\"fmt\"\n\t\"strings\"\n)\n\nfunc main() {\n\tmusicType := `华语| 欧美| 日语| 韩语| 粤语| 小语种|\n\t流行| 摇滚| 民谣| 电子| 舞曲| 说唱| 轻音乐| 爵士| 乡村| R&B/Soul| 古典| 民族| 英伦| 金属| 朋克| 蓝调| 雷鬼| 世界音乐| 拉丁| 另类/独立| New Age| 古风| 后摇| Bossa Nova|\n\t清晨| 夜晚| 学习| 工作| 午休| 下午茶| 地铁| 驾车| 运动| 旅行| 散步| 酒吧|\n\t怀旧| 清新| 浪漫| 性感| 伤感| 治愈| 放松| 孤独| 感动| 兴奋| 快乐| 安静| 思念|\n\t影视原声| ACG| 校园| 游戏| 70后| 80后| 90后| 网络歌曲| KTV| 经典| 翻唱| 吉他| 钢琴| 器乐| 儿童| 榜单| 00后|`\n\n\tmusicType = strings.Replace(musicType, \" \", \"\", -1)\n\tmusicTypes := strings.Split(musicType, \"|\")\n\treqs := []*model.Request{}\n\n\tfor _, ty := range musicTypes {\n\t\treqs = append(reqs, &model.Request{\n\t\t\tMethod:      \"get\",\n\t\t\tUrl:         fmt.Sprintf(\"http://music.163.com/discover/playlist/?order=hot&cat=%s&limit=35&offset={0-1440,35}\", ty),\n\t\t\tProcessName: \"music-list\",\n\t\t})\n\t}\n\ttask := &model.Task{\n\t\tId:      \"music-list\",\n\t\tName:    \"music-list\",\n\t\tRequest: reqs,\n\t\tProcess: []model.Process{\n\t\t\t{\n\t\t\t\tName: \"music-list\",\n\t\t\t\tType: \"template\",\n\t\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":       \"array|.m-cvrlst li\",\n\t\t\t\t\t\t\"img\":        \"attr.src|.u-cover img\",\n\t\t\t\t\t\t\"music_addr\": \"attr.href|.u-cover a\",\n\t\t\t\t\t\t\"title\":      \"attr.title|.u-cover a\",\n\t\t\t\t\t\t\"play_num\":   \"text|.nb\",\n\t\t\t\t\t\t\"author\":     \"text|.nm\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t\tAddQueue: []*model.Request{\n\t\t\t\t\t{\n\t\t\t\t\t\tMethod:      \"get\",\n\t\t\t\t\t\tUrl:         \"http://music.163.com{$music_addr}\",\n\t\t\t\t\t\tProcessName: \"music-detail\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t\t{\n\t\t\t\tName: \"music-detail\",\n\t\t\t\tType: \"template\",\n\t\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"img\":         \"attr.src|.u-cover img\",\n\t\t\t\t\t\t\"title\":       \"text|.f-ff2\",\n\t\t\t\t\t\t\"play_num\":    \"text|#play-count\",\n\t\t\t\t\t\t\"author\":      \"text|.s-fc7\",\n\t\t\t\t\t\t\"like_num\":    \"text|.u-btni-fav i\",\n\t\t\t\t\t\t\"share_num\":   \"text|.u-btni-share i\",\n\t\t\t\t\t\t\"comment_num\": \"text|#cnt_comment_count\",\n\t\t\t\t\t\t\"desc\":        \"text|#album-desc-dot\",\n\t\t\t\t\t\t\"time\":        \"text|.time\",\n\t\t\t\t\t\t\"music_count\": \"#playlist-track-count\",\n\t\t\t\t\t\t\"id\":          \"attr.data-rid|#content-operation\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\t}\n\n\tapp := spider.New()\n\tapp.AddSpider(spider2.InitWithTask(task))\n\tapp.Run()\n}\n"
  },
  {
    "path": "example-spider/wangyi-music/music/conf.json",
    "content": "{\n  \"name\":\"music_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 100,\n  \"max_wait_num\":100000,\n  \"http_addr\":\"127.0.0.1:7775\",\n  \"etcd\":[\"http://127.0.0.1:2379\"],\n\n  \"mysql\":\"root:123456@tcp(127.0.0.1:3306)/auto_db?charset=utf8\"\n}"
  },
  {
    "path": "example-spider/woshipm/conf.json",
    "content": "{\n  \"name\":\"woshipm_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 100,\n  \"max_wait_num\":12000,\n  \"http_addr\":\"127.0.0.1:7774\",\n  \"etcd\":[\"http://127.0.0.1:2379\"]\n\n}"
  },
  {
    "path": "example-spider/woshipm/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/spider\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n)\n\nfunc main() {\n\n\ttask := &model.Task{\n\t\tId:   \"woshipm\",\n\t\tName: \"woshipm\",\n\t\tRequest: []*model.Request{\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         \"http://www.woshipm.com/category/pd/page/{1-588,1}\",\n\t\t\t\tProcessName: \"woshipm-list\",\n\t\t\t},\n\t\t},\n\t\tProcess: []model.Process{\n\t\t\t{\n\t\t\t\tName: \"woshipm-list\",\n\t\t\t\tType: \"template\",\n\t\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\t\"node\":     \"array|.postlist-item\",\n\t\t\t\t\t\t\"img\":      \"attr.src|.post-img a img\",\n\t\t\t\t\t\t\"time\":     \"text|.stream-list-meta time\",\n\t\t\t\t\t\t\"title\":    \"text|.post-title a\",\n\t\t\t\t\t\t\"author\":   \"text|.author a\",\n\t\t\t\t\t\t\"des\": \"text|.des\",\n\t\t\t\t\t\t\"read_num\":   \"text|.post-meta-items span:nth-child(1)\",\n\t\t\t\t\t\t\"collect_num\":   \"text|.post-meta-items span:nth-child(2)\",\n\t\t\t\t\t\t\"like_num\":   \"text|.post-meta-items span:nth-child(3)\",\n\t\t\t\t\t},\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\n\t\tPipline: \"file\",\n\t}\n\n\tapp := spider.New()\n\tapp.AddSpider(spider2.InitWithTask(task))\n\tapp.Run()\n}\n"
  },
  {
    "path": "manage/conf.json",
    "content": "{\n  \"name\":\"yi_spider_manage\",\n  \"version\":\"0.01\",\n  \"discover\":\"etcd\",\n  \"http_addr\":\"127.0.0.1:7778\",\n  \"etcd\":[\"http://127.0.0.1:2379\"]\n}"
  },
  {
    "path": "manage/config/config.go",
    "content": "package config\n\nimport (\n\t\"YiSpider/manage/logger\"\n\t\"encoding/json\"\n\t\"io/ioutil\"\n\t\"os\"\n)\n\nvar ConfigI *Config\n\ntype Config struct {\n\tName    string `json:\"name\"`\n\tVersion string `json:\"version\"`\n\n\tDiscover string   `json:\"discover\"`\n\tHttpAddr string   `json:\"http_addr\"`\n\tEtcd     []string `json:\"etcd\"`\n}\n\nfunc InitConfig() error {\n\tvar file *os.File\n\tvar bytes []byte\n\tvar err error\n\n\tif file, err = os.OpenFile(\"./conf.json\", os.O_RDONLY, 0666); err != nil {\n\t\treturn err\n\t}\n\n\tif bytes, err = ioutil.ReadAll(file); err != nil {\n\t\treturn err\n\t}\n\n\tConfigI = &Config{}\n\tif err = json.Unmarshal(bytes, ConfigI); err != nil {\n\t\treturn err\n\t}\n\n\tlogger.Info(\"init success \", *ConfigI)\n\treturn nil\n}\n"
  },
  {
    "path": "manage/discover/discover.go",
    "content": "package discover\n\nimport (\n\t\"YiSpider/manage/config\"\n\t\"YiSpider/manage/discover/etcd\"\n\t\"YiSpider/manage/model\"\n)\n\ntype Discover interface {\n\tGetNodes() map[string]*model.Node\n\tStart() error\n}\n\nvar DiscoverI Discover\n\nfunc InitDiscover() error {\n\tvar err error\n\tswitch config.ConfigI.Discover {\n\tcase \"etcd\":\n\t\tDiscoverI, err = etcd.NewCluster(config.ConfigI.Etcd)\n\t\tif err != nil {\n\t\t\treturn err\n\t\t}\n\t\tDiscoverI.Start()\n\t}\n\treturn nil\n}\n\nfunc GetNodes() map[string]*model.Node {\n\tif DiscoverI != nil {\n\t\treturn DiscoverI.GetNodes()\n\t}\n\treturn nil\n}\n"
  },
  {
    "path": "manage/discover/etcd/etcd.go",
    "content": "package etcd\n\nimport (\n\t\"encoding/json\"\n\t\"time\"\n\n\t\"YiSpider/manage/logger\"\n\t\"YiSpider/manage/model\"\n\t\"fmt\"\n\t\"github.com/coreos/etcd/client\"\n\t\"golang.org/x/net/context\"\n\t\"log\"\n)\n\ntype Cluster struct {\n\tnodes   map[string]*model.Node\n\tKeysAPI client.KeysAPI\n}\n\nfunc NewCluster(endpoints []string) (*Cluster, error) {\n\tcfg := client.Config{\n\t\tEndpoints:               endpoints,\n\t\tTransport:               client.DefaultTransport,\n\t\tHeaderTimeoutPerRequest: time.Second,\n\t}\n\n\tetcdClient, err := client.New(cfg)\n\tif err != nil {\n\t\tlogger.Error(\"Error: cannot connec to etcd:\", err)\n\t\treturn nil, err\n\t}\n\n\tmaster := &Cluster{\n\t\tnodes:   make(map[string]*model.Node),\n\t\tKeysAPI: client.NewKeysAPI(etcdClient),\n\t}\n\treturn master, nil\n}\n\nfunc (c *Cluster) Start() error {\n\tgo c.WatchWorkers()\n\tfmt.Println(\"Master Start ...\")\n\treturn nil\n}\n\nfunc (c *Cluster) GetNodes() map[string]*model.Node {\n\tfmt.Println(\"c.nodes\", c.nodes)\n\treturn c.nodes\n}\n\nfunc (c *Cluster) addWorker(info *model.WorkerInfo) {\n\tnode := &model.Node{\n\t\tIsHealth:   true,\n\t\tIP:         info.IP,\n\t\tName:       info.Name,\n\t\tCPU:        info.CPU,\n\t\tMetaData:   info.MetaData,\n\t\tSpiderData: info.SpiderData,\n\t}\n\tc.nodes[node.Name] = node\n}\n\nfunc (c *Cluster) updateWorker(info *model.WorkerInfo) {\n\tc.addWorker(info)\n}\n\nfunc unmarshal(node *client.Node) *model.WorkerInfo {\n\tlogger.Info(node.Value)\n\tinfo := &model.WorkerInfo{}\n\terr := json.Unmarshal([]byte(node.Value), info)\n\tif err != nil {\n\t\tlogger.Error(err)\n\t}\n\treturn info\n}\n\nfunc (c *Cluster) WatchWorkers() {\n\tapi := c.KeysAPI\n\twatcher := api.Watcher(\"spiders/\", &client.WatcherOptions{\n\t\tRecursive: true,\n\t})\n\tfor {\n\t\tres, err := watcher.Next(context.Background())\n\t\tif err != nil {\n\t\t\tlogger.Error(\"Error watch workers:\", err)\n\t\t\tbreak\n\t\t}\n\t\tif res.Action == \"expire\" {\n\t\t\tinfo := unmarshal(res.PrevNode)\n\t\t\tlogger.Info(\"Expire worker \", info.Name)\n\t\t\tmember, ok := c.nodes[info.Name]\n\t\t\tif ok {\n\t\t\t\tmember.IsHealth = false\n\t\t\t}\n\t\t} else if res.Action == \"set\" {\n\t\t\tinfo := unmarshal(res.Node)\n\t\t\tif _, ok := c.nodes[info.Name]; ok {\n\t\t\t\tlogger.Info(\"Update worker \", info.Name)\n\t\t\t\tc.updateWorker(info)\n\t\t\t} else {\n\t\t\t\tlogger.Info(\"Add worker \", info.Name)\n\t\t\t\tc.addWorker(info)\n\t\t\t}\n\t\t} else if res.Action == \"delete\" {\n\t\t\tinfo := unmarshal(res.Node)\n\t\t\tlog.Println(\"Delete worker \", info.Name)\n\t\t\tdelete(c.nodes, info.Name)\n\t\t}\n\t}\n}\n"
  },
  {
    "path": "manage/discover/file/file.go",
    "content": "package file\n\nfunc init() {\n\n}\n"
  },
  {
    "path": "manage/discover/zookeeper/zookeeper.go",
    "content": "package zookeeper\n\nfunc init() {\n\n}\n"
  },
  {
    "path": "manage/http/controller.go",
    "content": "package http\n\nimport (\n\t\"YiSpider/manage/model\"\n\t\"encoding/json\"\n\t\"io/ioutil\"\n\t\"net/http\"\n\t\"net/url\"\n)\n\nvar errorMethod = []byte(\"{\\\"code\\\":\\\"400\\\",\\\"msg\\\":\\\"not support method\\\"}\")\nvar errorQuery = []byte(\"{\\\"code\\\":\\\"400\\\",\\\"msg\\\":\\\"error url parmas\\\"}\")\nvar errorBody = []byte(\"{\\\"code\\\":\\\"400\\\",\\\"msg\\\":\\\"error get body\\\"}\")\nvar errorJson = []byte(\"{\\\"code\\\":\\\"400\\\",\\\"msg\\\":\\\"error get Json\\\"}\")\nvar commonSuccess = []byte(\"{\\\"code\\\":\\\"200\\\",\\\"msg\\\":\\\"success\\\"}\")\n\nfunc AddTask(w http.ResponseWriter, req *http.Request) {\n\tif req.Method != \"POST\" {\n\t\tw.Write(errorMethod)\n\t\treturn\n\t}\n\tbody, err := ioutil.ReadAll(req.Body)\n\tif err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\ttask := &model.Task{}\n\terr = json.Unmarshal(body, task)\n\tif err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\tdata, err := AddTaskS(task)\n\tif err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\n\tw.Write(data)\n}\n\nfunc StopTask(w http.ResponseWriter, req *http.Request) {\n\tif req.Method != \"GET\" {\n\t\tw.Write(errorMethod)\n\t\treturn\n\t}\n\n\tqueryMap, err := url.ParseQuery(req.URL.RawQuery)\n\tif err != nil {\n\t\tw.Write(errorQuery)\n\t\treturn\n\t}\n\tname := queryMap.Get(\"name\")\n\n\tdata, err := StopTaskS(name)\n\tif err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\n\tw.Write(data)\n}\n\nfunc RunTask(w http.ResponseWriter, req *http.Request) {\n\tif req.Method != \"GET\" {\n\t\tw.Write(errorMethod)\n\t\treturn\n\t}\n\n\tqueryMap, err := url.ParseQuery(req.URL.RawQuery)\n\tif err != nil {\n\t\tw.Write(errorQuery)\n\t\treturn\n\t}\n\tname := queryMap.Get(\"name\")\n\n\tdata, err := RunTaskS(name)\n\tif err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\n\tw.Write(data)\n}\n\nfunc EndTask(w http.ResponseWriter, req *http.Request) {\n\tif req.Method != \"GET\" {\n\t\tw.Write(errorMethod)\n\t\treturn\n\t}\n\n\tqueryMap, err := url.ParseQuery(req.URL.RawQuery)\n\tif err != nil {\n\t\tw.Write(errorQuery)\n\t\treturn\n\t}\n\tname := queryMap.Get(\"name\")\n\n\tdata, err := EndTaskS(name)\n\tif err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\n\tw.Write(data)\n}\n\nfunc ListTask(w http.ResponseWriter, req *http.Request) {\n\tif req.Method != \"GET\" {\n\t\tw.Write(errorMethod)\n\t\treturn\n\t}\n\n\tqueryMap, err := url.ParseQuery(req.URL.RawQuery)\n\tif err != nil {\n\t\tw.Write(errorQuery)\n\t\treturn\n\t}\n\tname := queryMap.Get(\"name\")\n\n\tdata, err := ListTaskS(name)\n\tif err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\n\tw.Write(data)\n}\n\nfunc ListNode(w http.ResponseWriter, req *http.Request) {\n\tif req.Method != \"GET\" {\n\t\tw.Write(errorMethod)\n\t\treturn\n\t}\n\n\tdata, err := ListNodesS()\n\tif err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\n\tw.Write(data)\n}\n"
  },
  {
    "path": "manage/http/request.go",
    "content": "package http\n\nimport (\n\t\"bytes\"\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"golang.org/x/net/publicsuffix\"\n\t\"io/ioutil\"\n\t\"net/http\"\n\t\"net/http/cookiejar\"\n\t\"time\"\n)\n\nvar ClientI *http.Client\n\nfunc init() {\n\tClientI = MakeClient(nil)\n}\n\nfunc makeCookiejar() http.CookieJar {\n\tcookiejarOptions := cookiejar.Options{\n\t\tPublicSuffixList: publicsuffix.List,\n\t}\n\tjar, _ := cookiejar.New(&cookiejarOptions)\n\n\treturn jar\n}\n\nfunc MakeClient(transport http.RoundTripper) *http.Client {\n\treturn &http.Client{Jar: makeCookiejar(), Transport: transport, Timeout: 5 * time.Second}\n}\n\nfunc Get(url string) ([]byte, error) {\n\tres, err := DoRequest(\"GET\", url, nil)\n\tif err != nil {\n\t\treturn []byte{}, err\n\t}\n\tvar body []byte\n\tif body, err = ioutil.ReadAll(res.Body); err != nil {\n\t\treturn []byte{}, err\n\t}\n\tfmt.Println(\"GET\", url, \" =>\", string(body))\n\n\treturn body, nil\n}\n\nfunc Post(url string, data interface{}) ([]byte, error) {\n\tdataJ, err := json.Marshal(data)\n\tif err != nil {\n\t\treturn []byte{}, err\n\t}\n\tfmt.Println(\"Request:\", string(dataJ))\n\tres, err := DoRequest(\"POST\", url, dataJ)\n\tif err != nil {\n\t\treturn []byte{}, err\n\t}\n\tvar body []byte\n\tif body, err = ioutil.ReadAll(res.Body); err != nil {\n\t\treturn []byte{}, err\n\t}\n\n\treturn body, nil\n}\n\nfunc DoRequest(method string, url string, data []byte) (resp *http.Response, err error) {\n\treq, err := http.NewRequest(method, url, bytes.NewBuffer(data))\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\treq.Header.Set(\"Content-Type\", \"application/json\")\n\treturn ClientI.Do(req)\n}\n"
  },
  {
    "path": "manage/http/server.go",
    "content": "package http\n\nimport (\n\t\"YiSpider/manage/config\"\n\t\"YiSpider/manage/logger\"\n\t\"net/http\"\n)\n\nfunc InitHttpServer() {\n\n\thttp.HandleFunc(\"/task/add\", AddTask)\n\thttp.HandleFunc(\"/task/run\", RunTask)\n\thttp.HandleFunc(\"/task/stop\", StopTask)\n\thttp.HandleFunc(\"/task/end\", EndTask)\n\thttp.HandleFunc(\"/tasks\", ListTask)\n\thttp.HandleFunc(\"/nodes\", ListNode)\n\n\terr := http.ListenAndServe(config.ConfigI.HttpAddr, nil)\n\tif err != nil {\n\t\tlogger.Error(\"ListenAndServe fail:\", err)\n\t}\n}\n"
  },
  {
    "path": "manage/http/service.go",
    "content": "package http\n\nimport (\n\t\"YiSpider/manage/discover\"\n\t\"YiSpider/manage/model\"\n\t\"YiSpider/manage/strategy\"\n\t\"encoding/json\"\n\t\"fmt\"\n)\n\nfunc AddTaskS(task *model.Task) ([]byte, error) {\n\tnode := strategy.GetNode()\n\treturn Post(getUrl(node.IP, \"/task/add\"), task)\n}\n\nfunc RunTaskS(name string) ([]byte, error) {\n\tnode := strategy.GetNode()\n\treturn Get(getUrl(node.IP, \"/task/run?name=\"+name))\n}\n\nfunc StopTaskS(name string) ([]byte, error) {\n\tnode := strategy.GetNode()\n\treturn Get(getUrl(node.IP, \"/task/stop?name=\"+name))\n}\n\nfunc EndTaskS(name string) ([]byte, error) {\n\tnode := strategy.GetNode()\n\treturn Get(getUrl(node.IP, \"/task/end?name=\"+name))\n}\n\nfunc ListTaskS(name string) ([]byte, error) {\n\tfmt.Println(\"name\", name, \"nodes\", discover.GetNodes())\n\tnode := discover.GetNodes()[name]\n\treturn Get(getUrl(node.IP, \"/tasks\"))\n}\n\nfunc ListNodesS() ([]byte, error) {\n\tnodes := discover.GetNodes()\n\treturn json.Marshal(nodes)\n}\n\nfunc getUrl(ip string, path string) string {\n\turl := fmt.Sprintf(\"http://%s%s\", ip, path)\n\treturn url\n}\n"
  },
  {
    "path": "manage/logger/logger.go",
    "content": "package logger\n\nimport \"fmt\"\n\nfunc Info(v ...interface{}) {\n\tfmt.Println(v)\n}\n\nfunc Debug(v ...interface{}) {\n\tfmt.Println(v)\n}\n\nfunc Warn(v ...interface{}) {\n\tfmt.Println(v)\n}\n\nfunc Error(v ...interface{}) {\n\tfmt.Println(v)\n}\n"
  },
  {
    "path": "manage/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/manage/config\"\n\t\"YiSpider/manage/discover\"\n\t\"YiSpider/manage/http\"\n\t\"YiSpider/manage/logger\"\n)\n\nfunc main() {\n\n\tvar err error\n\n\tif err = config.InitConfig(); err != nil {\n\t\tlogger.Info(err.Error())\n\t\treturn\n\t}\n\n\tdiscover.InitDiscover()\n\n\thttp.InitHttpServer()\n\n}\n"
  },
  {
    "path": "manage/model/node_info.go",
    "content": "package model\n\ntype Node struct {\n\tIsHealth   bool                   `json:\"is_health\"`\n\tIP         string                 `json:\"ip\"`\n\tName       string                 `json:\"name\"`\n\tCPU        int                    `json:\"cpu\"`\n\tMetaData   map[string]string      `json:\"metadata\"`\n\tSpiderData map[string]*SpiderData `json:\"spider_data\"`\n}\n\ntype WorkerInfo struct {\n\tName       string                 `json:\"name\"`\n\tIP         string                 `json:\"ip\"`\n\tCPU        int                    `json:\"cpu\"`\n\tMetaData   map[string]string      `json:\"metadata\"`\n\tSpiderData map[string]*SpiderData `json:\"spider_data\"`\n}\n\ntype SpiderData struct {\n\tDownloadFailCount int32 `json:\"download_fail_count\"`\n\tDownloadCount     int32 `json:\"download_count\"`\n\tUrlNum            int32 `json:\"url_num\"`\n\tWaitUrlNum        int   `json:\"wait_url_num\"`\n\tCrawlerResultNum  int32 `json:\"crawler_result_num\"`\n}\n"
  },
  {
    "path": "manage/model/task.go",
    "content": "package model\n\ntype Task struct {\n\tId     string `json:\"id\"`\n\tName   string `json:\"name\"`\n\tUrl    string `json:\"url\"`\n\tHost   string `json:\"host\"`\n\tMethod string `json:\"method\"`\n\n\tHeader  map[string]string `json:\"header\"`\n\tCookies Cookies           `json:\"cookies\"`\n\tProxys  []string          `json:\"proxys\"`\n\n\tRequestBody RequestBody `json:\"request_body\"`\n\n\tProcess Process `json:\"process\"`\n\n\tDepth    int `json:\"depth\"`\n\tEndCount int `json:\"end_count\"`\n\n\tPipline string `json:\"pipline\"`\n}\n\ntype RequestBody struct {\n\tType string            `json:\"type\"` // json urlencode form\n\tData map[string]string `json:\"data\"`\n}\n\ntype Cookies struct {\n\tUrl  string `json:\"url\"`\n\tData string `json:\"data\"`\n}\n\ntype Process struct {\n\tUrl          string\n\tRegUrl       []string\n\tType         string       `json:\"type\"` // template json self_process\n\tTemplateRule TemplateRule `json:\"template_rule\"`\n\tJsonRule     JsonRule     `json:\"json_rule\"`\n}\n\ntype TemplateRule struct {\n\tRule map[string]string\n}\n\ntype JsonRule struct {\n\tRule map[string]interface{}\n}\n"
  },
  {
    "path": "manage/schedule/request.go",
    "content": "package schedule\n\nimport (\n\t\"bytes\"\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"golang.org/x/net/publicsuffix\"\n\t\"io/ioutil\"\n\t\"net/http\"\n\t\"net/http/cookiejar\"\n\t\"testing\"\n\t\"time\"\n)\n\nvar ClientI *http.Client\n\nfunc init() {\n\tClientI = MakeClient(nil)\n}\n\nfunc makeCookiejar() http.CookieJar {\n\tcookiejarOptions := cookiejar.Options{\n\t\tPublicSuffixList: publicsuffix.List,\n\t}\n\tjar, _ := cookiejar.New(&cookiejarOptions)\n\n\treturn jar\n}\n\nfunc MakeClient(transport http.RoundTripper) *http.Client {\n\treturn &http.Client{Jar: makeCookiejar(), Transport: transport, Timeout: 60 * time.Second}\n}\n\nfunc Get(url string) ([]byte, error) {\n\tres, err := DoRequest(\"GET\", url, nil)\n\tif err != nil {\n\t\treturn []byte{}, err\n\t}\n\tvar body []byte\n\tif body, err = ioutil.ReadAll(res.Body); err != nil {\n\t\treturn []byte{}, err\n\t}\n\tfmt.Println(\"GET\", url, \" =>\", string(body))\n\n\treturn body, nil\n}\n\nfunc Post(url string, data interface{}) ([]byte, error) {\n\tdataJ, err := json.Marshal(data)\n\tif err != nil {\n\t\treturn []byte{}, err\n\t}\n\tfmt.Println(\"Request:\", string(dataJ))\n\tres, err := DoRequest(\"POST\", url, dataJ)\n\tif err != nil {\n\t\treturn []byte{}, err\n\t}\n\tvar body []byte\n\tif body, err = ioutil.ReadAll(res.Body); err != nil {\n\t\treturn []byte{}, err\n\t}\n\n\treturn body, nil\n}\n\nfunc DoRequest(method string, url string, data []byte) (resp *http.Response, err error) {\n\treq, err := http.NewRequest(method, url, bytes.NewBuffer(data))\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\treq.Header.Set(\"Content-Type\", \"application/json\")\n\treturn ClientI.Do(req)\n}\n"
  },
  {
    "path": "manage/schedule/schedule.go",
    "content": "package schedule\n\nimport (\n\t\"YiSpider/manage/discover\"\n\t\"YiSpider/manage/model\"\n\t\"YiSpider/manage/strategy\"\n\t\"fmt\"\n)\n\nfunc AddTask(task *model.Task) ([]byte, error) {\n\tnode := strategy.GetNode()\n\treturn Post(getUrl(node.IP, \"/task/add\"), task)\n}\n\nfunc RunTask(name string) ([]byte, error) {\n\tnode := strategy.GetNode()\n\treturn Get(getUrl(node.IP, \"/task/run?name=\"+name))\n}\n\nfunc StopTask(name string) ([]byte, error) {\n\tnode := strategy.GetNode()\n\treturn Get(getUrl(node.IP, \"/task/stop?name=\"+name))\n}\n\nfunc EndTask(name string) ([]byte, error) {\n\tnode := strategy.GetNode()\n\treturn Get(getUrl(node.IP, \"/task/end?name=\"+name))\n}\n\nfunc ListTask(name string) ([]byte, error) {\n\tnode := discover.GetNodes()[name]\n\treturn Get(getUrl(node.IP, \"/task/list\"))\n}\n\nfunc getUrl(ip string, path string) string {\n\turl := fmt.Sprintf(\"http://%s:7777%s\", ip, path)\n\treturn url\n}\n"
  },
  {
    "path": "manage/strategy/rand_strategy.go",
    "content": "package strategy\n\nimport (\n\t\"YiSpider/manage/discover\"\n\t\"YiSpider/manage/model\"\n)\n\nfunc GetNode() *model.Node {\n\tnodes := discover.GetNodes()\n\tfor _, node := range nodes {\n\t\treturn node\n\t}\n\treturn nil\n}\n"
  },
  {
    "path": "manage/task/task.go",
    "content": "package task\n\nfunc init() {\n\n}\n"
  },
  {
    "path": "spider/boot.go",
    "content": "package spider\n\nimport (\n\t\"YiSpider/spider/config\"\n\t\"YiSpider/spider/core\"\n\t\"YiSpider/spider/http\"\n\t\"YiSpider/spider/register/etcd\"\n\t\"YiSpider/spider/spider\"\n)\n\ntype Boot struct {\n\tengine *core.Engine\n}\n\nfunc init() {\n\tvar err error\n\n\tif err = config.InitConfig(); err != nil {\n\t\tpanic(err)\n\t}\n}\n\nfunc New() *Boot {\n\ts := &Boot{}\n\ts.engine = core.New()\n\treturn s\n}\n\nfunc (s *Boot) AddSpider(spider *spider.Spider) *core.Engine {\n\treturn s.engine.AddSpider(spider)\n}\n\nfunc (s *Boot) Run() {\n\n\ts.engine.Run()\n\n\tif len(config.ConfigI.Etcd) > 0{\n\t\tworker := etcd.NewWorker(config.ConfigI.Name, config.ConfigI.HttpAddr, config.ConfigI.Etcd)\n\t\tgo worker.HeartBeat()\n\t}\n\n\thttp.InitHttpServer()\n}\n"
  },
  {
    "path": "spider/common/encode.go",
    "content": "package common\n\nimport (\n\t\"fmt\"\n\n\t\"strings\"\n\n\t\"github.com/saintfish/chardet\"\n\t\"golang.org/x/text/encoding\"\n\t\"golang.org/x/text/encoding/charmap\"\n\t\"golang.org/x/text/encoding/japanese\"\n\t\"golang.org/x/text/encoding/korean\"\n\t\"golang.org/x/text/encoding/simplifiedchinese\"\n\t\"golang.org/x/text/encoding/traditionalchinese\"\n\t\"golang.org/x/text/encoding/unicode\"\n\t\"golang.org/x/text/transform\"\n)\n\nvar (\n\tcharsetDetector  = chardet.NewTextDetector()\n\tcharsetDetectors = map[string]encoding.Encoding{\n\t\t\"Big5\":         traditionalchinese.Big5,\n\t\t\"EUC-JP\":       japanese.EUCJP,\n\t\t\"EUC-KR\":       korean.EUCKR,\n\t\t\"GB-18030\":     simplifiedchinese.GB18030,\n\t\t\"ISO-2022-JP\":  japanese.ISO2022JP,\n\t\t\"ISO-8859-5\":   charmap.ISO8859_5,\n\t\t\"ISO-8859-6\":   charmap.ISO8859_6,\n\t\t\"ISO-8859-7\":   charmap.ISO8859_7,\n\t\t\"ISO-8859-8\":   charmap.ISO8859_8,\n\t\t\"ISO-8859-8-I\": charmap.ISO8859_8I,\n\t\t\"KOI8-R\":       charmap.KOI8R,\n\t\t\"Shift_JIS\":    japanese.ShiftJIS,\n\t\t\"UTF-16BE\":     unicode.UTF16(unicode.BigEndian, unicode.UseBOM),\n\t\t\"UTF-16LE\":     unicode.UTF16(unicode.LittleEndian, unicode.UseBOM),\n\t\t\"windows-1251\": charmap.Windows1251,\n\t\t\"windows-1252\": charmap.Windows1252,\n\t\t\"windows-1253\": charmap.Windows1253,\n\t\t\"windows-1254\": charmap.Windows1254,\n\t\t\"windows-1255\": charmap.Windows1255,\n\t\t\"windows-1256\": charmap.Windows1256,\n\t}\n)\n\nfunc ToUtf8(html []byte) ([]byte, error) {\n\tr, err := charsetDetector.DetectBest(html)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\tif strings.ToLower(r.Charset) == strings.ToLower(\"UTF-8\") || strings.ToLower(r.Charset) == strings.ToLower(\"ISO-8859-1\") || strings.ToLower(r.Charset) == strings.ToLower(\"Big5\") {\n\t\treturn html, nil\n\t}\n\n\tt, ok := charsetDetectors[r.Charset]\n\tif !ok {\n\t\treturn nil, fmt.Errorf(\n\t\t\t\"could not find charset decoder for `%s`\",\n\t\t\tr.Charset)\n\t}\n\n\thtml, _, err = transform.Bytes(t.NewDecoder(), html)\n\treturn html, err\n}\n"
  },
  {
    "path": "spider/common/prase_req.go",
    "content": "package common\n\nimport (\n\t\"YiSpider/spider/model\"\n\t\"encoding/json\"\n\t\"regexp\"\n\t\"strconv\"\n\t\"strings\"\n)\n\nfunc PraseReq(reqs []*model.Request, ctx map[string]interface{}) []*model.Request {\n\tresultsReqs := []*model.Request{}\n\tfor _, req := range reqs {\n\t\tresults, ok := isRuleReq(req, ctx)\n\t\tif ok {\n\t\t\tresultsReqs = append(resultsReqs, results...)\n\t\t} else {\n\t\t\tresultsReqs = append(resultsReqs, req)\n\t\t}\n\t}\n\treturn resultsReqs\n}\n\nfunc FindRule(text string) [][]string {\n\treg := regexp.MustCompile(`{([^}]+)}`)\n\treturn reg.FindAllStringSubmatch(text, -1)\n}\n\nfunc isRuleReq(req *model.Request, ctx map[string]interface{}) ([]*model.Request, bool) {\n\treqs := []*model.Request{req}\n\toutReqs := []*model.Request{}\n\tfinalReqs := []*model.Request{}\n\tisMatch := false\n\n\trules := FindRule(req.Url)\n\tif len(rules) > 0 {\n\t\tisMatch = true\n\t} else {\n\t\treturn nil, false\n\t}\n\n\tif ctx != nil {\n\t\treqs, isMatch = PraseParamCtx(req, rules, ctx)\n\t}\n\tfor _, r := range reqs {\n\t\toutReqs = append(outReqs, PraseOffset(r)...)\n\t}\n\n\tfor _, r := range outReqs {\n\t\tfinalReqs = append(finalReqs, PraseOr(r)...)\n\t}\n\n\tif isMatch {\n\t\treturn finalReqs, true\n\t}\n\n\treturn finalReqs, isMatch\n}\n\n// http://xxxxxxxx.com/abc/{begin-end,offset}/   example:{1-400,10}\nfunc PraseOffset(req *model.Request) []*model.Request {\n\treqs := []*model.Request{}\n\toutrReqs := []*model.Request{}\n\n\trules := FindRule(req.Url)\n\tif len(rules) <= 0 {\n\t\treturn []*model.Request{req}\n\t}\n\n\tvar begin, end, offset int\n\tvar rule string\n\tfor _,rulee :=range rules{\n\t\trule = rulee[1]\n\t\tsp := strings.Split(rule, \",\")\n\n\t\tif len(sp) != 2 {\n\t\t\tcontinue\n\t\t}\n\n\t\trs := strings.Split(sp[0], \"-\")\n\n\t\tvar err error\n\t\tbegin, err = strconv.Atoi(rs[0])\n\t\tend, err = strconv.Atoi(rs[1])\n\t\toffset, err = strconv.Atoi(sp[1])\n\t\tif err != nil {\n\t\t\tcontinue\n\t\t}\n\t\tif offset == 0 {\n\t\t\tcontinue\n\t\t}\n\n\t\tbreak\n\t}\n\n\tif begin == 0 && end == 0 && offset == 0{\n\t\treturn []*model.Request{req}\n\t}\n\n\tfor i := begin; i <= end; i = i + offset {\n\t\turl := strings.Replace(req.Url, \"{\"+rule+\"}\", strconv.Itoa(i), 1)\n\t\treq := &model.Request{Url: url, Method: req.Method, ContentType: req.ContentType, Data: req.Data, Header: req.Header, Cookies: req.Cookies, ProcessName: req.ProcessName}\n\t\treqs = append(reqs, req)\n\t}\n\n\tfor _, r := range reqs {\n\t\toutrReqs = append(outrReqs, PraseOffset(r)...)\n\t}\n\n\treturn outrReqs\n}\n\n// http://xxxxxxxx.com/abc/{id1|id2|id3}/\nfunc PraseOr(req *model.Request) []*model.Request {\n\treqs := []*model.Request{}\n\toutrReqs := []*model.Request{}\n\n\trules := FindRule(req.Url)\n\tif len(rules) <= 0 {\n\t\treturn []*model.Request{req}\n\t}\n\truleArray := rules[0]\n\trule := ruleArray[1]\n\tsp := strings.Split(rule, \"|\")\n\tif len(sp) < 2 {\n\t\treturn []*model.Request{req}\n\t}\n\n\tfor _, word := range sp {\n\t\turl := strings.Replace(req.Url, \"{\"+rule+\"}\", word, 1)\n\t\tr := &model.Request{Url: url, Method: req.Method, ContentType: req.ContentType, Data: req.Data, Header: req.Header, Cookies: req.Cookies, ProcessName: req.ProcessName}\n\t\treqs = append(reqs, r)\n\t}\n\n\tfor _, r := range reqs {\n\t\toutrReqs = append(outrReqs, PraseOr(r)...)\n\t}\n\n\treturn outrReqs\n}\n\n// http://xxxxxxxx.com/abc/{name}/{id}/\nfunc PraseParamCtx(req *model.Request, rules [][]string, ctx map[string]interface{}) ([]*model.Request, bool) {\n\treqs := []*model.Request{}\n\treqUrl := req.Url\n\n\tcount := strings.Count(reqUrl, \"$\")\n\tif count <= 0 {\n\t\treturn []*model.Request{req}, false\n\t}\n\n\tfor ctxName, ruleUrl := range ctx {\n\t\turlArray, ok := ruleUrl.([]string)\n\t\tif ok {\n\t\t\tfor _, url := range urlArray {\n\t\t\t\tu := strings.Replace(reqUrl, \"{$\"+url+\"}\", string(url), -1)\n\t\t\t\tu = strings.Replace(reqUrl, \"$\"+url, string(url), -1)\n\t\t\t\tr := &model.Request{Url: u, Method: req.Method, ContentType: req.ContentType, Data: req.Data, Header: req.Header, Cookies: req.Cookies, ProcessName: req.ProcessName}\n\t\t\t\tif newCount := strings.Count(u, \"$\"); newCount != count {\n\t\t\t\t\treqUrl = u\n\t\t\t\t\tcount = newCount\n\t\t\t\t\tif count == 0 {\n\t\t\t\t\t\treqs = append(reqs, r)\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\n\t\turlStr, ok := ruleUrl.(string)\n\t\tif ok {\n\t\t\turl := strings.Replace(reqUrl, \"{$\"+ctxName+\"}\", string(urlStr), -1)\n\t\t\turl = strings.Replace(url, \"$\"+ctxName, string(urlStr), -1)\n\t\t\tr := &model.Request{Url: url, Method: req.Method, ContentType: req.ContentType, Data: req.Data, Header: req.Header, Cookies: req.Cookies, ProcessName: req.ProcessName}\n\t\t\tif newCount := strings.Count(url, \"$\"); newCount != count {\n\t\t\t\treqUrl = url\n\t\t\t\tcount = newCount\n\t\t\t\tif count == 0 {\n\t\t\t\t\treqs = append(reqs, r)\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\n\t\turlNumber, ok := ruleUrl.(json.Number)\n\t\tif ok {\n\t\t\turl := strings.Replace(reqUrl, \"{$\"+ctxName+\"}\", string(urlNumber), -1)\n\t\t\turl = strings.Replace(url, \"$\"+ctxName, string(urlNumber), -1)\n\t\t\tr := &model.Request{Url: url, Method: req.Method, ContentType: req.ContentType, Data: req.Data, Header: req.Header, Cookies: req.Cookies, ProcessName: req.ProcessName}\n\n\t\t\tif newCount := strings.Count(url, \"$\"); newCount != count {\n\t\t\t\treqUrl = url\n\t\t\t\tcount = newCount\n\t\t\t\tif count == 0 {\n\t\t\t\t\treqs = append(reqs, r)\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\n\t\turlInt, ok := ruleUrl.(int)\n\t\tif ok {\n\t\t\turl := strings.Replace(reqUrl, \"{$\"+ctxName+\"}\", strconv.Itoa(urlInt), -1)\n\t\t\turl = strings.Replace(url, \"$\"+ctxName, strconv.Itoa(urlInt), -1)\n\t\t\tr := &model.Request{Url: url, Method: req.Method, ContentType: req.ContentType, Data: req.Data, Header: req.Header, Cookies: req.Cookies, ProcessName: req.ProcessName}\n\t\t\tif newCount := strings.Count(url, \"$\"); newCount != count {\n\t\t\t\treqUrl = url\n\t\t\t\tcount = newCount\n\t\t\t\tif count == 0 {\n\t\t\t\t\treqs = append(reqs, r)\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\n\t}\n\n\treturn reqs, true\n}\n"
  },
  {
    "path": "spider/common/prase_req_test.go",
    "content": "package common\n\nimport (\n\t\"YiSpider/spider/model\"\n\t\"fmt\"\n\t\"testing\"\n)\n\nfunc TestPraseOffset(t *testing.T) {\n\treqs := []*model.Request{\n\t\t{\n\t\t\tMethod:      \"get\",\n\t\t\tUrl:         \"https://movie.douban.com/j/new_search_subjects?sort=T&range=0,10&tags=&start={0-10,1}&last={1|2|3}&page={0-5,1}\",\n\t\t\tProcessName: \"movie\",\n\t\t},\n\t}\n\tresults := PraseReq(reqs, nil)\n\tfor _, result := range results {\n\t\tfmt.Println(result)\n\t}\n}\n\nfunc TestPraseOr(t *testing.T) {\n\treqs := []*model.Request{\n\t\t{\n\t\t\tMethod:      \"get\",\n\t\t\tUrl:         \"https://movie.douban.com/j/new_search_subjects?sort=T&range=0,10&tags=&start={0|20|40}&page={1|2|3}\",\n\t\t\tProcessName: \"movie\",\n\t\t},\n\t}\n\tresults := PraseReq(reqs, nil)\n\tfor _, result := range results {\n\t\tfmt.Println(result)\n\t}\n}\n\nfunc TestPraseParamCtx(t *testing.T) {\n\treqs := []*model.Request{\n\t\t{\n\t\t\tMethod:      \"get\",\n\t\t\tUrl:         \"https://sclub.jd.com/comment/productPageComments.action?productId={$id}&score=0&sortType=5&page={0-$max_page,1}&pageSize=10\",\n\t\t\tProcessName: \"movie\",\n\t\t},\n\t}\n\tresults := PraseReq(reqs, map[string]interface{}{\n\t\t\"id\":       13123123,\n\t\t\"max_page\": 10,\n\t})\n\tfor _, result := range results {\n\t\tfmt.Println(result)\n\t}\n}\n\nfunc TestFindRule(t *testing.T) {\n\turl := `\"https://movie.douban.com/j/new_search_subjects?sort=T&url={1-$count,1}&tags=\"`\n\n\tresults := FindRule(url)\n\tfor _, result := range results {\n\t\tfmt.Println(result)\n\t}\n}\n\nfunc TestOrAndOffset(t *testing.T){\n\tgoodsType := `电子书刊|电子书|网络原创|数字杂志|多媒体图书|音像|音乐|影视|教育音像|英文原版|少儿|商务投资|英语学习与考试|文学|传记|励志|文艺|小说|文学|青春文学|传记|艺术|少儿|少儿|0-2岁|3-6岁|7-10岁|11-14岁|人文社科|历史|哲学|国学|政治/军事|法律|人文社科|心理学|文化|社会科学|经管励志|经济|金融与投资|管理|励志与成功|生活|生活|健身与保健|家庭与育儿|旅游|烹饪美食|科技|工业技术|科普读物|建筑|医学|科学与自然|计算机与互联网|电子通信|教育|中小学教辅|教育与考试|外语学习|大中专教材|字典词典|港台图书|艺术/设计/收藏|经济管理|文化/学术|少儿|其他|工具书|杂志/期刊|套装书|手机通讯|手机|对讲机|运营商|合约机|选号中心|装宽带|办套餐|手机配件|移动电源|电池/移动电源|蓝牙耳机|充电器/数据线|苹果周边|手机耳机|手机贴膜|手机存储卡|充电器|数据线|手机保护套|车载配件|iPhone 配件|手机电池|创意配件|便携/无线音响|手机饰品|拍照配件|手机支架|大 家 电|平板电视|空调|冰箱|洗衣机|家庭影院|DVD/电视盒子|迷你音响|冷柜/冰吧|家电配件|功放|回音壁/Soundbar|Hi-Fi专区|电视盒子|酒柜|厨卫大电|燃气灶|油烟机|热水器|消毒柜|洗碗机|厨房小电|料理机|榨汁机|电饭煲|电压力锅|豆浆机|咖啡机|微波炉|电烤箱|电磁炉|面包机|煮蛋器|酸奶机|电炖锅|电水壶/热水瓶|电饼铛|多用途锅|电烧烤炉|果蔬解毒机|其它厨房电器|养生壶/煎药壶|电热饭盒|生活电器|取暖电器|净化器|加湿器|扫地机器人|吸尘器|挂烫机/熨斗|插座|电话机|清洁机|除湿机|干衣机|收录/音机|电风扇|冷风扇|其它生活电器|生活电器配件|净水器|饮水机|个护健康|剃须刀|剃/脱毛器|口腔护理|电吹风|美容器|理发器|卷/直发器|按摩椅|按摩器|足浴盆|血压计|电子秤/厨房秤|血糖仪|体温计|其它健康电器|计步器/脂肪检测仪|五金家装|电动工具|手动工具|仪器仪表|浴霸/排气扇|灯具|LED灯|洁身器|水槽|龙头|淋浴花洒|厨卫五金|家具五金|门铃|电气开关|插座|电工电料|监控安防|电线/线缆|摄影摄像|数码相机|单电/微单相机|单反相机|摄像机|拍立得|运动相机|镜头|户外器材|影棚器材|冲印服务|数码相框|数码配件|存储卡|读卡器|滤镜|闪光灯/手柄|相机包|三脚架/云台|相机清洁/贴膜|机身附件|镜头附件|电池/充电器|移动电源|数码支架|智能设备|智能手环|智能手表|智能眼镜|运动跟踪器|健康监测|智能配饰|智能家居|体感车|其他配件|智能机器人|无人机|影音娱乐|MP3/MP4|智能设备|耳机/耳麦|便携/无线音箱|音箱/音响|高清播放器|收音机|MP3/MP4配件|麦克风|专业音频|苹果配件|电子教育|学生平板|点读机/笔|早教益智|录音笔|电纸书|电子词典|复读机|虚拟商品|延保服务|杀毒软件|积分商品|家纺|桌布/罩件|地毯地垫|沙发垫套/椅垫|床品套件|被子|枕芯|床单被罩|毯子|床垫/床褥|蚊帐|抱枕靠垫|毛巾浴巾|电热毯|窗帘/窗纱|布艺软饰|凉席|灯具|台灯|节能灯|装饰灯|落地灯|应急灯/手电|LED灯|吸顶灯|五金电器|筒灯射灯|吊灯|氛围照明|生活日用|保暖防护|收纳用品|雨伞雨具|浴室用品|缝纫/针织用品|洗晒/熨烫|净化除味|家装软饰|相框/照片墙|装饰字画|节庆饰品|手工/十字绣|装饰摆件|帘艺隔断|墙贴/装饰贴|钟饰|花瓶花艺|香薰蜡烛|创意家居|宠物生活|宠物主粮|宠物零食|医疗保健|家居日用|宠物玩具|出行装备|洗护美容|电脑整机|笔记本|超极本|游戏本|平板电脑|平板电脑配件|台式机|服务器/工作站|笔记本配件|一体机|电脑配件|CPU|主板|显卡|硬盘|SSD固态硬盘|内存|机箱|电源|显示器|刻录机/光驱|散热器|声卡/扩展卡|装机配件|组装电脑|外设产品|移动硬盘|U盘|鼠标|键盘|鼠标垫|摄像头|手写板|硬盘盒|插座|线缆|UPS电源|电脑工具|游戏设备|电玩|电脑清洁|网络仪表仪器|游戏设备|游戏机|游戏耳机|手柄/方向盘|游戏软件|游戏周边|网络产品|路由器|网卡|交换机|网络存储|4G/3G上网|网络盒子|网络配件|办公设备|投影机|投影配件|多功能一体机|打印机|传真设备|验钞/点钞机|扫描设备|复合机|碎纸机|考勤机|收款/POS机|会议音频视频|保险柜|装订/封装机|安防监控|办公家具|白板|文具/耗材|硒鼓/墨粉|墨盒|色带|纸类|办公文具|学生文具|财会用品|文件管理|本册/便签|计算器|笔类|画具画材|刻录碟片/附件|服务产品|上门安装|延保服务|维修保养|电脑软件|京东服务|烹饪锅具|炒锅|煎锅|压力锅|蒸锅|汤锅|奶锅|锅具套装|煲类|水壶|火锅|刀剪菜板|菜刀|剪刀|刀具套装|砧板|瓜果刀/刨|多功能刀|厨房配件|保鲜盒|烘焙/烧烤|饭盒/提锅|储物/置物架|厨房DIY/小工具|水具酒具|塑料杯|运动水壶|玻璃杯|陶瓷/马克杯|保温杯|保温壶|酒杯/酒具|杯具套装|餐具|餐具套装|碗/碟/盘|筷勺/刀叉|一次性用品|果盘/果篮|酒店用品|自助餐炉|酒店餐具|酒店水具|茶具/咖啡具|整套茶具|茶杯|茶壶|茶盘茶托|茶叶罐|茶具配件|茶宠摆件|咖啡具|其他|清洁用品|纸品湿巾|衣物清洁|清洁工具|驱虫用品|家庭清洁|皮具护理|一次性用品|面部护肤|洁面|乳液面霜|面膜|剃须|套装|精华|眼霜|卸妆|防晒|防晒隔离|T区护理|眼部护理|精华露|爽肤水|身体护理|沐浴|润肤|颈部|手足|纤体塑形|美胸|套装|精油|洗发护发|染发/造型|香薰精油|磨砂/浴盐|手工/香皂|洗发|护发|染发|磨砂膏|香皂|口腔护理|牙膏/牙粉|牙刷/牙线|漱口水|套装|女性护理|卫生巾|卫生护垫|私密护理|脱毛膏|其他|洗发护发|洗发|护发|染发|造型|假发|套装|美发工具|脸部护理|香水彩妆|香水|底妆|腮红|眼影|唇部|美甲|眼线|美妆工具|套装|防晒隔离|卸妆|眉笔|睫毛膏|女装|T恤|衬衫|针织衫|雪纺衫|卫衣|马甲|连衣裙|半身裙|牛仔裤|休闲裤|打底裤|正装裤|小西装|短外套|风衣|毛呢大衣|真皮皮衣|棉服|羽绒服|大码女装|中老年女装|婚纱|打底衫|旗袍/唐装|加绒裤|吊带/背心|羊绒衫|短裤|皮草|礼服|仿皮皮衣|羊毛衫|设计师/潮牌|男装|衬衫|T恤|POLO衫|针织衫|羊绒衫|卫衣|马甲/背心|夹克|风衣|毛呢大衣|仿皮皮衣|西服|棉服|羽绒服|牛仔裤|休闲裤|西裤|西服套装|大码男装|中老年男装|唐装/中山装|工装|真皮皮衣|加绒裤|卫裤/运动裤|短裤|设计师/潮牌|羊毛衫|内衣|文胸|女式内裤|男式内裤|睡衣/家居服|塑身美体|泳衣|吊带/背心|抹胸|连裤袜/丝袜|美腿袜|商务男袜|保暖内衣|情侣睡衣|文胸套装|少女文胸|休闲棉袜 |大码内衣|内衣配件|打底裤袜|打底衫|秋衣秋裤|情趣内衣|洗衣服务|服装洗护|服饰配件|太阳镜|光学镜架/镜片|围巾/手套/帽子套装|袖扣|棒球帽|毛线帽|遮阳帽|老花镜|装饰眼镜|防辐射眼镜|游泳镜|女士丝巾/围巾/披肩|男士丝巾/围巾|鸭舌帽|贝雷帽|礼帽|真皮手套|毛线手套|防晒手套|男士腰带/礼盒|女士腰带/礼盒|钥匙扣|遮阳伞/雨伞|口罩|耳罩/耳包|假领|毛线/布面料|领带/领结/领带夹|钟表|男表|瑞表|女表|国表|日韩表|欧美表|德表|儿童手表|智能手表|闹钟|座钟挂钟|钟表配件|流行男鞋|商务休闲鞋|正装鞋|休闲鞋|凉鞋/沙滩鞋|男靴|功能鞋|拖鞋/人字拖|雨鞋/雨靴|传统布鞋|鞋配件|帆布鞋|增高鞋|工装鞋|定制鞋|时尚女鞋|高跟鞋|单鞋|休闲鞋|凉鞋|女靴|雪地靴|拖鞋/人字拖|踝靴|筒靴|帆布鞋|雨鞋/雨靴|妈妈鞋|鞋配件|特色鞋|鱼嘴鞋|布鞋/绣花鞋|马丁靴|坡跟鞋|松糕鞋|内增高|防水台|奶粉|婴幼奶粉|孕妈奶粉|营养辅食|益生菌/初乳|米粉/菜粉|果泥/果汁|DHA|宝宝零食|钙铁锌/维生素|清火/开胃|面条/粥|尿裤湿巾|婴儿尿裤|拉拉裤|婴儿湿巾|成人尿裤|喂养用品|奶瓶奶嘴|吸奶器|暖奶消毒|儿童餐具|水壶/水杯|牙胶安抚|围兜/防溅衣|辅食料理机|食物存储|洗护用品|宝宝护肤|洗发沐浴|奶瓶清洗|驱蚊防晒|理发器|洗澡用具|婴儿口腔清洁|洗衣液/皂|日常护理|座便器|童车童床|婴儿推车|餐椅摇椅|婴儿床|学步车|三轮车|自行车|电动车|扭扭车|滑板车|婴儿床垫|寝居服饰|婴儿外出服|婴儿内衣|婴儿礼盒|婴儿鞋帽袜|安全防护|家居床品|睡袋/抱被|爬行垫|妈妈专区|妈咪包/背婴带|产后塑身|文胸/内裤|防辐射服|孕妈装|孕期营养|孕妇护肤|待产护理|月子装|防溢乳垫|童装童鞋|套装|上衣|裤子|裙子|内衣/家居服|羽绒服/棉服|亲子装|儿童配饰|礼服/演出服|运动鞋|皮鞋/帆布鞋|靴子|凉鞋|功能鞋|户外/运动服|安全座椅|提篮式|安全座椅|增高垫|潮流女包|钱包|手拿包|单肩包|双肩包|手提包|斜挎包|钥匙包|卡包/零钱包|精品男包|男士钱包|男士手包|卡包名片夹|商务公文包|双肩包|单肩/斜挎包|钥匙包|功能箱包|电脑包|拉杆箱|旅行包|旅行配件|休闲运动包|拉杆包|登山包|妈咪包|书包|相机包|腰包/胸包|礼品|火机烟具|礼品文具|军刀军具|收藏品|工艺礼品|创意礼品|礼盒礼券|鲜花绿植|婚庆节庆|京东卡|美妆礼品|礼品定制|京东福卡|古董文玩|奢侈品|箱包|钱包|服饰|腰带|太阳镜/眼镜框|配件|鞋靴|饰品|名品腕表|高档化妆品|婚庆|婚嫁首饰|婚纱摄影|婚纱礼服|婚庆服务|婚庆礼品/用品|婚宴|进口食品|饼干蛋糕|糖果/巧克力|休闲零食|冲调饮品|粮油调味|牛奶|地方特产|其他特产|新疆|北京|山西|内蒙古|福建|湖南|四川|云南|东北|休闲食品|休闲零食|坚果炒货|肉干肉脯|蜜饯果干|糖果/巧克力|饼干蛋糕|无糖食品|粮油调味|米面杂粮|食用油|调味品|南北干货|方便食品|有机食品|饮料冲调|饮用水|饮料|牛奶乳品|咖啡/奶茶|冲饮谷物|蜂蜜/柚子茶|成人奶粉|食品礼券|月饼|大闸蟹|粽子|卡券|茗茶|铁观音|普洱|龙井|绿茶|红茶|乌龙茶|花草茶|花果茶|养生茶|黑茶|白茶|其它茶|时尚饰品|项链|手链/脚链|戒指|耳饰|毛衣链|发饰/发卡|胸针|饰品配件|婚庆饰品|黄金|黄金吊坠|黄金项链|黄金转运珠|黄金手镯/手链/脚链|黄金耳饰|黄金戒指|K金饰品|K金吊坠|K金项链|K金手镯/手链/脚链|K金戒指|K金耳饰|金银投资|投资金|投资银|投资收藏|银饰|银吊坠/项链|银手镯/手链/脚链|银戒指|银耳饰|足银手镯|宝宝银饰|钻石|裸钻|钻戒|钻石项链/吊坠|钻石耳饰|钻石手镯/手链|翡翠玉石|项链/吊坠|手镯/手串|戒指|耳饰|挂件/摆件/把件|玉石孤品|水晶玛瑙|项链/吊坠|耳饰|手镯/手链/脚链|戒指|头饰/胸针|摆件/挂件|彩宝|琥珀/蜜蜡|碧玺|红宝石/蓝宝石|坦桑石|珊瑚|祖母绿|葡萄石|其他天然宝石|项链/吊坠|耳饰|手镯/手链|戒指|铂金|铂金项链/吊坠|铂金手镯/手链/脚链|铂金戒指|铂金耳饰|木手串/把件|小叶紫檀|黄花梨|沉香木|金丝楠|菩提|其他|橄榄核/核桃|檀香|珍珠|珍珠项链|珍珠吊坠|珍珠耳饰|珍珠手链|珍珠戒指|珍珠胸针|维修保养|机油|正时皮带|添加剂|汽车喇叭|防冻液|汽车玻璃|滤清器|火花塞|减震器|柴机油/辅助油|雨刷|车灯|后视镜|轮胎|轮毂|刹车片/盘|维修配件|蓄电池|底盘装甲/护板|贴膜|汽修工具|改装配件|车载电器|导航仪|安全预警仪|行车记录仪|倒车雷达|蓝牙设备|车载影音|净化器|电源|智能驾驶|车载电台|车载电器配件|吸尘器|智能车机|冰箱|汽车音响|车载生活电器|美容清洗|车蜡|补漆笔|玻璃水|清洁剂|洗车工具|镀晶镀膜|打蜡机|洗车配件|洗车机|洗车水枪|毛巾掸子|汽车装饰|脚垫|座垫|座套|后备箱垫|头枕腰靠|方向盘套|香水|空气净化|挂件摆件|功能小件|车身装饰件|车衣|安全自驾|安全座椅|胎压监测|防盗设备|应急救援|保温箱|地锁|摩托车|充气泵|储物箱|自驾野营|摩托车装备|汽车服务|清洗美容|功能升级|保养维修|油卡充值|车险|加油卡|ETC|驾驶培训|赛事改装|赛事服装|赛事用品|制动系统|悬挂系统|进气系统|排气系统|电子管理|车身强化|赛事座椅|运动鞋包|跑步鞋|休闲鞋|篮球鞋|板鞋|帆布鞋|足球鞋|乒羽网鞋|专项运动鞋|训练鞋|拖鞋|运动包|运动服饰|羽绒服|棉服|运动裤|夹克/风衣|卫衣/套头衫|T恤|套装|乒羽网服|健身服|运动背心|毛衫/线衫|运动配饰|骑行运动|折叠车|山地车/公路车|电动车|其他整车|骑行服|骑行装备|平衡车|垂钓用品|鱼竿鱼线|浮漂鱼饵|钓鱼桌椅|钓鱼配件|钓箱鱼包|其它|游泳用品|泳镜|泳帽|游泳包防水包|女士泳衣|男士泳衣|比基尼|其它|户外鞋服|冲锋衣裤|速干衣裤|滑雪服|羽绒服/棉服|休闲衣裤|抓绒衣裤|软壳衣裤|T恤|户外风衣|功能内衣|军迷服饰|登山鞋|雪地靴|徒步鞋|越野跑鞋|休闲鞋|工装鞋|溯溪鞋|沙滩/凉拖|户外袜|户外装备|帐篷/垫子|睡袋/吊床|登山攀岩|户外配饰|背包|户外照明|户外仪表|户外工具|望远镜|旅游用品|便携桌椅床|野餐烧烤|军迷用品|救援装备|滑雪装备|极限户外|冲浪潜水|健身训练|综合训练器|其他大型器械|哑铃|仰卧板/收腹机|其他中小型器材|瑜伽舞蹈|甩脂机|踏步机|武术搏击|健身车/动感单车|跑步机|运动护具|体育用品|羽毛球|乒乓球|篮球|足球|网球|排球|高尔夫|台球|棋牌麻将|轮滑滑板|其他|适用年龄|0-6个月|6-12个月|1-3岁|3-6岁|6-14岁|14岁以上|遥控/电动|遥控车|遥控飞机|遥控船|机器人|轨道/助力|毛绒布艺|毛绒/布艺|靠垫/抱枕|娃娃玩具|芭比娃娃|卡通娃娃|智能娃娃|模型玩具|仿真模型|拼插模型|收藏爱好|健身玩具|炫舞毯|爬行垫/毯|户外玩具|戏水玩具|动漫玩具|电影周边|卡通周边|网游周边|益智玩具|摇铃/床铃|健身架|早教启智|拖拉玩具|积木拼插|积木|拼图|磁力棒|立体拼插|DIY玩具|手工彩泥|绘画工具|情景玩具|创意减压|减压玩具|创意玩具|乐器|钢琴|电子琴/电钢琴|吉他/尤克里里|打击乐器|西洋管弦|民族管弦乐器|乐器配件|电脑音乐|工艺礼品乐器|口琴/口风琴/竖笛|手风琴||机票|国内机票|酒店|国内酒店|酒店团购|旅行|度假|景点|租车|火车票|旅游团购|充值|手机充值|游戏|游戏点卡|QQ充值|票务|电影票|演唱会|话剧歌剧|音乐会|体育赛事|舞蹈芭蕾|戏曲综艺|产地直供|水果|苹果|橙子|奇异果/猕猴桃|车厘子/樱桃|芒果|蓝莓|火龙果|葡萄/提子|柚子|香蕉|牛油果|梨|菠萝/凤梨|桔/橘|柠檬|草莓|桃/李/杏|更多水果|水果礼盒/券|猪牛羊肉|牛肉|羊肉|猪肉|内脏类|海鲜水产|鱼类|虾类|蟹类|贝类|海参|海产干货|其他水产|海产礼盒|禽肉蛋品|鸡肉|鸭肉|蛋类|其他禽类|冷冻食品|水饺/馄饨|汤圆/元宵|面点|火锅丸串|速冻半成品|奶酪黄油|熟食腊味|熟食|腊肠/腊肉|火腿|糕点|礼品卡券|饮品甜品|冷藏果蔬汁|冰激凌|其他`\n\n\treqs :=  []*model.Request{\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         fmt.Sprintf(\"https://search.jd.com/Search?keyword={%s}&enc=utf-8&page={1-5,1}\", goodsType),\n\t\t\t\tProcessName: \"jingdong-list\",\n\t\t\t},\n\t}\n\tresults := PraseReq(reqs, nil)\n\tfmt.Println(len(results))\n\tfor _, result := range results {\n\t\tfmt.Println(result)\n\t}\n\n\t//daily_timeline|kb_video_news|kb_news_bagua|kb_news_qipa|kb_photo_news|kb_news_tech|kb_news_finance|location|kb_news_world|kb_news_movie|kb_news_gaojidi|kb_news_wealth|kb_photo_gif|kb_news_sports|kb_news_mil|kb_news_history|kb_news_nba|kb_news_car|kb_news_chaobao|kb_news_laugh|kb_news_pet|kb_news_science|kb_news_baby|kb_news_astro|kb_news_sex|kb_news_beauty|kb_news_house|kb_news_share|kb_news_rock|kb_news_tfboys|kb_news_augury|kb_news_photography|kb_news_lottery|kb_news_cate|kb_news_julebu|kb_news_travel|kb_news_idea|kb_news_lol|kb_news_erciyuan|kb_news_space|kb_news_game|kb_news_iphone|kb_news_esport|kb_news_health|kb_news_outfit|kb_news_furnishing|kb_news_workout|kb_news_soup|kb_news_run|kb_news_fishing|kb_news_buddism|kb_news_diet|kb_news_football|kb_news_tennis|kb_news_tea|kb_news_yoga|kb_news_plaything|kb_news_watch\n\n}\n\nfunc TestOrAndOffset_2(t *testing.T) {\n\ttypes := \"daily_timeline|kb_video_news|kb_news_bagua|kb_news_qipa|kb_photo_news|kb_news_tech|kb_news_finance|location|kb_news_world|kb_news_movie|kb_news_gaojidi|kb_news_wealth|kb_photo_gif|kb_news_sports|kb_news_mil|kb_news_history|kb_news_nba|kb_news_car|kb_news_chaobao|kb_news_laugh|kb_news_pet|kb_news_science|kb_news_baby|kb_news_astro|kb_news_sex|kb_news_beauty|kb_news_house|kb_news_share|kb_news_rock|kb_news_tfboys|kb_news_augury|kb_news_photography|kb_news_lottery|kb_news_cate|kb_news_julebu|kb_news_travel|kb_news_idea|kb_news_lol|kb_news_erciyuan|kb_news_space|kb_news_game|kb_news_iphone|kb_news_esport|kb_news_health|kb_news_outfit|kb_news_furnishing|kb_news_workout|kb_news_soup|kb_news_run|kb_news_fishing|kb_news_buddism|kb_news_diet|kb_news_football|kb_news_tennis|kb_news_tea|kb_news_yoga|kb_news_plaything|kb_news_watch\"\n\n\treqs:= []*model.Request{\n\t\t\t{\n\t\t\t\tMethod:      \"get\",\n\t\t\t\tUrl:         fmt.Sprintf(`http://r.cnews.qq.com/getSubNewsChlidInterest?patchver=4511&mid=fd248c13ee1ce793495484e4cf3250f8ebbb475a&devid=860046037899335&store=60009&screen_height=1920&apptype=android&origin_imei=860046037899335&hw=OnePlus_ONEPLUSA3000&appver=25_areading_4.5.11&appversion=4.5.11&uid=bfa0a264a6547298&screen_width=1080&sceneid=&android_id=bfa0a264a6547298&last_id=20171207A03G7J00&ssid=GeeyueTech_5G&forward=0&IronThroneBuildTime=1512716487405&omgid=e0f7a4180378ba4e5ee80b0820ef5a1744ca0010211815&IronThroneRelBuildTime=415047497&refreshType=normal&qqnetwork=wifi&last_time=&bottom_id=20171207A0BFU500&top_time=1512631500&currentTab=kuaibao&top_id=20171207C0HX4500&is_wap=0&omgbizid=b03081d3f5806f45b65904d08cfad6bc77130080211815&page={1-100,1}&imsi=460019017167485&lastRefreshTime=&IronThroneRelExecTime=415047499&muid=49887860909485482&activefrom=icon&cachedCount=20&direction=0&sessionid=&chRefreshTimes=0&chlid={%s}&bottom_time=1512603257&IronThroneExecTime=1512716487407&qn-sig=284d6905ece4010e0ebd89dce072b5ee&qn-rid=6e63ca4d-1285-47ee-b95d-0bb49da3ce03`,types),\n\t\t\t\tProcessName: \"ttkblist\",\n\t\t\t},\n\t}\n\n\tresults := PraseReq(reqs, nil)\n\tfmt.Println(len(results))\n\t//for _, result := range results {\n\t//\tfmt.Println(result)\n\t//}\n}\n\n"
  },
  {
    "path": "spider/conf.json",
    "content": "{\n  \"name\":\"qiubai_spider\",\n  \"version\":\"0.01\",\n  \"work_num\": 1,\n  \"max_wait_num\":4096,\n\n  \"http_addr\":\"127.0.0.1:7775\",\n  \"etcd\":[\"http://127.0.0.1:2379\"],\n\n  \"schedule\":\"redis\",\n  \"redis_addr\":\"127.0.0.1:6379\"\n}"
  },
  {
    "path": "spider/config/config.go",
    "content": "package config\n\nimport (\n\t\"YiSpider/spider/logger\"\n\t\"encoding/json\"\n\t\"io/ioutil\"\n\t\"os\"\n)\n\nvar ConfigI *Config\n\ntype Config struct {\n\tName       string `json:\"name\"`\n\tVersion    string `json:\"version\"`\n\tWorkNum    int    `json:\"work_num\"`\n\tMaxWaitNum int    `json:\"max_wait_num\"`\n\n\tHttpAddr     string   `json:\"http_addr\"`\n\tRedisAddr    string   `json:\"redis_addr\"`\n\tScheduleMode string   `json:\"schedule\"`\n\tEtcd         []string `json:\"etcd\"`\n\n\tMysql        string `json:\"mysql\"`\n}\n\nfunc InitConfig() error {\n\n\tvar file *os.File\n\tvar bytes []byte\n\tvar err error\n\n\tif file, err = os.OpenFile(\"conf.json\", os.O_RDONLY, 0666); err != nil {\n\t\treturn err\n\t}\n\n\tif bytes, err = ioutil.ReadAll(file); err != nil {\n\t\treturn err\n\t}\n\n\tConfigI = &Config{}\n\tif err = json.Unmarshal(bytes, ConfigI); err != nil {\n\t\treturn err\n\t}\n\n\tlogger.Info(\"init success \", *ConfigI)\n\treturn nil\n}\n"
  },
  {
    "path": "spider/core/engine.go",
    "content": "package core\n\nimport (\n\t\"YiSpider/spider/spider\"\n\t\"fmt\"\n\t\"github.com/kataras/go-errors\"\n\t\"sync\"\n)\n\nvar engineI *Engine\nvar once sync.Once\n\nfunc New() *Engine {\n\tonce.Do(func() {\n\t\tengineI = &Engine{spiders: make(map[string]*SpiderRuntime)}\n\t})\n\treturn engineI\n}\n\nfunc GetEnine() *Engine {\n\treturn engineI\n}\n\ntype Engine struct {\n\tspiders map[string]*SpiderRuntime\n}\n\nfunc (m *Engine) AddSpider(spider *spider.Spider) *Engine {\n\tspiderRuntime := NewSpiderRuntime()\n\tspiderRuntime.SetSpider(spider)\n\tm.spiders[spider.Name] = spiderRuntime\n\treturn m\n}\n\nfunc (m *Engine) RunTask(name string) error {\n\ts, ok := m.spiders[name]\n\tif !ok {\n\t\treturn errors.New(fmt.Sprintf(\"Task [%s] is not exist\", name))\n\t}\n\ts.Run()\n\treturn nil\n}\n\nfunc (m *Engine) StopTask(name string) error {\n\ts, ok := m.spiders[name]\n\tif !ok {\n\t\treturn errors.New(fmt.Sprintf(\"Task [%s] is not exist\", name))\n\t}\n\ts.Stop()\n\treturn nil\n}\n\nfunc (m *Engine) EndTask(name string) error {\n\ts, ok := m.spiders[name]\n\tif !ok {\n\t\treturn errors.New(fmt.Sprintf(\"Task [%s] is not exist\", name))\n\t}\n\ts.Exit()\n\treturn nil\n}\n\nfunc (m *Engine) ListTask() []*spider.Spider {\n\tspiders := []*spider.Spider{}\n\tfor _, s := range m.spiders {\n\t\tspiders = append(spiders, s.spider)\n\t}\n\treturn spiders\n}\n\nfunc (m *Engine) GetTaskMetas() map[string]*TaskMeta {\n\tmetas := map[string]*TaskMeta{}\n\tfor name, s := range m.spiders {\n\t\tmetas[name] = s.TaskMeta\n\t}\n\treturn metas\n}\n\nfunc (m *Engine) Run() {\n\tfor _, s := range m.spiders {\n\t\ts.Run()\n\t}\n}\n"
  },
  {
    "path": "spider/core/runtime.go",
    "content": "package core\n\nimport (\n\t\"YiSpider/spider/config\"\n\t\"YiSpider/spider/downloader\"\n\t\"YiSpider/spider/logger\"\n\t\"YiSpider/spider/model\"\n\t\"YiSpider/spider/schedule\"\n\t//\"time\"\n\t\"YiSpider/spider/common\"\n\t\"YiSpider/spider/process\"\n\t\"YiSpider/spider/spider\"\n\t\"io/ioutil\"\n\t\"net/http\"\n\t\"sync\"\n\t\"sync/atomic\"\n\t\"YiSpider/spider/pipline/mysql\"\n)\n\nconst Default_WorkNum = 1\n\ntype SpiderRuntime struct {\n\tsync.Mutex\n\tworkNum  int\n\tschedule schedule.Schedule\n\tspider   *spider.Spider\n\n\tstopSign    bool\n\trecoverChan chan int\n\n\tTaskMeta *TaskMeta\n}\n\ntype TaskMeta struct {\n\tDownloadFailCount int32 `json:\"download_fail_count\"`\n\tDownloadCount     int32 `json:\"download_fail_count\"`\n\n\tUrlNum           int32 `json:\"url_num\"`\n\tWaitUrlNum       int   `json:\"wait_url_num\"`\n\tCrawlerResultNum int32 `json:\"crawler_result_num\"`\n}\n\nfunc NewSpiderRuntime() *SpiderRuntime {\n\n\tworkNum := config.ConfigI.WorkNum\n\tif workNum == 0 {\n\t\tworkNum = Default_WorkNum\n\t}\n\n\ts := &SpiderRuntime{}\n\ts.workNum = workNum\n\ts.schedule = schedule.GetSchedule(config.ConfigI)\n\ts.recoverChan = make(chan int)\n\tmeta := &TaskMeta{}\n\tmeta.WaitUrlNum = 0\n\tmeta.UrlNum = int32(0)\n\tmeta.DownloadCount = int32(0)\n\tmeta.DownloadFailCount = int32(0)\n\tmeta.CrawlerResultNum = int32(0)\n\n\ts.TaskMeta = meta\n\n\tif len(config.ConfigI.Mysql) > 0{\n\t\tmysql.InitMysql(config.ConfigI.Mysql)\n\t}\n\n\treturn s\n}\n\nfunc (s *SpiderRuntime) SetSpider(spider *spider.Spider) {\n\ts.spider = spider\n}\n\nfunc (s *SpiderRuntime) GetSpider() *spider.Spider {\n\treturn s.spider\n}\n\nfunc (s *SpiderRuntime) Run() {\n\tif s.stopSign {\n\t\ts.recoverChan <- 1\n\t\treturn\n\t}\n\ts.schedule.PushMuti(s.spider.GetRequests())\n\n\tfor i := 0; i < s.workNum; i++ {\n\t\tgo s.worker()\n\t}\n}\n\nfunc (s *SpiderRuntime) Stop() {\n\ts.stopSign = true\n}\n\nfunc (s *SpiderRuntime) worker() {\n\tcontext := model.Context{}\n\n\tfor {\n\t\tif s.stopSign {\n\t\t\t_, ok := <-s.recoverChan\n\t\t\ts.stopSign = false\n\t\t\tif !ok {\n\t\t\t\tgoto exit\n\t\t\t}\n\t\t}\n\n\t\treq, ok := s.schedule.Pop()\n\t\tif !ok {\n\t\t\tgoto exit\n\t\t}\n\t\tif req == nil {\n\t\t\tlogger.Info(\"schedule is emply\")\n\t\t\tcontinue\n\t\t}\n\n\t\tatomic.AddInt32(&s.TaskMeta.DownloadCount, 1)\n\t\tresponse, err := s.download(req)\n\t\tif err != nil {\n\t\t\tlogger.Error(err.Error())\n\t\t\tatomic.AddInt32(&s.TaskMeta.DownloadFailCount, 1)\n\t\t\tcontinue\n\t\t}\n\n\t\tbody, err := ioutil.ReadAll(response.Body)\n\t\tif err != nil {\n\t\t\tlogger.Error(err.Error())\n\t\t\tcontinue\n\t\t}\n\n\t\tcontext.Clear()\n\t\tcontext.Body, err = common.ToUtf8(body)\n\t\tif err != nil {\n\t\t\tcontext.Body = body\n\t\t}\n\t\tcontext.Request = response.Request\n\t\tcontext.Header = response.Header\n\n\t\tps, ok := s.spider.Process[req.ProcessName]\n\t\tif !ok {\n\t\t\tresponse.Body.Close()\n\t\t\tlogger.Info(\"process is not find ! please call SetProcess|SetTask\")\n\t\t\tbreak\n\t\t}\n\t\tfor _, p := range ps {\n\t\t\tpage, err := processWrapper(p, context)\n\t\t\tif err != nil {\n\t\t\t\tlogger.Info(\"Process fail|\", err.Error())\n\t\t\t\tcontinue\n\t\t\t}\n\t\t\tif page == nil {\n\t\t\t\tlogger.Info(\"Process page is nil\")\n\t\t\t\tcontinue\n\t\t\t}\n\t\t\ts.TaskMeta.WaitUrlNum = s.schedule.Count()\n\n\t\t\tif page.Urls != nil && len(page.Urls) > 0 {\n\t\t\t\tatomic.AddInt32(&s.TaskMeta.UrlNum, int32(len(page.Urls)))\n\t\t\t\tgo func() {\n\t\t\t\t\ts.schedule.PushMuti(page.Urls)\n\t\t\t\t}()\n\t\t\t}\n\n\t\t\tif page.ResultCount > 0 {\n\n\t\t\t\tatomic.AddInt32(&s.TaskMeta.CrawlerResultNum, int32(page.ResultCount))\n\n\t\t\t\ts.spider.Pipline.ProcessData(page.Result, s.spider.Name, req.ProcessName)\n\t\t\t}\n\t\t}\n\n\t\tresponse.Body.Close()\n\t}\n\nexit:\n\tlogger.Info(s.spider.Name, \"worker close\")\n}\nfunc processWrapper(p process.Process, context model.Context) (*model.Page, error) {\n\tdefer func() {\n\t\tif err := recover(); err != nil {\n\t\t\tlogger.Error(err)\n\t\t}\n\t}()\n\n\tpage, err := p.Process(context)\n\treturn page, err\n}\n\nfunc (s *SpiderRuntime) download(req *model.Request) (*http.Response, error) {\n\t//time.Sleep(1*time.Second)\n\tswitch req.Method {\n\tcase \"get\":\n\t\treturn downloader.Get(req.ProcessName, req.Url)\n\tcase \"post\":\n\t\treturn downloader.PostJson(req.ProcessName, req.Url, req.Data)\n\t}\n\n\treturn nil, nil\n}\n\nfunc (s *SpiderRuntime) Exit() {\n\ts.schedule.Close()\n\tclose(s.recoverChan)\n}\n"
  },
  {
    "path": "spider/downloader/request.go",
    "content": "package downloader\n\nimport (\n\t\"YiSpider/spider/logger\"\n\t\"bytes\"\n\t\"encoding/json\"\n\t\"errors\"\n\t\"fmt\"\n\t\"golang.org/x/net/publicsuffix\"\n\t\"net/http\"\n\t\"net/http/cookiejar\"\n\t\"sync\"\n\t\"time\"\n)\n\nvar Clients map[string]*http.Client\nvar lock sync.RWMutex\n\nfunc init() {\n\tClients = make(map[string]*http.Client)\n}\n\nfunc makeCookiejar() http.CookieJar {\n\tcookiejarOptions := cookiejar.Options{\n\t\tPublicSuffixList: publicsuffix.List,\n\t}\n\tjar, _ := cookiejar.New(&cookiejarOptions)\n\n\treturn jar\n}\n\nfunc makeClient(transport http.RoundTripper, jar http.CookieJar) *http.Client {\n\treturn &http.Client{Jar: jar, Transport: transport, Timeout: 60 * time.Second}\n}\n\nfunc Get(taskId string, url string) (*http.Response, error) {\n\tres, err := doRequest(taskId, \"GET\", url, nil)\n\tif err != nil {\n\t\tlogger.Info(\"Download fail doRequest,url:\", url, \"err:\", err)\n\t\treturn nil, err\n\t}\n\tlogger.Info(\"GET\", url, \" =>\", res.StatusCode)\n\tif res.StatusCode >= 400 {\n\t\treturn nil, errors.New(fmt.Sprintf(\"download fail,url %s, StatusCode %d\", url, res.StatusCode))\n\t}\n\treturn res, nil\n}\n\nfunc PostJson(taskId string, url string, data interface{}) (*http.Response, error) {\n\tdataJ, err := json.Marshal(data)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\tres, err := doRequest(taskId, \"POST\", url, dataJ)\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\tlogger.Info(\"POST\", url, \"=>\", res.StatusCode)\n\tif res.StatusCode >= 400 {\n\t\treturn nil, errors.New(fmt.Sprintf(\"download fail, StatusCode %d\", res.StatusCode))\n\t}\n\treturn res, nil\n}\n\nfunc doRequest(id string, method string, url string, data []byte) (resp *http.Response, err error) {\n\treq, err := http.NewRequest(method, url, bytes.NewBuffer(data))\n\tif err != nil {\n\t\treturn nil, err\n\t}\n\t//req.Header.Set(\"Content-Type\", \"application/json\")\n\tclient := getClient(id)\n\tif client == nil {\n\t\tclient = makeClient(nil, makeCookiejar())\n\t\tsetClient(id, client)\n\t}\n\treturn client.Do(req)\n}\n\nfunc setClient(id string, client *http.Client) {\n\tlock.Lock()\n\tdefer lock.Unlock()\n\tClients[id] = client\n}\n\nfunc getClient(id string) *http.Client {\n\tlock.RLock()\n\tdefer lock.RUnlock()\n\tclient := Clients[id]\n\treturn client\n}\n"
  },
  {
    "path": "spider/downloader/request_test.go",
    "content": "package downloader\n\nimport \"testing\"\n\nfunc TestGet(t *testing.T) {\n\tif _, err := Get(\"baidu\", \"http://www.hao123.com\"); err != nil {\n\t\tt.Fatal(err)\n\t}\n}\n\nfunc TestPostJson(t *testing.T) {\n\n}\n"
  },
  {
    "path": "spider/http/server.go",
    "content": "package http\n\nimport (\n\t\"YiSpider/spider/config\"\n\t\"YiSpider/spider/core\"\n\t\"YiSpider/spider/model\"\n\tspider2 \"YiSpider/spider/spider\"\n\t\"encoding/json\"\n\t\"io/ioutil\"\n\t\"log\"\n\t\"net/http\"\n\t\"net/url\"\n)\n\nvar errorMethod = []byte(\"{\\\"code\\\":\\\"400\\\",\\\"msg\\\":\\\"not support method\\\"}\")\nvar errorQuery = []byte(\"{\\\"code\\\":\\\"400\\\",\\\"msg\\\":\\\"error url parmas\\\"}\")\nvar errorJson = []byte(\"{\\\"code\\\":\\\"400\\\",\\\"msg\\\":\\\"error prase json \\\"}\")\nvar errorReadBody = []byte(\"{\\\"code\\\":\\\"400\\\",\\\"msg\\\":\\\"error read body\\\"}\")\nvar commonSuccess = []byte(\"{\\\"code\\\":\\\"200\\\",\\\"msg\\\":\\\"success\\\"}\")\n\nfunc AddTask(w http.ResponseWriter, req *http.Request) {\n\tif req.Method != \"POST\" {\n\t\tw.Write(errorMethod)\n\t\treturn\n\t}\n\tbody, err := ioutil.ReadAll(req.Body)\n\tif err != nil {\n\t\tw.Write(errorReadBody)\n\t\treturn\n\t}\n\tspider := &model.Task{}\n\terr = json.Unmarshal(body, spider)\n\tif err != nil {\n\t\tw.Write(errorJson)\n\t\treturn\n\t}\n\terr = core.GetEnine().AddSpider(spider2.InitWithTask(spider)).RunTask(spider.Name)\n\tif err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\tw.Write(commonSuccess)\n\treturn\n}\n\nfunc StopTask(w http.ResponseWriter, req *http.Request) {\n\tif req.Method != \"GET\" {\n\t\tw.Write(errorMethod)\n\t\treturn\n\t}\n\n\tqueryMap, err := url.ParseQuery(req.URL.RawQuery)\n\tif err != nil {\n\t\tw.Write(errorQuery)\n\t\treturn\n\t}\n\tname := queryMap.Get(\"name\")\n\tif err := core.GetEnine().StopTask(name); err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\tw.Write(commonSuccess)\n\treturn\n}\n\nfunc RunTask(w http.ResponseWriter, req *http.Request) {\n\tif req.Method != \"GET\" {\n\t\tw.Write(errorMethod)\n\t\treturn\n\t}\n\n\tqueryMap, err := url.ParseQuery(req.URL.RawQuery)\n\tif err != nil {\n\t\tw.Write(errorQuery)\n\t\treturn\n\t}\n\tname := queryMap.Get(\"name\")\n\tif err := core.GetEnine().RunTask(name); err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\tw.Write(commonSuccess)\n\treturn\n}\n\nfunc EndTask(w http.ResponseWriter, req *http.Request) {\n\tif req.Method != \"GET\" {\n\t\tw.Write(errorMethod)\n\t\treturn\n\t}\n\n\tqueryMap, err := url.ParseQuery(req.URL.RawQuery)\n\tif err != nil {\n\t\tw.Write(errorQuery)\n\t\treturn\n\t}\n\tname := queryMap.Get(\"name\")\n\tif err := core.GetEnine().EndTask(name); err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\tw.Write(commonSuccess)\n\treturn\n}\n\nfunc ListTask(w http.ResponseWriter, req *http.Request) {\n\tif req.Method != \"GET\" {\n\t\tw.Write(errorMethod)\n\t\treturn\n\t}\n\n\ttasks := core.GetEnine().ListTask()\n\tdatas, err := json.Marshal(tasks)\n\tif err != nil {\n\t\tw.Write([]byte(err.Error()))\n\t\treturn\n\t}\n\tw.Write(datas)\n\treturn\n}\n\nfunc InitHttpServer() {\n\thttp.HandleFunc(\"/task/addAndRun\", AddTask)\n\thttp.HandleFunc(\"/task/run\", RunTask)\n\thttp.HandleFunc(\"/task/stop\", StopTask)\n\thttp.HandleFunc(\"/task/end\", EndTask)\n\thttp.HandleFunc(\"/tasks\", ListTask)\n\n\terr := http.ListenAndServe(config.ConfigI.HttpAddr, nil)\n\tif err != nil {\n\t\tlog.Fatal(\"ListenAndServe fail:\", err)\n\t}\n\n}\n"
  },
  {
    "path": "spider/logger/logger.go",
    "content": "package logger\n\nimport \"fmt\"\n\nfunc Info(v ...interface{}) {\n\tfmt.Println(v)\n}\n\nfunc Debug(v ...interface{}) {\n\tfmt.Println(v)\n}\n\nfunc Warn(v ...interface{}) {\n\tfmt.Println(v)\n}\n\nfunc Error(v ...interface{}) {\n\tfmt.Println(v)\n}\n"
  },
  {
    "path": "spider/model/context.go",
    "content": "package model\n\nimport \"net/http\"\n\ntype Context struct {\n\tBody    []byte\n\tRequest *http.Request\n\tHeader  http.Header\n}\n\nfunc (c *Context) Clear() {\n\tc.Body = nil\n\tc.Request = nil\n\tc.Header = nil\n}\n"
  },
  {
    "path": "spider/model/page.go",
    "content": "package model\n\ntype Page struct {\n\tResult      []map[string]interface{}\n\tResultCount int\n\tUrls        []*Request\n}\n\nfunc (p *Page) AddUrl(req *Request) {\n\tif p.Urls == nil {\n\t\tp.Urls = []*Request{}\n\t}\n\tp.Urls = append(p.Urls, req)\n}\n\nfunc (p *Page) AddUrls(req []*Request) {\n\tif p.Urls == nil {\n\t\tp.Urls = []*Request{}\n\t}\n\tp.Urls = append(p.Urls, req...)\n}\n\nfunc (p *Page) AddResult(value map[string]interface{}) {\n\tif p.Result == nil {\n\t\tp.Result = []map[string]interface{}{}\n\t}\n\tp.Result = append(p.Result, value)\n\tp.ResultCount++\n}\n"
  },
  {
    "path": "spider/model/task.go",
    "content": "package model\n\nimport (\n\t\"encoding/json\"\n)\n\ntype Task struct {\n\tId   string `json:\"id\"`\n\tName string `jsonTask:\"name\"`\n\n\tRequest []*Request `json:\"request\"`\n\tProcess []Process  `json:\"process\"`\n\tPipline string     `json:\"pipline\"`\n\n\tDepth    int `json:\"depth\"`\n\tEndCount int `json:\"end_count\"`\n}\n\ntype Request struct {\n\tUrl         string            `json:\"url\"`\n\tMethod      string            `json:\"method\"`\n\tContentType string            `json:\"type\"` // json urlencode form\n\tData        map[string]string `json:\"data\"`\n\tHeader      map[string]string `json:\"header\"`\n\tCookies     Cookies           `json:\"cookies\"`\n\tProcessName string            `json:\"process_name\"`\n}\n\nfunc (r *Request) Write() ([]byte, error) {\n\treturn json.Marshal(r)\n}\n\nfunc (r *Request) Read(b []byte) error {\n\treturn json.Unmarshal(b, r)\n}\n\ntype Cookies struct {\n\tUrl  string `json:\"url\"`\n\tData string `json:\"data\"`\n}\n\ntype Process struct {\n\tName         string       `json:\"name\"`\n\tRegUrl       []string     `json:\"reg_url\"`\n\tType         string       `json:\"type\"` // template json self_process\n\tTemplateRule TemplateRule `json:\"template_rule\"`\n\tJsonRule     JsonRule     `json:\"json_rule\"`\n\tAddQueue     []*Request   `json:\"add_queue\"` //  http://www.baidu.com/{name}/{ctx}\n}\n\ntype TemplateRule struct {\n\tRule map[string]string\n}\n\ntype JsonRule struct {\n\tRule map[string]string\n}\n"
  },
  {
    "path": "spider/pipline/console/console.go",
    "content": "package console\n\nimport (\n\t\"encoding/json\"\n\t\"fmt\"\n)\n\ntype ConsolePipline struct {\n}\n\nfunc NewConsolePipline() *ConsolePipline {\n\treturn &ConsolePipline{}\n}\n\nfunc (c *ConsolePipline) ProcessData(v []map[string]interface{}, taskName string, processName string) {\n\tbytes, _ := json.Marshal(v)\n\tfmt.Println(\"Pipline :\", string(bytes))\n}\n"
  },
  {
    "path": "spider/pipline/file/file.go",
    "content": "package file\n\nimport (\n\t\"YiSpider/spider/logger\"\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"os\"\n\t\"time\"\n)\n\ntype FilePipline struct {\n\troot  string\n\tfiles map[string]*os.File\n}\n\nfunc NewFilePipline(root string) *FilePipline {\n\treturn &FilePipline{root: root, files: make(map[string]*os.File)}\n}\n\nfunc (c *FilePipline) ProcessData(v []map[string]interface{}, taskName string, processName string) {\n\n\tfile, ok := c.files[processName]\n\tif !ok {\n\t\tvar f *os.File\n\t\tvar err error\n\n\t\tpath := fmt.Sprintf(\"%s%s-%s.txt\", c.root, taskName, processName)\n\t\tif f, err = os.OpenFile(path, os.O_CREATE|os.O_RDWR, 0666); err != nil {\n\t\t\tlogger.Error(\"FilePipline Open File fail, path =\", path, err)\n\t\t\treturn\n\t\t}\n\t\tf.WriteString(fmt.Sprintf(\"========= Task : %s =============\\n\", taskName))\n\t\tf.WriteString(fmt.Sprintf(\"======= Task Begin : %s =============\\n\", time.Now()))\n\n\t\tc.files[processName] = f\n\t\tfile = f\n\t}\n\n\tfor _, value := range v {\n\t\tdata, err := json.Marshal(value)\n\t\tif err != nil {\n\t\t\tlogger.Error(\"FilePipline json.Marshal fail, v = \", v)\n\t\t\treturn\n\t\t}\n\t\tfile.WriteString(string(data) + \"\\n\")\n\t}\n\tlogger.Info(\"File Pipline write. Count:\", len(v))\n\n\treturn\n}\n\nfunc (c *FilePipline) Close() {\n\tfor _, f := range c.files {\n\t\tf.Close()\n\t}\n}\n"
  },
  {
    "path": "spider/pipline/mysql/dbModel.go",
    "content": "package mysql\n\nimport (\n\t\"fmt\"\n\t\"strings\"\n\t\"reflect\"\n\t\"encoding/json\"\n)\n\ntype Field struct {\n\tName string\n\tPk   bool\n\tValue interface{}\n}\n\nfunc (f *Field) Sql() string{\n\tvar sql string\n\tswitch f.Value.(type) {\n\tcase string:\n\t\tsql = fmt.Sprintf(\"\\n `%s` varchar(255) NULL DEFAULT '' \",f.Name)\n\tcase int:\n\t\tsql = fmt.Sprintf(\"\\n `%s` integer NULL DEFAULT 0 \",f.Name)\n\tcase int32:\n\t\tsql = fmt.Sprintf(\"\\n `%s` integer NULL DEFAULT 0 \",f.Name)\n\tcase int64:\n\t\tsql = fmt.Sprintf(\"\\n `%s` integer NULL DEFAULT 0 \",f.Name)\n\tcase float64:\n\t\tsql = fmt.Sprintf(\"\\n `%s` float NULL DEFAULT 0.0 \",f.Name)\n\tcase float32:\n\t\tsql = fmt.Sprintf(\"\\n `%s` float NULL DEFAULT 0.0 \",f.Name)\n\tdefault:\n\t\tsql = fmt.Sprintf(\"\\n `%s` varchar(255) NULL DEFAULT '' \",f.Name)\n\t}\n\n\tif f.Pk{\n\t\tsql = fmt.Sprintf(\"\\n `%s` integer AUTO_INCREMENT PRIMARY KEY\",f.Name)\n\t}\n\n\tsql += \",\"\n\n\treturn sql\n}\n\ntype DBModel struct {\n\tName string\n\tFields []Field\n}\n\nfunc (d *DBModel) TableSql() string{\n\tsql := fmt.Sprintf(\"CREATE TABLE IF NOT EXISTS `%s` (\",d.Name)\n\tfor _,field := range d.Fields{\n\t\tsql += field.Sql()\n\t}\n\tsql = sql[:len(sql)-1]\n\tsql += \"\\n ) ENGINE=InnoDB DEFAULT CHARSET=utf8;\"\n\n\treturn sql\n}\n\nfunc (d *DBModel) InsertSql() string{\n\tsql := fmt.Sprintf(\"INSERT `%s` SET \",d.Name)\n\tfor i:= 1;i< len(d.Fields);i++{\n\t\tsql += fmt.Sprintf(\"`%s`=?,\",d.Fields[i].Name)\n\t}\n\tsql = sql[:len(sql)-1]\n\treturn sql\n}\n\nfunc (d *DBModel) InsertArgs() []interface{}{\n\targs := []interface{}{}\n\tfor i:= 1;i< len(d.Fields);i++{\n\t\trv := reflect.ValueOf(d.Fields[i].Value)\n\t\tswitch rv.Kind(){\n\t\tcase reflect.Array:\n\t\tcase reflect.Slice:\n\t\t\tbytes,_ := json.Marshal(d.Fields[i].Value)\n\t\t\targs = append(args,string(bytes))\n\t\tdefault:\n\t\t\targs = append(args,rv.String())\n\n\t\t}\n\t}\n\treturn args\n}\n\nfunc NewDBModel(name string,m map[string]interface{}) *DBModel{\n\n\tdbModel := &DBModel{Name:name,Fields:[]Field{}}\n\tdbModel.Fields = append(dbModel.Fields,Field{Name:strings.ToLower(\"FffId\"),Pk:true,Value:1})\n\tfor k,v := range m{\n\t\tdbModel.Fields = append(dbModel.Fields,Field{Name:strings.ToLower(k),Pk:false,Value:v})\n\t}\n\n\treturn dbModel\n}\n\n"
  },
  {
    "path": "spider/pipline/mysql/mysql.go",
    "content": "package mysql\n\nimport \"database/sql\"\nimport (\n\t_ \"github.com/go-sql-driver/mysql\"\n\t\"github.com/astaxie/beego\"\n\t\"github.com/astaxie/beego/orm\"\n)\n\nvar DB *sql.DB\n\nfunc InitMysql(mysql string) {\n\torm.RegisterDriver(\"mysql\", orm.DRMySQL)\n\torm.RegisterDataBase(\"default\", \"mysql\", mysql)\n\torm.RegisterModel(&C{})\n\torm.RunSyncdb(\"default\", false, true)\n}\n\ntype C struct {\n\tId int\n}\n\nfunc CreateTable(m *DBModel) error{\n\to := orm.NewOrm()\n\t_,err := o.Raw(m.TableSql()).Exec()\n\tif err != nil{\n\t\treturn err\n\t}\n\tbeego.Info(\"创建表 \",m.Name,\" 成功 【完成】\")\n\treturn nil\n}\n\nfunc Add(m *DBModel) error{\n\to := orm.NewOrm()\n\t_,err := o.Raw(m.InsertSql(),m.InsertArgs()...).Exec()\n\tif err != nil{\n\t\treturn err\n\t}\n\tbeego.Info(\"插入数据成功\")\n\treturn nil\n}\n\n"
  },
  {
    "path": "spider/pipline/mysql/mysqlPipline.go",
    "content": "package mysql\n\nimport (\n\t\"sync\"\n)\n\ntype MysqlPipline struct {\n\tsync.Once\n}\n\nfunc NewMysqlPipline() *MysqlPipline {\n\treturn &MysqlPipline{}\n}\n\nfunc (c *MysqlPipline) ProcessData(v []map[string]interface{}, taskName string, processName string) {\n\tfor _,m :=range v{\n\t\tdbModel := NewDBModel(processName,m)\n\n\t\tCreateTable(dbModel)\n\t\tAdd(dbModel)\n\t}\n}\n"
  },
  {
    "path": "spider/pipline/nsq/nsq.go",
    "content": "package nsq\n"
  },
  {
    "path": "spider/pipline/pipline.go",
    "content": "package pipline\n\ntype Pipline interface {\n\tProcessData(v []map[string]interface{}, taskName string, processName string)\n}\n"
  },
  {
    "path": "spider/process/filter/repoat_filter.go",
    "content": "package filter\n\nimport (\n\t\"YiSpider/spider/model\"\n\t\"sync\"\n)\n\nvar CuckooFilter map[string]int\nvar lock sync.RWMutex\n\nfunc init() {\n\tCuckooFilter = make(map[string]int)\n}\n\nfunc RepeatFilter(url string, process *model.Process) bool {\n\tsign := url\n\tif ok := get(sign); ok {\n\t\treturn false\n\t}\n\tput(sign)\n\treturn true\n}\n\nfunc get(name string) bool {\n\tlock.RLock()\n\tdefer lock.RUnlock()\n\t_, ok := CuckooFilter[name]\n\treturn ok\n}\n\nfunc put(name string) {\n\tlock.Lock()\n\tdefer lock.Unlock()\n\tCuckooFilter[name] = 1\n}\n"
  },
  {
    "path": "spider/process/filter/repoat_filter_test.go",
    "content": "package filter\n\nimport (\n\t\"YiSpider/spider/model\"\n\t\"fmt\"\n\t\"testing\"\n)\n\nfunc TestRepeatFilter(t *testing.T) {\n\ttask := &model.Task{\n\t\tId:     \"qiiubai\",\n\t\tName:   \"qiubai\",\n\t\tMethod: \"get\",\n\t\tHost:   \"https://www.qiushibaike.com\",\n\t\tUrl:    \"https://www.qiushibaike.com\",\n\t\tProcess: model.Process{\n\t\t\tUrl: \"https://www.qiushibaike.com\",\n\t\t\tRegUrl: []string{\n\t\t\t\t\"/.*?/page/[0-9]+\",\n\t\t\t},\n\t\t\tType: \"template\",\n\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\"node\":        \"array|.article\",\n\t\t\t\t\t\"url\":         \"attr.href|.contentHerf\",\n\t\t\t\t\t\"author\":      \"attr.alt|.author a img\",\n\t\t\t\t\t\"content\":     \"text|.content span\",\n\t\t\t\t\t\"like_num\":    \"text|.stats-vote i\",\n\t\t\t\t\t\"comment_num\": \"text|.stats-comments a i\",\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\t\tPipline: \"file\",\n\t}\n\n\tfmt.Println(RepeatFilter(\"/8hr/\", task))\n\tfmt.Println(RepeatFilter(\"/8hr/page/3/\", task))\n\tfmt.Println(RepeatFilter(\"/8hr/page/4/\", task))\n\tfmt.Println(RepeatFilter(\"/8hr/page/5/\", task))\n\tfmt.Println(RepeatFilter(\"/8hr/page/13/\", task))\n\tfmt.Println(RepeatFilter(\"/8hr/page/3/\", task))\n}\n"
  },
  {
    "path": "spider/process/filter/url_filter.go",
    "content": "package filter\n\nimport (\n\t\"YiSpider/spider/model\"\n\t\"regexp\"\n)\n\nfunc Filter(url string, process *model.Process) bool {\n\tif len(url) == 0 {\n\t\treturn false\n\t}\n\n\tcheck := false\n\tfor _, regUrl := range process.RegUrl {\n\t\treg := regexp.MustCompile(regUrl)\n\t\tmatch := reg.MatchString(url)\n\t\tif match {\n\t\t\tcheck = true\n\t\t\tbreak\n\t\t}\n\t}\n\n\tif check == false {\n\t\treturn false\n\t}\n\n\treturn RepeatFilter(url, process)\n}\n"
  },
  {
    "path": "spider/process/filter/url_filter_test.go",
    "content": "package filter\n\nimport (\n\t\"YiSpider/spider/model\"\n\t\"fmt\"\n\t\"testing\"\n)\n\nfunc TestFilter(t *testing.T) {\n\ttask := &model.Task{\n\t\tId:     \"qiiubai\",\n\t\tName:   \"qiubai\",\n\t\tMethod: \"get\",\n\t\tHost:   \"https://www.qiushibaike.com\",\n\t\tUrl:    \"https://www.qiushibaike.com\",\n\t\tProcess: model.Process{\n\t\t\tUrl: \"https://www.qiushibaike.com\",\n\t\t\tRegUrl: []string{\n\t\t\t\t\"/.*?/page/[0-9]+\",\n\t\t\t},\n\t\t\tType: \"template\",\n\t\t\tTemplateRule: model.TemplateRule{\n\t\t\t\tRule: map[string]string{\n\t\t\t\t\t\"node\":        \"array|.article\",\n\t\t\t\t\t\"url\":         \"attr.href|.contentHerf\",\n\t\t\t\t\t\"author\":      \"attr.alt|.author a img\",\n\t\t\t\t\t\"content\":     \"text|.content span\",\n\t\t\t\t\t\"like_num\":    \"text|.stats-vote i\",\n\t\t\t\t\t\"comment_num\": \"text|.stats-comments a i\",\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\t\tPipline: \"file\",\n\t}\n\n\tfmt.Println(Filter(\"/8hr/\", task))\n\tfmt.Println(Filter(\"/8hr/page/3/\", task))\n\tfmt.Println(Filter(\"/8hr/page/4/\", task))\n\tfmt.Println(Filter(\"/8hr/page/5/\", task))\n\tfmt.Println(Filter(\"/8hr/page/13/\", task))\n\tfmt.Println(Filter(\"/8hr/page/3/\", task))\n}\n"
  },
  {
    "path": "spider/process/json-process/json_process.go",
    "content": "package json_process\n\nimport (\n\t\"YiSpider/spider/model\"\n)\n\ntype JsonProcess struct {\n\tjsonProcess *model.Process\n}\n\nfunc NewJsonProcess(jsonProcess *model.Process) *JsonProcess {\n\treturn &JsonProcess{jsonProcess: jsonProcess}\n}\n\nfunc (j *JsonProcess) Process(context model.Context) (*model.Page, error) {\n\treturn JsonRuleProcess(j.jsonProcess, context)\n}\n"
  },
  {
    "path": "spider/process/json-process/json_rule.go",
    "content": "package json_process\n\nimport (\n\t\"YiSpider/spider/common\"\n\t\"YiSpider/spider/logger\"\n\t\"YiSpider/spider/model\"\n\tsimplejson \"github.com/bitly/go-simplejson\"\n\t\"strings\"\n)\n\nfunc JsonRuleProcess(process *model.Process, context model.Context) (*model.Page, error) {\n\treturn Process(process, context)\n}\n\nfunc Process(process *model.Process, context model.Context) (*model.Page, error) {\n\tjsonRule := process.JsonRule.Rule\n\tpage := &model.Page{}\n\n\tsJson, err := simplejson.NewJson(context.Body)\n\tif err != nil {\n\t\tlogger.Error(\"NewDocumentFromReader fail,\", err)\n\t\treturn nil, err\n\t}\n\n\tresultType := \"map\"\n\trootSel := []string{}\n\n\tv, ok := jsonRule[\"node\"]\n\n\tif ok {\n\t\tcontentInfo := strings.Split(v, \"|\")\n\t\tresultType = contentInfo[0]\n\t\tselStr := contentInfo[1]\n\t\trootSel = strings.Split(selStr, \".\")\n\t}\n\n\tif resultType == \"array\" {\n\t\tfor _, name := range rootSel {\n\t\t\tsJson = sJson.Get(name)\n\t\t}\n\t\trootNode, err := sJson.Array()\n\t\tif err != nil {\n\t\t\tlogger.Error(\"Json fail,\", err)\n\t\t\treturn nil, err\n\t\t}\n\t\tif len(rootNode) >= 0 {\n\t\t\tfor _, node := range rootNode {\n\t\t\t\tnodeMap, ok := node.(map[string]interface{})\n\t\t\t\tif !ok {\n\t\t\t\t\tcontinue\n\t\t\t\t}\n\t\t\t\tdata := map[string]interface{}{}\n\t\t\t\tfor key, value := range jsonRule {\n\t\t\t\t\tif key == \"node\" {\n\t\t\t\t\t\tcontinue\n\t\t\t\t\t}\n\t\t\t\t\tdata[key] = nodeMap[value]\n\t\t\t\t}\n\t\t\t\tif len(process.AddQueue) > 0 {\n\t\t\t\t\tpage.AddUrls(common.PraseReq(process.AddQueue, data))\n\t\t\t\t}\n\t\t\t\tpage.AddResult(data)\n\t\t\t}\n\t\t}\n\t}\n\n\tif resultType == \"map\" {\n\n\t\tresult := map[string]interface{}{}\n\n\t\tfor _, name := range rootSel {\n\t\t\tsJson = sJson.Get(name)\n\t\t}\n\n\t\tif err != nil {\n\t\t\tlogger.Error(\"Json fail,\", err)\n\t\t\treturn nil, err\n\t\t}\n\n\t\tfor key, value := range jsonRule {\n\t\t\tvalueSel := []string{}\n\t\t\tvalueSel = strings.Split(value, \".\")\n\t\t\tvalueNode := *sJson\n\t\t\tfor _, name := range valueSel {\n\t\t\t\tvalueNode = *valueNode.Get(name)\n\t\t\t}\n\t\t\tresult[key] = valueNode.Interface()\n\t\t}\n\n\t\tif len(process.AddQueue) > 0 {\n\t\t\tpage.AddUrls(common.PraseReq(process.AddQueue, result))\n\t\t}\n\t\tpage.AddResult(result)\n\t}\n\n\tif resultType == \"nil\" {\n\n\t\tresult := map[string]interface{}{}\n\n\t\tfor _, name := range rootSel {\n\t\t\tsJson = sJson.Get(name)\n\t\t}\n\t\trootNode, err := sJson.Map()\n\n\t\tif err != nil {\n\t\t\tlogger.Error(\"Json fail,\", err)\n\t\t\treturn nil, err\n\t\t}\n\n\t\tfor key, value := range jsonRule {\n\t\t\tresult[key] = rootNode[value]\n\t\t}\n\t\tpage.Urls = []*model.Request{}\n\t\tif len(process.AddQueue) > 0 {\n\t\t\tpage.AddUrls(common.PraseReq(process.AddQueue, result))\n\t\t}\n\t}\n\treturn page, nil\n}\n"
  },
  {
    "path": "spider/process/process.go",
    "content": "package process\n\nimport (\n\t\"YiSpider/spider/model\"\n)\n\ntype Process interface {\n\tProcess(context model.Context) (*model.Page, error)\n}\n"
  },
  {
    "path": "spider/process/template-process/template_process.go",
    "content": "package template_process\n\nimport (\n\t\"YiSpider/spider/model\"\n)\n\ntype TemplateProcess struct {\n\ttempProcess *model.Process\n}\n\nfunc NewTemplateProcess(tempProcess *model.Process) *TemplateProcess {\n\treturn &TemplateProcess{tempProcess: tempProcess}\n}\n\nfunc (t *TemplateProcess) Process(context model.Context) (*model.Page, error) {\n\treturn TemplateRuleProcess(t.tempProcess, context)\n\n}\n"
  },
  {
    "path": "spider/process/template-process/template_rule.go",
    "content": "package template_process\n\nimport (\n\t\"YiSpider/spider/common\"\n\t\"YiSpider/spider/logger\"\n\t\"YiSpider/spider/model\"\n\t\"YiSpider/spider/process/filter\"\n\t\"bytes\"\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"github.com/PuerkitoBio/goquery\"\n\turl2 \"net/url\"\n\t\"strings\"\n)\n\nfunc TemplateRuleProcess(process *model.Process, context model.Context) (*model.Page, error) {\n\tpage := &model.Page{}\n\n\trule := process.TemplateRule.Rule\n\n\tdoc, err := goquery.NewDocumentFromReader(bytes.NewBuffer(context.Body))\n\tif err != nil {\n\t\tlogger.Error(\"NewDocumentFromReader fail,\", err)\n\t\treturn nil, err\n\t}\n\n\tif len(process.RegUrl) > 0 {\n\t\tdoc.Find(\"a\").Each(func(i int, sel *goquery.Selection) {\n\t\t\thref, _ := sel.Attr(\"href\")\n\t\t\thref = getComplateUrl(context.Request.URL, href)\n\t\t\tif filter.Filter(href, process) {\n\t\t\t\tpage.AddUrl(&model.Request{Url: href, Method: \"get\"})\n\t\t\t}\n\t\t})\n\t}\n\n\tresultType := \"map\"\n\trootSel := \"\"\n\n\tv, ok := rule[\"node\"]\n\tif ok {\n\t\tcontentInfo := strings.Split(v, \"|\")\n\t\tresultType = contentInfo[0]\n\t\trootSel = contentInfo[1]\n\t}\n\n\tif resultType == \"array\" {\n\n\t\tdoc.Find(rootSel).Each(func(i int, s *goquery.Selection) {\n\t\t\tdata := getMapFromDom(rule, s)\n\t\t\tif data == nil {\n\t\t\t\treturn\n\t\t\t}\n\t\t\tif len(process.AddQueue) > 0 {\n\t\t\t\tpage.AddUrls(common.PraseReq(process.AddQueue, data))\n\t\t\t}\n\t\t\tpage.AddResult(data)\n\t\t})\n\t}\n\n\tif resultType == \"map\" {\n\t\tdata := getMapFromDom(rule, doc.Selection)\n\t\tif len(process.AddQueue) > 0 {\n\t\t\tpage.AddUrls(common.PraseReq(process.AddQueue, data))\n\t\t}\n\t\tpage.AddResult(data)\n\t}\n\n\treturn page, nil\n}\n\nfunc getMapFromDom(rule map[string]string, node *goquery.Selection) map[string]interface{} {\n\n\tresult := make(map[string]interface{})\n\n\tisNull := true\n\n\tfor key, value := range rule {\n\n\t\tif key == \"node\" {\n\t\t\tcontinue\n\t\t}\n\n\t\trules := strings.Split(value, \"|\")\n\t\tValueType := strings.Split(rules[0], \".\")\n\n\t\tif len(rules) < 2 {\n\t\t\tcontinue\n\t\t}\n\n\t\ts := node.Find(rules[1])\n\t\tswitch ValueType[0] {\n\t\tcase \"text\":\n\t\t\tresult[key] = s.Text()\n\t\tcase \"html\":\n\t\t\tresult[key], _ = s.Html()\n\t\tcase \"attr\":\n\t\t\tif len(ValueType) < 2 {\n\t\t\t\tcontinue\n\t\t\t}\n\t\t\tresult[key], _ = s.Attr(ValueType[1])\n\t\tcase \"texts\":\n\t\t\tarr := []string{}\n\t\t\ts.Each(func(i int, sel *goquery.Selection) {\n\t\t\t\ttext := sel.Text()\n\t\t\t\tarr = append(arr, text)\n\t\t\t})\n\t\t\tj, _ := json.Marshal(arr)\n\t\t\tresult[key] = string(j)\n\t\tcase \"htmls\":\n\t\t\tarr := []string{}\n\t\t\ts.Each(func(i int, sel *goquery.Selection) {\n\t\t\t\thtml, _ := s.Html()\n\t\t\t\tarr = append(arr, html)\n\t\t\t})\n\t\t\tj, _ := json.Marshal(arr)\n\t\t\tresult[key] = string(j)\n\t\tcase \"attrs\":\n\t\t\tarr := []string{}\n\t\t\tattr := \"\"\n\t\t\ts.Each(func(i int, sel *goquery.Selection) {\n\t\t\t\tif len(ValueType) >= 2 {\n\t\t\t\t\tattr, _ = sel.Attr(ValueType[1])\n\t\t\t\t\tarr = append(arr, attr)\n\t\t\t\t}\n\t\t\t})\n\t\t\tresult[key] = arr\n\t\tdefault:\n\t\t\tresult[key] = \"\"\n\t\t}\n\t\tres, ok := result[key].(string)\n\t\tif ok || len(res) != 0 {\n\t\t\tisNull = false\n\t\t}\n\t}\n\n\tif isNull == true {\n\t\treturn nil\n\t}\n\n\treturn result\n}\n\nfunc getComplateUrl(url *url2.URL, href string) string {\n\n\tif strings.HasPrefix(href, \"/\") {\n\t\tnewHref := fmt.Sprintf(\"%s://%s%s\", url.Scheme, url.Host, href)\n\t\treturn newHref\n\t}\n\n\tnewHref := fmt.Sprintf(\"%s://%s/%s\", url.Scheme, url.Host, href)\n\treturn newHref\n}\n"
  },
  {
    "path": "spider/process/template-process/template_rule_test.go",
    "content": "package template_process\n\nimport (\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"github.com/PuerkitoBio/goquery\"\n\t\"testing\"\n)\n\nfunc TestTemplateProcess(t *testing.T) {\n\t//doc,err := goquery.NewDocument(\"https://www.qiushibaike.com/\")\n\t//if err != nil{\n\t//\tt.Fatal(\"open url fail \",err)\n\t//}\n\t//html,err := doc.Html()\n\t//if err != nil{\n\t//\tt.Fatal(\"get html fail \",err)\n\t//}\n\t//\n\t//\n\t//rule := map[string]string{\n\t//\t\"node\":\"array|.article\",\n\t//\t\"url\":\"attr.href|.contentHerf\",\n\t//\t\"author\":\"attr.alt|.author a img\",\n\t//\t\"content\":\"text|.content span\",\n\t//\t\"like_num\":\"text|.stats-vote i\",\n\t//\t\"comment_num\":\"text|.stats-comments a i\",\n\t//}\n\t//\n\t//result,_ := TemplateRuleProcess(rule,[]byte(html))\n\t//data,_ := json.Marshal(result)\n\t//fmt.Println(\"Result :\",string(data))\n}\n"
  },
  {
    "path": "spider/register/etcd/etcd.go",
    "content": "package etcd\n\nimport (\n\t\"encoding/json\"\n\t\"log\"\n\t\"runtime\"\n\t\"time\"\n\n\t\"YiSpider/spider/core\"\n\t\"github.com/coreos/etcd/client\"\n\t\"golang.org/x/net/context\"\n)\n\ntype Worker struct {\n\tName    string\n\tIP      string\n\tKeysAPI client.KeysAPI\n}\n\ntype WorkerInfo struct {\n\tName       string                 `json:\"name\"`\n\tIP         string                 `json:\"ip\"`\n\tCPU        int                    `json:\"cpu\"`\n\tMetaData   map[string]string      `json:\"metadata\"`\n\tSpiderData map[string]*SpiderData `json:\"spider_data\"`\n}\ntype SpiderData struct {\n\tDownloadFailCount int32 `json:\"download_fail_count\"`\n\tDownloadCount     int32 `json:\"download_count\"`\n\n\tUrlNum           int32 `json:\"url_num\"`\n\tWaitUrlNum       int   `json:\"wait_url_num\"`\n\tCrawlerResultNum int32 `json:\"crawler_result_num\"`\n}\n\nfunc NewWorker(name, IP string, endpoints []string) *Worker {\n\tcfg := client.Config{\n\t\tEndpoints:               endpoints,\n\t\tTransport:               client.DefaultTransport,\n\t\tHeaderTimeoutPerRequest: time.Second,\n\t}\n\n\tetcdClient, err := client.New(cfg)\n\tif err != nil {\n\t\tlog.Fatal(\"Error: cannot connec to etcd:\", err)\n\t}\n\n\tw := &Worker{\n\t\tName:    name,\n\t\tIP:      IP,\n\t\tKeysAPI: client.NewKeysAPI(etcdClient),\n\t}\n\treturn w\n}\n\nfunc (w *Worker) HeartBeat() {\n\tapi := w.KeysAPI\n\n\tfor {\n\t\tinfo := &WorkerInfo{\n\t\t\tName:       w.Name,\n\t\t\tIP:         w.IP,\n\t\t\tCPU:        runtime.NumCPU(),\n\t\t\tSpiderData: getSpiderData(),\n\t\t}\n\n\t\tkey := \"spiders/\" + w.Name\n\t\tvalue, _ := json.Marshal(info)\n\n\t\t_, err := api.Set(context.Background(), key, string(value), &client.SetOptions{\n\t\t\tTTL: time.Second * 15,\n\t\t})\n\t\tif err != nil {\n\t\t\tlog.Println(\"Error update workerInfo:\", err)\n\t\t}\n\t\ttime.Sleep(time.Second * 5)\n\t}\n}\n\nfunc getSpiderData() map[string]*SpiderData {\n\tdatas := make(map[string]*SpiderData)\n\tmetas := core.GetEnine().GetTaskMetas()\n\tfor name, meta := range metas {\n\t\tdata := &SpiderData{}\n\t\tdata.CrawlerResultNum = meta.CrawlerResultNum\n\t\tdata.DownloadFailCount = meta.DownloadFailCount\n\t\tdata.DownloadCount = meta.DownloadCount\n\t\tdata.WaitUrlNum = meta.WaitUrlNum\n\t\tdata.UrlNum = meta.UrlNum\n\t\tdatas[name] = data\n\t}\n\treturn datas\n}\n"
  },
  {
    "path": "spider/schedule/schedule.go",
    "content": "package schedule\n\nimport (\n\t\"YiSpider/spider/config\"\n\t\"YiSpider/spider/model\"\n)\n\ntype Schedule interface {\n\tPush(req *model.Request)\n\tPushMuti(reqs []*model.Request)\n\tPop() (*model.Request, bool)\n\tCount() int\n\tClose()\n}\n\nvar (\n\tscheduleMap = make(map[string]func(*config.Config) Schedule)\n)\n\nfunc RegisterSchedule(name string, builder func(*config.Config) Schedule) {\n\tscheduleMap[name] = builder\n}\n\nfunc GetSchedule(c *config.Config) Schedule {\n\tschedule := scheduleMap[c.ScheduleMode]\n\tif schedule == nil {\n\t\treturn scheduleMap[\"chan\"](c)\n\t}\n\treturn schedule(c)\n}\n"
  },
  {
    "path": "spider/schedule/schedule_chan.go",
    "content": "package schedule\n\nimport (\n\t\"YiSpider/manage/logger\"\n\t\"YiSpider/spider/common\"\n\t\"YiSpider/spider/config\"\n\t\"YiSpider/spider/model\"\n)\n\ntype ChanSchedule struct {\n\twaitQueue chan *model.Request\n}\n\nfunc NewChanSchedule(config *config.Config) Schedule {\n\tschedule := &ChanSchedule{}\n\tschedule.waitQueue = make(chan *model.Request, config.MaxWaitNum)\n\treturn schedule\n}\n\nfunc (d *ChanSchedule) Push(req *model.Request) {\n\tpraseReqs := common.PraseReq([]*model.Request{req}, nil)\n\tfor _, req := range praseReqs {\n\t\tlogger.Info(\"Push Url:\", req.Url, req.ProcessName, len(d.waitQueue))\n\t\td.waitQueue <- req\n\t}\n}\n\nfunc (d *ChanSchedule) PushMuti(reqs []*model.Request) {\n\tpraseReqs := common.PraseReq(reqs, nil)\n\tfor _, req := range praseReqs {\n\t\tlogger.Info(\"Push Url:\", req.Url, req.ProcessName, len(d.waitQueue))\n\t\td.waitQueue <- req\n\t}\n}\n\nfunc (d *ChanSchedule) Pop() (*model.Request, bool) {\n\treq, ok := <-d.waitQueue\n\tlogger.Info(\"Pop Url:\", req.Url, req.ProcessName, len(d.waitQueue))\n\treturn req, ok\n}\n\nfunc (d *ChanSchedule) Count() int {\n\treturn len(d.waitQueue)\n}\n\nfunc (d *ChanSchedule) Close() {\n\tclose(d.waitQueue)\n}\n\nfunc init() {\n\tRegisterSchedule(\"chan\", NewChanSchedule)\n}\n"
  },
  {
    "path": "spider/schedule/schedule_chan_test.go",
    "content": "package schedule\n\nimport (\n\t\"testing\"\n)\n\nfunc TestInitDownloader(t *testing.T) {\n\t//s := NewSchedule(4)\n\t//s.Push(&model.Task{Id:\"hao123\",Url:\"http://www.hao123.com\",Method:\"get\"})\n\t//task,ok := s.Pop()\n\t//if !ok{\n\t//\tt.Fatal()\n\t//}\n\t//fmt.Println(task)\n}\n"
  },
  {
    "path": "spider/schedule/schedule_redis.go",
    "content": "package schedule\n\nimport (\n\t\"YiSpider/spider/common\"\n\t\"YiSpider/spider/config\"\n\t\"YiSpider/spider/logger\"\n\t\"YiSpider/spider/model\"\n\t\"github.com/garyburd/redigo/redis\"\n\t\"time\"\n)\n\ntype RedisSchedule struct {\n\tname    string\n\taddress string\n\tpool    *redis.Pool\n}\n\nfunc NewRedisSchedule(config *config.Config) Schedule {\n\tschedule := &RedisSchedule{}\n\tschedule.address = config.RedisAddr\n\tschedule.name = config.Name\n\tschedule.connect()\n\n\treturn schedule\n}\n\nfunc (r *RedisSchedule) connect() {\n\tr.pool = &redis.Pool{\n\t\tMaxIdle:     10,\n\t\tIdleTimeout: 240 * time.Second,\n\t\tDial:        func() (redis.Conn, error) { return redis.Dial(\"tcp\", r.address) },\n\t}\n\n\tgo r.CronCount(1)\n}\n\nfunc (r *RedisSchedule) Push(req *model.Request) {\n\tconn := r.pool.Get()\n\tdefer conn.Close()\n\n\tpraseReqs := common.PraseReq([]*model.Request{req}, nil)\n\tfor _, req := range praseReqs {\n\t\tlogger.Info(\"Push Url:\", req.Url, req.ProcessName)\n\t\tbody, err := req.Write()\n\t\tif err != nil {\n\t\t\tlogger.Info(\"Push Url:\", err.Error())\n\t\t\tcontinue\n\t\t}\n\t\t_, err = conn.Do(\"LPUSH\", r.name, body)\n\t\tif err != nil {\n\t\t\tlogger.Info(\"Push Url:\", err.Error())\n\t\t\tcontinue\n\t\t}\n\t}\n}\n\nfunc (r *RedisSchedule) PushMuti(reqs []*model.Request) {\n\tconn := r.pool.Get()\n\tdefer conn.Close()\n\n\tpraseReqs := common.PraseReq(reqs, nil)\n\tfor _, req := range praseReqs {\n\t\tlogger.Info(\"Push Url:\", req.Url, req.ProcessName)\n\t\tbody, err := req.Write()\n\t\tif err != nil {\n\t\t\tlogger.Info(\"Push Url:\", err.Error())\n\t\t\tcontinue\n\t\t}\n\t\t_, err = conn.Do(\"LPUSH\", r.name, body)\n\t\tif err != nil {\n\t\t\tlogger.Info(\"Push Url:\", err.Error())\n\t\t\tcontinue\n\t\t}\n\t}\n}\n\nfunc (r *RedisSchedule) Pop() (*model.Request, bool) {\n\tconn := r.pool.Get()\n\tdefer conn.Close()\n\n\tvalue, err := redis.ByteSlices(conn.Do(\"BRPOP\", r.name, 5))\n\tif err != nil {\n\t\tlogger.Info(\"Pop Url: \", err.Error())\n\t\treturn nil, true\n\t}\n\n\treq := &model.Request{}\n\tif err := req.Read(value[1]); err != nil {\n\t\tlogger.Info(\"Pop Url: \", err.Error())\n\t\treturn nil, true\n\t}\n\n\tlogger.Info(\"Pop Url:\", req.Url, req.ProcessName)\n\treturn req, true\n}\n\nfunc (r *RedisSchedule) Count() int {\n\tconn := r.pool.Get()\n\tdefer conn.Close()\n\n\tvalue, err := redis.Int(conn.Do(\"LLEN\", r.name))\n\tif err != nil {\n\t\tlogger.Info(\"Count  \", err.Error())\n\t\treturn -1\n\t}\n\treturn value\n}\n\nfunc (r *RedisSchedule) Close() {\n\tr.pool.Close()\n}\n\nfunc (r *RedisSchedule) CronCount(flushTime int) {\n\tticker := time.NewTicker(time.Second * time.Duration(flushTime))\n\tgo func() {\n\t\tfor range ticker.C {\n\t\t\tlogger.Info(\"RedisSchedule Count:\", r.Count())\n\t\t}\n\t}()\n}\n\nfunc init() {\n\tRegisterSchedule(\"redis\", NewRedisSchedule)\n}\n"
  },
  {
    "path": "spider/schedule/schedule_redis_test.go",
    "content": "package schedule\n\nimport (\n\t\"testing\"\n\n\t\"YiSpider/spider/config\"\n\t\"YiSpider/spider/model\"\n)\n\nfunc TestRedisSchedule_Push(t *testing.T) {\n\ts := NewRedisSchedule(&config.Config{RedisAddr: \"127.0.0.1:6379\"})\n\ts.Push(&model.Request{Url: \"www.bai123.com\", Method: \"get\", Header: map[string]string{\"a\": \"b\"}})\n}\n\nfunc TestRedisSchedule_Pop(t *testing.T) {\n\ts := NewRedisSchedule(&config.Config{Name: \"qiongyou_spider\", RedisAddr: \"127.0.0.1:6379\"})\n\tfor i := 0; i < 100; i++ {\n\t\tgo s.Pop()\n\t}\n}\n"
  },
  {
    "path": "spider/spider/spider.go",
    "content": "package spider\n\nimport (\n\t\"YiSpider/spider/model\"\n\t\"YiSpider/spider/pipline\"\n\t\"YiSpider/spider/pipline/console\"\n\t\"YiSpider/spider/pipline/file\"\n\t\"YiSpider/spider/process\"\n\t\"YiSpider/spider/process/json-process\"\n\t\"YiSpider/spider/process/template-process\"\n\t\"YiSpider/spider/pipline/mysql\"\n\t\"YiSpider/spider/config\"\n)\n\ntype Spider struct {\n\tId   string\n\tName string\n\n\tDepth    int\n\tEndCount int\n\n\tRequests []*model.Request\n\n\tProcess map[string][]process.Process\n\tPipline pipline.Pipline\n}\n\nfunc (s *Spider) GetPipline() pipline.Pipline {\n\treturn s.Pipline\n}\n\nfunc (s *Spider) GetProcess(name string) []process.Process {\n\treturn s.Process[name]\n}\n\nfunc (s *Spider) GetRequests() []*model.Request {\n\treturn s.Requests\n}\n\nfunc (s *Spider) AddProcess(name string, p process.Process) {\n\tif s.Process == nil {\n\t\ts.Process = make(map[string][]process.Process)\n\t}\n\tprocesss, ok := s.Process[name]\n\tif !ok {\n\t\tps := []process.Process{}\n\t\ts.Process[name] = append(ps, p)\n\t} else {\n\t\tprocesss = append(processs, p)\n\t}\n}\n\nfunc InitWithTask(task *model.Task) *Spider {\n\ts := &Spider{}\n\ts.Id = task.Id\n\ts.Name = task.Name\n\ts.Depth = task.Depth\n\ts.EndCount = task.EndCount\n\ts.Requests = task.Request\n\n\ts.Process = make(map[string][]process.Process)\n\n\tfor i, p := range task.Process {\n\t\tswitch p.Type {\n\t\tcase \"template\":\n\t\t\tprocesss, ok := s.Process[p.Name]\n\t\t\tif !ok {\n\t\t\t\tprocesss = []process.Process{}\n\t\t\t\ts.Process[p.Name] = processs\n\t\t\t}\n\t\t\ts.Process[p.Name] = append(processs, template_process.NewTemplateProcess(&task.Process[i]))\n\t\tcase \"json\":\n\t\t\tprocesss, ok := s.Process[p.Name]\n\t\t\tif !ok {\n\t\t\t\tprocesss = []process.Process{}\n\t\t\t\ts.Process[p.Name] = processs\n\t\t\t}\n\t\t\ts.Process[p.Name] = append(processs, json_process.NewJsonProcess(&task.Process[i]))\n\t\t}\n\t}\n\n\tswitch task.Pipline {\n\tcase \"console\":\n\t\ts.Pipline = console.NewConsolePipline()\n\tcase \"file\":\n\t\ts.Pipline = file.NewFilePipline(\"./\")\n\tcase \"mysql\":\n\t\ts.Pipline = mysql.NewMysqlPipline()\n\n\tdefault:\n\t\tif len(config.ConfigI.Mysql) > 0{\n\t\t\ts.Pipline = mysql.NewMysqlPipline()\n\t\t}else{\n\t\t\ts.Pipline = file.NewFilePipline(\"./\")\n\t\t}\n\t}\n\n\n\treturn s\n}\n"
  },
  {
    "path": "storage/conf.json",
    "content": "{\n  \"name\":\"yi_spider_storage\",\n  \"version\":\"0.01\"\n}"
  },
  {
    "path": "storage/config/config.go",
    "content": "package config\n\nimport (\n\t\"YiSpider/storage/logger\"\n\t\"encoding/json\"\n\t\"io/ioutil\"\n\t\"os\"\n)\n\nvar ConfigI *Config\n\ntype Config struct {\n\tName    string `json:\"name\"`\n\tVersion string `json:\"version\"`\n}\n\nfunc InitConfig() error {\n\tvar file *os.File\n\tvar bytes []byte\n\tvar err error\n\n\tif file, err = os.OpenFile(\"./storage/conf.json\", os.O_RDONLY, 0666); err != nil {\n\t\treturn err\n\t}\n\n\tif bytes, err = ioutil.ReadAll(file); err != nil {\n\t\treturn err\n\t}\n\n\tConfigI = &Config{}\n\tif err = json.Unmarshal(bytes, ConfigI); err != nil {\n\t\treturn err\n\t}\n\n\tlogger.Info(\"init success \", *ConfigI)\n\treturn nil\n}\n"
  },
  {
    "path": "storage/db/elasticsearch/elasticsearch.go",
    "content": "package elasticsearch\n\nfunc init() {\n\n}\n"
  },
  {
    "path": "storage/db/hbase/hbase.go",
    "content": "package hbase\n"
  },
  {
    "path": "storage/db/mysql/mysql.go",
    "content": "package mysql\n"
  },
  {
    "path": "storage/logger/logger.go",
    "content": "package logger\n\nimport \"fmt\"\n\nfunc Info(v ...interface{}) {\n\tfmt.Println(v)\n}\n\nfunc Debug(v ...interface{}) {\n\tfmt.Println(v)\n}\n\nfunc Warn(v ...interface{}) {\n\tfmt.Println(v)\n}\n\nfunc Error(v ...interface{}) {\n\tfmt.Println(v)\n}\n"
  },
  {
    "path": "storage/main.go",
    "content": "package main\n\nimport (\n\t\"YiSpider/storage/config\"\n\t\"YiSpider/storage/logger\"\n)\n\nfunc main() {\n\n\tvar err error\n\n\tif err = config.InitConfig(); err != nil {\n\t\tlogger.Info(err.Error())\n\t\treturn\n\t}\n\n}\n"
  }
]