6. How can I prevent Baiduspider from crawling my site?
[COLOR=#333333][FONT=Arial]Baiduspider works on the robots.txt protocol. You can prevent Baiduspider from crawling your entire site or the specific contents by specifying them in robots.txt. Please note that by doing this, the pages of your site will not be found in Baidu search results and in any other the search results which is provided by Baidu. For details of setting a robots.txt, please see How to create a robots.txt
You can set different rules towards different user-agents. (Please note Baiduspider-video does not support the rules currently). If you prefer to prevent all the user-agents of Baidu, you can simply block Baiduspider.
Below robots command will block all the crawling from Baidu. User-agent: Baiduspider Disallow: /
Below robots command will allow Baiduspider-image only to crawl the directory of /image/ User-agent: Baiduspider Disallow: /
User-agent: Baiduspider-image Allow: /image/
[/FONT][/COLOR] Please note that the pages that crawled by Baiduspider-cpro will not be built into the index and Baiduspider-cpro works on the agreement that set with customers. In this case, Baiduspider-cpro will not work on the records set by robots.txt. If you are not comfortable with Baiduspider-cpro, please contact union1@baidu.com. Baiduspider-ads will not be built into the index and Baiduspider-ads works on the agreement that set with customers. In this case, Baiduspider-ads will not work on the records set by robots.txt. If you are not comfortable with Baiduspider-ads, please contact your customer service representative.