To begin with, what we need to do is talk about the manners by which we can control robots. This includes the three essential ones: robots.txt, meta robots, and—well, the nofollow tag is somewhat less about controlling bots.
Robots.txt is located at yoursite.com/robots.txt, it tells crawlers what they ought to and shouldn’t get to, it doesn’t generally get regarded by Google and Bing. So a great deal of people when you state, “hello, refuse this,” and after that you all of a sudden observe those URLs springing up and you’re pondering what’s happening, look—Google and Bing as a rule believe that they simply know better. They imagine that possibly you’ve committed an error, they think “hello, there’s a great deal of connections indicating this substance, there’s many individuals who are visiting and thinking about this substance, perhaps you didn’t mean for us to block it.” The more explicit you get about an individual URL, the better they for the most part are tied in with respecting it. The less explicit, which means the more you use trump cards or state “everything behind this whole huge index,” the more regrettable they are about fundamentally trusting you.
Meta robots—somewhat extraordinary—that lives in the headers of individual pages, so you can just control a solitary page with a meta robots tag. That tells the motors whether they should keep a page in the list, and whether they ought to pursue the connections on that page, and it’s generally significantly increasingly respected, in light of the fact that it’s at an individual-page level; Google and Bing will in general trust you about the meta robots tag.
And after that the nofollow tag, that lives on an individual connection on a page. It doesn’t advise motors where to creep or not to slither. All it’s adage is whether you editorially vouch for a page that is being connected to, and whether you need to pass the PageRank and connection value measurements to that page.
on the off chance that you need something genuinely removed, incapable to be found in indexed lists, you can’t simply forbid a crawler. You need to state meta “noindex” and you need to give them a chance to let them crawl it.
So this makes a few inconveniences. Robots.txt can be extraordinary in case we’re attempting to spare creep data transfer capacity, however it isn’t really perfect for keeping a page from being appeared in the list items. I would not advise, coincidentally, that you do what we think Twitter as of late attempted to do, where they attempted to canonicalize www and non-www by saying “Google, don’t slither the www rendition of twitter.com.” What you ought to do is rel standard ing or utilizing a 301.
Meta robots—that can permit creeping and connection following while at the same time refusing indexation, which is incredible, yet it requires crawl budget and you can at present moderate ordering.
The nofollow tag, as a rule, isn’t especially helpful for controlling bots or rationing indexation.