Robots.txt file is the first file read by bots and crawlers to determine whether to index and read a particular page/ link. This process helps better index your quality pages, is helpful for search engine optimization, and is also beneficial for preventing secret and sensitive data leaks to search engines.
SOME TAGS AND MEANINGS IN ROBOTS.TXT FILE:
User-agent: *, – this means the section applies to all robots.
Allow:/ – this means all the files should be crawled under the root directory.
Disallow:/ – this means no bots or crawlers are allowed to crawl/visit any page.
Did you know you don’t need to create the robots.txt file manually and upload it to the server if you have Magento 2? Instead, you can do all this from the admin panel.
Log in to your Magento 2 admin panel from the left navigation bar.
Click on Content > Configuration (under Design Tab). Now click the edit button next to your website.
Now, from the tabs, choose “Search Engine Robots.”
Select the “default” or the specific store where you want to create the robots.txt file.
Default Configuration for robots.txt file – let’s first explain some basic but essential details about search engines and robots.txt file:
INDEX, FOLLOW – You are telling the crawlers to index the website and come back periodically.
NOINDEX, FOLLOW – You are telling crawlers not to index this website but come back to your website periodically if you change this NOINDEX or FOLLOW value in the future.
INDEX, NOFOLLOW – this way, you tell the crawlers to index your website this time and not come back again.
NOINDEX, NOFOLLOW – the crawlers don’t come or index your website. This is a good option if your website is not launched and a very BAD choice if your website is launched.
After choosing the appropriate option, click “reset to default” to add the default robots.txt data into the window. You can now add your custom instructions and click on SAVE CONFIG.
Here are the recommended settings for the default robots.txt file for Magento 2:
User-agent: * # Directories Disallow: /app/ Disallow: /pub/ Disallow: /bin/ Disallow: /dev/ Disallow: /lib/ Disallow: /phpserver/ Disallow: /pkginfo/ Disallow: /report/ Disallow: /setup/ Disallow: /update/ Disallow: /var/ Disallow: /vendor/ # Paths (clean URLs) Disallow: /index.php/ Disallow: /catalog/product_compare/ Disallow: /catalog/category/view/ Disallow: /catalog/product/view/ Disallow: /catalogsearch/ Disallow: /checkout/ Disallow: /control/ Disallow: /contacts/ Disallow: /customer/ Disallow: /customize/ Disallow: /newsletter/ Disallow: /review/ Disallow: /sendfriend/ Disallow: /wishlist/ # Files Disallow: /composer.json Disallow: /composer.lock Disallow: /CONTRIBUTING.md Disallow: /CONTRIBUTOR_LICENSE_AGREEMENT.html Disallow: /COPYING.txt Disallow: /Gruntfile.js Disallow: /LICENSE.txt Disallow: /LICENSE_AFL.txt Disallow: /nginx.conf.sample Disallow: /package.json Disallow: /php.ini.sample Disallow: /RELEASE_NOTES.txt # Do not index pages that are sorted or filtered. Disallow: /*?*product_list_mode= Disallow: /*?*product_list_order= Disallow: /*?*product_list_limit= Disallow: /*?*product_list_dir= # Do not index session ID Disallow: /*?SID= Disallow: /*? Disallow: /*.php$ # CVS, SVN directory and dump files Disallow: /*.CVS Disallow: /*.Zip$ Disallow: /*.Svn$ Disallow: /*.Idea$ Disallow: /*.Sql$ Disallow: /*.Tgz$
Let us know if this article was helpful. Also, in case you have any suggestions, tips, etc. don’t hesitate to shoot us an email at [email protected]