Create a robots.txt file automatically in Magento 2

Create a robots.txt file automatically in Magento 2

Robots.txt file is the first file read by bots and crawlers whether to index and read a particular page/ link or not. This helps for the better indexing of your quality pages, helpful for search engine optimization and it is also helpful to prevent secret and sensitive data leak to search engines.

 

Some tags and meanings in robots.txt file:
User-agent: *, - means this section applies to all robots.
Allow:/ - this means all the files should be crawled under the root directory.
Disallow:/ - this means, no bots or crawlers are allowed to crawl/ visit any page.

 

Did you know, you do not need to create the robots.txt file manually and upload that on the server if you have Magento 2? You can do all this from the admin.

Let's configure Magento to generate a robots.txt file for your store

 

Login to your Magento 2 administration. From the left navigation, 

Click on Content > Configuration (under Design Tab). Now click the edit button next to your website. 

Now from the tabs, choose “Search Engine Robots”

Select the “default” or the specific store where you want to create the robots.txt file

Default Configuration for robots.txt file – let me explain you some basic but important details about search engines and robots.txt file:

INDEX, FOLLOW – You are telling crawlers to index the site and come to your site periodically

NOINDEX, FOLLOW – You are telling crawlers don’t index this site but come to your site periodically if you change this NOINDEX or FOLLOW value in future.

INDEX, NOFOLLOW – this way you are telling crawlers, index my site this time and do not come again.

NOINDEX, NOFOLLOW – crawlers don’t come and don’t index your site. This is really a good option if your site is not launched and a very BAD option if your site is launched.

 

After you choose the appropriate option, click “reset to default” to add the default robots.txt data into the window.

You can now add your custom instructions and click on SAVE CONFIG.

 

Here is the recommended settings for default robots.txt file for Magento 2:

 

User-agent: *
# Directories
Disallow: /app/
Disallow: /pub/
Disallow: /bin/
Disallow: /dev/
Disallow: /lib/
Disallow: /phpserver/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /setup/
Disallow: /update/
Disallow: /var/
Disallow: /vendor/
# Paths (clean URLs)
Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /wishlist/
# Files
Disallow: /composer.json
Disallow: /composer.lock
Disallow: /CONTRIBUTING.md
Disallow: /CONTRIBUTOR_LICENSE_AGREEMENT.html
Disallow: /COPYING.txt
Disallow: /Gruntfile.js
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /nginx.conf.sample
Disallow: /package.json
Disallow: /php.ini.sample
Disallow: /RELEASE_NOTES.txt
# Do not index pages that are sorted or filtered.
Disallow: /*?*product_list_mode=
Disallow: /*?*product_list_order=
Disallow: /*?*product_list_limit=
Disallow: /*?*product_list_dir=
# Do not index session ID
Disallow: /*?SID=
Disallow: /*?
Disallow: /*.php$
# CVS, SVN directory and dump files
Disallow: /*.CVS
Disallow: /*.Zip$
Disallow: /*.Svn$
Disallow: /*.Idea$
Disallow: /*.Sql$
Disallow: /*.Tgz$

Let me know how did you like this article. Also, in case you have any suggestions, tips etc, dont hesitate to shoot us an email at contact@bizspice.com