Friday, December 25, 2015

SEO: HTTPS Over HTTP

In the webmaster blog Google announced its preference of HTTPS over HTTP protocol. This was bound to be since the search company is quite conscious of visitor privacy and security. Simply put this protocol encrypts message to be sent and on arrival decrypts messages to be received over HTTP protocol.   

HTTPS is a security protocol which carries a connection encrypted by TLS or Transfer Security Layer. The main objective is to secure visitor interest, prevent malicious packet injection Ad inserts, information espionage and privacy of course. Hence Data Exchange takes place in a secured environment for the benefit of the user.   

Earlier this protocol  was used for electronic payment systems over the WWW or Web. It was also preferred by corporate portals and e-mail providers in order to attain extreme secrecy.  

Browser software and certification authorities should be trustworthy in this type of data exchange. Some of the certification authorities are: Symantec, Comodo, GoDaddy and GlobalSign.

The encryption protection layer should also be trustworthy and provide security. However HTTPS can be called reasonably secure and do prevent tempering and malicious attacks but they are not hundred percent foolproof.       

Keeping in mind the good aspects of this protocol the search engine company will give preference to it over HTTP which is not as secured.    

In this case the port used is 443 and the URL begins with https://

For More Information visit:  HTTPS

Read at Google Webmaster Blog for more information on HTTPS 

robot.txt: What it is?

Search engines visit websites, blogs and other online portal to scan and then store or cache the information. This is then used to rank them according to the relevancy. In a large portal the whole website may be indexed according to criteria of search engines. Most of the search engine used indexed web pages collectively to rank the site. This means pages that are not important or hold information not meant for the engines are also indexed. 

In order to specify to the robots which pages to index and which to ignore a common protocol robot.txt is used. Hence a file is uploaded into the server along with the pages in the root directory. This is the robot.txt file which the robots first visit and index the pages accordingly. But remember this protocol may not be adhered to by some robots especially those with malicious intent. But all the popular search companies adhere to this standard which is public.       

The file contains the following command/commands:

User-agent: *
Disallow: /

More instructions are given below: 

To block the entire site, use a forward slash.

Disallow: /


To block a directory and everything in it, follow the directory name with a forward slash.

Disallow: /junk-directory/

To block a page, list the page.

Disallow: /private_file.html

To remove a specific image from Google Images, add the following:

User-agent: Googlebot-Image
Disallow: /images/dogs.jpg

To remove all images on your site from Google Images:

User-agent: Googlebot-Image
Disallow: /

To block files of a specific file type (for example, .gif), use the following:

User-agent: Googlebot
Disallow: /*.gif$

More information can be had from: Here Robot.TXT

Also visit Robottxt.org 

This information can also be given in the meta robot tag which should be present in every page. Another methodology is to insert x-robot header this is placed in the header which then works for all pages. Care should be taken that directives are proper and no important pages is barred.  This also applies to CMS portals.     

You have to regulary check the directives so that over time no misinformation/ wrong case is inserted. There are number of tools that can enable you to keep things in line.     
 

Wednesday, December 23, 2015

More About Accelerated Mobile Pages

As mentioned in my earlier blog these are stripped down HTML pages that download fast. This development also highlights the limitation of web design and development technology. The inability to handle heavily loaded pages points towards design limitation. But there are elements that may be necessary or fit the role but nevertheless not suitable for visibility on mobile or smart phones. The AMP should be able to take care of that. Well time will tell. 

The AMP pages use what is termed as diet HTML.  The Java Script is not allowed but off the shelf script library can be used this enables images to download whence you scroll to the point. This may also be incorporated in many platforms in time to come.       

There are lot or restrictions whence building an AMP pages. The rules if not followed will fail the Google validation test. The tool for this test is built in Google browser Chrome.   

These pages are designed for fast download and readability and not for interactivity. Thus they can be easily cached,  pre loaded and  pre rendered.   This is amazing and will Google in time to come create AMP version of regular page and present it to the visitors instantly.  

Further information can be had from:

ampproject.org – The main project web page, where you'll find a technical intro, tutorial, GitHub repository, and more
dis.tl/amp-pages – Further information on AMPs and how they work