I previously wrote a pessimistic post titled “The Myths of SEO“, where I laid out a number of complaints of “SEO Experts”. This was based on past experiences of needing to implement knee-jerk demands related to SEO in reaction to what decision-makers had read in single blog posts. Coupled with a recent comment made to me that “responsive design is bad for SEO”, I thought that I was going to be placed in another situation where I need to implement immeasurable changes that may or may not benefit the system when I would rather focus my attention on other tasks. Well, my own knee-jerk reaction to those comments resulted in only one side of the SEO situation.
This is the post that puts a positive spin on SEO practices!
I attended a meeting with who I thought were “SEO experts”, but in actuality they were SEO experts (no sarcastic quotes around the title) and they provided beautiful proof of their claims in the form of videos of Google employees taken at conferences, and information about how GoogleBot crawls pages and how they are rendered. After the 2 hour meeting, I realize that I had some misconceptions about SEO, and it isn’t all made up claims.
Let’s start with why SEO is important: your site moves up in the search results.
If your site moves up, you get more clicks (many users never go past the first page of search results)
More clicks means more visitors hitting your site and the potential for more money goes up.
So, to facilitate getting more money, we need to be conscious of how GoogleBot handles the site.
Google and Yahoo make up 98% of the search engine traffic we receive. I don’t remember the specifics of what was said in the meeting, but it sounds like Yahoo Japan uses Googlebot, so we only need to focus on satisfying Googlebot for all of our SEO needs. Bing doesn’t have a strong presence in Japan, and Yandex and Baidu are more of a nuisance than a benefit.
Our service is a SPA because reasons, and it comes with a set of complications for crawlers. Googlebot will crawl a site’s source, looking for
<a href> tags (it doesn’t matter if the tags are triggered by HTML or JS, but they must be
<a href>) and GoogleBot will render the site at a later time when there are free CPU cycles. This means that without pre-rendering, the crawling process becomes a lot longer as the page is hit, rendered later, links checked, and repeated. It can be several days before your page is rendered.
And here’s a knowledge dump from the meeting:
Googlebot is currently rendering pages with Chrome 41, which is a bit old now. It does not support some current features like ES6 and storage apis like session data or localstorage (so that means Flux and Vuex become problematic). There are plans to use a newer version of Chrome from December 2018, but it is unclear if this will come to fruition or not.
In regards to HTML: Semantic HTML is important (placing content in the tags which make sense for that content. H tags should descend in order. All img elements should have “alt”, even if it is empty.
URLs should be normalized and avoid ‘#’. It sounds like any URL containing a ‘#’ is completely ignored, which makes sense for URLs that jump to an id on a page, but some SPAs with routers will stick a ‘#’ in the URL out of the box, meaning a URL may look like “https://mysite.com/#/about Surprise! None of your pages are being indexed!
If the page is using pagination, it needs to be clear to Googlebot by having
<link rel="prev"> and
<link rel="next"> as the links for navigating it.
Lazyloaded images don’t get loaded. Any lazyloaded image will need to have a fallback which is wrapped in
To avoid Googlebot hitting duplicate content and hurting your SEO, it’s important to tell the bot not to crawl pages that are duplicated content. This is done with the canonical tag which is put in the header which points to a source document (
<link rel="canonical" href="" />)
Canonical tags are used frequently to indicate that PC and SP are the same content if using non-responsive sites.
Click-to-load patterns (like infinite scroll) aren’t typically triggered. Google doesn’t interact with things like that, or with tabbed content. In cases like these, the content should be preloaded and visibility toggled with CSS, or simply use different URLs
Page rendering can be directly checked with Google’s Search Console Tool. It’s prudent to check both PC and smart phone HTTP responses and rendered versions by using the “fetch as Google” option in the Search Console.
g.co/MobileFriendly is also able to be used to show which responses were blocked by Google
Oh no, what if we have a SPA?
We (sigh, I) have our (second sigh, my) work cut out for ourselves (third sigh, myself) in regards to rendering the SPA. There are a couple options available:
- Switch to server side rendering
- Prerender the entire site (Hybrid rendering)
- Prerender some of the site (Hybrid rendering)
- Create a server specifically for Googlebot with the prerendered site (Dynamic rendering)
I’m going with prerendering all of the pages that appear for users that have not logged-in, which is getting tough due to our API checking the origin of all calls. So, the prerendering of certain pages will need to have that spoofed (because we build the site on several different domains). I’d like to go into detail about the pros and cons of an SPA for a large site in another post, but for now I’ll just say “Plan for prerendering”.
I’d like to reiterate that the meeting was great and it was refreshing to meet people actually knowledgeable about SEO practices. When I pressed for information about KPIs on SEO, the meeting facilitators were honest in saying that measuring the benefits of SEO is not possible, but we still need to follow the best practices set by Google in order to get the site listed higher in search results. It was refreshing to get a “We don’t know, but we’ve done what Google said in the past with good results…” instead of a “Do as I say because I’m an expert”.
I was curious about the importance of image filenames, and they mentioned that they were not important (which I am skeptical about, but they stressed the importance of alt tags, and Google may use AI to identify contents of pictures…)
I asked about their information sources, and it sounds like it all comes from Google documentation, Google’s blog(?), and conference presentations from Google employees. Straight from the horse’s mouth. So don’t just be another “SEO expert” who gets information from horrible blogs like this one. Be an SEO expert who confirms all of their information directly from the sources.
2018/10/17 UPDATE! Check out this great guide written by Google! https://static.googleusercontent.com/media/www.google.com/ja//insidesearch/howsearchworks/assets/searchqualityevaluatorguidelines.pdf