I've been reading different articles about what elements of a page Google indexes with an eye toward whether they index content that's added to the page via the JavaScript document.write() method. Not getting a conclusive answer, I decided to do my own test.

Why was I interested? Well, with all the "Web 2.0" technologies that rely on JavaScript (in the form of AJAX) to populate a page with content, it's important to know how it's treated to determine if the content is searchable. If it's not searchable, then it's not having an impact on search-driven traffic.

The test page had three pairs of nonsense words that, at the time of its creation, generated no hits in a Google search. Two were placed in the page via straight HTML. Two were placed in the page via a JavaScript that was part of the document. Two were placed in the page via a JavaScript on a different server that was sourced from within the page (<script type="text/javascript" language="javascript" src="URL to script on other server">).

The page was linked from a sitewide footer to ensure that Google found it, and was posted and linked on the evening of March 7th. Google alerts were set up for one word from each pair so Google would notify me by e-mail when it spotted a page containing those words.

An alert came in in the late evening of March 10th for "zonkdogfology", one of the words in the first pair (part of the straight HTML). By the time I got online in the early afternoon of March 11th, it was part of the Google index and a search for it turned up the page as the sole result.

I then searched for each of the six words at Google.

  • The two HTML words both generated a search result that included the page.
  • The two words inserted by a JavaScript in the page generated no search results.
  • The two words inserted by a remotely sourced JavaScript generated no search results.

Now, it's too early to say conclusively that Google will never index the JavaScript-generated content, barring a change in their search/indexing algorithms. I'll continue to monitor the situation over the next two weeks to give Google time for any secondary processing and distribution to all their datacenters. It is worth noting though, that at least in the immediate term, content in your pages that is made part of the page via JavaScript document.write statements will not be searchable in Google.

GOING FORWARD: Over the next two weeks, I'll be watching to see two things. First, does the indexing change so this page shows up in searches for the four JavaScripted words? And second, how long does it take for MSN and Yahoo to pick up the page and how do they treat it?

Stay tuned.

Addendum: People have been asking why you'd want to index dynamic JavaScripted content... Look at the dozens of comments on this article. They're all going to be indexed by Google because the inclusion is server-side. That's got some value. Comments in general don't just enhance the user experience, but add indexable content to your page and can organically increase your keyword density.

If you're using an AJAX powered comment module, particularly one that's remotely hosted like JS-Kit, then it's important to know what you're getting and what you're losing. Yes, you may be adding functionality to your page easily and enhancing the user experience, but if you don't get the comments indexed, you lose all that juicy keywordy goodness.

Given, I didn't do a heavily AJAXed test with nodes and other constructs. I decided to do the most simple construct... document.write(). I may do other tests in the future. But this was a good place to start. See, in both instances of the JavaScript inserted words, they were included in the scripts as discrete strings. If Google merely indexed the page and made the script text part of the searchable index, the two words from the script that's hardcoded into the page would become searchable. If it read the remote script and indexed it in the same manner, we might see those last two words showing up either in the test page or get the remote script as a hit.

Share
88 Responses to “Does Google Index Dynamic JavaScripted Content?”
  1. whiteside says:

    Greg is right, normal people don't use document.write. Actually, I believe it's essentially deprecated. Creating dynamic nodes is the most supported way to fill in ajax content, followed closely by innerHTML.

    Finally, all content on your page should be fully readable with javascript disabled (without any extra code if you coded your server-side code and DHTML properly). Obviously functional javascript applications, like GoogleMaps, are exempted.

    I don't have much sympathy for "dynamic" sites that aren't indexed properly because of poor implementation and js hacks.

  2. [...] Probably something most of us figured but Greg Bulmash did some test to answer the question, Does Google index dynamic javascripted content. The test page had three pairs of nonsense words that, at the time of its creation, generated no hits in a Google search. Two were placed in the page via straight HTML. Two were placed in the page via a JavaScript that was part of the document. Two were placed in the page via a JavaScript on a different server that was sourced from within the page. [...]

  3. [...] In the new age of Web 2.0 development, often times you come across sites that are beautiful to look at and have some fairly complicated AJAX interfaces. Tools like Scriptaculous and JQuery almost make it too easy to incorporate this type of functionality into your your site. However, today I read a post from Brain Handles where they experimented with dynamic JavaScript content, specifically adding text to the page via document.write commands, to see whether Google parses the content. The results are quite interesting from an SEO standpoint… [...]

  4. Chris Stark says:

    Even if Google does eventually index those words, I think it is still interesting from a search engine optimization standpoint that those words are not indexed as quickly as the others, or even on the same time frame for that matter.

  5. Sean says:

    I seem to remember that the googlebot does not enable client-side javascript. If so, then content that is generated via javascript upon loading is then not seen by the googlebot at all.

  6. Henry Rose says:

    [...] Here’s an interesting article discussing whether the googlebots are able to index content published to a page using javascript. [...]

  7. chicken says:

    Yea Javascript sucks developers should really cut back on using it
    it has become some type of showmanship to produce CMS/blog software
    that depends on CSS and DHTML

    its really stupid and really ugly stuff

    just take a look at Drupal's website
    even their own main pages have positioning problems

    its cute to have dropdown menus and pages that will expand
    to fit a browsers width
    but seriously when you have stacks of boxes on the page
    with content in them and they are all jumbled up unless you
    open your browser full screen

    I mean why even take time to design that
    lets all go back to HTML 1.0 and do away with everything
    that has become good design standards

    DHTML and Javascript should be restricted to situations
    where there is no other way to do things
    not only because of cross browser problems
    but just because its good design

    Just because you know how to fireup a Gui Editor
    and put a hundred DHTML boxes on your page
    doesnt mean you know how to design an attractive site
    that has good function

    • An interesting argument put forward as to why developers should abandon the use of javascript. However, if you look at research you would know that the majority of site users do not like to (and in most cases wont) scroll through the contents of a page - but they will click.

      Javascript sites allow developers to provide a rich and entrertaining experience - its not just about making the site look attractive. But as with most none xhtml elements this should not be overdone. Also developers should give their users the choice as to whether they would like to take part in these features or just view the xhtml content. With regards the cross browser problems this again comes down to bad developers not taking the time to impliment features that either function correctly across browsers or offer alternatives for when they don't.

      I must also point out that i strongly disagree with chicken's option to remove all non-xhtml elements and must ask whether he/she ever engages in online shopping, takes part in social networking, social bookmarking, watches video on youtube or any other form of web experience that would not be possible if we used just xhtml 1.0. For one thing chicken would not have been able to leave his/her comment on this page.

      Non-xhtml elements provide the user with a truley interactive experience, as such its about time that search engines stopped trying to be so restrictive and looked for ways to enable such activities to be indexed especially when they use these processes on their own products (eg feedburner). a return to just xhtml would be a return to web 1.0 and this is not something i would wish

  8. ME says:

    You will never see js output in Google's cache, because Google will never run the javascript on your pages. Here's why:

    indexing speed/cost.... google can't really afford to run everyone's slow ass javascript...it would slow down indexing a LOT. Javascript is slow as hell to begin with due to the fact that it's interpreted...add the fact that most js code out there is crap and can't be trusted to not go into infinite loops, etc... and there's no way google can afford to execute your js.

    security .... google doesn't want to run random javascript code on their servers...who knows what that code might try to do. Sure you can sandbox the javascript engine, but unknown exploits could make google very vulnerable.

  9. Googles indexing is based on the text it really touches. When the indexing is done it crawls the available content, which happens to be the html text. On the document writing part by javascript, Google ignore just for the simple reason that it may not be the basic part of the crawling page, and if it is then it may be some other stuff, may be some other link .... which is not relevent to the searching keyword.

  10. George Lee's blog: 利用 Google 和 Slashdot 獲利的方法

    剛剛看到一個 slashdot 新聞:Googlebot and Document.write,有一個人在好奇 Google 到底會不會把 javascript document...

  11. John Quays says:

    As a proud owner of a zonkdog, i'm glad to see that i'm not alone in my love of study of zonkdogs.

    Perhaps we could meet up and discuss the finer points of zonkdogfology sometime.

    i have some interesting slides.

  12. [...] Un geek di livello Omega ha fatto la prova. Ecco il link. [...]

  13. you should so a similar test but more about content on submit, so when someone clicks a link and the content is put in the page, as this is how most 'Web 2.0' sites work. See if google reads that content. + Does the google search crawler/spider/bot have javascript parsing ability?

  14. [...] Brain Handles’ Greg Bulmash experiments with this long-asked question. [...]

  15. raz says:

    This is why fallback on ajax matters, if the content is generated by server side anyway why not make a link to it like the traditional way and only use javascript to fetch the content when javascript is available.

    Going back to the article, google will index content even on ajax based side if they provide fallback for it.

    Fallback matters, not every device/browser supports javascript.

  16. aaron says:

    We use a really complex templating engine and actually have two versions of the website, one for the user (which is laced with web2.0 goodness) and another for spiders which, in lieu of ajax just has the content rendered out for indexing...

    works a treat!!!

  17. wallis says:

    just for the record,

    I wouldn't ever expect the likes of google to index javascript content, its up to the developers to ensure their applications are as compatible as possible and are index according to their standards, expecting a company like google to "get with the times" is quite an ask really!

  18. [...] No, there is a reason for this post. I stumbled across an article on slashdot, that asks Does Google Index Dynamic javascript content, posted by Greg from brainhandles.com? In this post, the author made a test page with a few different nonsense words on it which generated no hits on Google. Unfortunately I feel his test will now be somewhat invalidated mainly due to buggers such as myself putting his nonesense words into Google. [...]

  19. [...] Back in March, I did an experiment on whether Google indexes content inserted into your page with JavaScript. Weeks later, the results are conclusive… no. The words inserted into the test page via JavaScript were never indexed. Lots of pages talking about the odd words I used come up in searches for them, but the page I created doesn’t come up in the Google results. [...]

  20. Google will probably not change how they treat JavaScripted content because they "have to" or because good content might be hidden there.

    For years, they were not indexing frames - as if it was just technologically impossible. Javascripted content is dangerous for them to run, but also may contain a large amount of SEO spam.

  21. [...] In the new age of Web 2.0 development, often times you come across sites that are beautiful to look at and have some fairly complicated AJAX interfaces. Tools like Scriptaculous and JQuery almost make it too easy to incorporate this type of functionality into your your site. However, today I read a post from Brain Handles where they experimented with dynamic JavaScript content, specifically adding text to the page via document.write commands, to see whether Google parses the content. The results are quite interesting from an SEO standpoint… [...]

  22. barneyrubble says:

    There are absolutely instances where Google does indeed index javascript content. This is most obvious in backlinks. If a site has a URL string to your site in a script tag, Google WILL index it, and can actually count it as a link to your site.

    I won't promote any sites here, but this can be easily verified with a few "link:" lookups using a few SEO websites as your target.

  23. [...] I have found an article with an experiment on javascript generated content (AJAX) indexing. As I already knew the Google does not index such content. [...]

  24. That's really bad news.

    As for dynamic content which needs indexing, a simple example is source size reduction by making table template exporters.

    For instance, I have a JS script which is given a matrix of values and calculates "row/colspanness" automatically and conditionally outputs styles and the table via doc.write.

    That saves some bandwith and it's WAY better than specifying layout and structure directly in code.

  25. marvin says:

    Its ok if search engines cannot execute Script to display the link from document.write(), but i was wondering if they display results from links inside the js sourcecode?

    Like ($link="http://www.domain.com")

  26. i think it's fair to say that DHTML web applications shouldn't be indexed like documents. they have a potentially infinite number of states and the information on these states isn't really aligned with the semantics of searching the web, which is geared towards finding documents.

  27. [...] a discussion on the issue of Java and Google, along with a [...]

  28. Dan says:

    Both of the words that have been generated by Javascript on the local server *are* being found by a Google search now. I just read this article now for the first time so thought I'd point it out. I wonder if the page just needs to be marked as 'good' before it's allowed or something similar?

  29. Greg Bulmash says:

    @Dan,

    I went and checked, and they are showing up now. Furthermore, for the one I checked, the pages linking to my test page did not use that word to link to it.

    I'll need to do a second test to see if this is perhaps a change in Googlebot's default behavior or if you're correct and it's that Googlebot can/does do it, but the page has to pass some tests which may take a while.

  30. Kalata says:

    I try to use javascript with a Flash which pulls content from plain html files.., since the flash resides on index.html, i need a way to change the nonflash content, so that google sees and indexes it. I use a technique called swfaddress (http://asual.com/swfaddress/ ) which provides deeplinking for flash using anchors, and was trying to make javascript look at the url and anchor, and then load stuff depending on the anchor link (ie. if the user, in this case google, don't have flash installed). I guess the only way to make this happen, while preserving googlefriendliness, would be using a serverside language like php. Any ideas?

  31. joe says:

    Thanks for the article. I have been using js-kit comments on a limited basis and the issue of how this works with the google spider just occurred to me. I want the comments to be spidered because adding fresh and additional comment to a page is helpful. In the case of js-kit (added by javascript which is remotely hosted), this does not appear to be the case.

  32. Ken Ray says:

    I know that they are scanning js to some degree. I am not sure in other browsers but in Chrome if there is a website that has malicious js, that website will be labeled as malicious on the Google index. Google even breaks down what is considered malicious.

    Now I am not sure if the indexer is actually scanning questionable js on the fly or if that js file has been reported to the index. With all the javascript engine upgrades in Chrome I would think that Google would index dynamic sites more throughly.

  33. Doug says:

    Thanks, Greg - great article, and it's still relevant 3 years later (2010)!

    Now, I'm wondering if after 3 years, google is finally indexing document.write? Probably not.

    I bet the trend shifted to perform "document.write" to css div positions.

  34. zonkdogfology has 1450 results now and zonkdogfology.com is a registered domain name - way to go!

  35. [...] http://www.brainhandles.com/2007/03/11/does-google-index-dynamic-javascripted-content/ This entry was posted in Uncategorized. Bookmark the permalink. ← Collateral Damage of [...]

  36.  
Get an angel for your site An Angel Watches Over This Site