This is the first part in a series, following my thoughts on using GWT in SEO'able web applications. The other parts in the series are part 2 and part 3.
GWT Is a superb framework for developing complex, componentized html & javascript widgets. You can have your cake and eat it:
It isn't so great when:
The SEO Problem
To a search engine, GWT apps just look like a big fat lump of dense javascript. Nothing to see here, move along. Its a similar problem for any web-app that uses ajax to collect data from the server, but the problem is magnified with GWT due to the fact that the entire application tends to present as a large lump of dense javascript, whereas many other ajax technologies typically involve some amount of server-side content rendering that can make the site at least partially visible to crawlers.
Google have a recommendation for how to get around the problems of SEO for ajax applications, which entails a special url form and the creation of "html snapshots" - effectively a parallel, ajax-disabled site that the crawler can index. This seems to me to be a workable but irritating solution that involves doing a lot of extra work just to allow a search engine to crawl the site. Its effectively just a Google-approved cloaking method. Also it isn't clear to me whether any other search engines than Google support this approach.
The Design Problem
Don't get me wrong, its not as bad as all that. You can, of course, leverage your UI/UX designer's talents when building GWT apps. They can produce designs that the GWT developers base their components on, and with UIBinder the html fragments produced by a designer can be used in large chunks, but there is always some disconnect between what the designers produce and what is actually output by the application - usually because there is a developer translating the designer's work into GWT components.
In retrospect, having built a number of "monolithic" GWT applications, it seems to me that what we're missing is a way to step back just a little from the "GWT does everything" mind-set, and instead to leverage GWT where it is best suited, and something a little more flexible where GWT can be too restrictive - for example when laying out high level components on a page it would be advantageous to be able to escape from the restrictions of having to compile that page layout into js, and instead work at the level of straight-forward declarative markup.
Introducing "GWT-Activated Pages"
How can we solve these two issues? One idea I've been toying with, is to use GWT for progressive enhancement of simple html + css, and goes as follows:
Rather than try to build two almost parallel versions of your application (one for SEO, one for real users), why not build one with a layered approach that allows graceful degradation for browsers with javascript disabled (of which search-engines could be considered a sub-set).
The base-layer that non-javascript browsers would render, and which search-engines would see, would be generated by some typical server-side technology - php, jsp, struts, jfaces, ... take your pick. This would build a "wireframe" of your page, giving it a basic shape and layout, and filling in some starter content. The markup would ideally be meaningful, in the sense that headings would appear in <h1> tags to indicate that they are headings, rather than to give them any particular styling.
This base layer would be something that designers could work on directly, including any and all css styling.
The second layer would be a set of GWT widgets that "activate" or progressively enhance the page, by scanning the DOM for certain signs that denote activateable sections of markup. When the base page loads, GWT widgets search for elements to bind themselves to. When a widget finds such an element it binds to it and "activates" it. Activation could mean anything from completely changing the html markup, to binding event-listeners, to handling interaction with ajax data loading from the server.
Here's a simple example "base" layer:
Notice the elements with css class-names prefixed with "gwt-". These are the signs that our gwt widgets will be looking for in order to know which elements they should activate.
As you probably guessed, the navigation widget would detect any elements with a class-name matching "gwt-navigation-widget", while the news-ticker will search for "gwt-news-ticker-widget".
OK, so what do we get for our troubles? Well, several things potentially:
In my simple example I showed the scripts being loaded separately, just for clarity, but I'm sure you wouldn't want to load each widget as a separate script - that would lose a good chunk of GWT's advantage. Rather, the whole widget-set could be loaded as one script, cached forever, and used all over.
Now, if you want to see an example of GWT-activated pages at work, just take a look at my older post on rendering 3D Rubiks Cubes with GWT and HTML5 Canvas, where the rubik's cubes are rendered by a gwt widget that "activates" a <div> element containing the configuration for the cube.
OK, but what are the down-sides? Here's a few...- Drop down into native javascript as and when you feel the need
- Integrate easily with native js components and libraries
- Use GWT components relatively easily from native javascript
- Create super-condensed, fast, platform-specific code, easily
- All the benefits of Java's static type-system, packages, and tooling to manage and refactor your code
It isn't so great when:
- You need to expose the content of your site for search-engines to index (The SEO Problem).
- You want to leverage the html and css skills of your UI designers, and to be able to generate more flexible layouts without requiring a re-compile (The Design Problem).
The SEO Problem
To a search engine, GWT apps just look like a big fat lump of dense javascript. Nothing to see here, move along. Its a similar problem for any web-app that uses ajax to collect data from the server, but the problem is magnified with GWT due to the fact that the entire application tends to present as a large lump of dense javascript, whereas many other ajax technologies typically involve some amount of server-side content rendering that can make the site at least partially visible to crawlers.
Google have a recommendation for how to get around the problems of SEO for ajax applications, which entails a special url form and the creation of "html snapshots" - effectively a parallel, ajax-disabled site that the crawler can index. This seems to me to be a workable but irritating solution that involves doing a lot of extra work just to allow a search engine to crawl the site. Its effectively just a Google-approved cloaking method. Also it isn't clear to me whether any other search engines than Google support this approach.
The Design Problem
Don't get me wrong, its not as bad as all that. You can, of course, leverage your UI/UX designer's talents when building GWT apps. They can produce designs that the GWT developers base their components on, and with UIBinder the html fragments produced by a designer can be used in large chunks, but there is always some disconnect between what the designers produce and what is actually output by the application - usually because there is a developer translating the designer's work into GWT components.
In retrospect, having built a number of "monolithic" GWT applications, it seems to me that what we're missing is a way to step back just a little from the "GWT does everything" mind-set, and instead to leverage GWT where it is best suited, and something a little more flexible where GWT can be too restrictive - for example when laying out high level components on a page it would be advantageous to be able to escape from the restrictions of having to compile that page layout into js, and instead work at the level of straight-forward declarative markup.
Introducing "GWT-Activated Pages"
How can we solve these two issues? One idea I've been toying with, is to use GWT for progressive enhancement of simple html + css, and goes as follows:
Rather than try to build two almost parallel versions of your application (one for SEO, one for real users), why not build one with a layered approach that allows graceful degradation for browsers with javascript disabled (of which search-engines could be considered a sub-set).
The base-layer that non-javascript browsers would render, and which search-engines would see, would be generated by some typical server-side technology - php, jsp, struts, jfaces, ... take your pick. This would build a "wireframe" of your page, giving it a basic shape and layout, and filling in some starter content. The markup would ideally be meaningful, in the sense that headings would appear in <h1> tags to indicate that they are headings, rather than to give them any particular styling.
This base layer would be something that designers could work on directly, including any and all css styling.
The second layer would be a set of GWT widgets that "activate" or progressively enhance the page, by scanning the DOM for certain signs that denote activateable sections of markup. When the base page loads, GWT widgets search for elements to bind themselves to. When a widget finds such an element it binds to it and "activates" it. Activation could mean anything from completely changing the html markup, to binding event-listeners, to handling interaction with ajax data loading from the server.
Here's a simple example "base" layer:
<html>
<body>
<h1>Page Header</h1>
<ol class="gwt-navigation-widget">
<!-- the listitems are generated server side -->
<li><a href="..">Home</a></li>
<li><a href="..">News</a></li>
<li><a href="..">Videos</a></li>
<li><a href="..">Photos</a></li>
<li><a href="..">About</a></li>
</ol>
<ol class="gwt-news-ticker-widget">
<!-- the listitems are generated server side -->
<li>News story 1</li>
<li>News story 2</li>
...
<!-- this last listitem gives a link that a search engine can follow to get more data -->
<li><a href="..">older stories</a></li>
</ol>
<script type="text/javascript" language="javascript" src="widgets/navigation-widget.js"></script>
<script type="text/javascript" language="javascript" src="widgets/news-ticker-widget.js"></script>
</body>
</html>
Notice the elements with css class-names prefixed with "gwt-". These are the signs that our gwt widgets will be looking for in order to know which elements they should activate.
As you probably guessed, the navigation widget would detect any elements with a class-name matching "gwt-navigation-widget", while the news-ticker will search for "gwt-news-ticker-widget".
OK, so what do we get for our troubles? Well, several things potentially:
- One request to the server to get our initial page full of data (rather than multiple widgets requesting async loading of little chunks of data)
- A page that contains the data and is search-engine friendly, allowing pages deep within your app to indexed by search-engines
- A very clear separation of widgets and page layout, allowing you more flexibility to change the page layout without GWT re-compile
- Flexibility in dividing work between designers and developers:
- designers can focus on the design-heavy html and css work, and the overall page layout
- developers can focus on interaction with the server, complex widget behaviour, etc.
- Examine the content of the element - this will very likely be the source of its initial configuration and/or data-set, and also might include some information about how to load more content, as in the news-ticker example whose last <li> is a link to "older stories". I'm sure it would be a good idea to make this even more explicit, but like I said this is supposed to be a simple example :)
- Replace or modify the content of the element - perhaps the widget displays a very complicated UI, so it removes the html and replaces it with something nifty that it generates, or maybe it just adds some decoration in the form of small visible changes, or perhaps it binds a bunch of event handlers to do neat tricks like adding gesture handling for touch-screen users.
In my simple example I showed the scripts being loaded separately, just for clarity, but I'm sure you wouldn't want to load each widget as a separate script - that would lose a good chunk of GWT's advantage. Rather, the whole widget-set could be loaded as one script, cached forever, and used all over.
Now, if you want to see an example of GWT-activated pages at work, just take a look at my older post on rendering 3D Rubiks Cubes with GWT and HTML5 Canvas, where the rubik's cubes are rendered by a gwt widget that "activates" a <div> element containing the configuration for the cube.
- Compiler no longer has visibility across the whole UI.
- Messaging between components becomes more difficult (but not impossible). This has its advantages too - it forces low coupling. Messaging via OpenAjax Hub or similar would be worth considering.
- It's more work than a straight-out GWT UI, and many would argue why bother to use GWT at all if you need SEO (depends on your skill-set and the complexity of the components you're building in my view).
- I'm sure that there are others which I'm currently blind to ... I need to try to build some more complex and interesting examples to find these out.
tl;dr ?