Stuff I'll forget if I don't write it down: SEO

My Blog has moved to Github Pages

Showing posts with label SEO. Show all posts

Tuesday, 15 February 2011

Progressive Enhancement with GWT, part 3

Read this article on my new blog

Read about my GWT.Progressive library on my new blog

This is the third part in a series, following my thoughts on using GWT in SEO'able web applications. The other parts in the series are part 1 and part 2.

In my previous posts I described an idea for progressive enhancement using GWT - "activating" server-generated html, to combine GWT goodness with an SEO friendly server-generated website, and my findings after some initial trials.

One of the problems I described in that second post was that it would be very difficult to work with these widgets if nested widgets could not be automatically (or at least easily) bound to fields within this widget.

After a little playing around and learning about GWT Generators I now have what seems like a nice solution, using a Generator to do almost all of the donkey work. Think of it like UiBinder, but with the templates provided at runtime (courtesy of the server). Here's an example class that automatically binds sub-widgets - an Image in this case - to a field of that class:

public class MyWidget extends Widget {
   
    interface MyActivator extends ElementActivator<MyWidget> {}
    private static MyActivator activator = GWT.create(MyActivator.class);
   
    @RuntimeUiField(tag="img", cssClass="small") Image small;
   
    public MyWidget(Element anElement) {
        // this will set our element and bind our image field.
        setElement(activator.activate(this, anElement));
   
        // now we can play with our fields.
        small.addClickHandler(new ClickHandler() {
            public void onClick(ClickEvent aEvent) {
                Window.alert("clicked!");
            }
        });
    }   
}

This class will bind onto any html that has an image tag somewhere in its inner-html, for example:

<div> <!-- Say our MyWidget is bound here -->
  <div>
    <span>
      <img class="small" src="/images/image.jpg"> <!-- will be bound to our Image widget -->
    </span>
  </div>
</div>

Anyone familiar with UiBinder will recognize the pattern I've used for the "activator":

Extend an interface with no methods

interface MyActivator extends ElementActivator<MyWidget> {}

GWT.create() an instance of that interface

private static MyActivator activator = GWT.create(MyActivator.class);

Then use it to initialize your widget

setElement(activator.activate(this, anElement));

The nice thing about this is we can automatically bind as many widgets as we like onto various sites within the inner-html of our current widget's element. It doesn't mess with the structure (unless you explicitly do so after the binding is done for you), and you can have as much other html within the elements as you like - it will just be left alone, which gives your designers the flexibility to change the layout quite a lot without necessarily needing to re-compile your GWT code.

Currently I have my generator set up to allow your widgets to bind to a choice of tag-name or css-class or both, for example:

// bind to the first <div> found by breadth-first search of child elements
@RuntimeUiField(tag="div") Label label;

// bind to first element with class="my-widget" found by breadth-first search
@RuntimeUiField(cssClass="my-widget") Label label;

// bind to first <div> with class="my-widget" found by breadth-first search
@RuntimeUiField(tag="div", cssClass="my-widget") Label label;

Notice in my examples so far I'm binding standard GWT widgets onto the nested elements. This works for the elements I've used in these examples because they all have a

public static Type wrap(Element anElement)

method which allows those widgets to be bound onto elements that are already attached to the DOM.

It is also possible to bind widgets of your own making in one of two ways:

Create a wrap method like

public static MyWidget wrap(Element anElement)

Create a single-argument public Constructor that accepts an Element as its argument.

Activate-able widgets can be nested within other such widgets - with no limits that I am aware of so far - and it is also possible to assign nested widgets to a List field in the enclosing widget, like this:

@RuntimeUiField(tag="img") List<Image> images;

This will search recursively for any <img> tags inside the enclosing widget's element and bind them all to Image widget's that will be added to the List. The current limitations here are that the List must be declared either as List or ArrayList, and parameterized with a concrete type that meets the criteria defined above (i.e. has a static wrap(Element) method, or a single-arg constructor that takes an Element as the argument).

A remaining question is how to bind the outer-most Widget. Currently I'm doing that using the DOM scanning code I wrote during earlier experiments and which I'm also using in the automatic scanning process set up by the Generator. For example to find the outer-most widgets and kick off the binding process I have something like this in my EntryPoint:

public void onModuleLoad() {
    List<MyWidget> _myWidgets = new ArrayList<mywidget>();
    for (Element _e : Elements.getByCssClass("outer-most-widget")) {
        new MyWidget(_e);
    }
    // do stuff with our widgets ...
}

I think of this as very similar to the RootPanel situation - "normal" GWT apps kick off by getting a RootPanel(body tag) or RootPanel's (looked up by id), to which everything else is added. It would be nice to hide away some of that scanning code inside a "top-level" widget - much like RootPanel does for the normal case. I can imagine this might look something like:

public void onModuleLoad() {
    Page _page = Page.activate();
    _page.doStuffWithWidgets();
    // ...
}

I still have lots of things to figure out and questions to answer, for example:

What's the performance like when binding many hundreds of widgets?
How will this really work when I make ajax requests for more data? (should I make ajax requests for html snippets which I add to the DOM and then bind onto, or switch to json for ajax requests and make my widgets able to replicate themselves from an initial html template?)
What's the best way to divide labour between developers and designers, and for them to organize their interaction? (Ideally I'd like there to be something of a cycle between them, where the designer can rough-out a page design, agree the componentisation with the developer, the developer knocks out some components and a build which the designer can use to activate their static designs, add fidelity, work on other pages with the same components, etc).
Where is the sweet-spot between creating high-fidelity html server-side and decorating it client-side using GWT? Should the GWT components really be just for adding dynamism, or is it a good idea to use them to build additional html sweetness? - I mean the server could dish out html that is more of a model than a view (just enough "view" to satisfy SEO), and the GWT layer acts as a client-side controller and view (SOFEA/TSA with a nod to SEO).

I'll try to keep posting as I work things out.

This probably belongs in a separate post, but with reference to that last point on TSA (Thin Server Architecture) - the working group list the following points to define the concept:

Do not use server-side templating to create the web page.
Use a classical and simple client-server model, where the client runs in the browser.
Separate concerns using protocol between client and server and get a much more efficient and less costly development model.

I'm right behind them on (2) and (3), and also on (1) for "enterprise" apps where SEO is a non-goal. However, for an app that needs SEO, (1) is a deal-breaker, so I'd offer this alternative 1st rule instead:

Use server-side templating to produce a model for the client to consume which minimally satisfies the needs of SEO.

Sunday, 13 February 2011

Progressive Enhancement with GWT, part 2

This is the second part in a series, following my thoughts on using GWT in SEO'able web applications. The other parts in the series are part 1 and part 3.

Since my earlier post, I spent a little time (only a few hours really, so far) trying a few things out. Here's a smattering of things I learned...

Scanning for elements and binding widgets onto them is easy. Making those widgets behave just like widgets in any normal GWT app needs a little more work.

Who's the daddy?

One big problem to get around is that normally GWT widgets are attached via a hierarchy of other widgets (parents) leading back to the RootPanel, whereas when you bind onto some arbitrary element that is already on the page you don't get this hierarchy for free.

When widgets are added to a parent widget some magic happens to set up things like the eventing system. Without that magic you can add as much event-handling plumbing as you like, but it won't work because your widget isn't wired into the eventing system.

Actually getting around this is not all that difficult. Simply invoking onAttach() will wire up your widget, though its a little unpleasant to have to do that.

Another problem with the lack of hierarchy is, well, there's no hierarchy. Things that you would normally do in GWT widgets - like adding, removing or replacing child widgets - gets a little trickier. If you want to use the technique recursively (and why wouldn't you?), you need to allow widgets to bind to elements inside other widgets without causing them to be removed from and re-attached to the DOM, but crucially you still need to add them as 'logical' children of the parent widget, otherwise the parent knows nothing about the child widgets and can't do any of those "normal" operations with them.

To do that there are two problems to overcome:

The parent needs to have the children added to it, so that the set of child widgets is known and available for manipulation (say by extending ComplexPanel and using the getChildren() method).
Some of the child widgets might need hard, typed references in the parent widget to allow direct manipulation of the child widget - just like in a "normal" GWT widget you would keep a reference to the Button you added in the constructor in order that you can bind ClickHandler's to it or toggle its enabled-ness.

Point 1 is easily solved - any widget that wants to play this way needs to support adding other widgets without triggering an attachment to the current element. When you add a normal child widget to a normal parent widget, the child is detached from its current parent - logically and physically - so that its html element is actually inserted into the DOM under the parent's element. This is not what we want when binding onto a template - we just want the logical attachment step, so we need to support an add method something like:

public void logicalAdd(Widget aWidget) {

getChildren().add(aWidget);

adopt(aWidget);

}

I've yet to try to solve point 2. So far I've built:

Tools to help with scanning for elements to bind to, and then binding the right widget.
Plumbing to allow recursively binding widgets with logical hierarchy intact (point 1 above).
An example that binds widgets recursively - an outer container, an inner container, and a bunch of widgets inside that are manipulated by the inner container.

I'll try to update the post with an example at some point. Meanwhile my next challenge is to solve point 2 such that widget developers can build their widgets in a fairly typical GWT way.

As an aside, I lay awake for a while last night pondering the ability to give designers a client-side templating system, where they can write the html for a component once (declaring it to be a template, which may include recursive binding points for GWT-activated widgets) and then re-use it elsewhere within their html by reference to the template. I'm sure this would be possible, though its utility might extend only to mock-ups.

Sunday, 30 January 2011

Progressive Enhancement with GWT, part 1

This is the first part in a series, following my thoughts on using GWT in SEO'able web applications. The other parts in the series are part 2 and part 3.

GWT Is a superb framework for developing complex, componentized html & javascript widgets. You can have your cake and eat it:

Drop down into native javascript as and when you feel the need
Integrate easily with native js components and libraries
Use GWT components relatively easily from native javascript
Create super-condensed, fast, platform-specific code, easily
All the benefits of Java's static type-system, packages, and tooling to manage and refactor your code

It is very tempting to go for an all-out GWT user-interface, which is great if what you want is a super-snazzy Rich-Internet-Application which packs down into very small and extremely cacheable js bundles that fetch data asynchronously from the server, and you don't mind re-compiling your user-interface to make even a very small change.

It isn't so great when:

You need to expose the content of your site for search-engines to index (The SEO Problem).
You want to leverage the html and css skills of your UI designers, and to be able to generate more flexible layouts without requiring a re-compile (The Design Problem).

The SEO Problem

To a search engine, GWT apps just look like a big fat lump of dense javascript. Nothing to see here, move along. Its a similar problem for any web-app that uses ajax to collect data from the server, but the problem is magnified with GWT due to the fact that the entire application tends to present as a large lump of dense javascript, whereas many other ajax technologies typically involve some amount of server-side content rendering that can make the site at least partially visible to crawlers.

Google have a recommendation for how to get around the problems of SEO for ajax applications, which entails a special url form and the creation of "html snapshots" - effectively a parallel, ajax-disabled site that the crawler can index. This seems to me to be a workable but irritating solution that involves doing a lot of extra work just to allow a search engine to crawl the site. Its effectively just a Google-approved cloaking method. Also it isn't clear to me whether any other search engines than Google support this approach.

The Design Problem

Don't get me wrong, its not as bad as all that. You can, of course, leverage your UI/UX designer's talents when building GWT apps. They can produce designs that the GWT developers base their components on, and with UIBinder the html fragments produced by a designer can be used in large chunks, but there is always some disconnect between what the designers produce and what is actually output by the application - usually because there is a developer translating the designer's work into GWT components.

In retrospect, having built a number of "monolithic" GWT applications, it seems to me that what we're missing is a way to step back just a little from the "GWT does everything" mind-set, and instead to leverage GWT where it is best suited, and something a little more flexible where GWT can be too restrictive - for example when laying out high level components on a page it would be advantageous to be able to escape from the restrictions of having to compile that page layout into js, and instead work at the level of straight-forward declarative markup.

Introducing "GWT-Activated Pages"

How can we solve these two issues? One idea I've been toying with, is to use GWT for progressive enhancement of simple html + css, and goes as follows:

Rather than try to build two almost parallel versions of your application (one for SEO, one for real users), why not build one with a layered approach that allows graceful degradation for browsers with javascript disabled (of which search-engines could be considered a sub-set).

The base-layer that non-javascript browsers would render, and which search-engines would see, would be generated by some typical server-side technology - php, jsp, struts, jfaces, ... take your pick. This would build a "wireframe" of your page, giving it a basic shape and layout, and filling in some starter content. The markup would ideally be meaningful, in the sense that headings would appear in <h1> tags to indicate that they are headings, rather than to give them any particular styling.

This base layer would be something that designers could work on directly, including any and all css styling.

The second layer would be a set of GWT widgets that "activate" or progressively enhance the page, by scanning the DOM for certain signs that denote activateable sections of markup. When the base page loads, GWT widgets search for elements to bind themselves to. When a widget finds such an element it binds to it and "activates" it. Activation could mean anything from completely changing the html markup, to binding event-listeners, to handling interaction with ajax data loading from the server.

Here's a simple example "base" layer:

<html>
  <body>
   <h1>Page Header</h1>
   <ol class="gwt-navigation-widget">
   
   <li><a href="..">Home</a></li>
   <li><a href="..">News</a></li>
   <li><a href="..">Videos</a></li>
   <li><a href="..">Photos</a></li>
   <li><a href="..">About</a></li>
   </ol>
   <ol class="gwt-news-ticker-widget">
   
   <li>News story 1</li>
   <li>News story 2</li>
   ...
   
   <li><a href="..">older stories</a></li>
   </ol>

   <script type="text/javascript" language="javascript" src="widgets/navigation-widget.js"></script>
   <script type="text/javascript" language="javascript" src="widgets/news-ticker-widget.js"></script>
  </body>
</html>

Notice the elements with css class-names prefixed with "gwt-". These are the signs that our gwt widgets will be looking for in order to know which elements they should activate.

As you probably guessed, the navigation widget would detect any elements with a class-name matching "gwt-navigation-widget", while the news-ticker will search for "gwt-news-ticker-widget".

OK, so what do we get for our troubles? Well, several things potentially:

One request to the server to get our initial page full of data (rather than multiple widgets requesting async loading of little chunks of data)
A page that contains the data and is search-engine friendly, allowing pages deep within your app to indexed by search-engines
A very clear separation of widgets and page layout, allowing you more flexibility to change the page layout without GWT re-compile
Flexibility in dividing work between designers and developers:
- designers can focus on the design-heavy html and css work, and the overall page layout
- developers can focus on interaction with the server, complex widget behaviour, etc.

Upon finding an element to bind to, the widget would:

Examine the content of the element - this will very likely be the source of its initial configuration and/or data-set, and also might include some information about how to load more content, as in the news-ticker example whose last <li> is a link to "older stories". I'm sure it would be a good idea to make this even more explicit, but like I said this is supposed to be a simple example :)
Replace or modify the content of the element - perhaps the widget displays a very complicated UI, so it removes the html and replaces it with something nifty that it generates, or maybe it just adds some decoration in the form of small visible changes, or perhaps it binds a bunch of event handlers to do neat tricks like adding gesture handling for touch-screen users.

It seems that it would be perfectly possible to bind widgets inside widgets in this way, although at some point you would probably want to control the binding order such that an outer widget binds before an inner one. One way to do this might be to have a page-activator that seeks elements that want to be activated, and which all widgets are registered with. This could be described nicely as a Pattern.

In my simple example I showed the scripts being loaded separately, just for clarity, but I'm sure you wouldn't want to load each widget as a separate script - that would lose a good chunk of GWT's advantage. Rather, the whole widget-set could be loaded as one script, cached forever, and used all over.

Now, if you want to see an example of GWT-activated pages at work, just take a look at my older post on rendering 3D Rubiks Cubes with GWT and HTML5 Canvas, where the rubik's cubes are rendered by a gwt widget that "activates" a <div> element containing the configuration for the cube.

OK, but what are the down-sides? Here's a few...

Compiler no longer has visibility across the whole UI.
Messaging between components becomes more difficult (but not impossible). This has its advantages too - it forces low coupling. Messaging via OpenAjax Hub or similar would be worth considering.
It's more work than a straight-out GWT UI, and many would argue why bother to use GWT at all if you need SEO (depends on your skill-set and the complexity of the components you're building in my view).
I'm sure that there are others which I'm currently blind to ... I need to try to build some more complex and interesting examples to find these out.

tl;dr ?