Ö÷²¥´óÐã

Archives for March 2009

Coyopa's guts

Post categories:

James Cridland James Cridland | 10:16 UK time, Monday, 23 March 2009

Coyopa, as any regular Radio Labs reader will know, is the new system for encoding Ö÷²¥´óÐã national radio stations for the iPlayer and internet media devices - both simulcast and on-demand. It's been running well in production since November, and is now producing all listen-again and most simulcast streams across our services.

If you're deeply interested in the technology we use, here's a quick delve into rather more detail about Coyopa. This isn't for the faint-hearted, particularly if your tolerance for TLAs is low. (What's a TLA? A Three Letter Acronym. LOL. FTW!)

We have used broadcast-engineering principles to implement the new encoding system - such as resilient system design with backups, multiple power supplies, keeping digital audio linear all the way to the encoder.

For simulcast - live streaming - we take the audio directly from the broadcast chain which feeds DSat/DTT. This is passed through an audio router (for resilience) and then directly into an AES3 sound card. The sound card implements protection limiting in a DSP and carries out other functions such as detecting audio silence (failure alarms). Up to four radio station versions (eg "Radio 2 UK") are encoded on each machine.

For Listen Again, the process is divided into two functions. Linear audio is recorded from the broadcast chain for each radio station, non-stop, and kept for up to seven days. Producers can schedule their radio programmes to be available for Listen Again, by selecting the times/days and repeat patterns (rather like a professional version of a consumer ). The system will let you schedule "in the past" too, using its internal store of audio. Users can also select which regions will hear the audio (UK/International) and allow programmes to be automatically published after the show (some programmes have to be edited before being allowed through for compliance or rights reasons). There are different rights restrictions for download (podcast) files, which Coyopa will begin producing later this year, and for streaming files. Producers can also re-edit the programme, particularly the start and end times, at any time to tidy up the audio that listeners hear.

When a programme is ready for encoding, it is sent to a set of sixteen encoding machines, allocated according to the current workload: Radio 3's Through the Night (nearly 6 hours) takes longer to encode than a fifteen-minute news bulletin, after all. Up to four files can be sent for encoding for each radio programme (UK/International, Streamed/Download), and the encoding system knows which formats/bitrates to use for encoding each file. After encoding, the many different encoded files are passed, using , into a large store, which then clones the files to our content delivery partners.

The schedule of radio programmes compounds the encoding workload by having many programmes across the radio stations ending at about the same time; nevertheless, the whole process is designed to be automatic and fast.

The server hardware uses HP "Blade systems" - where each PC server is a plug-in module, with up to 16 in a, very heavy, chassis. Six power supplies share the load of the whole rack, fed from two different sources of mains power. These servers are used because as well as offering the performance, they are reliable, easy to maintain and allow a very high packing density. Each server has two 4-core processor chips which are needed to achieve the throughput for listen again encoding.

As you can see, Coyopa is rather more than a little machine with a sound-card!

I hope all that was interesting to some; and I'm indebted to my colleague David for working on this blog post with me.

Designing for your least able user

Michael Smethurst Michael Smethurst | 20:00 UK time, Monday, 16 March 2009

Usability, accessibility and search engine optimisation from an information architect's perspective

A few reasons why we make websites how we make websites. Thanks to and .

Also available as a (by ).

Another dull presentation...

..in black and white with too much text and no pictures. I apologise - PowerPoint was never a key skill. If it's of any use please feel free to take it and tart it up as necessary.

Following an understandable complaint about the use of Flash in a post about accessibility I've added an S5 HTML version here. Also a note that there's nothing in the presentation that isn't in the post.

View more from .

Some egg sucking

Way, way back in the day was the . It provided a way for machines to talk to machines across the globe. Software engineers no longer had to care about the wires or the packets - all the hard work was done for them. It changed the focus of development away from cables and back to machines.

In 1990-91 Tim Berners-Lee took the internet and added 2 new components. He took , stripped it down and invented HTML. And he took academic theories of and invented and . The result was the and it changed the focus from a network of machines to a web of documents and links.

All this is explained far, far better and with greater emphasis on the future by the man himself in his .

All about pages

So the web has always been about 2 things: pages and links. It's pretty obvious but it's something we often lose track of. If you've ever worked on the web just think about all the time and effort we put into pages: wireframes, visual design, photoshop comps, semantic markup, CSS, flash components, 'widgets'... And compare that with the time and effort we put into URI schemas and link design.

In many ways it's understandable. It's far easier to get people to engage with things they can easily picture. And if they engage they sign-off. And if they sign-off we get paid. Unfortunately URI schemas and link designs are not by nature particularly engaging or picturesque.

But URIs and links are more not less important than page design. If you get your URIs right and your pages look shoddy you can always come back and make them nicer. But if you make nice pages but get the URIs wrong you've got a big rebuild job and lose the persistence of your URIs. And persistent URIs are vital for both user experience and search engines.

To use a tortuous analogy a well designed website should be like a cubist painting: the spaces between things are as important as the things themselves - where things in this case means pages and spaces means links. Sorry!

The early days of web search

Back in the days of and first generation Yahoo!, web search was all about pages. crawled the web and indexed the content and metadata of each page. Results were ranked according to keyword density and the content of .

Pretty soon people realised that they could spam the search engines by including hidden data in pages. The meta keywords and description elements were designed to allow publishers to include metadata to describe the page but some publishers abused them to include popular search terms with no relevance to the page content.

Now search engines have the same attitude to being fooled as . They don't much like it. If a search engine allows itself to be spammed users will no longer trust it's results and go elsewhere. And since modern search organisations are really more in the advertising business than the search business getting spammed could be fatal to their bottom line. So they started to ignore meta keywords and description elements.

The major metric for search result ranking became keyword density on pages. Which was fine whilst the web was small. But last year on the web. As we see increasing content syndication via feed hubs and friend feeds and reaggregators how do search engines differentiate and rank the ever increasing list of results?

What Google did

To solve the problem Google went back to the 17th Century and some founding principles of modern science: and . To differentiate between results with near identical keyword density they also ranked pages by how cited and therefore influential they were. Which for the web meant they counted the number of inbound links. The assumption is that the more a page is linked to (cited) the higher it's relevance. This was the foundation of the famous algorithm. If you're in the mood for deep maths you might want to do some background reading on - if you're not then really don't.

So Google brought URIs and HTTP into the world of search. What had been all about pages and content became about pages AND links - the 2 key components of the web working in tandem.

Web 1.0 > Web 2.0 - from link rot to persistence

The hypertext of academia had a few things that TimBL left out of the web. One of these was . In academic hypertext theory if document A linked to document B then document B would also link to document A.

The lack of bidirectional linking removes an obvious overhead from web design and maintenance. It means that pages don't have to know anything about each other. It also means it's perfectly possible to link from one document to another document that doesn't (yet) exist. A broken link is a poor experience but it doesn't break the web. This permissive attitude to linking is probably one key to the rapid growth of the web - it just made life easier.

Unfortunately it also leads to the problem of . Link rot happens when document A links to document B and document B is then removed, moved or changes meaning. Again the web doesn't break but the link does. Link rot is the enemy of search engines and the enemy of any organisation that hopes to make it's content findable.

Back in page focussed web 1.0 days this happened all the time. If web 2.0 was about anything () it was the resurrection of HTTP and URIs as key components of the web. Blog permalinks, blog outbound links, social bookmarking, wikis and all reestablished the primacy of HTTP, URIs and links as the backbone of the web.

Web 2.0 > Web 3.0 > onwards - from documents to things

So what happens next? Some people talk about web 3.0, some say , TimBL talks about the but it's all pretty much the same thing. Instead of a network of machines or a web of documents we're starting to move into a world of linked things or at least a web of data about things. And to build that we first need a web of identifiers - or URIs. Those things might be you or your friend or a great TV programme or a favourite music artist. But the key to identifying them on the web is to give them an HTTP URI and make that URI stable and persistent - cos .

Probably the first place to feel the effects of this will be you and your social network. Tired of having to constantly reenter you friends into social network sites? If you (not a document about you but actually you) have a URI and each of your friends has a URI and your social network can be expressed as links between these URIs, there's no need for any more data entry. The technology may be or or a combination thereof - whichever way the concept remains the same.

The future of search (is semantic?)

As clever as Google et al are (and they are very clever) search is still something of a brute force exercise. If I search engines don't know if I mean or or or or . (In this case a document about the band is ranked first which is how it should be :-) .) But that doesn't help people who aren't fans of Salford based .

So influence doesn't always map to authority - particularly when terms are ambiguous. To differentiate between films and seasons and books and bands we need to publish content as data that machines (in this case search bots) can understand. In other words we need on the Semantic Web.

It's still early days for Linked Data and search engines. Most of the major players seem to be dipping their toes in the water. is probably the most high profile effort to date. So far it indexes and within HTML documents to extract semantic information. I dare say few SEO consultancies will advice you to add microformats or RDFa or full fat RDF to your site just yet - but those days may be coming.

Why this matters

When you're buried in the midst of a large project it's sometimes difficult to focus on anything but the implementation details. Your energy is expended on your site and you forget to consider how that site stitches into the rest of the web. How many user testing sessions have you sat through where the participant is shown to a browser open at the site homepage and asked to find things? - that's why you've spent so much time and effort designing it.

But in real life how many of your journeys start at a homepage and how many start at Google? Every day 8 million users arrive on a Ö÷²¥´óÐã page via a search engine. 1 million of those come via Ö÷²¥´óÐã search, the rest via Google, Yahoo! etc.

So search is important. The prettiest, most useful site in the world is no use if your potential users can't find your pages. And the easiest way to find things on the web is via search.

Now I'm not saying homepages aren't important - just that the time and energy we spend making them is sometimes disproportionate to their value to the web and therefore to users.

So what can you do?

There are several routes open if you want to optimise your site for search engines. Some of them are vaguely dubious:

  1. You can increase the . This often raises objections from journalists and editorial staff who don't like their copy style interfered with. And those objections are often dismissed by techy types and SEO consultants as creative whimsy. But it's not. You often see sites that have been SEOed to within an inch of their lives with keywords repeated ad nauseam. How many times do you have to read Prime Minister Gordon Brown before you understand that Gordon Brown is Prime Minister? If you keep repeating keywords you run the risk of making your content unpleasant to read and therefore less useful. And if it's less useful people won't pass it on to friends or link to it. And without links your Page Rank suffers. It's a vicious circle. But from an IA perspective . The usual style guides still apply - write for your intended audience and try to use words they'd use. But don't repeat yourself unnecessarily just to up your PageRank.
  2. You can add keywords to URIs. Google tells us they take no notice of this but no-one seems sure of what impact it has on Yahoo! etc. If you do choose to do this consider what happens if those keywords change over time. Can you honour your old URIs? Without too many ? Search engines see redirects as a potential spam mechanism - Google for example will only follow one 301. If you can't honour your old URIs then links to your pages will break and all your hard won search engine juice will leak away.

Other techniques are less controversial:

  1. You can provide . These let you you exercise some control over how and when your site is crawled and indexed. Before a search bot crawls your site it first checks the sitemap to determine what's changed, what's expected to change frequently and what hasn't changed since it's last visit. This lets you point the bot at new pages and updated pages and makes sure these are crawled and indexed first.

But the majority of techniques are more about URIs and links than pages:

  1. If you've read this far then you've probably guessed the first recommendation is to spend time designing your URI schema. Ensure that you can guarantee the persistence of URIs against all (predictable) eventualities. Don't sacrifice persistence for the sake of readability. If your pages move your search engine juice goes out the window.
  2. If you decide that readability is a prerequisite for you or your organisation and if you plan to publish a lot of pages you'll probably need to allow for editorial intervention in setting these labels. You'll also need to build an admin tool to allow this intervention to take place. Which means you'll need to build the cost of employing these people and building the tool into your project.
  3. If for some unforeseen reason your URIs do have to change spend time getting your redirects right. Sometimes there are redirects on your site that you're so used to you forget they exist. For instance if I link to the first redirect will be to /zanelowe and the second redirect will be to /radio1/zanelowe/. Since Google will only pass PageRank for one redirect the first of these links will pass no PageRank to Zane Lowe. Since even the addition / omission of trailing slashes will usually cause a redirect getting this wrong could lead to a serious leakage of search engine juice.
  4. Never expose you technology stack in your URIs - no .shtml, /cgi-bin/, .php, /struts/, .jsp etc. Technology is likely to change over time - when it does you don't want your URIs to change with it. As a side benefit keeping your technology out of your URIs also gives less clues to hackers.
  5. On the subject of security there was . If you did choose to do this you'd be kissing your search engine findability goodbye.
  6. Never include in your URIs. This is one of the options discussed in the MSDN article above. But it's also commonly used as a means of tracking users across a site. When you visit a site you're given a key that persists for the duration of your browser session. This is then used in every subsequent URI link to track your journey. But since every user gets a different key for every visit it means you can't link to, bookmark or email a link to a friend. Which means search engines can't see or index the page. It's a technique that's used on the . Which means that (eh?) which works for me won't work for you and won't work for search engines. And next time I open my browser it won't work for me either. Nice.
  7. Only use where you have to. https is http's more secure cousin. It should be used when you want users to submit sensitive information to your site. However, pages served with https aren't indexed by search engines so you don't want to use it for plain content pages. The Ö÷²¥´óÐã jobs site gets it right with application submissions which are https. Unfortunately it also uses https for everything else: search results, job listings, job descriptions... Another reason why Ö÷²¥´óÐã jobs don't turn up in search engines.
  8. One of the first rules in the is don't expose your internal organisation structures in your public interface. It's still something that often happens and can take many forms: using labels that only your business units understand, reflecting your management structures in your site structure etc. The most pernicious examples usually happen when you think of your website as a set of stand-alone, self-sufficient products. The web really doesn't lend itself to this shrink-wrap mentality. The net result is often the creation of multiple pages / URIs across your site that talk about the same thing. In general your site should tend towards one page / URI per concept. When you get multiple pages about the same thing some will inevitably end up unmaintained and go stale. This all results in confused users. It also results in confused search engines and the splitting of your PageRank across multiple URIs. It's better to have one page with 10 inbound links than 10 pages with one inbound link.
  9. If for some reason you do end up with multiple pages about the same concept at least make sure there are links between them. Decide which one is the canonical page - the one you want to see turn up in search results. And add a to that page. If search engines find many similar pages they'll try to squash down the result set into one page. Telling them which page is canonical helps them to make the right decision.
  10. Connecting up your site on a data and interface level and breaking down the content silos results in a more usable, more search engine friendly experience. The first step is to agree on what you model, check your understanding with users and agree on identifiers. Once you've done this new linking opportunities arise, new user journeys become possible and you can slice and dice one set of data by many, many others. The more content aggregations you make, the more user journeys, the more links for search engines to get their teeth into. As an example, if part of your site is about programmes and some programmes contain recipes and another part of your site is about food and contains recipe pages then link from the programme episode page to the appropriate recipes and link from the recipe pages to the episode they were featured in. It's simple in principle - the key to good user experience and good SEO is to get your infrastructure and piping right.
  11. If we're agreed on one page per concept we should also agree on one concept per page. There'll probably always be pressure from marketing types to include lots of cross-promotion links from content page to content page. Which is fine in principle. In practice it can lead to pages that have more adverts than content. This waters down their keyword clustering and can also be confusing for users - what is this page and where am I? If you connect up your data you can start to build and minimise the need for clumsy advertising. Think Wikipedia, not right hand nav.
  12. You can encourage people to link to you by making every nugget of content addressable at a persistent URI. The analogy here is with Twitter. Every tweet, no matter how has it's own URI. Which means when someone does say something interesting people can link to it. And because every tweet links back to it's tweeter it's all more links for search engines to chew on.
  13. Remember you don't have to mint your own identifiers. If you can use common web-scale identifiers for concepts you're interested in. It makes it easier for other sites to link to you if you share a common currency.

The final tip is the most important. MAKE GOOD CONTENT!!!. If your content is interesting, relevant or funny people will want to bookmark it, cite it and share it with friends. If it isn't they won't.

Share the love

So I've talked about inbound links - what about outbound links? According to a strict interpretation of the PageRank algorithm if inbound links are a tap pouring lovely search engine juice into your page then outbound links are an unplugged leak splurging it back out again.

There's . The web in general and search engines in particular thrive on the density of links. So far I've found no evidence either way on this one but maybe I haven't looked in the right places - or maybe the right places weren't SEOed :-) If you know better (or you work at a search engine company) maybe you'd like to leave a comment. In the meantime we're working with search engine companies to get to the bottom of this - when we understand more I'll update this post.

Either way if you want to make the web a better place make links. If you find an article you like you could bookmark it in your browser. But if you do that only you benefit. If you delicious it or it or blog about it or whatever it is you do then your social network also benefits. And the links from delicious etc all count to the PageRank of the article so it becomes more findable and the publisher benefits too. It's worth noting that higher the PageRank of your page the higher the PageRank it's links pass on.

So if you think this post has been worth reading then please (social) bookmark it or blog it. Or if you think it's all rubbish blog it anyway but add a attribute to your link back.

Which brings us nicely to rel="nofollow". It's a way to link to something without passing PageRank. And since links are often seen as leaky many publishers choose to add it to all outbound links. Indeed some publishers are so convinced that links mean leaks they even add rel="nofollow"s to their own internal navigation - things like Terms and Conditions and Privacy Policies. It's a practice called PageRank sculpting and it verges on the paranoid. Given it's also pretty pointless.

rel="nofollow" is also commonly used on sites that accept user content. In order to stop people using links in comments to leach PageRank from the hosting site, publishers often add rel="nofollow" attributes to all links in comments. Twitter is one example amongst many - every link in a tweet is automatically made nofollow. The trouble with nofollow is that if everyone used it, PageRank would die and web search would die with it. So go easy on the rel nofollows or you might break the web.

Some other things you can do

'Widgets' and APIs

and open data allow users to take your content / data and reuse it in their own sites and applications. It seems counterintuitive to suggest that if your content can be found everywhere it'll be more findable on your site. Again it all comes back to links.

In the case of widgets they almost always come with links back to the content of the source site. The presentation at the top of this post for instance comes with 3 links back to slideshare. Every one of those links makes the slideshare content more findable.

Unfortunately it's not always so simple. The slideshare widget at the top of this page displays (mainly) static data. Which means it can be rendered in HTML with a Flash movie for the actual presentation. Because the links back to slideshare are plain HTML links they all contribute to PageRank. In many cases widgets need to display dynamic data. To do this they often use JavaScript and / or Flash to render themselves. In which case the usual Flash and JavaScript problems emerge.

A much better way to encourage links back to your content is to provide an open data API. Opening your data with APIs allows other people to take it, / it up with other data sources and make things you've not thought of or didn't have the time to implement. The is a great example. The site itself is stripped down to perfection. It does exactly what it needs to do and no more. But the API allows other people to take the data and use it in new and imaginative ways. . Some of these are standalone applications that work on desktops and mobile phones. But others are websites that can be crawled by search engines. And they all link back to Twitter passing search engine juice as they go.

So opening your content / data for reuse can make your site more findable and drive traffic back to you. As ever, . Everybody wins.

One web

Once you've modelled your data, given everything a URI and provided as many aggregations and user journeys as possible it would be silly to dilute your user's attention and links by providing the same content at a different set of URIs. But we still do this all the time with special sites for mobile and other non-desktop devices.

For now it's not too much of a problem. Most other devices don't have the rich web of support sites you get on the desktop web and device specific sites are still more walled gardens than their more weaved into the web desktop cousins. But as mobile support increases there's no reason to suppose that the complex ecosystem of support sites (social bookmark tools eg) won't evolve with it.

The iPhone in particular already raises many of these problems. It's quite capable of rendering standard desktop sites and integrating with social bookmark tools. But we often create a separate iPhone version at a different URI to take advantage of the iPhone's swishy JavaScript page transitions etc.

Clearly different devices call for different content prioritisation, different user journeys and different interaction patterns. But they don't need their own set of URIs. It's better to use and device detection to return a representation of your content appropriate to the user's device. A single set of URIs means your users attention and links aren't split and increases the search engine juice to your pages. .

If content negotiation / device detection is too much work you need to decide which representation (usually desktop web) is canonical and mark it / link to it as such.

Erm, user generated content

So we all know that . And I'm afraid I'm going to demean it further. Sorry.

Way back in 1995 wrote a book called . In it he discussed what broadcasters would have to do to make their content findable in a digital world. He talked about bits (the content as digital data) and bits about bits (data about the content). (I guess bits about bits are what we now call metadata but I think I still prefer bits about bits.) On the subject of bits about bits he said:

[..] we need those bits that describe the narrative with keywords, data about the content, and forward and backward references. [..] These will be inserted by humans aided by machines, at the time of release (like closed caption today) or later (by viewers and commentators).

Emphasis my own. Remember this was well before everyone talked about social media, before Google existed and before most people had even heard of the web.

Nowadays we're all used to sites that ask us to log in and rate and tag and comment on content. This might seem cynical but in many cases (although not the Ö÷²¥´óÐã of course) site publishers invite these interactions not because they're interested in what you have to say but because it's a valuable source of additional data about their content. And this data can be used to make new aggregations (most popular; tagclouds for you, for your friends, for everyone; latest comments etc).

From an SEO perspective publishers benefit twice. Firstly search engines have more text and keywords to chew on without requiring much editorial intervention / expense. And secondly more aggregations means more links into content pages, more user journeys and more journeys for search engines. Every one of those inbound links pushes up the PageRank of the aggregated page.

Personalisation

There are 2 ways to personalise a site. The first is to change the content of your existing pages according to the instructions and behaviour (implicit and explicit) of the logged in user. So a page about a TV programme might include a list of your friends who've watched that programme. It's a good way to make your site feel lived in but if this is all you do you will sacrifice valuable social recommendation and search engine goodness.

The first problem with personalising only on existing content pages is that only you can see this data as presented - your friends and friends of your friends can't. So you sacrifice valuable recommendation from outside the user's immediate social graph.

The second problem with content page personalisation is that search engines can't see it either. Search engine bots can't register and can't log in. Which means that all your development work and all the work your users put in consuming and annotating your content can't be seen by search engines.

The answer again lies in URIs and links. In this case you should treat each user as a primary data object and give them a persistent URI. Make links from users, through their to your content and from your content to your users. Obviously you should ask your users before you expose their attention data - how much is made visible to the web should always be under their control.

If you need an example of how to do personalisation properly .

Accessibility, SEO and karma

This is probably the only section that's pertinent to the title of this post. But since I quite like the title I'll stick with it - even if it is non search engine optimised.

We've long accepted that accessibility is not an optional extra. Neither is it something you can just stitch over your site when all other development is done.

The same is true of SEO. And many of the rules we follow to make sites accessible will also make them more search engine friendly. So even if you don't design for accessibility because you know you should, self interest should take you in that direction anyway. It won't be as good for your karma but it will have the same effect.

Accessibility is not a set of boxes to be ticked - it needs to be baked into your whole design, build and testing ethos. Building an accessible site is no use unless it's also usable. And even a usable site is pointless unless it's useful. Giving things persistent URIs, connecting your data, building non-siloed sites and providing new journeys across and out of your site all help to make your site usable and useful so maybe it's all connected...

Plain English

Always write in plain English or French or Welsh or [insert your chosen language here]. Use language your intended audience will understand. If you overcomplicate your text you run the risk of confusing users. The risk is doubly so for users with cognitive disabilities. This doesn't mean you have to write tabloid style - as always write to be understood.

If you stick to the language of your users chances are your chosen words will also be words they search for. Which will help with your search engine friendliness.

If your website has it's own search functionality the standard SEO advice is to check your search logs to see what users are searching for and tailor your language appropriately. However, Google tell us that they already perform a certain amount of term association so if your site says 'TV listings' and a user searches for 'tv guide' they'll find your content anyway. There's really no need to cramp your writing style so long as you keep things clear.

Alt attributes

Probably too obvious to mention. People with visual disabilities struggle with images. If your page includes images you should include an to describe the image. Search engine bots spend their lives chewing through pages like pacman on a diet of links. They can detect the presence of an image but not decipher what's depicted. So they need alt attributes too. Sometimes images are used for purely decorative purposes. You only need to add alt attributes if the image adds meaning to the document. An empty alt attribute can be the right choice.

Semantic HTML

Screen readers struggle with old style table layouts. It's best to keep your markup stripped down and simple. Get your document design right and use .

For now search engines only really care about headings. Text found in h1s, h2s, h3s etc will be given extra weight. But you might as well go the whole hog. Separating out document design into semantic HTML and visual design into will make your site easier to maintain and update. Go easy on those definition lists tho...

Hidden content

Screen readers are pretty inconsistent in their support for CSS content hiding. The majority of modern screen readers will ignore content hidden with display:hidden or display:none but still read out content hidden by positioning offscreen. Whether this is intentional or because screen readers are still catching up with modern CSS design techniques is unclear. Offscreen position hiding is often used in of titles. Whilst screen readers will still read the offscreen text, the replacement images aren't rescaled in most browsers so still have accessibility issues. Remember - more people have bad eyesight and need to increase font size than use screen readers.

If you've got this far you'll know it's possible to write a very dull article. It's also possible to add search friendly but non-contextual keywords to a dull article and hide them with CSS. Like meta keywords of old, search engines see any hidden content as a potential attempt to spam them. So hidden content is penalised. Keeping content visible will help accessibility and help your SEO.

Forms

Designing accessible forms has been a subject of . It's usually framed in terms of screen reader users. But complex forms are confusing for all users and doubly confusing for users with cognitive disabilities. Sometimes forms are unavoidable (for search, for user signup etc) but if possible always provide routes to content that don't require form filling.

Search engine bots can't fill in forms. Which means they meet forms and refuse like a small horse at a large fence. Put simply search bots can't search. Sometimes we're asked why we make topic pages and don't just add the extra semantics into search results. And part of the reason is that search engines can see topic pages and the links from them - but they can't see Ö÷²¥´óÐã search results. Getting site search right is important. But it won't reward you with any more search engine juice.

Not to pick on the Ö÷²¥´óÐã jobs site too much but since there's no way to browse to jobs which means even if the job pages where indexable (which they're not) search engines wouldn't be able to find them in the first place.

So for the sake of accessibility and SEO never use forms when links will do (or at least try to provide both). The only exception is when the action at the end of the link is destructive - you don't want Google (or ) deleting your data.

Flash

Use of Flash obviously has it's place on a modern website. If you want to deliver streaming video or audio it's the obvious choice. But overuse of Flash can lead to accessibility problems. with tabbed navigation and keyboard shortcuts but you need to put the work in. If you choose to use Flash for your main site navigation you're making a lot of work for yourself if you also want your site to be even approaching accessible.

Even if you use HTML for site navigation and Flash for audio-visual content there'll be knock on effects for accessibility. Because Flash doesn't scale, if your movie contains lots of text or carries it's message via moving images and video you'll make life difficult for users with visual disabilities. If it carries it's message via audio you'll make life difficult for users with hearing disabilities.

The best way to make content locked up in Flash files accessible is to provide an HTML transcript.

Modern search engines are starting to be able to look inside Flash files and - so take care when you're adding abusive comments :-) So is Flash still incompatible with SEO?

If you use it as your primary navigation the answer is absolutely yes. The other day I was looking at a site that sold trainers. I spotted a pair that I rather liked. Normally I'd have bookmarked the page in delicious and come back later but in this case the whole site was rendered in Flash. Which meant there was no page to bookmark. Which means the company lost not only a potential sale but also one tiny drop of search engine juice. Add that up across many potential users and the net effect is fewer sales and less findability for their products. So use Flash sparingly.

If you lock up a lot of your content in Flash the answer is still yes. Flash is primarily a visual, time based medium and search bots don't have much visual acumen. When we say they can index text based content in Flash files there are 2 caveats:

  • if the text is entered as a bitmap there's nothing a search engine can do to pull it apart,
  • if the semantic structure of the text is a product of the movie's timeline no search engine will be able to stitch this together.

The best way to make content locked up in Flash files search engine friendly is to provide an HTML transcript.

JavaScript and AJAX

JavaScript and AJAX can be an accessibility disaster zone. For screen reader users it's almost impossible to keep them updated when the page changes state. The result is confusion, confusion leads to frustration and your users go elsewhere. The best approach is to design and build your site as plain old HTML and by layering over Javascript and AJAX. Always test your site with JavaScript turned on and off to make sure it works in all modes.

In search engine terms what goes for Flash goes for JavaScript and Ajax. Used appropriately it can make your user experience more dynamic and interactions flow more freely. Occasionally (although less these days than when it first appeared) it's used to render a whole site. Which means that URIs are not exposed to users or to the web. Now most search engines don't process JavaScript so can't fight their way through to your content. And even if they could because individual URIs are not exposed there'd be no pages to index. It also means that users can't (social) bookmark or blog your pages which cuts down the number of inbound links and reduces your search engine juice. Again the best approach is plain HTML first, with JavaScript and AJAX layered over the top. Even so JavaScript and AJAX should be used sparingly. If your site degrades gracefully you still need to expose individual page URIs to users so they can link to them.

Link titles

Finally when screen readers encounter a link they'll read out "link - link title" where "link title" is the text found between the opening <a> tag and the closing </a> tag or the title attribute on the <a> tag if it has one. This means that if you use:

For more on {important key words} click <a href='..'>here</a>

users will hear "link - here" which isn't very informative. If instead you use:

<a href='..'>More on {important key words}</a>

users will hear "link - More on {important key words}" which is much more useful.

For search engines link titles are almost as important as link density and keyword density. There's little point peppering your documents with search keywords if you don't make your links descriptive. So again accessibility and SEO will both benefit if you make the titles of your links as descriptive of the link target as possible.

It's probably worth pointing out that you can only control the link titles on pages you publish. The rest of the web is free to link to your content with any label they feel fit. A while back . For a little while the top result in Google for 'miserable failure' was this biography. It's called Google bombing and there's nothing you can do about it.

And the story doesn't end with the coming of Obama. With the change of administration the Whitehouse webmaster permanently redirected the Bush biography page which was at to the new Obama biography page at . Which meant . They've fixed the problem since but there are are 2 lessons in this. The first is to beware of semantic drift. The 43rd president was not the 44th president; last year's Glastonbury Festival was not the same as this year's Glastonbury Festival. Every time you encounter a new concept you need to mint a new URI. The second lesson is be very careful with your redirects...

Finally

Since I seem to have spectacularly failed to write a pithy blog post I guess a few more paragraphs won't hurt...

In summary, your PageRank is outside your control. How well your site fares in search engines is pretty much at the discretion of the web. The best thing you can do is make lots of your own links and encourage other people to link to you. There are of course other options if you want to artificially inflate your search ranking (mainly keyword clustering) but...

...Google et al are cleverer than we are. They employ the best graduates from the best universities in the world. If a rival publisher makes a better page than you but your page gets a higher search ranking, users will find a new search engine that returns better results. And Google etc would lose their business. The clever people aren't about to let that happen.

So from an IA perspective the best advice is to keep things simple. Design and build with the established tools of the web: HTTP, URIs, HTML, CSS. If you make a site that's usable and accessible for people, chances are it'll be useable and accessible for search bots too. Search engines are only trying to reward good behaviour and good content. Don't make anyone's life harder than it has to be...

Magazines are made of pages, websites are made of links.

Multimedia meets radio

Post categories:

James Cridland James Cridland | 15:02 UK time, Monday, 16 March 2009

Recently, I was lucky enough to be able to attend a conference run by the . They do a good amount of sharing of knowledge - and this, , was a shining example.

Jonas Woost from and Steve Purdham from discussed digital music business models.

Then, a fascinating few sessions around "introducing your new favourite artists". The Ö÷²¥´óÐã has "Ö÷²¥´óÐã Introducing", but the Ö÷²¥´óÐã isn't the only public service broadcaster introducing their audience to new music. Dominik Born, Project Manager for spoke about their service. It lets new bands upload their songs, but also allows people to grab widgets (and a nice iPhone app) to listen to it - based on music genre (so I could add all new trad-jazz bands to my website for example). What was interesting is that mx3.ch is a separately branded service for many of the public service broadcasters in Switzerland (who don't share any common brand, or, indeed, language). Nicely done.

And then we heard from Steve Pratt from Canada's (strapline: "Breaking New Sound"). CBC Radio 3 is the "worst radio station in the world", he said - it's programmed entirely against the rules. Music you've never heard before, chosen by the audience, and very few big hits - yet it works fantastically well, merged together with a set of podcasts and a great-looking website. CBC Radio 3 allows bands to upload their favourite songs, but then to give them a player for their own websites... a neat idea, giving bands a good incentive to take part (and covering their bandwidth costs). The radio station plays music which isn't owned by a record company, so the programmes are also fully available as a nicely chapterised podcast, too. Users can register and be given recommendations, be able to program their own playlists (some of whom get on-air from what I could tell), and they get their own page on the website too. This is really clever, really far-reaching stuff, and I was hugely impressed at seeing it.

Two more neat things in the final session of the day ("delivering innovative services"). First Henrik Heide, Editor of , showed off their new personalised radio player which goes live in a few weeks. DR offers a bunch of music stations (about 15 from memory), and the idea is that you listen to those non-stop music stations... until one of your favourite programmes is on the air, in which case the non-stop music station gracefully fades out, and it's replaced with the live radio programme. Once your favourite programme has finished, it fades your music stream back up again. Really nicely done.

And then it was the turn of Gerhard Zienczyk, Head of International Relations for the German radio broadcaster . They have a problem - they don't have all the music rights that they need to offer every radio programme on-demand. And they certainly can't supply their programmes for download onto your iPhone. So... they've a novel way round it - they get you to record the programmes yourself. The is a free download from their site, which records WDR radio services (based on the download of an EPG). This records the 128k MP3 stream; imports the resulting full programme into your iTunes, and lets you get the entire thing in a DRM-free file which you can then listen to on your iPhone out and about. A clever (and visually beautiful) way around a legal licensing issue. Neatly done.

Adam Bowie kicked off the second day from the UK's , talking us through the rebrand of the station. This was a similar presentation to Clive Dickens at the 's last year, but with additional information about what the station's done since. I also learnt that their management blog, , is the first thing their staff members see whenever they log into their computers - neat idea.

A nice man from was next, talking about the online research he's done; and discussing some radio drama with a good accompanying website.

Then it was the turn of Brett Spencer from Ö÷²¥´óÐã Radio 5 Live. Brett showed off the visualised radio trial we ran earlier in the year; showed a nicely put together video about how the station made their Wimbledon coverage an interactive thing; and finally discussed their football player (you'll find this at bbc.co.uk/widgets.

Then, a bit of an odd thing from , lots of exciting French drama, sound effects, odd bouncing eggs, and incomprehensible things; and more mobile apps from Sweden's - including, yes, yet another iPhone app. They're even advertising their podcasts on Spotify, they've paid-for an entire tube train to be coloured in their SR branding too, under the idea that (SR podcasts are) "make getting to work faster". Really nice to see people promoting podcasting, and not simply relying on the radio station and word of mouth. Incidentally, SR's iPhone app took significant time to get agreed - over six months, I believe.

Finally, - another witty and clever presentation. Brilliantly, started his first slide with "Slide 1 of 2,879″ on the bottom-right. Believes that there's a great future for community radio in many areas of the world, but does say, starkly, that radio will die out in Asia. "From shouting to sharing" is his nicely observed theme of how technology is changing things.

Jonathan discussed many of the radio stations that he works for in Africa. Really interesting in terms of funding, and how mobile phone SIM cards are used as currency; indeed, mobile needs to be integrated much closer into radio broadcasters' work. He uses RFID tags to automatically record university lectures for a local radio station ("this is the card that turns the lights on"); and has a novel idea for metadata - "you must say what this lecture is about in the first three sentences otherwise you won't get paid": excellent social behaviour! Says that kids don't like websites of broadcasters - "most broadcasters websites look finished": they'd much rather be involved more.

It was a good couple of days - some very useful new contacts, some inspiring discussions, and some neat ideas for Radio at the Edge later this year (November 9th, mark your diaries now). Particularly interesting was additional monitors showing the @mmradio Twitter account - someone in the audience writing up interesting quotes from the speakers (though no attempt at conversation with the twitterverse).

This is an edited, rather more polite, version from my own blog.

Ö÷²¥´óÐã iD

Ö÷²¥´óÐã navigation

Copyright © 2015 Ö÷²¥´óÐã. The Ö÷²¥´óÐã is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.