en Technology + Creativity at the 主播大秀 Feed Technology, innovation, engineering, design, development. The home of the 主播大秀's digital services. Wed, 25 Nov 2020 10:47:40 +0000 Zend_Feed_Writer 2 (http://framework.zend.com) /blogs/internet 主播大秀 World Service: Migrating 31 million readers and an 83% improvement in page performance Wed, 25 Nov 2020 10:47:40 +0000 /blogs/internet/entries/a98f6952-4051-4b8f-8d27-35b3db69a839 /blogs/internet/entries/a98f6952-4051-4b8f-8d27-35b3db69a839 Chris Hinds Chris Hinds

The 主播大秀 World Service publishes news stories in over 40 languages globally. Stories are written by journalists around the world in their native language instead of using translations. World Service covers everything from local to global news and content is delivered in multiple formats, including text, video and audio.

As mentioned in Moving 主播大秀 Online to the cloud the frontend (and many backend services) powering the World Service websites were previously written mostly in PHP, hosted by 主播大秀-owned data centres. Over the last couple of years, teams within 主播大秀 Online have been working tirelessly to migrate their services to the cloud and the 主播大秀 World Service has nearly completed this transition.

Over the past 12 months we’ve migrated our pages which are spread across 41 discrete sites from a legacy PHP monolith to a new React based application. This application is called , an open source, isomorphic single page application developed by the World Service languages team.

What do we mean by Single page application and Isomorphic?

Single Page Application (SPA)

A single page application (SPA) is a web application that works solely in the browser, removing the need to refresh or reload the page. This creates an outstanding user experience that feels close to that of a native mobile application. Some common services you may use on a daily basis make use of this technology, including Gmail, GitHub, Facebook and Google Maps.

Isomorphic

An isomorphic (sometimes referred to as “universal”) app is a web app that can run on both the server and the client. The idea being that the first request to a web page e.g. will be rendered on the web server delivering server side rendered HTML to the readers browser. Once the rendered page reaches the client and the JavaScript is downloaded and parsed the browser is able to take control and then treat subsequent page views as a single page application. In React this is handled via the React hydrate function which “hydrates” the client side DOM with the data that was used to render the page on the server. In most cases the reader does not notice this phase as React is performing a diff on the DOM between the server side render and the client side render. For the most part these will be identical, however, at this point React is in control of the page rendering in the browser.

Simorgh (The React SPA built by the 主播大秀 World Service)

Simorgh is the rendering platform built by the 主播大秀 World Service web team using the technologies described above. What made Simorgh challenging to build wasn’t the technology we used, but the specific requirements of 主播大秀 World Service. When building Simorgh to replace our dated PHP solution we had to bear in mind the following:

  • Performance — The websites must be as performant as they can be. Many of our readers are on lower end smart/feature phones on networks with low bandwidth rates and high data costs, slow connections and patchy coverage.
  • Accessibility — The 主播大秀 aims to provide a fully accessible web platform, ensuring that anyone can access our websites using any assistive technology.
  • Support for multiple languages — 主播大秀 World Service currently supports 41 different languages, each language site has its own editorial team and from the outside this is seen as 41 separate websites.
  • Huge volumes of traffic — The World Service currently serves 31m weekly readers. Simorgh despite being behind many different caching/routing layers is rendering on average 1 million unique pages per day with an average of 11 million daily renders across the 41 languages.
  • First class AMP support — Offering AMP variants of all supported pages. This allows us to move away from the previously separate AMP rendering system that was built on an internal Ruby based framework for static rendering.

We are planning to post a dedicated writeup on the history of Simorgh and the technologies chosen in the near future so keep an eye out for that.

So where are we today?

Simorgh currently supports twelve different page types:

These pages may appear visually similar, but our internal content management system treats them as different page types. One of the biggest advantages we have seen by rebuilding our platform is the ability to reuse code as much as possible. Many of these pages share the same code and React components. These components are located in our open source React component library (Storybook ).

We knew that we wanted to focus on improving web performance for the World Service, but this was difficult with the previous PHP platform. So one of the main goals of the migration was to build a platform that would enable this type of rapid prototyping, allowing us to make changes and improve the feedback loop of those pages.

Web performance wasn’t the number one goal when we created Simorgh, but as we followed best practice in developing the new platform, we did see vast improvements compared to the old one. Some of this was attributed to some early design decisions such as no blocking JS, minimal layout shift, server side rendering etc.

We released pages in batches, grouped by the language and the first language we released onto the new platform saw huge gains performance in many areas.

  • Lighthouse performance score saw a 224% increase from 24 > 94
  • Lighthouse best practice score saw a 27% increase from 79 > 100
  • Total number of requests dropped by 85% to 17 down from 112
  • Blocking JS requests dropped by 100% from 9 to 0
  • JS requests dropped by 79%
  • Total page weight is now 60% smaller than before
  • JS size dropped by 61%
  • Dom Content Loaded is 85% faster at just 0.4s down from 2.6s
  • Visually complete time dropped by 62% down to just 1.8s vs the previous 4.7

These performance metrics were captured using SpeedCurve comparing the same url on the old platform and the new.

So as you can see we have already made a great improvement in our frontend web performance — but we won’t stop here. A large proportion of the 主播大秀 World Service audience are on slower 2g and 3g networks, they use lower end budget-friendly android handsets or feature phones. In some of our supported regions network coverage is patchy at best, some readers may only have network access whilst traveling to work or whilst at work. We must continue to make improvements in every way we can to make our pages some of the most accessible web pages in the news category both in terms of accessibility requirements and performance.

This video demonstrates the performance improvement between the old and new platforms.

This external content is available at its source:

Post migration improvements

Since the migration we have already released a number of new features that aim to help improve performance, perhaps the most notable one was the lazyloading of social embeds (Tweets, Instagram posts and YouTube videos).

Social embeds are often a key part of telling a story. We have found that many of our journalists add a number of social embeds to each page. For instance one language always embeds 2–3 YouTube videos at the bottom of each story. When looking into the performance metrics for these pages we noticed upwards of 500Kb of JS (that was more than the entire Simorgh application) was being loaded by YouTube and that some of this JS was actually blocking the rendering of our page as it was being parsed. In one extreme example the Time to First Byte (TTFB) was at 12s.

This content had to be on the page as it was part of the onward journeys experience. However not every reader would scroll down the page to where these embeds were rendered, so they shouldn't have to download the extra JavaScript or use the extra data allowance when they may never interact with these social embeds.

The Solution?

Lazyloading of third party content - We already do this with any images that are outside of viewport so why not for social embeds? A quick pull request later and we were lazyloading social embeds, no new library, no JS size increase, just using an already existing feature on the platform. Soon after releasing we saw a wide variety of results as these were dependent on where the social embeds were in the story and how many a given story had.

In most cases we were seeing a 10–15% improvement in TTI, as well as reducing, if not eliminating the render blocking time. Where I was most impressed though was in the story mentioned earlier. We had taken the TTI from 12s down to 6s. 6s is still a long time, however, this was a story with many different social embeds and in some ways a worst case scenario. In any-case a 50% improvement in just a few lines of code is phenomenal. This kind of change would not have been possible, at least not so quickly on the previous platform.

How are we monitoring web performance?

Now that the migration is complete, we are in a position to start making more improvements to web performance and changes to the platform. Before we can make many meaningful improvements to the application we need to be able to monitor web performance.

There are two common ways of monitoring web performance;

Synthetic Testing

Synthetic testing is great for catching regressions during the development lifecycle. We use Lighthouse, SpeedCurve and WebPageTest to measure our web page performance.

RUM (Real User Monitoring)

RUM testing is a method of capturing performance metrics from our users. RUM is generally more expensive in comparison to synthetic testing, however it provides a vital look into how real users are experiencing our site.

We use a combination of Synthetic and RUM monitoring for Simorgh. During development, Lighthouse runs on every pull request/feature branch. Lighthouse tests a subset of pages and for the most part is looking at the Accessibility, PWA and BestPractices audits.

Lighthouse is also used in our continuous delivery pipeline. After we deploy to the test environment, we run Lighthouse against the environment and can choose to fail the build if the audits fail. This same test will then also run against the live environment once the deployment is complete.

SpeedCurve runs daily tests against a smaller subset of URLs. SpeedCurve is a tool that essentially wraps around WebPageTest and Lighthouse, providing a fantastic UI on top of those underlying tools. These tests give us an insight into performance of our pages from different regions around the world.

Core Web-Vitals

A recent initiative from Google is the Core Web-Vitals. The idea behind these metrics is that they are a way to determine/monitor the user experience of your site. Google collects the metrics themselves from popular sites and publishes the CRUX dataset (Chrome User Experience). These metrics include things like; Time to first byte, First input delay, Cumulative Layout Shift.

Through a new package in our component library we are now able to collect these same metrics ourselves if the user has opted into performance tracking via the 主播大秀 cookies settings page.

In comparison to procuring a 3rd party tool, this is a cost-effective way for us to be collecting real user metrics. RUM is very important for us in the 主播大秀 World Service as our users are situated all around the world, they all use different devices with different capabilities and run on a wide variety of different networks. Getting this sort of test coverage with just synthetic testing would be impossible. This new data will allow us to start making informed production decisions about where we need to improve the web pages directly affecting the readers experience.

Screenshot of our custom RUM solution reporting on Web Vitals

We hope to publish a dedicated post in the near future about how we collect and use Web-Vitals.

Conclusion

It’s been a busy period for many teams at the 主播大秀 this year but we are seeing the light at the end of the tunnel. The World Service migration has been a great success thus far. We have migrated to a modern platform that is open source, and faster than ever both in terms of product/feature iteration and web performance.

Our journey has only just begun. Simorgh represents a new beginning for the 主播大秀 World Service, and we will continue to improve the performance and accessibility of our news web pages for our global audiences.

]]>
0
An international update on 主播大秀 Sounds and 主播大秀 iPlayer Radio Tue, 22 Sep 2020 08:49:50 +0000 /blogs/internet/entries/166dfcba-54ec-4a44-b550-385c2076b36b /blogs/internet/entries/166dfcba-54ec-4a44-b550-385c2076b36b Lloyd Shepherd Lloyd Shepherd

Here in the UK, 主播大秀 Sounds is now one of the most popular audio apps available, and just had its best quarter ever. We’re seeing around 3.5 million people a week listening to the 主播大秀 on Sounds across the app, website, connected TVs and voice-activated devices – more than we’ve ever seen before.

There are also millions of people who watch, listen and read 主播大秀 content outside the UK – we recently announced the 主播大秀 had reached an all-time record global audience of 468 million. Until now, 主播大秀 Sounds has only been available for those international listeners through the website, and people who’ve wanted to listen to the 主播大秀 through an app have had to use the old international version of the iPlayer Radio app. Now though, the time has come to turn off the old international iPlayer Radio app and transition those users over to 主播大秀 Sounds.

From today, the 主播大秀 Sounds app will begin to be made available to download on international app stores, available for devices capable of running iOS 11, Android 5 or Amazon OS 5 or above. People who currently use the iPlayer Radio app outside the UK will now begin seeing messages encouraging them to download the 主播大秀 Sounds app and switch over. At some point in the near future, the iPlayer Radio app will stop working, and listeners will need to get the 主播大秀 Sounds app to continue listening.

This change won’t affect the content international listeners can enjoy. They’ll still be able to listen to all their favourite 主播大秀 programmes, the wide range of 主播大秀 radio stations will be available to stream live, including the World Service. Popular radio programmes like The Archers, Desert Island Discs, and Radio 5’s Elis James and John Robins show will be available to listen to on-demand, as well as the wide range of hit podcasts from the 主播大秀, including Grounded with Louis Theroux, That Peter Crouch Podcast and George the Poet’s fantastic Have You Heard George’s Podcast?

主播大秀 Sounds will now also offer international listeners a personalised experience they didn’t get with iPlayer Radio, receiving tailored recommendations based on what they’ve been enjoying and the ability to pick up where they left off with their favourite shows. International listeners can also stream 主播大秀 Sounds to their Chromecast device, and enjoy the improved experience for Android Auto and Apple CarPlay, which offers a larger choice of content from the dashboard than the old iPlayer Radio app did.

These changes won’t impact people within the UK, and we’re excited that international listeners will now be able to get the much-improved experience 主播大秀 Sounds offers.

The 主播大秀 Sounds TV app

Here in the UK, we’re bringing the 主播大秀 Sounds TV app to more devices. Since first launching in March on Virgin Media and YouView, the Sounds TV app has also come to some Sony and Samsung TVs – giving people with those devices another way to listen to their favourite music, radio and podcasts from the 主播大秀. From today, we’ll begin rolling out the TV app on Roku streaming devices, and we’ll be bringing it to more devices in the coming weeks.

We hope you enjoy this update, and that our international listeners enjoy the new Sounds experience. We’ll be bringing more updates in the future to make Sounds even better, and we’ll keep you posted.

Update: Since we last told you about the 主播大秀 Sounds on TV app, we’ve made some updates to the app and launched it on several new devices. The Sounds TV app now has search functionality, and viewers can now access their personalised My Sounds section on the TV app experience as well as on the mobile and web versions of Sounds.

We’ve also launched the 主播大秀 Sounds TV app on a number of new devices, including Now TV, Roku, Freesat, LG, Freeview Play, and Google TV, and as of today we’re bringing it to all Amazon Fire TV devices too, including sticks and smart TVs. Anyone with those devices will be able to find and add the app, and enjoy all of our best live and on-demand radio, music mixes, and fantastic podcasts like the new series of Grounded with Louis Theroux, Bex Smith’s women’s football podcast The Players and iconic rapper Eve’s new podcast Constantly Evolving.

Update #2:  The 主播大秀 Sounds TV app is now also available on Sky Q. You can find the app on the SkyQ screen, or just say “Launch 主播大秀 Sounds” into your Sky Q voice remote, and start easily listening to your favourite music, radio and podcasts from the 主播大秀.

]]>
0
Leveraging the Tor Network to circumvent blocking of 主播大秀 News content Wed, 30 Oct 2019 08:18:33 +0000 /blogs/internet/entries/936e460a-03b3-41db-be96-a6f2f27934e6 /blogs/internet/entries/936e460a-03b3-41db-be96-a6f2f27934e6 Abdallah al-Salmi Abdallah al-Salmi

主播大秀 News and Tor logos

The 主播大秀 World Service's news content became available on the Tor network last week in a move that attracted wide media attention.

The decision to go ahead with setting this service up came at a time when 主播大秀 News is either blocked or restricted in several parts of the world.

For example, in Egypt, Iran and China, our audiences are finding it either impossible or difficult to access our content without the use of a circumvention tool, such as a VPN.

The Tor network is an overlay network on the internet, which provides increased security and is resistant to blocking.

The 主播大秀 is not the first leading organisation to have a direct presence on the Tor network. ; the implementation of the social media platform on the network was built by Facebook engineer Alec Muffett, who later left Facebook and subsequently assisted .

As a result of his experiences, Alec created the , which makes it easier for any organisation to set themselves up on the Tor Network.

With help from the 主播大秀 Online Technology Group, Alec prototyped a solution based on the EOTK for the 主播大秀 World Service. The 主播大秀 has an unusually complex domain name configuration, and the prototype proved that the EOTK could handle this complexity well.

The implementation for the 主播大秀 was carried out by the and Alec continues to be a key contributor. The OTF is one of the leading Internet freedom organisations in the world, who have found prominence through funding and vetting numerous information security and internet freedom projects.

Why an Onion service?

From a technical standpoint, the Tor network is a subset of the internet we know and use every day, and is accessed by users using a modified browser. The key feature of the Tor network is that it is fully encrypted. That’s to say, it hides the location of users, and the protocol it uses is continuously updated to maintain resistance to blocking.

This explains why it is a strong solution for the problem of internet censorship and secure communications and why who live in muzzled media environments.

Users can already access the 主播大秀 (for example ) on the Tor Browser to circumvent blocking. The user’s connection enters the Tor Network in one country, runs through at least three servers, then exits the Tor Network to the 主播大秀 website from another country. While successful in circumventing blocking, this route is exposed to censors who might monitor activity on the last exit server, which is unencrypted, or even tamper with it.

An alternative, called an Onion service, uses the Tor Network’s own address scheme where domain names end with “.onion”. In this case, traffic is directed to a dedicated node on the Tor Network for that service.

This allows the traffic between the Tor Network and the content provider to travel a trustworthy path. This also removes the risk associated with exit nodes.

An additional benefit is that the routing within the Tor Network is simplified when using an Onion service. This provides a much higher performance, which is especially noticeable when watching video.

is available for Windows, MacBook and Linux computers, as well as Android phones. Alternative browsers, such as Brave or the Onion Browser (for iPhones) can also be used. These browsers can be used for both .onion and classic URLs.

The 主播大秀 homepage, with a URL accessed through the Tor network.

Is there different 主播大秀 content on the Tor network?

The 主播大秀 content on the Tor network is not different from that which is accessible to our international audiences under normal conditions.

The experience is similar to being in Ireland or the East Coast in the USA for example. Users will be able to access World Service radio, TV and websites in over 40 languages, as well as the news in English.

Content which is not available internationally, such as 主播大秀 iPlayer, will continue to be unavailable on the 主播大秀 Onion service. Users within the UK appear as international users when they use the 主播大秀 Onion service.

Technical risks

An aspect of setting up an Onion service for the 主播大秀 was the question of whether technical 主播大秀 assets will be placed on the Tor Network or whether the Onion service needs to be technically trusted by the 主播大秀 in any way.

Onion services are https-based and therefore do require their own server certificates and the certificate for the 主播大秀 Onion domain is separate from other 主播大秀 certificates. This allows users to trust that they are actually reaching 主播大秀 content.

The Onion service has to rewrite all of the URLs in order to make the 主播大秀 site work inside the Tor Network. It is therefore essential that the Onion service is operated securely and by a trusted team.

The work done by the EOTK platform does not involve placing any 主播大秀 assets on the Tor network itself. Neither does it need to be provisioned with any passwords or certificates to access 主播大秀 systems. To the 主播大秀, it appears like a normal group of international users.

Content on the Tor network is therefore proxied through the Onion service and there is no additional web hosting commitment.

The 主播大秀's duty of care

Some countries, such as Russia, China and the UAE, have passed laws to regulate the sale and distribution of tools such as VPNs.

For example, the UAE prohibits the use of VPNs to access illegal content. However, 主播大秀 content is not illegal in the UAE.

The promotion of the Onion site by the different 主播大秀 services will include clear warnings that users should be aware of their legal environments and should not use it if it might put them or those close to them under any risk or danger.

Information controls then and now

Controls placed by governments on access to information and trusted news are not new at all.

During the Cold War, some governments used to jam the shortwave radio broadcasts of the 主播大秀 World Service to stop their populations from listening to 主播大秀. Then, the 主播大秀 circumvented these measures by providing new frequencies or changing frequency values to confuse jammers.

These controls are now moving on to the internet. At a time when and online information controls are growing, the 主播大秀 World Service continues to pursue its mission by providing an additional online news presence on the Tor Network.

can be accessed at  (Link updated October 2021)

]]>
0
Shifting gear with the World Service Mon, 26 Mar 2018 15:16:48 +0000 /blogs/internet/entries/f1181b9d-77e8-485a-bad1-5daa394517be /blogs/internet/entries/f1181b9d-77e8-485a-bad1-5daa394517be Robin Pembrooke Robin Pembrooke

Just over a year ago I took this photo of Fran Unsworth and Jonathan Chapman standing in in the middle of an empty concrete shell of a office floor in Nairobi where we had just signed a lease for our new bureau in Kenya. It was exciting to think of what could be done with the space, albeit with a daunting level of work needed in Kenya, Nigeria and India to realise the potential of the investment in the World Service through the World 2020 programme.

Our new Nairobi bureau as it was this time last year

A year of building

A year on, thanks to the hard work of multiple teams across the 主播大秀, we have three brand new bureaus in Delhi, Lagos and Nairobi staffed with hundreds of new journalists broadcasting and publishing in 12 new languages across Africa and Asia. We have also updated or provisioned new facilities in another 50 sites overseas as part of the 2020 expansion programme. It means the 主播大秀 is now broadcasting and publishing in over 40 languages ranging from Pidgin to Korean and our new bureau in Lagos will be launched next week.

I’ve been fortunate to visit three of our new sites in the last 12 months and meet some of the incredible people running those new services. These are really diverse teams, as we publish in 3 or 4 different regional languages from each office.

Part of the Delhi editorial team

In Delhi I watched a brand new team of journalists make news bulletins in Hindi, Telugu, and Tamil one after another; something they do every night. An amazingly complex task, especially as the teams had only been working together at the 主播大秀 for 6 months.

The News bulletin production team in Delhi working in Hindi, Telugu, and Tamil

There are very few organisations anywhere in the world capable of mobilising this level of broadcast and digital infrastructure, let alone the scale of quality journalism in so many different languages. It is one of the things that makes working in technology in the 主播大秀 uniquely interesting and rewarding.

Whilst we’re proud of what we’ve achieved this year, we’re only just getting started.

Although we’ve launched the 12 new websites, created countless programmes in 3 new major bureaus, and hired of hundreds of journalists, we’ve only just begun to transform how the World Service works. The bulk of our audience remains on TV and Radio; to be sustainable we have to deliver engaging digital services that local audiences find relevant and valuable to their lives.

Adjusting our approach

We have a big goal to achieve: the 主播大秀 as a whole is aim to reach 500M weekly users by 2022, and World Service expansion is one of the drivers of growth that we hope will drive an additional 80M users. To get there we’re changing our approach to building digital products and ways of working.

The launch of 13 new languages sites were the starting point of our digital plans for the World Service

In the last year, we had to constantly manage risk to achieve the deadlines we’ve hit. To meet tight delivery schedules for the new websites and building fit out, we often had to reign in the scope of our ambition for our digital products at initial launch. Similarly, we had to be 100% sure that the studios and production systems would work from day one, and so we deployed existing combinations of production tools, albeit now virtualised and IP capable, rather than experimenting with completely new technologies.

We’re now turning our focus to a more agile, data and performance driven phase of delivery.

In the near term that will focus on speed of digital performance and user engagement:

  • our content doesn’t load quick enough on mobile; we want to ensure all our sites load as quickly as possible in all our WS markets, even on slower mobile connections.
  • we also have a lot of users who come to us via search and social, read one article and then leave again. We’re experimenting with to increase the number of users who click on one more thing to read or watch.

World Service Dev Team together with some of the developers from our partner Andela

In parallel with the core delivery our News Labs team have been able to experiment with new tools that leverage the latest in languages technology. One of those tools, called “Stitch”, started as a rapid prototype created in a weekend by a tech savvy journalist. The News Labs team scaled it rapidly to a point where it is used to create over a 1,000 videos a week. It’s helped us publish 4 times as much digital video a week as we did before the World 2020 programme started.

With our journalists, we’ll continue rolling out improvements to the tools and dashboards they use; letting them to connect better with each other, so they can better connect with audiences.

We’ll start blogging soon about some of the new prototypes of technology we’re using to transform how we plan, gather, edit and publish the news across the World Service and the rest of 主播大秀 News.

If you are interested to join the work we are doing to expand and tranform the World Service do get in touch. We are looking for fantastic front end and frameworks developers and data scientists to

]]>
0
World Service expansion continues with new language websites launch Mon, 25 Sep 2017 13:47:17 +0000 /blogs/internet/entries/85535a67-ebcd-464a-b320-fb03fa92a75f /blogs/internet/entries/85535a67-ebcd-464a-b320-fb03fa92a75f Neil Doughty Neil Doughty

Last week 主播大秀 News launched three 主播大秀 News websites, with Tigrinya, Amharic and Afaan Oromoo becoming the 31st, 32nd and 33rd languages to join the World Service. Neil Doughty is the digital Executive Product Owner for the W2020 World Service expansion project, and explains some of the editorial and technical considerations behind the new websites.

The three launches are part of phase two of the World 2020 programme when it comes to digital launches, alongside 主播大秀 News in which launched on August 21st, and following the launch of 主播大秀 News in in November 2016.

Our full list of World 2020 languages is: Thai, Pidgin, Amharic, Tigrinya, Afaan Oromoo, Korean, Marathi, Telugu, Punjabi, Gujarati, Igbo, Yoruba.

The three new services will provide people in Ethiopia and Eritrea - often called the Horn of Africa - and diaspora from those countries with fair impartial news content. These sites are what the 主播大秀 World Service refers to as ‘services of need’ rather than ‘services of want’.

Access to news for people in the Horn of Africa is not easy, and when there is news available it’s hard to understand how much to trust it. These new services will help people in Ethiopia and Eritrea know more about what's going on inside their country, and the rest of the world.

I'm the Executive Product Owner for the digital part of the programme, which means making sure the sites are ready to launch on time is my responsibility.

In many ways the sites are similar to the existing (including Thai and Pidgin), but they do have some new features.

The new W2020 sites are the first from 主播大秀 News to be served over HTTPS by default, which we consider to be an important move for a number of reasons.

  • It makes our users more secure, and as and are ranked in the bottom 30 (of 180) countries in the world by for press freedom, this is important.
  • HTTPS means it’s much harder for anyone to know the specific pages a user is looking at; the hostname - www.bbc.com - can be detected, but not which specific World Service site or page within that site.
  • HTTPS makes it harder for a attack to take place. When this happens entities can intercept and alter data being passed between two systems - often a user and the website they’re accessing. Serving our sites from HTTPS means users can be confident that the content they are consuming is what the 主播大秀 published and hasn’t been altered by anyone else before they accessed it.

Sites on HTTPS have the potential to be faster for users. Sites served over HTTPS can make use of something called which allows page content or assets to be delivered to user’s devices in a more efficient way. The 主播大秀 as an organisation is already doing work to implement HTTP/2, so launching sites over HTTPS means some of the resources needed to construct a web page are being served over HTTP/2.

Also, these new services are launching in the Horn of Africa, where a combination of poor connectivity and data costs can cause problems. HTTPS allows the 主播大秀 to start thinking about (PWAs) which may help mitigate some of these issues in future.

Google that sites which are served over HTTPS will gain a boost in its search algorithm, so as services for new users, that’s a no-brainer.

Later this year we expect the Chrome browser (the most-used among World Service audiences) to use the words Not Secure if a site allows a user to enter data (a search box) and is on HTTP. These services are new, and we want people to understand they can trust them.

So, if there all these great reasons around, why isn't all of 主播大秀 News and World Service jumping straight to HTTPS?

In short, existing sites, particularly in the UK, have a lot of dependencies outside of News, such as media players and legacy content formats and those are tough things to work through.

These new sites don't have all of those elements, and with no legacy content marooned on HTTP to concern ourselves with, launching on HTTPS is less complex.

That said, not all is rosy in the HTTPS garden…

When a site is served over HTTPS a single page within that site cannot be blocked. If a foreign government wants to block a single controversial page, they will not be able to block that page on its own - they will have to restrict access to the entire hostname. In this case that would be www.bbc.com.

Now, some people may argue that offering a site over http, which isn’t therefore blocked, is better for users as they have access to some content, but that raises the possibility that the content they are seeing has been altered in some way and they might be unaware this has happened.

Ultimately knowing that they can trust content from the 主播大秀 and knowing they can do so without anyone being able to find out which 主播大秀 pages they’ve been looking at is probably of more value.

]]>
0
Come and help transform the 主播大秀 World Service Tue, 03 Jan 2017 08:01:00 +0000 /blogs/internet/entries/76fad89b-2bca-45af-9a4a-349ae6b40c26 /blogs/internet/entries/76fad89b-2bca-45af-9a4a-349ae6b40c26 Robin Pembrooke Robin Pembrooke

If you’re looking for a new challenge in 2017, there are a number of fantastic new opportunities to join the team working to digitally transform the 主播大秀 World Service. 

The World Service is an extraordinary and unique organisation; it is the world’s largest international broadcaster, reaching over 320 million people every week on Radio, TV and Online in .

In November we announced the 鈥—鈥妛hich presents a huge, yet complex, technical challenge to launch new broadcast and digital services in a further 11 languages. The programme is called World 2020; we will be establishing new broadcast infrastructure, and updating our news gathering and production processes as more journalists are hired in Africa, Asia, Russia and the Middle East.

As the service expands we are looking to digitally transform how we operate and deliver content to users. There is a massive shift in consumption of news to mobile phones, and this means we need to create and publish our content in new ways, both on 主播大秀 websites and apps, as well as social media platforms that are being used by audiences in the countries in which we are expanding.

We already have a fantastic team of designers, developers, product managers and testers who have expertise in deploying digital products responsively in 27 languages, but we are looking to expand the team to allow us to meet the new challenges of the World 2020 programme.

The existing 主播大秀 Arabic responsive website

New projects will include scaling our publishing solutions for messenger based platforms, and taking some of the recent innovations prototyped by into full production.

If you have a passion for news, for languages technology and digital innovation, and a belief in the impact we can have together through the World Service, we would love to hear from you. We have vacancies in the following roles:

-
-
-
- Testers
- UX Designers
- Business Analysts

The roles are based mainly in London in New Broadcasting House, co-located with many of the journalists creating content and driving the editorial innovations that are part of the World 2020 programme.

We will be running a first round of interviews in January, so do get in touch with your CV if you would like to be considered for one of the roles. Send emails to Laura.Rowley@bbc.co.uk.

This is a great opportunity to join a team that is embarked on a unique global programme of change. We look forward to hearing from you.

]]>
0
A virtual voice-over tool for multilingual journalists Thu, 21 Apr 2016 12:59:00 +0000 /blogs/internet/entries/159cf8a2-faba-427f-9cbc-a09f3d04449a /blogs/internet/entries/159cf8a2-faba-427f-9cbc-a09f3d04449a 主播大秀 News Labs 主播大秀 News Labs

Computer assisted translation and voice synthesis

 is driven by the latest language technology: text-to-speech voice synthesis and computer assisted translation. This joint effort between 主播大秀 News Labs and 主播大秀 Digital Development has produced an innovative tool called ALTO which assists multilingual journalists in reversioning news video content. The pilot service, which offers experimental video clips, is now available on and .

ALTO combines a number of cutting edge language technologies to allow a single language journalist to generate multilingual voice-overs for a video story and script. The script is first pre-translated using machine translation (think Google Translate), the results of which are post-edited by the language journalist. This process is generally referred to as computer-assisted translation. Post-editing is not only necessary for linguistic reasons, but mainly because of the 主播大秀’s editorial requirements which don’t leave any room for unedited, fully automated machine translation.

In the second step, the language journalist converts the translated script into a computer generated voice track. This is done by using off the shelf text-to-speech technology (TTS) provided via cloud services. The TTS voices are generated through unit selection synthesis – also known as concatenative speech synthesis – which makes them sound more ‘natural’. What this means is that each of the TTS voices was once a real voice of a person whose utterances were recorded and then segmented into tiny units (phones, morphemes etc.). When a new track is produced, these segments are joined into new utterances – i.e. synthesised.

Selecting a synthetic voice from the dropdown menu

Occasionally, the synthetic voices mispronounce words – mostly names of people and places (‘proper nouns’). In this case, the language journalist then tweaks the spelling of the word to make the voice pronounce the word correctly; or at least as correctly as possible. ALTO also uses speech synthesis markup language () to help our journalists insert pauses between words and sentences. This makes the new audio sound more intelligible.

The journalists can choose from at least 2 voices in their language. This allows them to create a voice-over with different gender voices to be as close to the original audio as possible.

In the final step, the new audio is automatically attached to the video file. First, the original audio track is stripped off from the video file. Then the new audio, containing the TTS voice audio files, is stitched to the video, and finally the original audio track is re-attached to the video, but at a lower audio level, i.e. they’re ducked automatically. The stitching process takes less than a minute for a short 30-45 second clip.

We are now in the process of developing a more flexible, dynamic auto-ducking of the original audio track. This is to accommodate the fact that the audio tracks vary in their dynamic range and, ideally, should to be fine-tuned when mixed with different TTS voices.

For more information, you can also watch a video explaining the 'virtual voice-over' news service . 

]]>
0
Language technology innovation: reversioning international videos with virtual voice-overs Mon, 21 Dec 2015 09:30:00 +0000 /blogs/internet/entries/d6e20681-9c82-4b92-8dea-1333a7ddf3d2 /blogs/internet/entries/d6e20681-9c82-4b92-8dea-1333a7ddf3d2 主播大秀 News Labs 主播大秀 News Labs

and our digital development teams have launched an online pilot service that offers continuous video play in different languages to our global audiences.

In this joint effort with the and we are trying out the latest language technology: machine translation and computer-generated voices. The Japanese version of the pilot, available on , went live on 15 December and . 

To help our language editors reversion videos more efficiently we are developing a production tool that amalgamates existing technologies and allows a single editor to generate multiple voice-overs on top of an existing video package and script. This allows our audiences to watch the same video stream in different languages. In this pilot service we begin with offering Japanese and Russian in the new year, with more languages to follow.

Language technology has come a long way. In recent years, along with the development of portable smart devices, we have seen this technology improve so fast that we now take it for granted. Features like auto-correction, predictive text, speech recognition, machine translation - and voice synthesis on smartphone personal assistants – are now commonplace.

We want to push the boundaries by creating new tools for our production workflow and innovate the way we work. Pioneering such technology in video production confirms the UK's place at the forefront of digital innovation.

The language technology has been always of great interest to 主播大秀 World Service. According to James Montgomery, Digital Development Director for 主播大秀 News: "The 主播大秀 has some of the best original journalism in the world, with correspondents around the globe. Technology like this means we can bring more of our international journalism to more people." This is especially relevant with the coming to World Service to pay for its expansion in the next few years.

This pilot is the latest in a series of digital innovations from the 主播大秀’s international news services. These include the international launch of the latest 主播大秀 News App, the use of messaging apps to provide specific or emergency news services, the Facebook-only 'pop-up' service in Thai; innovative filming from the 主播大秀 'hexacopter' as well as hackathons in Africa.

]]>
0
主播大秀 Minute CatchUP: Global news in just 60 seconds Wed, 11 Nov 2015 12:41:31 +0000 /blogs/internet/entries/951051ee-8ad1-465e-bf35-6de3d03dba5b /blogs/internet/entries/951051ee-8ad1-465e-bf35-6de3d03dba5b Marlon Parker Marlon Parker

is a player that sits on your web browser and provides a 60 second news bulletin every half an hour. It’s the first pilot to come out of the Connected Studio, World Service partnership programme and is available to try now on . Marlon Parker, who created the pilot, explains the development process from concept to launch.

主播大秀 Minute CatchUP was conceptualised by RLabs at a development studio in Cape Town. The two-day event featured 12 teams formed of a diverse group of local technology and design experts, with a brief to devise and develop solutions that could assist with the distribution of 主播大秀 audio content using digital tools.

Teams developed solutions on how to distribute 主播大秀 audio content

is a social enterprise and innovation hub that provides educational programmes, technology and entrepreneurial support to community members. Most of its innovation hubs are based in peri-urban areas and rural towns. CatchUP was created by a group of young people who are part of the RLabs team and that have a passion for using technology for social change. The opportunity to be part of the event was an incredible experience, and the innovation process from idea to pilot allowed the team to develop and build something that could potentially be scaled across Africa.

The CatchUP team always knew that the best way to develop their solution would be to keep it simple, with a clear focus in mind. The team believed that the offering would be the perfect content to share, especially to a young audience who would just need a quick update on the latest news. The widget that was developed changed as the team started exploring what would be the most meaningful way to share the audio content.

The 主播大秀 Minute CatchUp widget provides quick audio content that is easy to share

Although the CatchUP team had done some basic user-testing on other projects prior to the 主播大秀 Minute CatchUP pilot, the experience with (the user-testing company) allowed us to learn how to gain a deeper insight into how the widget would be used. After multiple testing environments and different users giving feedback, the 主播大秀 Minute CatchUP widget was refined to reflect this valuable feedback. The CatchUP team also had the opportunity to explore new technologies as part of the pilot using web components and . This technology enables an easy integration into any mobile and web browser.

A highlight was working with the incredible 主播大秀 Connected Studio team. They supported CatchUp from the beginning; from forming the idea, through development and on to the pilot launch on 主播大秀 Taster. They were not only supportive for the delivery of the objectives of the pilot but also equipped the CatchUP team with skills that will be valuable for our future development.

You can try  now on 主播大秀 Taster and find out more about the 主播大秀 World Service and 主播大秀 Connected Studio partnership . 

]]>
0
Digital News Discovery and the launch of Google AMP Wed, 07 Oct 2015 14:05:18 +0000 /blogs/internet/entries/d4ba29b1-7d01-4ccf-a964-b152e4c2f8b1 /blogs/internet/entries/d4ba29b1-7d01-4ccf-a964-b152e4c2f8b1 Robin Pembrooke Robin Pembrooke

More users are discovering and reading 主播大秀 News on an ever widening selection of platforms and news aggregation services. For the 主播大秀 World Service, a high proportion of our traffic has always been through partner destinations that we syndicate to, but the last two years have seen an explosion in the volume of our content discovered and consumed inside Facebook, YouTube, and Twitter amongst others.

For digital publishers, whether large and small, the increased audience reach this delivers is welcome, but it has an impact on how we measure and understand our audience, and for 主播大秀 News outside the UK, how we monetise that content on third party platforms. Each of the platforms has tended to have their own technical specification or publishing tools which can drive up our costs of delivering news in a multiplicity of different formats.

It’s in the context of this increasingly complex distribution ecosystem, that we welcome the announcement this week of the new Google Accelerated Mobile Pages (AMP) initiative as a significant and positive development for the Digital News.

What is Google AMP and why have we been involved?

Google AMP is a new open and shared, web standard for publishing content pages, optimized for mobile consumption. Google have announced the initiative today and the 主播大秀 together with around 20 global publishers have been involved in the design and definition of the approach.

The key benefit to users is that pages on mobiles will load much quicker than before, particularly in markets with slow connectivity, due to a simplified approach to both coding and caching of pages. With over 60% of traffic to 主播大秀 News coming from mobiles or tablets, optimising this performance is crucial, particularly for events such as the General Election where we saw over 85% of traffic coming in on mobile devices in the morning after as final results came in.

It also means that content can be previewed and discovered easier that before in Google search results, or other products such as Twitter or Pintrest. From a publisher point of view this means reduced levels of effort to make our content discoverable in a wider variety of destinations.

The new Google Accelerated Mobile Pages previewed in search results

Innovating through collaboration

The new standard has been developed with genuine and thorough collaboration between Google News, and product development teams from publishers across the industry. In a matter of weeks the specification and format has evolved rapidly as we’ve shared and discussed requirements through experimenting with differing approaches to using the new standard. It’s far from perfect yet, but This is still work in progress, but we have very high hopes for what it will deliver for our users.

Now that it is announced, it will be an open standard available for anyone to use be they a large publisher like ourselves, a small specialist publication or individual blogger.

Google have said that the plan is for this to be an open standard that can be used by all browsers, search engines or services. The fact that other content discovery services such as Twitter and Pintrest are involved from the launch of the service is a very encouraging sign of that open collaboration.

As the 主播大秀 we have always aimed to be as open as possible with our approach to innovation and product development. Our 主播大秀 News Labs team publish all of our projects here . Similarly all of our thinking about the use of metadata in digital publishing is completely open for everyone to use - .

The approach of establishing a common industry wide standard for publishing content to mobile web browsers is a welcome development that we’re happy to support.

]]>
0
主播大秀 collaboration to build multilingual media monitoring system Thu, 01 Oct 2015 13:22:41 +0000 /blogs/internet/entries/7b65ffff-1897-4d72-a6bb-fdd6998aaa82 /blogs/internet/entries/7b65ffff-1897-4d72-a6bb-fdd6998aaa82 主播大秀 News Labs 主播大秀 News Labs

Teams from across the world meet for the project organised by 主播大秀 Connected Studio, the World Service and News Labs

and  are partnering up with the University of Edinburgh, UCL, Deutsche Welle and others to build an automated media monitoring tool. We want to build an automated system that will not only monitor hundreds of international TV channels, radio stations, online articles and social media; this system will also be able to observe trends and detect news stories and events in several languages.

Research groups from across the globe are joining up to help build this platform; Priberam (Portugal), LETA (Latvia), Idiap (Switzerland) and the Qatar Computing Research Institute. The project SUMMA (Scalable Understanding of Multilingual Media) has been granted over €6 million by the EU as part of the .

It all began during a event held in December 2014, which was organised by 主播大秀 Connected Studio, the World Service and News Labs as part of the . The focus was on Language Technology, including speech recognition, machine translation and voice synthesis. Teams came from across the world, including Qatar, Bulgaria, Latvia and Scotland – and not long after that, the idea for SUMMA was conceived (over a cup of tea!) to build a multi-lingual media monitoring platform.

Monitoring the international news media is of critical importance, and not just to the 主播大秀, but also to news agencies and journalists and many industrial sectors, including advertising, finance and sports. Monitoring the global media, spotting trends, tracking people in the news and identifying differences in reporting on the same events is a crucial activity for organisations with a global outlook.

The aim of SUMMA is to significantly improve this process through the creation of a platform to automate the analysis of media streams across many languages; to aggregate and distil the content, to automatically create rich knowledge bases and to provide visualisations to cope with this deluge of data.

The scale of the task is increasing massively year-on-year because of the rapidly growing number of internet broadcast and text portals and the increasing number of broadcast media sources. In March 2015, had access to over 13,000 sources. The European Media Monitor at the European Commission’s Joint Research Centre ingests textual material from over 10,000 RSS feeds and HTML pages, 3,750 key news portals world-wide, plus 20 commercial news feeds - in up to 60 languages.

A recent analysis by the Arab Advisors Group determined that there were 658 fully launched and operational Free-to-Air satellite channels targeting Arabic countries in May 2013 - an increase of 600% since 2004, with over 100 extra channels in the previous year. This is a tremendous volume of data and current approaches to monitoring, in particular for audio and video content, simply cannot cope because current media monitoring systems are severely limited in terms of the number of streams that may be simultaneously monitored, the support needed for multiple languages, the ability to ingest and process multiple media types and the richness of the automatic analysis that they supply.

Partners working together at the #newsHack event

主播大秀 Monitoring undertakes one of the most advanced, comprehensive and large scale media monitoring operations, providing news and information from media sources around the world. Around 300 people monitor TV, Radio, internet and Social Media sources to detect trends and changing media behaviour and to flag breaking news events. Media monitoring journalists look for emerging themes – political, societal and economic – and aim to anticipate certain stories and events. The expertise of monitoring analysts and journalists is required to understand a change in behaviour of particular media sources.

The media landscape has become too large to maintain the traditional approach. In SUMMA we shall address this through the development of a scalable platform for intelligent media monitoring featuring:

  • multilingual stream processing including speech recognition, machine translation and story identification
  • the automated construction of knowledge bases based on entity and relation extraction
  • natural language understanding including deep semantic parsing, summarization and sentiment detection
  • rich visualisations based on multiple views (eg topic, person or timeline)

For the latest news on the SUMMA project, please visit the .

]]>
0
主播大秀 Connected Studio Nairobi hack: Why the 主播大秀 needs to innovate outside of the UK Mon, 02 Mar 2015 15:43:03 +0000 /blogs/internet/entries/fe799f4a-0048-49c5-a6c6-de6d73185cb0 /blogs/internet/entries/fe799f4a-0048-49c5-a6c6-de6d73185cb0 Dmitry Shishkin Dmitry Shishkin

Facilitators and participants discuss the Connected Studio: World Service Africa 鈥 Nairobi event

主播大秀’s global news services are reaching 265 million people every week on TV, radio and digital platforms. It’s a great number, without a doubt. Another great number is 500 million – the size of the international audience we are asked to reach by 主播大秀’s centenary in 2022.

The majority of the growth will come from digital, and that the areas of rapid growth are going to be in the developing world, mainly in Africa and Asia. The proliferation of news sources, local, national, regional and international, together with the massive penetration of mobile devices and explosion of social platforms in these parts of the planet, means that competition for audiences’ attention has never been greater.

Ericsson forecasts that a number of mobile subscriptions in sub-Saharan Africa will rise from 635m in 2014 to 930mln in 2019. Mobile data traffic is expected to rise 20-fold between 2013 and 2019 – twice the average world rate. Smartphones at less than $50 are coming, where devices were $200 two years ago. The future is coming closer so fast that we need to be even faster to prepare for fundamental shifts in our relationship with the audience.

The - that has a diverse portfolio of news in 28 languages and a long and rich heritage of covering the globe - appreciates the challenge and knows that the task facing it is massive.

We understand that in order to be successful we need to be constantly experimenting with the way we tell stories, but also with the way we are reaching new audiences.

And this is where comes in, working alongside and the World Service as the 主播大秀’s innovation and technical outreach arm, as this is an absolutely crucial area to focus on if the World Service is to achieve this massive target. It has been clear to us all along that in order to be successful from the technical point of view in markets we don’t necessarily know a lot about, we need to engage the local tech communities in order to develop our ideas, prove hunches and dispel technical myths. UK-facing solutions won’t necessarily work in Africa.

This is where the idea came in, which got the local and international media interested.

Africa has a thriving tech scene, where lots of international companies engage with local startups to better understand the market. We picked Kenya as the place for the first ever non-UK 主播大秀 Connected Studio event as we thought that East Africa would give us a great opportunity to build from.

Firstly, we do a lot of content in English and Swahili; secondly, East Africa is in a different stage of technical development than, say, South Africa, and findings in Nairobi could be easily scaled to, or applied in West Africa or India, for example, and thirdly, Nairobi is the biggest 主播大秀 hub in Africa, so our own internal editorial and organisational infrastructure was there if we needed it.

The event itself went very smoothly. We partnered with a , that helped bring the 13 teams together who took part in the event. Each team was given access to 主播大秀 content feeds, and 主播大秀 staff were on hand to help explain or clarify different aspects of contestants’ ideas. We also had audience representatives attending, so hackers were able to run their ideas and shape them with users who they were building for. The brief was simple but wide; how can you take 主播大秀 editorial content and package it in a way that will be engaging, exciting, interesting and relevant to younger Africans using social media and mobile phones?

The results exceeded our expectations. We have, with two others we’d like to continue talking to. I can’t go into details of what exactly is going to be built just yet, but I can’t wait to start. Our kick-off meetings are in early March and we are fully committed to actually ship something at the end of the process. We started the process with audiences’ needs in mind and we still hold those as the most important part of the puzzle.

We are learning all the time, and we are busy planning the next 主播大秀 hack event in Africa, which will take place in April in Cape Town, South Africa. We are iterating the approach we took in Nairobi, with the next event based upon mobiles and audio. 主播大秀 World Service has a wide variety of content mix, from the latest 主播大秀 Minute aimed at younger audiences to long form documentary series, like the latest Richer World season. We want to try and see how our audio content could be plugged into the lives of local communities.

Technical engagement with Africa is only one aspect of a wider plan to connect with African audiences better. January saw the launch of the Africa edition of the bbc.com front page, making the overall offer more relevant to local needs. More editorial innovation, both on bbc pages, but also on six social media platforms, including 主播大秀 Africa accounts on SoundCloud and Instagram, is going to come through in the Spring.

主播大秀’s digital commitment to Africa is strong, and we are there for the long term.

13 teams took part in Nairobi hack event, and we continue talking to four of them

]]>
0
#newsHACK III: the winners Fri, 19 Dec 2014 13:40:09 +0000 /blogs/internet/entries/8a50d986-1864-3787-b48e-c53966e4620e /blogs/internet/entries/8a50d986-1864-3787-b48e-c53966e4620e Basile Simon Basile Simon

I'm Basile, hacker-journalist with over in Euston, and I'm just recovering from the , which took place on December 15 and 16 in London.

50 participants came from all over the world to hack with us around language technologies. The Qatar Computing Research Institute (QCRI) sent a team over, as well as Germany's international broadcaster Deutsche Welle, Bulgarian semantic tech company Ontotext – some other participants came from Latvia and Portugal.

Matt Shearer, head of News Labs, kicking off the hacking

The event took place in a fantastic startup incubator literally 100 yards away from Tower Bridge,and was run by the fantastic Connected Studio team, driven by our friends from the World Service, and supported by the News Labs.

Our call for collaboration during the event's kick-off proved very fruitful, as the 13 teams reshuffled into 11 after some unplanned collaborations. Staff from 主播大秀 Monitoring, 主播大秀 Weather, Travel, News, Location Services, R&D IRFS, and World Service joined other teams to work on some great projects.

The Winners

Among all these projects, the judges had to pick winners:

1- Best in Show: Qatar Computing Research Institute

Translating 主播大秀 Arabic video into English, including subtitles, voice and using Speaker Diarisation to change gender of speech synthesis alongside the changes in the voiceover gender.

2- Best practical innovation & Closest to launch: 主播大秀 Voice

Tom Collins, Owain Lewis & Darren Lucas from The 主播大秀 Weather, Travel News and Location teams demonstrated a great voice-controlled 主播大秀 app.

Illustration 1: Matt Shearer, head of News Labs, kicking off the hacking

3- Best speech synthesis hack 鈥 Cereproc and Red Bee Media

The team demonstrated the process to go from subtitling to speech synthesis, with really good, realistic voices. Also demo'ed in 4 languages during the presentation!

4- Best entity extraction 鈥 Ontotext (proud providers of our LDP software)

They hacked a great solution for Russian & Arabic entity extraction, using DBPedia, Freebase, and then some clever statistical tricks to get around scarcity of data in those languages in the reference sets.

5- Best end-to-end multilanguage tools - 鈥淕lobal Vox鈥 by Edinburgh university

They demonstrated some brilliant work in all areas of the chain - starting from Audio, through segmentation, transcription, summarisation, NER, translation, to speech synthesis.

6- Most perplexing and steepest learning curve for judges (unplanned award) 鈥 Andreas from UCL

For Multilingual Entity Relation Extraction and Knowledge Base (using cutting edge machine learning and NLP, across languages). He quickly outlined some cutting edge machine learning approaches which were "quite simple" and "still in under review".

骋谤别补迟听翱耻迟肠辞尘别蝉

In between the stressful two days of serious hacking, we managed to kick off at least three informal Language Tech collaborations.

We'd like to see more of UCL and Latvia-Prebaram's project of multilingual Knowledge base for global journalism research tools. Also, Cereproc and their Speech Synthesis will hopefully allow audio delivery of articles, subtitles, captions, and audio description for new language services.

Finally, Edinburgh university really impressed us. There's almost too much to write, as they demonstrated their handling of the whole chain.

We also hope to work with QCRI on some "Arabic language audio processing" topics, perhaps in partnership with 主播大秀 Monitoring.

Team Ontotext

It was a fantastic success and we have kicked off some key relationships for our future Language Technology work - a key part of reaching another 250 million globally.

Serious collaboration and hacking at newsHack III

Special thanks to Connected Studio for running another dazzlingly good #newsHACK event, to World Service for driving this new international partnership with us, and thanks to the News Labs team for supporting and networking with the future partners.

Basile Simon is Hacker Journalist, 主播大秀 News Labs

]]>
0
Crowdsourcing the World Service radio archive: an experiment from 主播大秀 R&D Tue, 24 Sep 2013 09:52:17 +0000 /blogs/internet/entries/44ab6780-3db5-368f-9547-010cd90f6df2 /blogs/internet/entries/44ab6780-3db5-368f-9547-010cd90f6df2 Tristan Ferne Tristan Ferne

The 听allows you to search, browse and listen to over 36,000 radio programmes from the 主播大秀 World Service archive spanning the past 45 years. For a limited time you can explore this archive and help us improve it by validating and adding topic tags that describe the programmes. You just need an email address to register for the .

I work for the team in and for the last few months we have been running an experiment on how to put a large media archive online using a combination of algorithms and people. With your help we aim to comprehensively and accurately tag this collection of 主播大秀 programmes. This video explains how:

A guide to using 主播大秀 R&D's World Service Archive prototype.

The 主播大秀, and many other organisations, have massive archives of TV and radio but it is expensive to put them online in a navigable, findable way so we are researching ways to make it cheaper and easier. R&D is pioneering ways of generating metadata for programmes automatically using innovative algorithms that can "listen to" and tag programmes with topics and speakers.

The World Service Archive prototype is an experiment to apply this research to a real-world archive, and then to improve the results using crowdsourcing. We want to learn about how good the algorithms are, whether and how people tag, and how to combine algorithms with people.

The archive problem

Digitising programmes in the archive

Between 2005 and 2008 the 主播大秀 World Service digitised the contents of its recorded programme library. This included programmes archived from the English-language radio services over the past 45 years - over 50,000 programmes (of which 36,000 factual programmes were available to put online - see footnote) covering a wide range of subjects from weekly African news reporting the 听迟辞 .

The digitisation project was a great success but the metadata was of limited quality and quantity. Metadata is data to describe digital media items and without it content is hard to find and navigate around. So although we might have had a programme title and broadcast date we didn't really know what each programme was about without listening to every one - or indeed know the shape and contents of the whole archive. In 2012 the World Service and engineers from 主播大秀 Research & Development joined together to demonstrate a way to put massive media archives online using a combination of computers and people. We thought we could use advanced algorithms to listen to all the programmes in the archive, automatically generate metadata, use this data to put it online and then ask listeners to validate and improve it.

搁&补尘辫;顿'蝉听蝉辞濒耻迟颈辞苍

We ran the audio through an automated speech-to-text process and this generated fairly noisy transcripts with lots of errors. So we used robust algorithms that we had developed for this purpose to extract key topics from each programme, using to ensure each topic is unambiguous and linked to the web. In total we created around 1 million topics, about 20 per programme. You can read more about the technology we used .

Although the results were fairly good, the automatically generated topic tags for programmes were often wrong - the computers aren't really listening to and understanding the programmes. But we thought that these automatically generated tags, together with the original metadata, were good enough to design and build a browsable and searchable website for the archive. Listeners could use this online prototype and help improve it by validating the automatically generated data and adding their own - "crowdsourcing" the final part of the problem.

Registered users of the experiment can now search, browse and listen to the programmes in the archive, vote on whether automatically generated tags are correct, add their own tags and even correct errors in the programme titles and synopses. You can read more about the .

Progress so far

Programmes tagged over time

So far, users of the prototype have listened to around 12,000 of the 36,000 programmes that are available and tagged or edited about 7,000 of these. This has generated over 70,000 individual metadata "edits" (votes, new tags etc). We've even had some dedicated listeners send us recordings of programmes that were missing from the archive. We are currently analysing the data so far to see how good the tags are by comparing professional archivists, listeners and our algorithms. We're also interested in what the most common tags are, what kind of tags are added by people (are they more often people, events or places?) and which kinds of programmes are most popular.

We want your help!

Could you help us do more? The World Service archive is being made available online for a limited time while we conduct this experiment and we want to get as much data as possible. Try finding the oldest programme we've got, look for , see how we can 听:

Footnote: Some of the audio in this experiment is unavailable due to rights considerations. This mainly affects programmes with drama, readings, comedy performances, music or sport. Although you cannot listen to these programmes they all retain a page in the prototype that describes them. The original digitised archive only contained pre-recorded programmes, so there are no news bulletins present.

Tristran Ferne is Executive Producer, IRFS, 主播大秀 R&D

]]>
0
US Elections: Mobile design on 主播大秀 News Fri, 18 Jan 2013 10:10:46 +0000 /blogs/internet/entries/a7069815-539e-3008-82ce-9a56cccc3f6c /blogs/internet/entries/a7069815-539e-3008-82ce-9a56cccc3f6c Helene Sears Helene Sears

Recently I had the pleasure of working on on the , a subject I'm especially passionate about as an American living in London.

My team produces a huge range of infographics that accompany our daily online news stories and we also do the for larger stories such as this.

Four years ago I followed the elections closely on the 主播大秀 and getting the chance to design for them this time around has been incredible.

US Elections on mobile

A lot of people worked very hard on this project but the team's greatest achievement (though there are many I'm proud of) has to be our mobile design success.

The design brief for the 'news story that brings the web to its knees'

The design challenge was to create an engaging and informative experience for the US Election results that would work across all mobile, tablet and desktop platforms and be consistent with TV.

The main focus of this was showing national and state-by-state presidential and congressional votes in an easy to understand yet visually exciting way.

US Elections has been described as "the news story that brings the web to its knees". It's a big worldwide story that only happens once every four years which means technology has moved on sufficiently enough that the last version can't just be dusted off and reused.

For this election there was no question that we had to create a great experience on our mobile site.

From the beginning of the creative process mobile was on the agenda and by presenting mobile designs at every team catch up we were able to make sure it never became an afterthought.

Designing mobile first is hugely beneficial, not just because it means there will be a mobile solution but also because it means the content has to have a clear hierarchy and progressively enhance.

Designing responsive mobile results pages

This approach not only fits into the development (as the screen enlarges more content is added thus the smallest devices get the lightest pages) but it also fits into people's mental models.

People expect a streamlined view on the mobile which is often relying on a data connection, with a more in-depth solution on their desktops which have greater screen real estate and usually a better connection.

The page that required the most design effort was the results page. There was a long list of requirements from detailed state results to congressional overviews.

I started by creating a post-it note board of everything the page must, should and could do and then began wireframing the modules.

Helene wireframes the results page

We initially experimented with versions of the results page that didn't have a map but we found it was a reference people were so accustomed to that we couldn't drop it.

The map is tricky because the size of a state often is not representative of how many votes it has. Montana is massive yet only has three votes so a map can be dominated by a party colour when actually the opponent won.

I decided to introduce a bar to the top of the page to give a more accurate snapshot of the election score.

As we moved forward we realized we wanted US Elections to have a singular home.

Previously the index, results and live updates page have all been separate pages. By tabbing all the content under the 'scoreboard' banner we were able to ensure users wouldn't miss all that was on offer, or they could choose to quickly check the latest tally if they just wanted an overview.

This score snapshot above in-depth tabs was especially effective on mobile.

State icons

I also wanted to give the site a bit of character so along with designer Nina Monet we gave each and every state an icon. These were things that the state was famous for and although only a small detail on the page it was fun to read people's about them. [N.B links to external site with strong language]

Taking the designs worldwide

As this story is of interest worldwide we worked closely with World Service. Designers Nour Saab and Charlotte Thornton adapted our designs for over 20 languages.

face difficult challenges. For example words in Russian are often much longer and don't fit the spacing allocated for the English words and Arabic reads right to left so the whole page has to be flipped.

Charlotte spent many days sitting with the developer - the ideal way to work I believe - making sure the design and development were aligned.

The results page in Arabic

Final thoughts

The benefit of this project was that regardless of the user's device or language there was a consistent US Elections home with a 'scoreboard' overview above the tabbed index, results and live page.

Although not fully responsive it's close. The mobile site has a clear relationship with the desktop. We had 16 million unique visitors to the site of which 30% were on mobile proving our mobile first approach was worthwhile.

This project had the support of the Visual Journalism team and I'd like to send a big thank you to everyone who helped make it a reality. It would be great to hear what you thought of the site.

is the editorial designer for UX&D.

]]>
0