en Technology + Creativity at the Feed Technology, innovation, engineering, design, development. The home of the 's digital services. Thu, 04 Nov 2021 14:51:30 +0000 Zend_Feed_Writer 2 (http://framework.zend.com) /blogs/internet HTTPS is easy, just turn it on… Thu, 04 Nov 2021 14:51:30 +0000 /blogs/internet/entries/ca919719-f8a4-4304-bf58-f71e82d0a13c /blogs/internet/entries/ca919719-f8a4-4304-bf58-f71e82d0a13c Neil Craig Neil Craig

The HTTPS padlock icon on bbc.co.uk

Back in early 2015, I'd just started working at the and whilst getting to know who's who and what's what, I discovered to my surprise that large parts of our main websites (www.bbc.co.uk and ) were only available over plaintext HTTP. My immediate thought was, "Well, here's something I can get stuck into immediately - how hard can it be to get to 100% HTTPS?".

We're now about six years on, and we're only just finishing the full migration to HTTPS. All you need for HTTPS is a vaguely modern CDN/traffic manager/server and a TLS cert plus a few changes to your HTML, right? Yeah, wouldn't it be great if things were that simple!

Setting the scene

At this stage, it probably helps to rewind a little to a circa 2015 context. It'd been about two years since the , which had helped to squash any remaining doubts as to the necessity of HTTPS. Around this time, the major web browsers began signalling their intentions to gradually ramp up the pressure on website operators to serve websites over HTTPS by restricting access to sensitive APIs to HTTPS contexts and also via changes in user interface (UI) indicators. Various platforms and services such as  were created or came to prominence, making it cheaper, easier, and faster to get TLS certificates and securely serve web content. The direction of travel for the web was clear - HTTPS was gradually replacing HTTP as the default transport, but we were nowhere near as far along the road as we are now.

The websites share a common public 'web edge' traffic management service. The web edge is similar to a CDN in that it handles TLS termination (as well as routing, caching and so on), but behind that, there are individual stacks that are managed by our Product teams - these form our , , ,  sites amongst others. It's fair to say in 2015, the number of our Product team's websites served over HTTPS was quite mixed - as was true of much of the internet.

Our web edge already offered HTTPS 'for free' to our Product teams in 2015. To migrate to HTTPS, our Product teams had to do the engineering work for compatibility of their websites and opt-in to an 'HTTPS allowlist' - otherwise, our web edge would force their traffic to HTTP.

Raising the issue

The first formal thing I did towards 100% HTTPS was to present to a forum which most of the Product architects attended to raise the issue and highlight why we should migrate to HTTPS and what was going to change soon in Chrome in terms of UI signals:

 

Screengrab of a presentation slide which highlights the drivers for HTTPS adoption.

The web browser UI signalling changes for plaintext HTTP were pretty new at the time and not as widely communicated as they are now. The planned UI changes were a really useful driver for our Product teams since they were a concrete change that would have direct user impact - something to galvanise the need for action and a timeline to work to. Our Product teams were, of course, all more than a little bit aware of HTTPS and those teams who hadn't migrated already intended to migrate as time allowed. However, this helped a few of them with a business case to make the change, and the discussion helped bring HTTPS further into the general conversation.

The h2 carrot

'HTTP/2 all the things!' meme on a presentation slide.

A year or so on from my initial presentation about HTTPS, we began to think in more detail about providing HTTP/2 (h2) on our web edge since the support in web browsers and servers/services was mature enough by then. We did the requisite planning work, the usual comms to our teams and then rolled h2 out. We had a bit of an issue with this, and there was a fair bit of work involved but before long, all our HTTPS web pages were also available over h2 - an added carrot to the teams who'd not yet migrated to HTTPS.

Product migrations

Our Product teams have done the bulk of work in migrating websites to HTTPS on their individual stacks. As well as in-place updates, there has been some major re-platforming work which is moving our Product websites on to new, HTTPS native platforms such as Web Core for Public Service, Simorgh for World Service and new, dedicated platforms for iPlayer and Sounds.

I didn't get involved specifically in any of these Product migrations, aside from the odd conversation and friendly badgering, so whilst it was a lot of work and absolutely vital, it's relatively well-understood work. So, for the remainder of this blog post, I'll focus more on the aspects of our migration that were perhaps less obvious (and often really quite awkward).

Content retention - 'The Archive'

The has a content retention mandate which states:

13.3.8 Unless content is specifically made available only for a limited time period, there is a presumption that material published online will become part of a permanently accessible archive and should be preserved in as complete a state as possible.

During a re-platforming in circa 2013-14, the decision was taken to archive (rather than migrate to the new CMS) a lot of the older content, which our retention mandate demands we keep online. The archive was produced via a crawler which saved web pages to online object storage as flat HTML and asset files. We ended up with somewhere in excess of 150 million archived web pages across hundreds of retired Products - all of which were captured as plaintext HTTP.

 

An archived web page: Sing.

Accurately migrating this many wildly differing static pages to HTTPS is not simple. Some quick maths and thinking-through eliminated the option of writing a crawler to run through the archive and update the HTML, JavaScript and CSS in-place - it's too risky, slow and expensive. Instead, I used our comprehensive access logging/analysis system, , to make sure that clients supported it then trialled allowing HTTPS on a section of the archive whilst adding the header to instruct clients to automagically upgrade HTTP links/asset loads to HTTPS.

The trial worked well, so we gradually rolled this out and monitored the effects via access logs, the CSP and  elements of the .

Robots.txt and friends

The final major hurdle we encountered was in serving global static assets - robots.txt, sitemaps, 3rd party authentication files and the like. We were still using our previous-generation traffic managers to host global static assets, and the configuration was unexpectedly coupled to our HTTPS allowlist logic. That wasn't a problem in itself, but it meant that when I asked one of our ops teams to remove the HTTPS allowlist, the serving of these static assets broke. Time for a rethink.

Our 24/7 support/ops team valiantly stepped in to build and run a new service that solved two problems in one - migrating the routing of global assets to our new traffic managers in a single-scheme fashion.

Removing the HTTPS allowlist

Once the robots.txt (et al.) problem was solved, we could finally remove the HTTPS allowlist, which meant that all content on www.bbc.co.uk was available over HTTPS. That was a really key step in this whole process.

HSTS

Once we had all our content on www.bbc.co.uk available over HTTPS, we began rolling out , which instructs to silently upgrade any plaintext HTTP links they come across for www.bbc.co.uk with HTTPS links. So that we can gain confidence and revert in a reasonable time if there are problems, we'll gradually increase the max-age on our HSTS header as follows:

  • Set to 10 seconds, then wait for 1 day (basic test for major issues)
  • Set to 600 seconds (10 minutes), then wait for 2 days (covers most page-to-page navigations)
  • Set to 3600 seconds (1 hour), then wait for 4 days (also covers most iPlayer/Sounds durations)
  • Set to 86400 seconds (1 day), then wait for 14 days (covers frequent users day-to-day)
  • Set to 2592000 seconds (30 days), then wait for 6 months (covers most users)
  • Set to 31536000 seconds (1 year)

To de-risk HSTS, as well as all the work above, progressing HSTS through our pre-live environments and some theoretical analysis, we used the Chrome net-internals facility to locally add HSTS for www.bbc.co.uk.

Assuming the HSTS rollout goes to plan, we'll look into for www.bbc.co.uk to avoid the ToFU (Trust on First Use) issue.

What's left to do?

Having jumped over most of the hurdles in our way, the last few jobs to do right now are:

  1. Use the "force HTTPS" feature in our traffic managers in conjunction with the already deployed CSP "upgrade insecure requests" on our archived web pages to ensure archived pages and their assets are loaded over HTTPS.
  2. Inform our Product teams that they can opt in to using the "force HTTPS" feature and therefore remove their own HTTP → HTTPS redirects in their origin services.
  3. Migrate the remaining couple of websites on www.bbc.com which are still plaintext HTTP, then roll out HSTS on www.bbc.com as well.
]]>
0
Checking your password against data breaches Wed, 19 Aug 2020 08:51:22 +0000 /blogs/internet/entries/4914ed43-dbf7-4480-bbc0-1c38c43c314d /blogs/internet/entries/4914ed43-dbf7-4480-bbc0-1c38c43c314d Marc Burrows Marc Burrows

In the Account team, we’re constantly trying to find better ways to keep your account data safe. Part of this includes making sure you choose a secure password.

When Account was launched in 2015, we followed the OWASP authentication guidelines on password security. It suggested:

  • a minimum of 8 characters
  • a mixture of letters, numbers or symbols
  • not matching the first part of your email address.

These guidelines were a good starting point. But over time, more security insights have become available.

A new password checker

We’ve just released a new feature for when users register for a account, or reset their password. This checks the chosen password against a large list of passwords previously exposed in data breaches elsewhere on the internet, which are in the public domain. We can then recommend changing the entered password.

The feature came about as a prototype in 2019, with two software engineers working together during ‘10% time’: a day every two weeks where the engineering team works on ideas to improve our engineering processes, come up with new features, and create quick proof of concepts.

The password checker uses a service called , created by web security expert Troy Hunt.

Why do we check passwords?

We’ve put together this password checker so you know if your password isn’t as safe as you might think. It means you can choose a different password on the – and importantly, stop using it on other websites and services too.

The most common way in which hackers access an individual’s accounts is by assuming users reuse the same password across their accounts. If you use the same password for every website, or even a similar password with a different ending, then your information is only as safe as the least secure of those websites.

Does this mean you’re sharing my password with someone?

No, and we will never do so. The process of how this works can be seen in the diagram below, with further explanation afterwards.


1. You enter a new password on the registration or forgotten password form

2. We take this entered password and hash it using the SHA-1 cryptographic hash function (To find out more about why we use this, please read about the service)

3. Of this created hash, we extract the first 5 characters to create a “prefix”, and keep the rest as a “suffix”

4. This 5 character prefix is sent to the HIBP Pwned Passwords API

5. The API will return a list of 800-1000 “suffixes” of fully hashed passwords from data breaches that match the prefix we sent

6. Once we have received this list, we search these results for the presence of our original suffix and can then inform you whether the entered password has been part of a breach.

What should I do if the password I’ve used was in a breach?

Our first recommendation would be for you to change the entered password, and use something different. If you already use that same password elsewhere, we’d recommend you change it in those places as well, making sure you use a different password for every website.

We’d also recommend using a password manager to help remember all your passwords for you. on why password managers are beneficial in keeping your data safe and secure.

What’s next?

If we find that the feature works well, we could well integrate it further. We’re always looking to improve and would welcome feedback on this feature to help guide our future plans.

]]>
0
Around the world with TLS 1.0 (Part 2) Mon, 18 May 2020 08:27:23 +0000 /blogs/internet/entries/478f0e3e-a9fe-4223-be89-4dd78ba076ec /blogs/internet/entries/478f0e3e-a9fe-4223-be89-4dd78ba076ec Neil Craig Neil Craig

Following a which was initiated by , asking about experiences with disabling TLS1.0 and 1.1, I committed to writing an update on my late 2018 blog post, This is that update.

I’ll keep this post brief and aim to keep the comparisons pretty direct. If you haven’t already, I’d recommend reading for context and methodology. Let’s dive in…

Global view

First of all, I looked at our “global view” of TLS usage. This covers TLS usage on www.bbc.co.uk and from every country we served:

November 2018 (original) data

TLS Version

Number of requests

Percentage

TLSv1.2

2,002,516,373

97.96%

TLSv1.1

4,529,764

0.22%

TLSv1.0

37,160,210

1.82%

 

February 2020 data

Context: We have two traffic edges currently (one of which replaced the traffic edge in the 2018 data), one for UK and mainland Europe (which supports TLS1.3), another for “rest of world” (which does not yet support TLS1.3)

UK, mainland Europe & “rest of world”:

TLS Version

Number of requests

Percentage

TLSv1.3

1,163,496,361

48.45%

TLSv1.2

1,218,683,970

50.75%

TLSv1.1

901,942

0.04%

TLSv1.0

18,164,567

0.76%

This shows a ~68% reduction in TLS1.0 usage globally over the 15 months or so between the two datasets. That’s pretty significant and is more than I had expected.

Incidentally, if we look exclusively at our UK/mainland Europe traffic edge (where TLS1.3 is enabled) we see ~69% TLS1.3 — so the adoption rate is strong:

TLS Version

Number of requests

Percentage

TLSv1.3

1,163,496,361

69.07%

TLSv1.2

506,655,701

30.08%

TLSv1.1

500,971

0.03%

TLSv1.0

13,879,940

0.82%

Per-Country view

Let’s examine how TLS1.0 usage has changed on a country-by-country basis. Again, we’ll find the percentage of HTTPS requests which used TLS1.0 for countries which made ≥ 10,000 HTTPS requests over 3 days. I’ll represent this as a comparison view for simplicity:

Country

Num requests (Nov. 2018)

% TLS 1.0 (Nov. 2018)

Num requests (Feb 2020)

% TLS 1.0 (Feb 2020)

% reduction

Bosnia and Herzegovina

35,031

100.00%

418,582

0.90%

99.10%

China

2,261,506

86.93%

2,549,943

19.79%

77.24%

Montenegro

28,712

48.74%

193,059

0.61%

98.76%

Croatia

113,948

43.75%

1,210,262

7.79%

82.19%

Uganda

150,225

34.48%

1,619,262

6.22%

81.95%

Honduras

97,644

29.55%

916,586

6.77%

77.10%

Ethiopia

180,473

26.04%

2,186,672

6.67%

74.38%

Democratic Republic of the Congo

12,775

25.67%

138,347

3.80%

85.20%

Nigeria

1,224,923

25.13%

9,621,049

8.08%

67.84%

Cote d'Ivoire

14,717

23.68%

170,716

8.11%

65.74%

Myanmar

164,751

21.25%

2,333,043

1.53%

92.80%

Hungary

175,327

20.20%

4,042,959

0.15%

99.24%

Cameroon

11,618

15.02%

217,951

6.87%

54.29%

Tanzania

76,469

14.93%

4,874,370

7.17%

51.95%

Somalia

189,509

12.98%

1,236,812

2.58%

80.12%

Sudan

16,273

12.93%

517,011

6.73%

47.92%

Mozambique

10,348

12.39%

228,480

3.31%

73.28%

Taiwan

195,132

11.01%

5,991,350

3.68%

66.55%

Zambia

29,070

10.41%

902,829

2.36%

77.31%

Morocco

32,932

10.04%

1,998,655

2.81%

72.03%

Uzbekistan

17,135

9.38%

1,270,560

2.46%

73.74%

Japan

489,215

9.15%

14,841,878

1.33%

85.44%

Hong Kong

426,542

8.97%

368,286

2.43%

72.91%

Algeria

24,760

8.97%

78,643

5.59%

37.65%

Romania

62,019

8.79%

52,821

1.78%

79.75%

Zimbabwe

19,253

8.15%

12,272

1.90%

76.69%

Egypt

52,061

7.60%

189,551

2.72%

64.21%

Turkey

234,372

7.32%

185,453

1.56%

78.69%

Philippines

94,536

6.95%

81,734

2.09%

69.93%

Ghana

44,913

6.71%

24,535

1.09%

83.76%

Belarus

28,211

6.68%

9,250

0.73%

89.07%

Kenya

73,939

6.39%

48,674

1.31%

79.50%

Nepal

38,569

6.00%

9,477

0.36%

94.00%

Bulgaria

27,659

5.96%

5,952

0.36%

93.96%

Malawi

15,501

5.85%

8,170

2.03%

65.30%

Jordan

13,419

5.73%

9,279

0.74%

87.09%

Indonesia

119,720

5.40%

63,831

0.98%

81.85%

Ukraine

86,505

5.35%

66,016

0.62%

88.41%

Republic of Korea

83,370

5.33%

42,123

0.98%

81.61%

Saudi Arabia

79,834

5.21%

108,438

1.54%

70.44%

       

Mean reduction

76.97%

 

This shows some even more significant reductions in TLS1.0 usage for some countries, the mean reduction being ~77%.

Some interesting observations from these data:

  • Hungary has both the largest reduction (99.24%) and the lowest percentage (0.15%) usage of TLS1.0
  • Algeria saw the smallest reduction in TLS1.0 usage, at 37.65%
  • China has the highest percentage usage of TLS1.0 at 19.79%

Let’s update our view for the UK and USA against the 2018 data:

Country

Num requests (Nov. 2018)

% TLS 1.0 (Nov. 2018)

Num requests (Feb. 2020)

% TLS 1.0 (Feb. 2020)

% reduction

Great Britain

23,778,043

1.43%

9,288,530

0.71%

51%

USA

2,373,620

1.47%

1,557,219

0.40%

72%

This is interesting in its own right, both the UK and USA have smaller (albeit it only a little smaller for the USA) reductions than the mean from the “2018 worst offenders” list, above. This is perhaps because the UK and USA have a smaller base of real users on TLS1.0, with more usage being “is the internet working” checks running on old platforms, corporate proxies etc. (we seem to be used for lots of these sorts of tests, which is hopefully a compliment!).

It’s worth updating the countries which have the largest percentage usage of TLS1.0 — the list above was the “worst of” 2018. Here’s the top 10 countries with the highest percentage of TLS1.0 usage in Feb. 2020:

Country

Number of requests

Percentage of TLS 1.0 usage

United States Minor Outlying Islands

389,725,509.

100.00%

Antarctica

4,979,351

100.00%

Kosovo

276,524

100.00%

Niue

12,758,637

100.00%

American Samoa

5,063,507

100.00%

Christmas Island

1,633,591

100.00%

Mayotte

8,590,803

100.00%

Svalbard and Jan Mayen

998,549

99.99%

Pitcairn Islands

425,550

99.98%

Tuvalu

5,770,681

99.98%

Yikes, lots of countries with 100% (rounded to 2 DP) TLS1.0 usage. It seems that most of these countries are relatively small (in comparison to the “worst offenders” in 2018) so maybe the above is the result of one or a few legacy systems in each country/territory.

Clients

As in 2018, it’s useful to know what is making all these TLS1.0 requests. The table below is slightly improved over the 2018 data (please see the original post for info). These data are global and show the top 10 by “Operating system” and “User Agent” fields which are parsed from the User Agent request header as a normalisation stage:

Operating system

User Agent

Percentage of TLS 1.0 usage

Unknown

Unknown

38.02%

Android 4.2.2

Android Browser 4

2.54%

Windows 7

IE 7

2.30%

Android 4.4.4

Unknown

2.03%

Windows 7

IE 9

2.02%

Android 4.4.2

Android Browser 4

1.97%

Android 2.3.6

Android Browser 4

1.93%

Mac OS 10.11.6

Chrome 53

1.85%

Windows 8

Firefox 16

1.80%

Unknown

WebKit 533

1.77%

“Unknown” means that the parser library doesn’t know what the Operating System / User Agent is — either because it’s uncommon or ancient! What we see here are very outdated Operating Systems and User Agents — essentially these seem to be combinations of:

  • Old Operating Systems with old TLS stacks and User Agents which use the Operating System TLS stack
  • Old User Agents with old TLS stacks which don’t use the (sometimes more modern) Operating System TLS stack

The top 10 User Agents whose Operating system and User Agent are both unknown are:

  • Nokia6280/2.0 (03.60) Profile/MIDP-2.0 Configuration/CLDC-1.1
  • CITRIXRECEIVER
  • <empty>
  • Mozilla/5.0 (compatible; Genieo/1.0 http://www.genieo.com/webfilter.html)
  • SGOS/6.7.3.9 (S400–30; Proxy Edition)
  • Mozilla/5.0 (compatible; PRTG Network Monitor (www.paessler.com); Windows)
  • Dorado WAP-Browser/1.0.0
  • Mozilla/4.0 (ISA Server Connectivity Check)
  • ProxySG Appliance
  • WinampMPEG/2.00

So yep, as expected, generally ancient User Agents and the usual suspects. Most notably, it appears we have essentially fewer “real” (as in “used by people”) User Agents which negotiate TLS1.0, leaving a higher proportion of TLS1.0 usage from what appear to be automated systems. This makes sense if you consider the changes in Operating systems over the 15 month span between my two datasets — Windows 10, for instance, has gone from around 38% to 57% (desktop) market share (largely replacing Windows 7) and brings with it a much more modern TLS stack. Similarly, many users will have upgraded mobile phones, tablets and other devices.

Conclusion

TLS1.0 has seen a significant reduction in usage of around 77% for our audiences over the 15 months since I wrote the original blog post but usage of TLS1.0 in some geographies remains stubbornly high. The trend is clear though, TLS1.0 usage is absolutely on the wane and whilst the long tail of this usage will undoubtedly drag last for years, usage patterns are moving in the right direction (at least in our audience).

We operate with a single edge configuration (in terms of TLS) around the world so we need to take a decision on when the right time to remove TLS1.0 (and 1.1) support is — balancing the security risks against the hard cut-off for users. Something we have put some thought into is a mechanism for warning our audience of such breaking changes — we’re not there yet with it but it’s definitely something I’d like to have as a deprecation process which aims to inform the end user and ideally, show them a workable upgrade path so they can continue to use our services, if they so choose.

Let me know if you have questions or would like more detail on an element shown here and I’ll do my best to get you the information. Please either leave a comment below or .

]]>
0
Leveraging the Tor Network to circumvent blocking of News content Wed, 30 Oct 2019 08:18:33 +0000 /blogs/internet/entries/936e460a-03b3-41db-be96-a6f2f27934e6 /blogs/internet/entries/936e460a-03b3-41db-be96-a6f2f27934e6 Abdallah al-Salmi Abdallah al-Salmi

News and Tor logos

The World Service's news content became available on the Tor network last week in a move that attracted wide media attention.

The decision to go ahead with setting this service up came at a time when News is either blocked or restricted in several parts of the world.

For example, in Egypt, Iran and China, our audiences are finding it either impossible or difficult to access our content without the use of a circumvention tool, such as a VPN.

The Tor network is an overlay network on the internet, which provides increased security and is resistant to blocking.

The is not the first leading organisation to have a direct presence on the Tor network. ; the implementation of the social media platform on the network was built by Facebook engineer Alec Muffett, who later left Facebook and subsequently assisted .

As a result of his experiences, Alec created the , which makes it easier for any organisation to set themselves up on the Tor Network.

With help from the Online Technology Group, Alec prototyped a solution based on the EOTK for the World Service. The has an unusually complex domain name configuration, and the prototype proved that the EOTK could handle this complexity well.

The implementation for the was carried out by the and Alec continues to be a key contributor. The OTF is one of the leading Internet freedom organisations in the world, who have found prominence through funding and vetting numerous information security and internet freedom projects.

Why an Onion service?

From a technical standpoint, the Tor network is a subset of the internet we know and use every day, and is accessed by users using a modified browser. The key feature of the Tor network is that it is fully encrypted. That’s to say, it hides the location of users, and the protocol it uses is continuously updated to maintain resistance to blocking.

This explains why it is a strong solution for the problem of internet censorship and secure communications and why who live in muzzled media environments.

Users can already access the (for example ) on the Tor Browser to circumvent blocking. The user’s connection enters the Tor Network in one country, runs through at least three servers, then exits the Tor Network to the website from another country. While successful in circumventing blocking, this route is exposed to censors who might monitor activity on the last exit server, which is unencrypted, or even tamper with it.

An alternative, called an Onion service, uses the Tor Network’s own address scheme where domain names end with “.onion”. In this case, traffic is directed to a dedicated node on the Tor Network for that service.

This allows the traffic between the Tor Network and the content provider to travel a trustworthy path. This also removes the risk associated with exit nodes.

An additional benefit is that the routing within the Tor Network is simplified when using an Onion service. This provides a much higher performance, which is especially noticeable when watching video.

is available for Windows, MacBook and Linux computers, as well as Android phones. Alternative browsers, such as Brave or the Onion Browser (for iPhones) can also be used. These browsers can be used for both .onion and classic URLs.

The homepage, with a URL accessed through the Tor network.

Is there different content on the Tor network?

The content on the Tor network is not different from that which is accessible to our international audiences under normal conditions.

The experience is similar to being in Ireland or the East Coast in the USA for example. Users will be able to access World Service radio, TV and websites in over 40 languages, as well as the news in English.

Content which is not available internationally, such as iPlayer, will continue to be unavailable on the Onion service. Users within the UK appear as international users when they use the Onion service.

Technical risks

An aspect of setting up an Onion service for the was the question of whether technical assets will be placed on the Tor Network or whether the Onion service needs to be technically trusted by the in any way.

Onion services are https-based and therefore do require their own server certificates and the certificate for the Onion domain is separate from other certificates. This allows users to trust that they are actually reaching content.

The Onion service has to rewrite all of the URLs in order to make the site work inside the Tor Network. It is therefore essential that the Onion service is operated securely and by a trusted team.

The work done by the EOTK platform does not involve placing any assets on the Tor network itself. Neither does it need to be provisioned with any passwords or certificates to access systems. To the , it appears like a normal group of international users.

Content on the Tor network is therefore proxied through the Onion service and there is no additional web hosting commitment.

The 's duty of care

Some countries, such as Russia, China and the UAE, have passed laws to regulate the sale and distribution of tools such as VPNs.

For example, the UAE prohibits the use of VPNs to access illegal content. However, content is not illegal in the UAE.

The promotion of the Onion site by the different services will include clear warnings that users should be aware of their legal environments and should not use it if it might put them or those close to them under any risk or danger.

Information controls then and now

Controls placed by governments on access to information and trusted news are not new at all.

During the Cold War, some governments used to jam the shortwave radio broadcasts of the World Service to stop their populations from listening to . Then, the circumvented these measures by providing new frequencies or changing frequency values to confuse jammers.

These controls are now moving on to the internet. At a time when and online information controls are growing, the World Service continues to pursue its mission by providing an additional online news presence on the Tor Network.

can be accessed at  (Link updated October 2021)

]]>
0
Navigating the data ecosystem technology landscape Tue, 03 Sep 2019 12:46:36 +0000 /blogs/internet/entries/67fee994-3d20-45d5-be2a-acfc47d572f1 /blogs/internet/entries/67fee994-3d20-45d5-be2a-acfc47d572f1 Hannes Ricklefs, Max Leonard Hannes Ricklefs, Max Leonard

Credit: Jasmine Cox

Want to message your Facebook friends on Twitter? Move your purchased music from iTunes to Amazon? Get Netflix recommendations based on your iPlayer history? Well, currently you can’t.

Many organisations are built on data, but the vast majority of the leading players in this market are structured as vertically integrated walled gardens, with few (if any) meaningful interfaces to any outside services. There are a great number of reasons for this, but regardless of whether they are intentional or technological happenstance (or a mixture of both), there is a rapidly growing movement of GDPR supercharged technologists who are putting forward decentralised and open alternatives to the household names of today. For the in particular, these new ways of approaching data are well aligned with our public service ethos and commitment to treating data in the most ethical way possible.

Refining how the uses data, both personal and public, is critical if we are to create a truly personalised in the near term and essential if we want to remain relevant in the coming decades. Our Chief Technology and Product Officer Matthew Postgate recently spoke about the ’s role within data-led services, in which he outlined some of the work we have been doing in this respect to ensure the and other public service organisations are not absent from new and emerging data economies.

Alongside focused technical research projects like the Box, we have been mapping the emerging players, technologies and data ecosystems to further inform the ’s potential role in this emerging landscape. Our view is that such an ecosystem is made up of the following core capabilities: Identity, data management (storage, access, and processing), data semantics and the developer experience, which are currently handled wholesale in traditional vertical services. A first step for us is hence to ascertain which of these core capabilities can realistically be deployed in a federated, decentralised future, and which implementations currently exist to practically facilitate this.

Identity, a crucial component of the data ecosystem, proves who users say they are providing a true digital identity. Furthermore we expect standard account features such as authentication and sharing options via unique access token that could enable users to get insights or to share data to be part of any offering. We found that identity, in the context of proving a user’s identity, was not provided by any of the solutions we investigated. Standard account features were present, ranging from platform specific implementations, to decentralised identifier approaches via WebID, and blockchain based distributed ledger approaches. As we strongly believe it is important to prove a user is who they say they are, at this point we would look to integrate solutions that specialise in this domain.

Data management can be further broken down into 3 areas:

  1. Data usage and access, involves providing integration of data sources with an associated permission and authorisation model. Users should have complete governance of their data and usage by data services. Strong data security controls and progressive disclosure of data are key here. Given our investigation is based around personal data stores (PDS) and time series sensor/IoT device data platforms to capture personal, public and open data, providing access and controls around sharing of data was a fundamental capability of all offerings. All of them provided significant granularity and transparency to the users about what data is being stored, its source and usage by external services.
  2. Data storage must provide high protection guarantees of users’ data, encrypted in transit and at rest, giving users complete control and transparency of data lifecycle management. Again, this is a fundamental requirement, such that storage is either a core offering of any platform or outsourced to external services that store data in strongly encrypted formats.
  3. Data processing mechanisms to allow users to bring “algorithms” to their data, combined with a strong contract based exchange of data. Users are in control and understand what insights algorithms and services derive from their data. These might include aspects such as the creation of reports, creation and execution of machine learning models, other capabilities that reinforce the user’s control over how their personal data is used for generated insights. Through contract and authorisation based approaches users have complete audit trails of any processing performed which provides transparency of how data is utilised by services, whilst continuously being able to detect suspicious or unauthorised data access. Our investigations found that processing of data is either through providing SDKs that heavily specify the workflow for data processing, or no provisioning at all, leaving it to developers to create their own solution.

Data model and semantics refers to mechanisms that describe (schemas, ontologies) and maintain the data domains inside of the ecosystem, which is essential to provide extensibility and interoperability. Our investigations found this being approached in a wide spectrum from:

  1. no provision requiring developers to come to conclusions about the best way to proceed
  2. using open standards such as schema.org and modeling data around linked data and RDF
  3. completely proprietary definitions around schemas within the system.

Finally the developer experience is key. It requires a set of software development tools to enable engineers to develop features and experiences as well as being able to implement unique value propositions required by services. This is the strongest and most consistent area across all our findings.

In summary our investigations have shown that there is no one solution that provides all of our identified and required capabilities. Crucially the majority of the explored end user solutions are still commercially orientated, such that they either make money from subscribers or through associated services.

So with the number of start-ups, software projects and standards that meet these capabilities snowballing, where might the fit into this increasingly crowded new world?

We believe that the has a role to play in all of these capabilities and that it would enhance our existing public service offering: to inform, educate and entertain. A healthy ecosystem requires multiple tenants and solutions providers, all adhering to core values such as transparency, interoperability and extensibility. Only then will users be able to freely and independently move or share their data between providers which would enable purposeful collaboration and fair competition toward delivering value to audiences, society and industry.

The was incorporated at the dawn of the radio era to counteract the unbridled free-for-all that often comes with any disruptive technology, and its remit to shape standards and practices for the good of the UK and its population stands today as . With a scale, reach and purpose that is unique to the , it is strongly congruent with our public service duty to help drive policy, standards and access rights to ensure that the riches on offer in these new ecosystems are not coopted solely for the downward pursuit of profit, and remain accessible for the benefit of all.

]]>
0
Looking at the 's role in data-led services Wed, 19 Jun 2019 08:20:47 +0000 /blogs/internet/entries/78948980-e1e6-48fe-918a-c9bb5f2a0719 /blogs/internet/entries/78948980-e1e6-48fe-918a-c9bb5f2a0719 Matthew Postgate Matthew Postgate

It’s been a busy time in my team over the last few months – with updates to Sounds and iPlayer, 5G trials in Orkney (and in London), UHD trials for the FA Cup, Doctor Who launching in Virtual Reality and those teams behind the scenes that keep our broadcast and online services going day-after-day.

But one area that keeps on coming up when I’m out and about speaking at conferences, or at meetings with our partners - is the question of data – how we use it, how we share it and its potential to help us understand the world around us.

Not a week goes by without stories about data. There are negative stories, about data being used to target you with specific messages or sell you more, or leaks of personal data to third-parties. But there are also positive stories, like using big data to help reduce carbon emissions or helping the justice system work better.

This has made me think about the ’s role in this new ‘data economy’ – and what that should be.

How we use your data

At the , we use data to make what we provide you, our viewers, listeners or readers, even better. It helps us tailor our products and services to be more about you – recommending programmes or content we think you might like, or alert you to the fact your favourite sport team has just scored (or lost a match). We also use it to ensure we’re making something for all audiences – and helps find gaps when we commission programmes and services.

But is there more that we could be doing to ensure data is used for good – that the data you give organisations is not just used for commercial gain but is used in a way that helps you and potentially your wider community? We think, potentially, yes.

That’s why we’ve started to work with teams here at the and other partners on specific projects to help us identify what public service value we can bring to these new markets driven by data.

To be clear – we’re experimenting at this stage, and we will learn what works, what people might like – and what areas we think the can help with, as we go along. We’re particularly interested in learning about how organisations can share data to get new insights and how people can safely move their data around. And, we know that when it comes to data, people are rightly concerned about privacy, safety and security. That’s why these trials will start small and controlled, so participants will have signed-up clear in knowing what, why and how their data is being used.

So what have we been up to?

Late last year, the DCMS which looked at the potential of personal data portability to stimulate innovation and competition in the UK. It found that the ability to safely and securely move personal data around could unlock huge economic and societal gains, but that there are big practical issues (both in the way organisations share data and how consumers use it) to resolve first.

Following this, (and with DCMS, ICO and CDEI as observers), we’re involved in two controlled trials of data sharing by 25 individuals. These trials tests how it could be practicality possible to put a person in control of the data they share about themselves with other companies and what concerns this brings up.

The first trial is cross-sector, with the participants signing up to share data from a range of commercial companies – as well as the . You can find out more about that . 

The second looks at bringing together data from media providers into a data store (or what we’re calling internally a Box) to improve people’s experiences when watching or listening to programmes. Bill has blogged about this here.

What’s next?

Over the coming months, we’ll continue looking at this area – with more experiments and closed trials.

We’ll be sharing more about what we learn – and look at what value the can bring you – ensuring this market develops in a way that maximises the huge potential benefits of data and shares them as widely as possible.

I'll be in touch.

]]>
0
Around the world with TLS 1.0 (Part 1) Fri, 30 Nov 2018 11:36:00 +0000 /blogs/internet/entries/50c3faf5-eed2-42c5-bb45-1d816fa8e514 /blogs/internet/entries/50c3faf5-eed2-42c5-bb45-1d816fa8e514 Neil Craig Neil Craig

Recently, was published and soon afterwards, , , and coordinated their announcements that they intend to remove TLS 1.0 and TLS 1.1 from new versions of their primary web browsers.

Removing TLS 1.0 and TLS 1.1 in newer web browsers is a good step forward, which I hope will drive up the number of websites and services offering TLS 1.2 and TLS 1.3.

Some of the above announcements provided statistics on TLS 1.0 and TLS 1.1 usage in modern browsers, since it’ll be modern browsers from which TLS 1.0 and TLS 1.1 are removed. The numbers I saw stated in the announcements (TLS 1.0 at 1.1% and TLS 1.1 at 0.1% usage) looked much lower than some I had seen in our per-geography data — likely because their data is globally aggregated.

“Best check our data to see how it’s looking, eh?” I thought…so I did.

Methodology

I put in some work earlier this year to make it easier to use the HTTP access log data from traffic management services. We now have an automated ingestion pipeline which takes the access logs from their existing AWS S3 storage buckets, verifies, parses, enriches and transforms them before loading them into Google BigQuery (in a GDPR-compliant manner, of course). The net result is that we can now perform SQL queries across all of our traffic management layer’s access logs in a very short timeframe. This has been a game-changer in my opinion, we’re using the data to discover all sorts of things we never knew about usage of our services.

The data I used for this particular study show HTTPS (only, not HTTP) requests to and from November 10th-13th 2018 — a total of just over 2 billion requests from 250 countries (including country: “unknown”).

Global view

First of all, I looked at our “global view” of TLS usage. This covers TLS usage on and from every country we served:

TLS Version

Number of requests

Percentage

TLSv1.2

2,002,516,373

97.96%

TLSv1.1

4,529,764

0.22%

TLSv1.0

37,160,210

1.82%

   hosted with ❤by 

So whilst our global aggregate view of TLS usage differs a little from e.g. the Firefox metrics, it’s not vastly different.

Per-Country view

As I mentioned earlier, the main purpose of this study was to look at how TLS usage varies by geography, as a contrast to the global view for our audience. My query counted the number of HTTPS requests and grouped them by the negotiated TLS version and also by the country (using the IANA name) from which the request originated. I then filtered out countries with less than 10,000 requests as they’re probably less reliable, statistically. Since the result set is pretty large, I then filtered to only include countries which have greater than 5% of TLS 1.0 usage. The results are as follows (ordered from highest to lowest percentage of TLS 1.0 usage):

Country

Number of requests

Percentage of TLS 1.0 usage

Bosnia and Herzegovina

35,031

100.00%

China

2,261,506

86.93%

Montenegro

28,712

48.74%

Croatia

113,948

43.75%

Uganda

150,225

34.48%

Honduras

97,644

29.55%

Ethiopia

180,473

26.04%

Democratic Republic of the Congo

12,775

25.67%

Nigeria

1,224,923

25.13%

Cote d'Ivoire

14,717

23.68%

Myanmar

164,751

21.25%

Hungary

175,327

20.20%

Cameroon

11,618

15.02%

Tanzania

76,469

14.93%

Somalia

189,509

12.98%

Sudan

16,273

12.93%

Mozambique

10,348

12.39%

Taiwan

195,132

11.01%

Zambia

29,070

10.41%

Morocco

32,932

10.04%

Uzbekistan

17,135

9.38%

Japan

489,215

9.15%

Hong Kong

426,542

8.97%

Algeria

24,760

8.97%

Romania

62,019

8.79%

Zimbabwe

19,253

8.15%

Egypt

52,061

7.60%

Turkey

234,372

7.32%

Philippines

94,536

6.95%

Ghana

44,913

6.71%

Belarus

28,211

6.68%

Kenya

73,939

6.39%

Nepal

38,569

6.00%

Bulgaria

27,659

5.96%

Malawi

15,501

5.85%

Jordan

13,419

5.73%

Indonesia

119,720

5.40%

Ukraine

86,505

5.35%

Republic of Korea

83,370

5.33%

Saudi Arabia

79,834

5.21%

    hosted with ❤by 

 

It’s pretty clear that there are very significant differences across the world in TLS 1.0 usage from country to country. We’ll dig into this in a little bit more detail in a moment but I should just mention for now that the data from China might be inaccurate as (to the best of my knowledge), www.bbc.co.uk and www.bbc.com are currently blocked in China (following our migration to HTTPS) so this could well be proxied/VPN’d traffic rather than traffic direct from users.

It’s interesting to make a comparison with the two countries which make up our largest user-base by request count:

Country

Number of requests

Percentage of TLS 1.0 usage

Great Britain

23,778,043

1.43%

USA

2,373,620

1.47%

   hosted with ❤by 

These data show what you’d probably guess, they’re similar and are just a little bit below the global values.

Clients

The next most obvious question is perhaps “what is making all these TLS 1.0 requests?”. The global most popular 20 (from over 90,000) user agents are:

User Agent

Browser/OS

Number of requests

Mozilla/5.0 (Windows NT 6.1; rv:26.0) Gecko/20100101 Firefox/26.0

Firefox 26 / Windows 7

2,243,786

CITRIXRECEIVER

Citrix receiver

1,762,744

Nokia6280/2.0 (03.60) Profile/MIDP-2.0 Configuration/CLDC-1.1

Nokia model 6280

1,054,119

HTTPClient/3.4.0 (Linux; Android 4.0.3; KFTT Build/IML74K)

HTTPClient / Android 4

1,045,245

Mozilla/5.0 (compatible; Genieo/1.0 )

Firefox / Genio search addon

892,453

SGOS/6.7.3.9 (S400-30; Proxy Edition)

Symantec SGOS (proxy)

479,035

Dorado WAP-Browser/1.0.0

Dorado

468,775

Mozilla/4.0 (ISA Server Connectivity Check)

Microsoft ISA server (proxy)

453,251

Mozilla/6.0 (Windows NT 6.2; WOW64; rv:16.0.1) Gecko/20121011 Firefox/16.0.1

Firefox 16 / Windows 8

439,674

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.78.2 (KHTML, like Gecko) Version/6.1.6 Safari/537.78.2

Safari 6 / OSX 10.7

387,037

HTTPClient/4.0.0 (Linux; Android 4.4.4; SM-T560 Build/KTU84P.T560XXU0APL1)

HTTPClient / Android 4

369,724

Mozilla/5.0

Possibly Bluecoat (proxy)

342,443

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/534.59.10 (KHTML, like Gecko) Version/5.1.9 Safari/534.59.10

Safari 5 / OSX 10.6

302,765

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)

IE 9 / Windows 7

291046

Mozilla/4.0

Possibly Bluecoat (proxy)

247,455

ProxySG Appliance

Symantec SGOS (proxy)

246,598

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; GTB7.5; EasyBits GO v1.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; yie8)

Yahoo IE 7 / Windows XP

242,294

HTTPClient/4.0.0 (Linux; Android 4.4.2; SM-T310 Build/KOT49H.T310XXSBQB4)

HTTPClient / Android 4

218,775

MediaCAT/4.5.1

?

216,015

HTTPClient/3.4.0 (Linux; Android 4.3; GT-I9300 Build/JSS15J.I9300XXUGMK6)

HTTPClient / Android 4

213,459

   hosted with ❤by 

So we can see that there are some old desktop web browsers, some feature phones, some proxies and some HTTP libraries, mostly running on older Android versions (mainly Android 2 and 4). Further down the list there are lots more HTTP libraries and web browsers running on Android 2 and 4. We can compare this global view with the 20 most popular user agents from Bosnia and Herzegovina:

User Agent

Browser/OS

Number of requests

Mozilla/6.0 (Windows NT 6.2; WOW64; rv:16.0.1) Gecko/20121011 Firefox/16.0.1

Firefox 16 / Windows 8

18,137

Mozilla/5.0 (compatible; Genieo/1.0 )

Firefox / Genio search addon

1,344

Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)

IE 9 / Windows ?

1,251

Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.15 (KHTML, like Gecko) Chrome/24.0.1295.0 Safari/537.15

Chrome 24 / Windows 8

1,240

Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)

IE 10 / Windows 7

1,229

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; Media Center PC 6.0; InfoPath.3; MS-RTC LM 8; Zune 4.7)

IE 9 / Windows 7

1,228

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1309.0 Safari/537.17

Chrome 24 / OSX 10.8

1,213

Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:16.0.1) Gecko/20121011 Firefox/16.0.1

Firefox 16 / Windows 8

1,213

Mozilla/5.0 (Windows NT 6.1; rv:15.0) Gecko/20120716 Firefox/15.0a2

Firefox 15 / Windows 7

1,194

Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.14 (KHTML, like Gecko) Chrome/24.0.1292.0 Safari/537.14

Chrome 24 / Windows 8

1,174

Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 7.1; Trident/5.0)

IE 9 / Windows ?

1,171

Mozilla/5.0 (Windows NT 6.2; WOW64; rv:16.0.1) Gecko/20121011 Firefox/16.0.1

Firefox 16 / Windows 8

1,168

Mozilla/5.0 (Windows; U; MSIE 9.0; WIndows NT 9.0; en-US))

IE 10 / Windows ?

1,156

Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)

IE 10 / Windows 7

1,143

HTTPClient/3.4.0 (Linux; Android 4.1.2; LG-E440 Build/JZO54K)

HTTPClient / Android 4

164

Mozilla/5.0 (Linux; U; Android 4.1.2; fr-fr; LG-E610 Build/JZO54K) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30

Webkit ? / Android 4

132

Mozilla/5.0 (Linux; U; Android 4.2.2; hr-hr; TPC-71203G Build/JDQ39) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30

Webkit ? / Android 4

125

HTTPClient/3.4.0 (Linux; Android 4.4.4; E2105 Build/24.0.A.5.14)

HTTPClient / Android 4

98

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_5_8) AppleWebKit/534.50.2 (KHTML, like Gecko) Version/5.0.6 Safari/533.22.3

Safari 5 / OSX 10.5

54

Mozilla/5.0 (Linux; U; Android 4.3; en-; SGH-T999 Build/JSS15J) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30

Webkit ? / Android 4

46

   hosted with ❤by 

Here we see fewer HTTP libraries, no feature phone or proxies but a greater proliferation of old desktop web browsers, notably lots of Chrome 24 (2013) and Firefox 15 (2012) & 16 (2012). There’s lots of old Android (especially v4, ~2013) in both result sets. Of course the user agent HTTP header is completely spoof-able so there may be some inaccuracies.

The concentration of year of client release is interesting though, I wonder why 2012 and 2013 are so common? It doesn’t seem to be tied directly to a TLS version change since TLS 1.0 was 1999, TLS 1.1 was 2006 and TLS 1.2 was 2008 (though it was revised in 2011). Answers on a postcard (or in a comment) please!

Is there anything we can do to reduce the TLS 1.0 usage?

The short answer, sadly, is “not really”. The longer answer involves waiting for the natural reduction in older Android versions as the devices running those OS’s fail and are replaced, hopefully with something which supports better crypto! The complication to this is in geographies which are not so wasteful as most “western” economies. In India, for example, older devices are much more frequently repaired than in the “west”, often by local repair agents whose skill and ingenuity can keep devices running for much longer than they do elsewhere.

What about TLS 1.1?

As it is for the rest of the industry, our TLS 1.1 usage is much, much lower than TLS 1.0 and TLS 1.2. This is typically because most user agents/Clients which support TLS 1.1 also support TLS 1.2, so TLS 1.1 doesn’t get a big slice of the action. Our data shows no countries with over 10,000 requests in the 3 days of data which also have TLS 1.1 usage above 1%.

Recommendations

Whichever metric(s) you’re looking, ensure that you don’t just look at the global/overall aggregated numbers, which often mask large regional/subset variations. The constituent communities of your audience often differ significantly, so it’s really important to understand how that affects your data and therefore your decision making process.

The same goes for percentages versus absolute numbers — for example, 0.2% of a large number of users is, in absolute numbers, still a lot of users. Don’t discount seemingly small fractions of a large user base without checking how many people that fraction represents!

P.S. Thanks to my children, Polly and Stanley, for the illustrations. I couldn’t find any suitable pictures so they drew some for me.

]]>
0
Personalisation: Is there a price for convenience? Fri, 19 Oct 2018 12:52:34 +0000 /blogs/internet/entries/b79885ce-811e-41c0-9e67-0576fe4f5dfc /blogs/internet/entries/b79885ce-811e-41c0-9e67-0576fe4f5dfc Sinead O'Brien Sinead O'Brien

Sinead O'Brien, Technology Strategy & Architecture's Lead Project Manager for Transformation Delivery shares her insights from this month's Machine Learning Fireside Chat.

As more and more of our intimate data is collected, is there a price of convenience? If so, what is it, and is it worth paying? The decidedly thought-provoking discussion at last week’s sold out ' Machine Learning Fireside Chats presents: The Price of Convenience’ was hosted by Ahmed Razek of Blue Room.

The provocation…
There is an increasingly fine line between personalised services and invasive services. Do people understand that they’re trading their personal data for these services? Are they aware of the risks? Do they care?

On the panel…
Maxine Mackintosh, PhD student at . Maxine’s PhD involves mining medical records for new predictors of dementia. She is passionate about understanding how we might make better use of routinely collected data to improve our cognitive health.

Also on the stellar line-up was Josh Cowls, Research Associate in Data Ethics at The Alan Turing Institute, and a doctoral researcher at the Digital Ethics Lab, Oxford Internet Institute. Josh's research agenda centres on decision-making in the digital era, with a particular focus on the social and ethical impact of big data and AI and its intersection with public opinion and policy-making.

The third guest speaker for the evening, Martin Goodson, is Chief Scientist and CEO of . Martin is a specialist in natural language processing, the computational understanding of human language.

The discourse…

Maxine kicked off the conversation with a rather hard-hitting statement, that we misunderstand what “health data” really means. When we discuss health data, it is presumed that we refer to data that is collected when we interact with the health system – our medical records. Incorrect. That is “sickness data”. Health data refers to search data, the information captured when we Google, for example travel, which indicates how healthy we are.

Maxine is a member of independent review panel. The board looks to build trust through radical transparency. She argued that we cannot expect the NHS or academia to afford the computational power required to get things right. Therefore we have to work together alongside the large corporations. Corporates can play an innovating role but they, and not just DeepMind, should not be enabled to profit from our data. We, the citizens, own the data. The government has a regulator role to play in protecting society.

Martin spoke further to the tensions between privacy and innovation. If we are too private with our data, there will be less innovation. He argued that the privileged of society are more likely to benefit from AI in terms of convenience. Data needs to work for people and for society. Misuses of machine learning based systems that have led to cruel justice were pointed to as an exemplar of the negative impact of the less privileged of society.

The panel then moved on to the topics of ethics. There was a sudden interest in the ethical perspective. Ahmed asked if there is a risk of “ethics-washing”, using ethical defence to side-step issues such as privacy, autonomy, and agency? General consensus amongst the panel was that the UK is in a good place to be setting the agenda. Europe has a long tradition of setting human liberties. But we need to be ethical and enable innovation at the same time.

The panel argued that unless citizens are personally affected by data breaches, they don’t really understand the repercussions. The public perspective is as much about when and how you ask, as whom you ask. We don’t need to teach kids to code. We need to teach young people to think about how coding impacts and why the control of data may be important.

Maxine highlighted that NHS users are automatically opted in to their depersonalised confidential patient information being used for research and planning by the NHS, as well as commercial and academic partners. NHS data isn’t great but it does have scale. There are huge benefits for populational research. Health data was likened to taxes, a societal contract. Informed decision-making is important. I am happy to share my data in this scenario. Would you opt out of giving your health data?

The discussion closed with a last thought-provoking question: "Can we put data solely in the hands of non-profits?". The panel argued that our health and justice systems need to be able to engage with organisations commercially. And sufficient profit is needed to run these organisations. The panel concluded that we need to define what we mean by “reasonable profit” in this sense.

For more details about upcoming events, visit .

]]>
0
News on HTTPS Mon, 09 Jul 2018 12:14:00 +0000 /blogs/internet/entries/b0807897-7c07-44eb-8d5f-3b2d081a3951 /blogs/internet/entries/b0807897-7c07-44eb-8d5f-3b2d081a3951 James Donohue James Donohue

A few weeks ago the News website finished transitioning to HTTPS. The green padlock you should now see next to the web address is probably the biggest publicly visible technical change to the site since it relocated from news.bbc.co.uk in 2011. Even so, a question we’re often asked is “why did it take so long?”

Before answering that, it’s worth remembering why HTTPS (or more accurately, TLS) has come to be seen as a must-have feature for all web applications. In the early days, secure technologies such as SSL were largely the preserve of e-commerce websites. The padlock assured the user of both the site’s authenticity and the encryption of their credit card details in transit. The use of these technologies has expanded in recent years, with campaigns such as and promoting the adoption of HTTPS across the board. Meanwhile, browser vendors such as Google are to identify sites that do not use HTTPS as ‘Not secure’. Clearly changes to the web landscape and user expectations mean that universal HTTPS is here to stay.

As a public service, we have to ensure that News is available to the widest possible audience, regardless of device, browser or use of assistive technology. We champion the ideal of graceful degradation of service as far as possible. But in a climate of anxiety around fake news, it’s vital that users are able to determine that articles have not been tampered with and that their browsing history is private to them. HTTPS achieves both of these as it makes it far more difficult for ISPs to track which articles and videos you’re looking at or selectively suppress individual pieces of content. We've seen cases outside the UK, with some of our World Service sites where foreign governments have tried to do this.

Our plan for migrating the News website was relatively straightforward, built on extensive groundwork already done to move World Service sites (such as ) to HTTPS. Until recently, anyone accessing News over HTTPS was redirected (‘downgraded’) to HTTP. This changed in March when we enabled access via both protocols and began an iterative process of chasing down a multitude of bugs, while we worked on updating links, feeds and metadata to reflect the new address. Colleagues in bureaux around the world helped us detect access issues in different geographical areas early (we discovered, for example, that in India a government-mandated network block initially made the site totally inaccessible).

At the same time, we compared the page load performance of real users across HTTP and HTTPS, which revealed that many of those on HTTPS received a slower experience, due to the relatively large number of domains our assets are served from and the overhead of negotiating multiple TLS connections. To balance out this impact, we decided to extend the project to include some performance improvements to the site. Our final step was to reapply the redirect in the other direction, ‘upgrading’ HTTP users to HTTPS in sections (though even here we had to proceed with caution, initially making the redirect temporary in case it had to be reversed).

There were other challenges. The work had to be fitted around major events that place restrictions on our platforms, including a Royal Wedding and local elections in the UK.

Many of the bugs mentioned above fall into the class of ‘’, where the browser detects non-HTTPS assets being loaded on an otherwise secure page. This is a particular challenge for News due to the site’s long and complex history, since almost every page published since the site launched in 1997 is still available. Though it appears externally to be a single website, it is really a patchwork of technical architectures, mainly because of differing requirements. Our election coverage demands real-time updates combined with scalability to cope with huge traffic levels, while one-off interactives need a flexibility and richness of experience that goes beyond our standard templates.

Over the last twenty years, publishing systems for content on News pages have come and gone, having been replaced or made obsolete. Although newer content is published through dynamic web applications that can be readily modified, what lies beneath this sometimes resembles layers of sedimentary rock. This means in practice that tracking down historical mixed content and working out how to change it is not always straightforward. We developed our own ‘crawler’ to help us find such problems, and had to come up with some crafty workarounds to address some of the most inaccessible bugs, and a number of these tasks are still in progress. We also have a major ongoing project to convert some older audiovisual content to a format that can be delivered securely, but this will take time.

Even then, some mixed content just cannot be fixed economically, and one or two errors will remain. Such pages still work, with the occasional browser warning, similar to how News pages from the late 90s . We confined our efforts to content available on www.bbc.co.uk, leaving older domains as a historical record. We think users would rather we spent more of our time on building the future of the website.

News is now only available over HTTPS, and the padlock (combined with the web address) hopefully gives users of the site confidence that what they read and watch was published by the and is private to them. We hope you agree it was worth the wait.

]]>
0
Privacy by design Wed, 13 Jun 2018 14:18:00 +0000 /blogs/internet/entries/3e35ce9a-a8ac-49aa-9925-741c30738184 /blogs/internet/entries/3e35ce9a-a8ac-49aa-9925-741c30738184 Adam Bailin Adam Bailin

This little button means a lot to us.

When it’s switched on it allows you to view personalised content recommendations based on your historic activity data.

I work for the Analytics Services team in Cardiff and we’ve been working on what happens when .

The decided that when you switch off personalisation, it shouldn’t link any activity data to your account. In fact, we think that it shouldn’t ever be possible to tie activity data back to you. This page expresses it well:

“Data about how you use the will be anonymous. For instance, we’d be able to see that someone looked at a particular story on News, but we wouldn’t be able to tell if it was you.”

That’s actually quite a difficult software engineering problem to solve. The way most analytics tracking systems work is by using cookies or some other persistent identifier precisely to be able to tie together a user’s activity across multiple sessions. To get around that, we reset a user’s analytics identifier whenever they switch personalisation off.

How it's normally done

Normally, when you sign in or out of a service your analytics identifier will persist. This means that organisations can attribute data to you as an individual even when you’re signed out or have chosen not to receive a personalised experience.

How we've designed it

We’ve designed our analytics ID differently, with privacy in mind. We’ve designed it so that when you sign in to your account and disable personalisation, we can’t attribute activity back to you as an individual.

We’ve designed privacy into our analytics.

Putting users in control of their data

The General Data Protection Regulation, or GDPR for short, is one of the biggest changes to data privacy law in recent years. It is designed to put you in control of how your information is collected and used by organisations.

This change to our analytics services is a small example of how we are designing privacy as a feature to put users in control of their data. It is part of a wider opportunity to enable much greater control over how your data is collected, what you share and with whom. And in turn to drive a more relevant and nuanced personalisation of the ’s services.

We see privacy not just as an exercise in legal compliance but as an opportunity to deliver greater value for users.

]]>
0
My Life, My Data, #MyTomorrow Thu, 24 May 2018 14:17:03 +0000 /blogs/internet/entries/969b8e44-cec7-40ef-ab39-c63d710f890b /blogs/internet/entries/969b8e44-cec7-40ef-ab39-c63d710f890b Chris Sizemore Chris Sizemore

Tomorrow(’s World) starts today.

Personal data is one of the most important issues we’re all grappling with these days, but it can all feel so confusing and abstract that we tend to dismiss it with a mixture of 😕, 🤯, ¯\_(ツ)_/¯.

What is personal data anyhow? It’s data about you, but is it your friend, or your enemy? Often when it comes to personal data, a sense of dystopia can understandably prevail - we’re being spied on, manipulated, and our civil liberties are under threat. Meanwhile, services that collect data about us have become so useful and seemingly indispensable (and those Buzzfeed quizzes ain’t gonna do themselves).

In a world where it can often feel that things are constantly happening to us, how can we influence our own futures in a positive, active way?

It’s in this context that the General Data Protection Regulation (GDPR) kicks into effect this Friday, 25 May 2018 - and it’s a game-changer for people living in the UK and Europe when it comes to their rights to privacy in the digital age and what they can actively do with data about themselves.

To help audiences explore the implications of this monumental intervention, Tomorrow’s World is launching “My Life, My Data, #MyTomorrow”, a campaign about people and communities shaping the future by taking control of the data they create. The campaign starts today and runs for the following week.

Using short films, interactive experiences, and conversations across social media, Tomorrow's World will help explain just how exciting and important personal data is.

Highlights of the “My Life, My Data, #MyTomorrow” campaign include:

“Future Values, or A Short Ride in an Intelligent Machine”  Data about you is helping drive the development of artificial intelligence. “Future Values, or A Short Ride in an Intelligent Machine” which launches today on Taster, will send you into a “what if?” future built atop your own data. You’ll exchange banter with the artificial intelligence behind a driverless taxi, and discover the guiding principles  that make up your own deepest values.  Experience “Future Values, or A Short Ride in an Intelligent Machine". , the cutting edge conversational storytelling technology behind this pilot. 

“My Life, My Data” A short film (5 minutes) hosted by Leila Johnston and Alex Lathbridge takes us on a fast-paced journey to explore what is personal data and how is it currently used. Brought to life with animations by Jamie Squire, the short film reminds us of the quid pro quo that comes with access digital services such as Facebook, Google and Instagram: nothing comes for free. .

“Instagram Chatbot” To illustrate the power of data about you and how it is driving the development of machine learning, our “Instagram Chatbot" will analyse your next Instagram post and compare it with thousands of others to estimate how people will react to it. The chatbot will then create a unique, personalised short video explaining why and to illustrate the power of personal data. .

“Donate Your Data Day” A short film (10 minutes) that imagines a not-to-distant future where we can donate data about ourselves en masse, using a click-to-donate button on our mobile phones. Galvanised around a gently satirical global 'Donate Your Data Day', the audience can decide what data to donate and who to donate it to, showing their support by becoming a data donor with their own Data Donor Card. .

Tomorrows World presenters Leila Johnston and Alex Lathbridge

“Meet the Personal Data Superheroes” A special episode of the Tomorrow’s World podcast explores the work of people making a difference when it comes to our data rights. Meanwhile, on social media, presenters, journalists, and influencers - including Radio 1 Newsbeat’s Tina Daheley, C’s Katie Thistleton and Click’s Spencer Kelly - discuss the importance of the data we're creating everyday. Listen here. 

"How Do You Feel About Your Data? A Survey" - a timely research project from Tomorrow's World partners The Open University, on their new nQuire citizen science platform. How much personal information are you happy to share? What should companies be allowed to do with the data you create? Audiences can contribute to this short survey and give their views on data protection. .

The “My Life, My Data, #MyTomorrow” campaign will make you smile and think, and ask you to reconsider your preconceptions about the relationship between us, the data we create, and the companies, governments, and other organisations that use that data.

So what comes next?

This is just the beginning, and there's much more to do. People are recognising their rights and starting to take greater control of the data they create. They are actively helping create a future their grandchildren will want to live in. We can all join in.

Together, we're creating .

 about the Tomorrow’s World partnership between Science Museum Group, Wellcome, The Open University, the Royal Society, and the .

]]>
0
Your data matters Tue, 22 May 2018 14:04:00 +0000 /blogs/internet/entries/7c605523-8df3-4dcb-bf58-7c64aa0b59a5 /blogs/internet/entries/7c605523-8df3-4dcb-bf58-7c64aa0b59a5 Julie Foster Julie Foster

Last , we updated everyone on our plans to make the more personalised and relevant to you. We can give you more of what you love when we understand you better, and also make sure that as a public service, we make something for everyone.

We now have over 15 million people with accounts using the ’s websites and apps in the last month. What’s more, they are also using websites and apps more than people who are not signed in. 64% of account users visit online more than 2 days per week, compared to 46% of all users. And when they are on websites and apps, people with account spend an additional hour per week than people not signed in.

Your personal data is helping power this transformation. We can’t provide you with a meaningful personal or tailored experience without this information, but it is ultimately your data. And your data matters.

The General Data Protection Regulation, or GDPR for short, is coming into enforcement in the next week. It makes sure that businesses clearly explain to you why they collect your personal data, and how they use it. It is an evolution of the Data Protection Act, and gives you new and important rights.

As we’ve said before, we’ve built our new account system with GDPR in mind, but we’re always reviewing our processes, technology and governance.

We use your personal data for different reasons, and it’s important we are transparent to you why we collect and use this data. Our site spells out, in plain English, what we will (and importantly won’t) do with your data. It also can help you exercise your GDPR rights, such as changing some of your details in Settings.

How have we prepared for GDPR?

For starters, you should not need to be a rocket scientist to know your rights. We’ve updated our policies to make them even more transparent and clear.

What are my rights?

We’ve created a new section in Using The all about GDPR to help you understand what your rights are, how you can exercise them with the and get help.
We’ve also innovated and developed technology with data privacy at the heart of what we create. Below are a couple of examples of the kind of work we’ve done to prepare for GDPR.

Privacy for children

We want to help you make sure that your child can only watch programmes, read comments and upload their creations in a space that is age appropriate and suitable to them. For this reason, we’ve developed a way for parents or guardians to register their child which is simple and easy to do, but more importantly is safe and secure for the entire family. We want your children to get great experiences on websites and apps, and play our part to help protect them.

Privacy by design

This little button means a lot to us.

We really want you to have a personalised experience, like picking up where you left off watching a show, getting recommendations on programmes you might like or getting notifications about your favourite football team. But you have the right to these features if you don’t want them. Our analytic services team has worked hard to develop a technical solution that can do this easily, and ensure your privacy.

What’s next?

We have some fantastic events coming this summer, from the to FIFA World Cup  and Wimbledon. Your data is helping us learn what you like, so we can make sure you get the best out of this summer, and improve our services for you in the future.

]]>
0
Enabling Secure HTTP for Online - update Wed, 20 Dec 2017 10:01:00 +0000 /blogs/internet/entries/a6604322-99a9-4272-860c-f78e667e18e3 /blogs/internet/entries/a6604322-99a9-4272-860c-f78e667e18e3 Paul Tweedy Paul Tweedy

Back in July 2016, I published a blog post called , about our plans to roll out HTTPS across our online products, and the particular challenges we were facing. Over a year has elapsed since then, and we’d planned to be more or less complete by now, so how are we doing?

Overall, what we said still holds true — retrofitting HTTPS onto an existing, ever-changing estate of web services at scale is the exact opposite of a straightforward task in practice. However, we’ve made some really good progress. All the enabling work at the traffic management layer is complete, and now products can roll out HTTPS in such a way as to avoid impact on their existing roadmaps.

(For a great read on how Online is composed of multiple products and technology bases, and some of the complexity that brings, read ).

We had a tentative 12 month timeframe back in 2016, and in that time the UK page, TV, Music, Children’s (C and CBeebies), iPlayer, Education, and many World Service sites such as World Service Radio, , and are now all HTTPS-only.

A really important achievement has been the roll-out of HTTPS to our AV streaming services across desktop, mobile and connected devices. We have adopted a slow & steady approach quite deliberately here as there is a huge variance in HTTPS support across all the devices that iPlayer is supported on (some don’t work at all, or perform sufficiently poorly that HTTPS gives a bad playback experience), but we are well on our way and the chances are that when you next stream iPlayer content to your device, you’re doing so over a completely secure stream. Lloyd Wallis has written a detailed post about all the achievements and challenges .

Also, our mobile applications teams have been working hard to secure all backend service calls from our native mobile applications like iPlayer Radio, in line with emerging mobile security standards such as Apple’s App Transport Security.

, the security standard that underpins HTTPS, is also an important enabler for the HTTP/2 protocol, another important future-looking standard for us which from a perspective.

So, despite the enormity of the task, we’ve made great progress, and we’ll continue to work to make HTTPS the default wherever possible across Online. Within Design & Engineering we believe that we owe our audiences the confidence that when they access Online, they’re doing so in the safest and most trusted manner possible, wherever they are.

]]>
0
Enabling Secure HTTP for Online  - audio and video Mon, 18 Dec 2017 10:19:00 +0000 /blogs/internet/entries/eb4fdb3a-fa91-49ad-bb71-bbe82dab2bd3 /blogs/internet/entries/eb4fdb3a-fa91-49ad-bb71-bbe82dab2bd3 Lloyd Wallis Lloyd Wallis

Last year, . Once our Online Technology Group had paved the way for web pages to be served over HTTPS, the next step was to do the same for our Audio/Video (AV) media distribution.

Before we go into more detail, many have asked on other blog posts about where exactly HTTPS is as most of the website is still only available on HTTP. Paul’s blog post explained that our platform is now capable of HTTPS, such as page, News and iPlayer to enable it for their websites. Many product teams have investigated enabling HTTPS but either are yet to prioritise the necessary work or have decided that there were too many limitations up to now. The biggest limitation for many sites has been that they play AV content, and this would still be HTTP (or even its predecessor, RTMP) — breaking the “secure” padlock in browser address bars when using the Flash player, and leaving us unable to use the HTML5 player on HTTPS at all.

Enabling HTTPS media delivery, as well as the increasing number of products with personalisation at their core which , will be big drivers products moving over. Since the end of March 2017, we’ve been able to offer HTTPS everywhere it’s needed and over the last few weeks has started upgrading users to HTTPS.

The AV HTTPS project was broken into phases:

  • Enabling HTTPS at the CDN edge
  • Rolling out HTTPS where it is preventing use of the HTML5 player (SMP)
  • Rolling out HTTPS
  • Enabling HTTPS to our origin AV services
  • A/B testing HTTPS in our HTML5 player

When we started, our streaming services looked like this:

Enabling HTTPS at the CDN edge

We currently use up to three CDNs to distribute AV content — two 3rd party and our own, . Our 3rd party providers both “support” HTTPS, so we expected this to be straightforward. BIDI required some development work, but as ultimately it’s nginx under the hood, it also wasn’t expected to be too complex.

Actually being ready to make our first HTTPS request for a media asset took around four months.

We took the decision with BIDI to only use , and whilst this limits supportable platforms for our in-house CDN, we thought it was not worth the complexity of supporting older technologies, specifically RSA, and to rely on our 3rd party CDN providers instead for these users, which are currently around 5% of our total. Our core media players for most platforms have automatic failover between CDNs when a problem with one is encountered.

The had to migrate all of its AV distribution to new hostnames, and in the case of one CDN, an entirely new product adding the additional complexity of needing to rebuild and retest the configuration.

New hostnames and a new platform meant that our CDNs would have cold caches. In normal usage the majority (>97%) of requests for media are served straight from their caches, which is something we rely upon. Once moved to the new configurations, all of the requests would have to come back to our origin cache, , and then potentially to the canonical dynamic AV packaging origin. We couldn’t just flip a switch and have a hundreds of gigabits per second of user traffic suddenly come straight back to our packaging origin; nor did we want to do a big bang move in the case of the new CDN product. So we broke our distribution configuration into 40 chunks, for example DASH iPlayer on desktop for CDN A, and migrated one chunk per deployment window. Including cases where we had to roll back and forward as we found issues, this part took three months.

Rolling out HTTPS

At this stage, we turned on HTTPS for our Worldwide commercial offerings — as HTTPS only products we had to serve them using our Flash player until this point. We were also rushing to meet Apple’s deadline of requiring HTTPS for all traffic from apps in its App Store — but when this deadline was extended, we backed off a little.

Enabling HTTPS to our origins

Now we had HTTPS available at the CDN edge, but we then used HTTP between the CDNs and our on-demand (OD) origins, known as “scheme downgrade”. This was a necessary step for us to meet the initial timescales — the same team looks after both BIDI and Radix, with BIDI being the priority enabling HTTPS on Radix was left until later. We moved those over a period of a few weeks, again in small parts to prevent overwhelming our services from cold caches.

A/B Testing in the HTML5 player

By now, HTML5 was our default player for most browsers, so it made up the majority of our desktop AV traffic. We added a new dial we control to the player, enabling it to prefer HTTPS, even when embedded on an HTTP page. This gave us a few benefits:

  • Starting to warm up the CDNs to HTTPS, so we could see they performed as expected. At the start of this project, HTTPS distribution capacity with our CDNs was a concern.
  • Analysis of client performance compared to HTTP.

HTTPS preference started at 1%, and we gradually ramped it up so that by the end of March, we were running at preferring HTTPS for 50% of playback sessions, although the actual split of playback sessions was closer to 55% HTTP, 45% HTTPS. A couple of weeks of tweaking, and we saw global MPEG-DASH playback sessions split at around 50%.

Next step was to analyse anonymous performance data we receive from the player. There were a few hypotheses about how HTTPS delivery would perform:

  • High latency connections would experience elevated error rates due to the multiple TCP round-trips required to establish a TLS (<1.3) connection
  • Certain UK mobile ISPs could see reduced error rates as they have previously known to intentionally alter the content served, sometimes breaking it
  • Conversely, these mobile ISPs are also often high-latency connections, and also have transparent carrier-grade proxies, so the reduced cacheability of content could degrade performance
  • As a result of the increased number of round-trips, average bitrate would reduce slightly and rebuffering events would increase slightly
  • Also conversely, the increased integrity could result in fewer media download errors, thus reducing rebuffering
  • Reduced error rates in countries with restricted access to the internet

In summary, we weren’t entirely sure what we to expect.

“High latency connections would experience elevated error rates due to the multiple TCP round-trips required to establish a TLS (<1.3) connection”

Proving this one in itself is difficult. However, , which is the country which had the highest HTTPS error rate in our trial.

“Certain UK mobile ISPs could see reduced error rates as they have previously known to intentionally alter the content served, sometimes breaking it”

One UK mobile ISP, which we have observed performing incredibly novel alterations to our content in the past, experienced a 30% reduction in errors for clients using HTTPS. In fact, every major UK mobile ISP had a reduction in errors for HTTPS clients, although most to a lesser degree.

“As a result of the increased round-trips, average bitrate would reduce slightly and rebuffering events would increase slightly. Conversely, the increased integrity could result in fewer media download errors, thus reducing rebuffering”

This has proven true overall — whilst there are fewer media download errors, overall rebuffering has increased and average bitrate has dropped, although only by a small fraction.

“Reduced error rates in countries with restricted access to the internet”

Countries such as China, Russia, Saudi Arabia and Turkey are seeing reduced error rates when using HTTPS.

Other observations

  • The impact on fixed-line broadband services varies quite substantially from one ISP to another. Some perform better, others worse.
  • Traffic that looks like it could potentially be proxy, VPN or small “one-man” ISP services has a substantial increase in failures with HTTPS. This is possibly due to poor TLS performance and configuration in these networks.

A note on pre-HTTP streaming technologies

Some of our older AV content is only available on web in a streaming format called RTMP. Unfortunately, this can’t easily be moved to HTTPS, as it’s not an HTTP-based protocol in the first place. To resolve this we are currently in the early stages of a huge undertaking to re-encode as much of our archive that we have source for. In the meantime, on some HTTPS sites, some older clips may present an error message when you try to play them, warning that the green ‘secure’ lock in your browser will be broken when you play the clip.

Summary

The now uses HTTPS 50% of the time when delivering media to HTML5 responsive web users on HTTP pages. In addition to that:

  • iPlayer on mobile devices and responsive web is now HTTPS-only
  • iPlayer Radio on the iOS app is now HTTPS-only
  • Around 10% of IPTV iPlayer traffic is HTTPS
  • Other responsive web services such as Music have moved to HTTPS, with more to follow
  • All World 2020 News sites (e.g. ) are launching HTTPS-only

We hope that this work will enable more products to finally take the leap into enabling HTTPS for their parts of Online. However there are still other concerns, most notably that where we do use HTTPS, we use modern cipher suites and protocols to ensure that when we say it’s secure, it actually is. Large amounts of traffic, especially when you consider World Service news sites such as need to balance access to information on cheap or old devices and browsers with the desire for the security HTTPS offers. For example, 5% of our Pashto users currently wouldn’t be able to support TLS 1.2, a standard that has existed since 2008.

Next, we’re looking towards making HTTPS AV perform as well or better, in the average case, when compared to HTTP; or to explain which market area is performing worse and why. We’re currently testing HTTP/2 for media distribution as well. Whilst previously , the technology has changed and we believe that at the very least changes to the minimum size HPACK (compression of HTTP headers in HTTP/2) operates on could reduce compression/decompression overhead and have some beneficial impact on performance. So far our testing is in line with this.

]]>
0
Enabling Secure HTTP for Online Thu, 14 Jul 2016 10:39:00 +0000 /blogs/internet/entries/f6f50d1f-a879-4999-bc6d-6634a71e2e60 /blogs/internet/entries/f6f50d1f-a879-4999-bc6d-6634a71e2e60 Paul Tweedy Paul Tweedy

The ’s role as a trusted destination on the Web makes it extremely important that we present a service to our audience that is trusted, secure and protects their privacy. Paul Tweedy, Lead Technical Architect in the Online Technology Group, explains how browsing across Online will become more secure with the release of a new update.

One of our roles in the Online Technology Group of Design & Engineering is to ensure Online is secure at the point of access, in a constantly-evolving world of Web standards, security vulnerabilities, hackers and malware. We work with the software engineering and operational teams across the division who build our services to achieve this.

In mid April, we made a small but significant step towards ensuring the security and safety of our audience: We enabled HTTPS (secure HTTP) on the domestic page and Travel News and mobile Weather sites, with more to come.

Technical Context

The World Wide Web was designed during the relative infancy of the Internet, when the number of users was small and access was relatively closed. One of the Web’s initial strengths was that it was simple to implement and debug; security was an afterthought.

HTTPS has been around since 1996, but, for the first 15 years of the Web, HTTPS was mainly used to selectively secure the transmission of sensitive information, e.g. credit card details, for e-commerce and similar use cases. This was usually due to the cost of implementing HTTPS being high - both in financial terms and in technical complexity, which also meant that it was prohibitive to serve high traffic peaks when major news or sport stories break, as the often has to.

Behind the scenes, Online has been using HTTPS for many years, allowing its components to be hosted anywhere without the need to trust any intervening network, and using X.509 client certificates to authenticate each other.

However, it’s taken many years of steady technical progress for it to become practical for large web sites to move all of their public traffic to HTTPS - not just for e-commerce purposes, but to:

  • Provide server authenticity - users have a measure of trust that they are actually connecting to the sites they intend to.
  • Protect users against common hacking attacks - man-in-the-middle, DNS hijacking, etc.
  • Keep up with stricter Web and browser security standards.
  • Allow use of modern Web standards such as HTTP/2.
  • Allow users to “sign in” to sites for a personalised experience, and for their data to be transported securely.

So, why didn’t we just enable HTTPS years ago, like other some large web sites? The real challenges for us were:

  • Enabling HTTPS via our CDN partners, to maintain functionality at times when we deliver Online through a CDN, required a host of technical & contractual changes.
  • The CPU overhead of TLS encryption has historically been significant. We’ve done a lot of work behind the scenes to improve both the software and hardware layers to minimise the load impact of TLS whilst also improving security.
  • We required several internal software changes. e.g. sites didn’t always make use of scheme-relative URLs, so required development work to make them scheme-agnostic or HTTPS-only.
  • As a public service broadcaster, we have to be as inclusive as possible when it comes to device support, but this often proves difficult. Devices such as smart TVs and Internet radios are key to the availability of services such as iPlayer, but bring unique challenges for managing HTTPS configurations, and many don’t support HTTPS at all!

Therefore before this year, HTTPS was limited to a small set of functions on sites, such as signing in to iD, online voting, and other areas which dealt with personalised data.

Earlier in 2016, the Chromium development team decided to implement , preventing access to certain in-browser features on ‘insecure’ (non-HTTPS) web pages. In practice, this meant that key features of certain products, such as the location-finding feature within the page, Travel News and Weather sites, would stop working if we didn’t enable HTTPS for those services.

Within our department, we had been preparing for this future for some time, lobbying technology providers to support current standards, updating TLS configurations and deprecating older versions. However, this brought the need to make HTTPS work for these products forward.

We set ourselves the deadline of mid-April to roll out this first phase of HTTPS delivery and through working with the product engineering teams, our operational colleagues and our CDN partners, we achieved the required support in our infrastructure to allow our product teams to migrate to HTTPS as they need to.

What this means for you

Now, when you access the page or Travel News sites, you might see a green padlock appear in your browser’s address bar. This means that you can trust that the connection between you and the servers is secure, keeping your data safer in transit and protecting your privacy. It also gives us confidence that the content provided by the remains intact and unaltered on its journey to your devices, no matter where you happen to be.

As sites roll out support for HTTPS across the year, you will see the green padlock more and more as you browse Online, even if you aren't signed in with ID. Our aim is to make secure browsing the default experience for most of our audience by the end of 2016.

There are always practical limitations to site-wide technical changes, and HTTPS Everywhere is no different. Sites and content we consider ‘archival’ that involve no signing in or personalisation, such as the News Online archive on news.bbc.co.uk, will remain HTTP-only. This is due to the cost we’d incur processing tens of millions of old files to rewrite internal links to HTTPS when balanced against the benefit.

Next steps & challenges

There’s plenty more to do over the coming months. We’ll be focussing on:

  • Gradual roll-out of HTTPS as and when products are ready – aiming to be HTTPS across Online within the next 12 months.
  • Evaluating audio/video delivery via HTTPS.
  • Support for the next version of the HTTP standard, HTTP/2.
  • Optimising HTTPS/TLS for performance, particularly on low-powered devices such as smartphones.
  • Examining compatibility with old devices, particularly consumer electronics, that can’t be easily updated.

Big thanks go to Andrew Hutson, Craig Taylor and Neil Craig, all architects in my team who are tracking the standards and making this happen. Thanks also to the page and Travel product teams, who we worked with closely to achieve this important first phase.

Finally, this is an exciting area, and we’re always recruiting - so if you’re an engineer or architect with a strong interest and/or expertise in HTTP(S), Internet security and secure content delivery, take a look at two roles open now for a  and a . 

]]>
0