SlideShare a Scribd company logo
1 of 19
e-Delivery Team




Too many websites
Too little of interest

30.03.2004




                         e-Delivery Team
                         Alan Mather

                                                    1
Pages per site                                                                                           e-Delivery Team



          Using google to spider to count the pages of all 3233 .gov.uk sites …
              90% of sites have less than 2,000 pages
              Less than 1% of sites have more than 20,000 pages

High count with less than 50 pages
– many redirects (where domain
has changed or is not active, e.g.                      Gov.UK - Web Site Page Counts                                                                   And only a few sites have
direct.gov.uk), also sites that use                                                                                                                     huge page counts (between
ASP or frames making it               2000                                                                        100%                                  20,000 and 100,000) …
impossible for google to spider       1800                                                                        90%                                   including ir.gov.uk, dh,




                                                                                                                         %age of total site count
behind first page                     1600                                                                        80%                                   scotland, ons, hmso
                                      1400                                                                        70%
                                      1200                                                                        60%
                                      1000                                                                        50%
  But, notwithstanding                 800                                                                        40%
  inability to spider some             600                                                                        30%
  sites, it looks clear that the       400                                                                        20%
  vast bulk of .gov sites have         200                                                                        10%
  less than 2000 pages                   0                                                                        0%


                                                                                            0

                                                                                                   0
                                            0

                                                  00

                                                         00

                                                                 00

                                                                        00

                                                                                00




                                                                                                          00
                                                                                       00

                                                                                                 00
                                         <5

                                                   0

                                                          0

                                                                 0

                                                                         0

                                                                                 0




                                                                                                       00
                                                                                        0

                                                                                               0
                                                <1

                                                       <2

                                                              <3

                                                                      <4

                                                                              <5

                                                                                     <1

                                                                                            <2


                                                                                                      0
                                                                                                   <1
                                                               e.g. 0<x<50, 50<x<1000 etc



         Page Count            <50    <1000            <2000            <3000               <4000              <5000                                <10000     <20000   <100000
         Site Count           1891      738              251                  120                 63             113                                     4         28        25

         %of total             58%     81%               89%                 93%                95%             98%                                   98%        99%      100%

         Cumulative
         Site Count           1891     2629             2880                 3000               3063            3176                                  3180       3208      3233     2
The Google Data - Raw                                    e-Delivery Team


   The Google data shows:
        More than 80% of the content (in pages) is found in around 10% of the total count of sites
        There are huge numbers of very small sites (per Google), although that may be because
         Google is unable to spider or does not cover all sites through the entire hierarchy
        Still, errors in Google indexing are likely to be consistent across the entire population of .gov
         sites, making the shape of the graph likely ok

                                           Google's site sizes

                          100000                                                          100%
                          90000                                                           90%
                          80000                                                           80%

                          70000                                                           70%
                          60000                                                           60%
              site size




                          50000                                                           50%
                          40000                                                           40%
                          30000                                                           30%

                          20000                                                           20%
                          10000                                                           10%
                              0                                                           0%
                                                                                                              3
Counting Servers                                       e-Delivery Team



   Checking on the servers operating
    behind the websites in .gov.uk                   Apache                                    1209
                                                                         Apache                 186
        Over 1,200 running Apache                                       Apache/1.3.26          274
        And more than 1,500 running Microsoft IIS                       Apache/1.3.27          282
                                                                         Apache/1.3.28           62
   These figures don’t include servers that                             Apache/1.3.29           99
    may be configured but not active for, e.g.                           Apache/2.0.40           25
                                                                         Apache/2.0.45            1
    resilience. They also don’t include                                  Apache/2.0.46           32
    servers further down the infrastructure                              other Apache           248
    stack, e.g. running content applications         Microsoft-IIS                                    1547
    or other code                                                        IIS/4.0                       377
                                                                         IIS/5.0                      1103
   Naturally, each of these servers is likely                           IIS/6.0
                                                                         other IIS
                                                                                                        65
                                                                                                         2
    to be accompanied by firewall and
                                                     Lotus-Domino        Lotus-Domino                        109
    storage configurations                           Netscape-Enterprise Netscape-Enterprise                  74

   At a conservative cost of £10,000 per
    server, the total cost of this infrastructure
    alone is over £29,000,000

                                                                                                               4
Cost of Websites
                                            (Benchmarking)                                        e-Delivery Team

                                                                                                 Not on Record
                                                                                                 •dti
                                                                                                 •IR
                               ONS                                                               •HMCE
                                                                                                 •Home Office
                         DH
                              DfT      Worktrain                                                 •DEFRA
                                            Business.gov?                                        •ODPM
                                             JC+ (development)


                                                                               Figures drawn from recent PQ (and, unless
                                                                               stated, include only hosting charges and
                                                                               not development or development support)




            JC+


      DWP     The Pension
              Service
OFT                                                                                                             Large
       TheRegister.com                                             Worktrain                                    Quasi-Public
HMT                                                                (development)                                Sector (fully
                                                                                   DfES
                                                                                                                Loaded)




                                                                                                                           5

       250k       500k        750k   1.0m        1.25m      1.5m       1.75m       2.0m      2.25m       2.5m        3.0m
Characteristics of .gov.uk sites                  e-Delivery Team



                           Inconsistent - five different
                                 look and feels

     Unreliable - Poor uptime                         Huge - up to 100,000
                                                            pages
Complex - Nine levels   More than 3200
       deep                                                 More than three
                            sites                          navigation designs
 100s of broken                                               Some parts of the
      links            More than 2.5                       site not linked to others
                     million documents                         ‘orphan content’
     More than 200
     URLs per dept                        More than 300
                                            authors
                  Slow - download time
                  more than one minute

                                                                                      6
Looking For The Right Thing?                           e-Delivery Team



                                              Using Internet search engines in an
                                               effort to find “the right thing” can be
                                               challenging. The search terms at left
                                               were entered, with the results restricted
                              30/03/2004
                                               to the “.gov.uk” domain only
Disability Living Allowance       14,700
Child Tax Credit                   5,790      There is a huge amount of duplication
Carers Allowance                     915       in government online:
Working Family Tax Credit            546
Attendance Allowance              13,000           Many local authority sites repeat the
Council Tax Benefit               42,000            description of the rules for claiming
Housing Benefit                   77,800            certain benefits, where to claim, what to
Statutory Sick Pay                 6,200            claim for and so on … and doubtless,
Self Assessment                   14,000            every year or so, each of these
                                                    mentions must be updated with the
                                                    correct rules (but what if they’re not?)

                                                   Even “self assessment” only has 4,950
                                                    mentions on the Inland Revenue’s own
                                                    site, but a further 9,000 across the rest
                                                    of government


                                                                                                7
And how does .gov look to the consumer? e-Delivery Team

   The variety of sites show little in
    the way of consistency

        Navigation varies from site to
         site, sometimes on the left,
         sometimes tabbed, sometimes
         graphic, sometimes text

        “Search” is called different
         things, is often not on the home
         page and often returns poor
         results – despite research
         showing that consumers who
         can’t see what they want
         instantly will use search

        Accessibility is poor with many
         sites not attempting to achieve
         the lowest hurdles

        Even sites owned by the same
         parent are confusing, e.g.
         pensionservice, pensionguide,
         agepositive, over50 …
                                                               8
The Missing Data                             e-Delivery Team



   To complete the picture and allow the proposed plan of action to be fine
    tuned, the following data is needed:

      Visitor counts (Hitwise may offer an approximation)

      Approximate costs to operate (at an infrastructure level including all servers,
       network equipment, firewalls, software licences etc) – both price bought at and the
       price for continued operations projected forwards (to allow for annual licence
       premiums, renewals etc that may be due in the future)

      Contractual agreements around exit arrangements, renewal dates etc along with
       whether the contract for web hosting is part of a wider technology outsource
       agreement (that might, therefore, make it harder to exit)




                                                                                              9
Proposal For What Next                              e-Delivery Team



   Principles

      Government is in the business of helping citizens by making information easy to
       find. The total number of websites needs to be rationalised dramatically – from
       over 3,000 to under 600 in the first stage (including Local Authorities).

      Government is in the business of presenting information in a way that citizens will
       understand; it is not in the user interface design business. The range of
       navigational and interface styles needs to be harmonised to a single core style.

      Government has already spent significant sums on its online presence, yet
       government is not a technology leader. The cost of the programme outlined must
       be absorbed through saves generated in the first year of the programme, making it
       self-funding.

      Government buys in cycles and these are likely to be maintained. This cycle will
       allow work to be completed at a constant pace as contracts come to their natural
       end, thus incurring no exit penalties.

      A programme of rationalisation this large will require multiple parallel streams of
       work – the cost of the overlap reducing the saves inherent slightly but increasing
       the odds of success through elimination of bottleneck and delay

                                                                                                10
DotP versus Everything Else                            e-Delivery Team


   Condensing 3,000+ sites to a few hundred is no simple task. It will likely
    require a variety of approaches and software solutions to ensure that there
    are no bottlenecks.

   DotP’s primary characteristics are:

      A managed service model (i.e. hardware, software, network included)
      A high end content management engine allowing customised workflow, complex
       information architectures and large numbers of geographical authors
      Highly resilient, scalable and secure infrastructure reducing the risk of failure
      A model to allow changes to sites through configuration, not code customisation
      A range of features tailored to solve government’s main content problems

   Other content engines usually:

      Come as a software licence with extensive customisation required
      Have a range of features that DotP doesn’t have and that have been developed
       over several product cycles, primarily for commercial customers. Some of these
       features will be useful for government
      Will develop competitively no matter what government does
      But they rarely come as managed services, necessitating hosting and
                                                                                              11
       management to be included
Setting Up The Programme                               e-Delivery Team



   Select a core of important websites based on:

        Total size (aiming to isolate 50% of the content in government)
        Visitor count (capturing a large chunk of the audience, say 50%)
        Transaction generation (targeting the bulk of online transactions for both business
         and citizen)
        Content management status (looking first for unmanaged systems still based on
         HTML or those that are not well advanced in terms of a content engine)

   Outline the information architecture as it is coupled with the target
    architecture for how it should be – taking each site and fitting it into an overall
    architecture and design that is consistent across all of them

        It is assumed that these sites – ranking as the most popular and largest in
         government – will need rearchitecting to make the most of them (including a new
         layout, new navigation and so on)

        This rework will give a good chance to eliminate duplication and inconsistency, as
         well as remove as much as 30-50% of content as redundant (based on experience
         with Department of Health).
                                                                                                12
Establishing The Target Platforms                              e-Delivery Team



   To identify the target platforms, the following is proposed:

      A “bake off” competition is kicked off where a variety of content management
       vendors are given an environment (with workspace, hardware and network
       connectivity).
      Each vendor is given the same brief – to take an existing, static website – the
       “challenge site” - with a known information architecture and transfer it to a new
       target architecture (also provided).
      The vendors then set up their systems, using templates and guidelines provided
       by government, to deliver the challenge site under strict timescales – including
       defining the architecture, implementing the style guidelines, integrating the search
       engine and migrating the content
      At the end of the competition, a subset of the vendors who have met previously
       agreed and published criteria is passed through to the next stage
      Commercial agreements are then built – using standard templates – with the
       vendors, allowing for volume discounts on licences to be obtained.
      Websites in the core population are then allocated across vendors and the
       implementation task kicked off. Vendors that perform are given more, vendors
       that don’t perform are gradually eliminated and their work shared across other,
       more successful vendors
                                                                                                13
Why a Bake Off?                              e-Delivery Team



   Migrating some 3000 websites is a fearsome task, here is why there should
    be more than one solution going:

      The problem is not one of only technology – the changes required to government
       editorial processes are enormous. The greater the range of experience thrown at
       this, the better the result

      One single system (or even two or three) would result in bottlenecks that would
       delay rationalisation. Having several “similar” but independent systems will
       resolve the bottleneck

      One large system would be high risk – a single outage could take down
       government’s online presence – spreading the systems will, in the end, reduce risk
       versus cost.

      Competition is healthy – a few players working both together (to complete the
       goal) and against each other (to complete the goal first and therefore win
       business) will work well

      But, we need only a few (5,6,7?) – too many will bring too high an overhead and
       risk quality standards
                                                                                              14
Estimating the Costs                             e-Delivery Team



   The costs of migration will include:

      The initial work to identify candidates

      The evaluation of target platforms

      The setting up of migration environments

      The cost of redesign of some sites to make them consistent with the target
       standard (e.g. search engine on home page, navigation through tabs, reducing the
       depth of the site etc)

      The cost of redesigning pages to fit the new system – e.g. where the site uses
       custom techniques that are not easily replicable

      The actual migration of data from one format to another (there are tools that claim
       to do this, with varying success, or manual methods – these too will need to be
       assessed)



                                                                                               15
Integrate … Marriott.com              e-Delivery Team




                  One URL
                  13 brands
                  Five major redesigns
                  2,600 locations
                  142,000 people




                                                        16
Rationalise … IRS.gov                    e-Delivery Team



             235 sites … to one
             47% e-filing
             25 million regular users
             AOL cache data at peaks
             80% of e-filers do it again
             Accountants starting to charge $35 for
              those who want to do it on paper




                                                           17
Unfocused and disorganised   e-Delivery Team




                                               18
Organised and Focused   e-Delivery Team




                                          19

More Related Content

Similar to Too many websites, too little of interest

Characteristics of the kinase mutant TPK2 in bioreactors
Characteristics of the kinase mutant TPK2 in bioreactorsCharacteristics of the kinase mutant TPK2 in bioreactors
Characteristics of the kinase mutant TPK2 in bioreactors★ Beatriz Barrera Garmón
 
TFM Search Engine Marketing Benchmark Report
TFM Search Engine Marketing Benchmark ReportTFM Search Engine Marketing Benchmark Report
TFM Search Engine Marketing Benchmark ReportEconsultancy
 
AR4 - Interest and Dividend Schedule
AR4 - Interest and Dividend ScheduleAR4 - Interest and Dividend Schedule
AR4 - Interest and Dividend Scheduletaxman taxman
 
Hot Summer case study Online Meet Up
Hot Summer case study Online Meet UpHot Summer case study Online Meet Up
Hot Summer case study Online Meet UpOrange Online Meetup
 
tax.utah.gov forms current tc tc-420a
tax.utah.gov forms current tc  tc-420atax.utah.gov forms current tc  tc-420a
tax.utah.gov forms current tc tc-420ataxman taxman
 
World IPv6 Day - What did we learn?
World IPv6 Day - What did we learn?World IPv6 Day - What did we learn?
World IPv6 Day - What did we learn?RIPE NCC
 
AR1800 - State Political Contribution Schedule
AR1800 - State Political Contribution ScheduleAR1800 - State Political Contribution Schedule
AR1800 - State Political Contribution Scheduletaxman taxman
 
Leigh Nelson - Art by Design
Leigh Nelson - Art by DesignLeigh Nelson - Art by Design
Leigh Nelson - Art by DesignLeigh Nelson
 
Open Garden for AndroidOpen
Open Garden for AndroidOpenOpen Garden for AndroidOpen
Open Garden for AndroidOpenMicha Benoliel
 
Scott Bowe
Scott BoweScott Bowe
Scott Bowecmnsdi
 
Jörg Mayer: Food price volatility - assessing potential market impacts
Jörg Mayer: Food price volatility - assessing potential market impactsJörg Mayer: Food price volatility - assessing potential market impacts
Jörg Mayer: Food price volatility - assessing potential market impactsfutureagricultures
 

Similar to Too many websites, too little of interest (20)

3B - GIS-BASED ANALYSIS OF AQUACULTURE SITE SELECTION
3B - GIS-BASED ANALYSIS OF AQUACULTURE SITE SELECTION3B - GIS-BASED ANALYSIS OF AQUACULTURE SITE SELECTION
3B - GIS-BASED ANALYSIS OF AQUACULTURE SITE SELECTION
 
Payu Pitch Deck
Payu Pitch DeckPayu Pitch Deck
Payu Pitch Deck
 
Characteristics of the kinase mutant TPK2 in bioreactors
Characteristics of the kinase mutant TPK2 in bioreactorsCharacteristics of the kinase mutant TPK2 in bioreactors
Characteristics of the kinase mutant TPK2 in bioreactors
 
www.energypluslight.com
www.energypluslight.comwww.energypluslight.com
www.energypluslight.com
 
TFM Search Engine Marketing Benchmark Report
TFM Search Engine Marketing Benchmark ReportTFM Search Engine Marketing Benchmark Report
TFM Search Engine Marketing Benchmark Report
 
AR4 - Interest and Dividend Schedule
AR4 - Interest and Dividend ScheduleAR4 - Interest and Dividend Schedule
AR4 - Interest and Dividend Schedule
 
Fill In
Fill InFill In
Fill In
 
Hot Summer case study Online Meet Up
Hot Summer case study Online Meet UpHot Summer case study Online Meet Up
Hot Summer case study Online Meet Up
 
tax.utah.gov forms current tc tc-420a
tax.utah.gov forms current tc  tc-420atax.utah.gov forms current tc  tc-420a
tax.utah.gov forms current tc tc-420a
 
World IPv6 Day - What did we learn?
World IPv6 Day - What did we learn?World IPv6 Day - What did we learn?
World IPv6 Day - What did we learn?
 
Empty template
Empty templateEmpty template
Empty template
 
Satisfaction Scores 3
Satisfaction Scores 3Satisfaction Scores 3
Satisfaction Scores 3
 
AR1800 - State Political Contribution Schedule
AR1800 - State Political Contribution ScheduleAR1800 - State Political Contribution Schedule
AR1800 - State Political Contribution Schedule
 
Fill In
Fill InFill In
Fill In
 
Leigh Nelson - Art by Design
Leigh Nelson - Art by DesignLeigh Nelson - Art by Design
Leigh Nelson - Art by Design
 
Satisfaction Scores 4
Satisfaction Scores 4Satisfaction Scores 4
Satisfaction Scores 4
 
Satisfaction Scores 2
Satisfaction Scores 2Satisfaction Scores 2
Satisfaction Scores 2
 
Open Garden for AndroidOpen
Open Garden for AndroidOpenOpen Garden for AndroidOpen
Open Garden for AndroidOpen
 
Scott Bowe
Scott BoweScott Bowe
Scott Bowe
 
Jörg Mayer: Food price volatility - assessing potential market impacts
Jörg Mayer: Food price volatility - assessing potential market impactsJörg Mayer: Food price volatility - assessing potential market impacts
Jörg Mayer: Food price volatility - assessing potential market impacts
 

More from Alan Mather

Mobile government slides 29.01.2003
Mobile government slides   29.01.2003Mobile government slides   29.01.2003
Mobile government slides 29.01.2003Alan Mather
 
Gateway and DotP Performance Dashboard July 2003
Gateway and DotP Performance Dashboard  July 2003Gateway and DotP Performance Dashboard  July 2003
Gateway and DotP Performance Dashboard July 2003Alan Mather
 
Mind the Gaps - January 2015
Mind the Gaps - January 2015Mind the Gaps - January 2015
Mind the Gaps - January 2015Alan Mather
 
Major Projects Leadership Academy 24.04.2014
Major Projects Leadership Academy 24.04.2014 Major Projects Leadership Academy 24.04.2014
Major Projects Leadership Academy 24.04.2014 Alan Mather
 
UK e-Government Shared Services Status Report - February 2004
UK e-Government Shared Services Status Report - February 2004UK e-Government Shared Services Status Report - February 2004
UK e-Government Shared Services Status Report - February 2004Alan Mather
 
Inland Revenue Conference
Inland Revenue Conference Inland Revenue Conference
Inland Revenue Conference Alan Mather
 
Inland Revenue Senior Management Conference 2000
Inland Revenue Senior Management Conference 2000Inland Revenue Senior Management Conference 2000
Inland Revenue Senior Management Conference 2000Alan Mather
 
Inland Revenue Senior Management Conference - February 2000
Inland Revenue Senior Management Conference - February 2000Inland Revenue Senior Management Conference - February 2000
Inland Revenue Senior Management Conference - February 2000Alan Mather
 
Japan slides 16.10.2001
Japan slides   16.10.2001Japan slides   16.10.2001
Japan slides 16.10.2001Alan Mather
 
Autumn com.com 3 151101
Autumn com.com 3 151101Autumn com.com 3 151101
Autumn com.com 3 151101Alan Mather
 
Authentication slides 04.07.2003
Authentication slides   04.07.2003Authentication slides   04.07.2003
Authentication slides 04.07.2003Alan Mather
 
edt dotp-vision_for_transformation_v0.19.-180702
edt dotp-vision_for_transformation_v0.19.-180702edt dotp-vision_for_transformation_v0.19.-180702
edt dotp-vision_for_transformation_v0.19.-180702Alan Mather
 
Portcullis house full - 25.01.2002
Portcullis house   full -  25.01.2002Portcullis house   full -  25.01.2002
Portcullis house full - 25.01.2002Alan Mather
 
Sun live slides 14.03.2006
Sun live slides   14.03.2006Sun live slides   14.03.2006
Sun live slides 14.03.2006Alan Mather
 
Slides gc2004 - 22.06.2004
Slides   gc2004 - 22.06.2004Slides   gc2004 - 22.06.2004
Slides gc2004 - 22.06.2004Alan Mather
 
Sapient e gov slides v2 - 20.03.2003
Sapient e gov slides v2 - 20.03.2003Sapient e gov slides v2 - 20.03.2003
Sapient e gov slides v2 - 20.03.2003Alan Mather
 
Enterprise Architecture
Enterprise ArchitectureEnterprise Architecture
Enterprise ArchitectureAlan Mather
 
eCrime Conference March 2006
eCrime Conference March 2006eCrime Conference March 2006
eCrime Conference March 2006Alan Mather
 
Dan Jellinek Transform Slides 13.05.2004
Dan Jellinek Transform Slides   13.05.2004Dan Jellinek Transform Slides   13.05.2004
Dan Jellinek Transform Slides 13.05.2004Alan Mather
 

More from Alan Mather (20)

Mobile government slides 29.01.2003
Mobile government slides   29.01.2003Mobile government slides   29.01.2003
Mobile government slides 29.01.2003
 
Gateway and DotP Performance Dashboard July 2003
Gateway and DotP Performance Dashboard  July 2003Gateway and DotP Performance Dashboard  July 2003
Gateway and DotP Performance Dashboard July 2003
 
Mind the Gaps - January 2015
Mind the Gaps - January 2015Mind the Gaps - January 2015
Mind the Gaps - January 2015
 
Major Projects Leadership Academy 24.04.2014
Major Projects Leadership Academy 24.04.2014 Major Projects Leadership Academy 24.04.2014
Major Projects Leadership Academy 24.04.2014
 
UK e-Government Shared Services Status Report - February 2004
UK e-Government Shared Services Status Report - February 2004UK e-Government Shared Services Status Report - February 2004
UK e-Government Shared Services Status Report - February 2004
 
Inland Revenue Conference
Inland Revenue Conference Inland Revenue Conference
Inland Revenue Conference
 
Inland Revenue Senior Management Conference 2000
Inland Revenue Senior Management Conference 2000Inland Revenue Senior Management Conference 2000
Inland Revenue Senior Management Conference 2000
 
Autumn leaves
Autumn leavesAutumn leaves
Autumn leaves
 
Inland Revenue Senior Management Conference - February 2000
Inland Revenue Senior Management Conference - February 2000Inland Revenue Senior Management Conference - February 2000
Inland Revenue Senior Management Conference - February 2000
 
Japan slides 16.10.2001
Japan slides   16.10.2001Japan slides   16.10.2001
Japan slides 16.10.2001
 
Autumn com.com 3 151101
Autumn com.com 3 151101Autumn com.com 3 151101
Autumn com.com 3 151101
 
Authentication slides 04.07.2003
Authentication slides   04.07.2003Authentication slides   04.07.2003
Authentication slides 04.07.2003
 
edt dotp-vision_for_transformation_v0.19.-180702
edt dotp-vision_for_transformation_v0.19.-180702edt dotp-vision_for_transformation_v0.19.-180702
edt dotp-vision_for_transformation_v0.19.-180702
 
Portcullis house full - 25.01.2002
Portcullis house   full -  25.01.2002Portcullis house   full -  25.01.2002
Portcullis house full - 25.01.2002
 
Sun live slides 14.03.2006
Sun live slides   14.03.2006Sun live slides   14.03.2006
Sun live slides 14.03.2006
 
Slides gc2004 - 22.06.2004
Slides   gc2004 - 22.06.2004Slides   gc2004 - 22.06.2004
Slides gc2004 - 22.06.2004
 
Sapient e gov slides v2 - 20.03.2003
Sapient e gov slides v2 - 20.03.2003Sapient e gov slides v2 - 20.03.2003
Sapient e gov slides v2 - 20.03.2003
 
Enterprise Architecture
Enterprise ArchitectureEnterprise Architecture
Enterprise Architecture
 
eCrime Conference March 2006
eCrime Conference March 2006eCrime Conference March 2006
eCrime Conference March 2006
 
Dan Jellinek Transform Slides 13.05.2004
Dan Jellinek Transform Slides   13.05.2004Dan Jellinek Transform Slides   13.05.2004
Dan Jellinek Transform Slides 13.05.2004
 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 

Too many websites, too little of interest

  • 1. e-Delivery Team Too many websites Too little of interest 30.03.2004 e-Delivery Team Alan Mather 1
  • 2. Pages per site e-Delivery Team  Using google to spider to count the pages of all 3233 .gov.uk sites …  90% of sites have less than 2,000 pages  Less than 1% of sites have more than 20,000 pages High count with less than 50 pages – many redirects (where domain has changed or is not active, e.g. Gov.UK - Web Site Page Counts And only a few sites have direct.gov.uk), also sites that use huge page counts (between ASP or frames making it 2000 100% 20,000 and 100,000) … impossible for google to spider 1800 90% including ir.gov.uk, dh, %age of total site count behind first page 1600 80% scotland, ons, hmso 1400 70% 1200 60% 1000 50% But, notwithstanding 800 40% inability to spider some 600 30% sites, it looks clear that the 400 20% vast bulk of .gov sites have 200 10% less than 2000 pages 0 0% 0 0 0 00 00 00 00 00 00 00 00 <5 0 0 0 0 0 00 0 0 <1 <2 <3 <4 <5 <1 <2 0 <1 e.g. 0<x<50, 50<x<1000 etc Page Count <50 <1000 <2000 <3000 <4000 <5000 <10000 <20000 <100000 Site Count 1891 738 251 120 63 113 4 28 25 %of total 58% 81% 89% 93% 95% 98% 98% 99% 100% Cumulative Site Count 1891 2629 2880 3000 3063 3176 3180 3208 3233 2
  • 3. The Google Data - Raw e-Delivery Team  The Google data shows:  More than 80% of the content (in pages) is found in around 10% of the total count of sites  There are huge numbers of very small sites (per Google), although that may be because Google is unable to spider or does not cover all sites through the entire hierarchy  Still, errors in Google indexing are likely to be consistent across the entire population of .gov sites, making the shape of the graph likely ok Google's site sizes 100000 100% 90000 90% 80000 80% 70000 70% 60000 60% site size 50000 50% 40000 40% 30000 30% 20000 20% 10000 10% 0 0% 3
  • 4. Counting Servers e-Delivery Team  Checking on the servers operating behind the websites in .gov.uk Apache 1209 Apache 186  Over 1,200 running Apache Apache/1.3.26 274  And more than 1,500 running Microsoft IIS Apache/1.3.27 282 Apache/1.3.28 62  These figures don’t include servers that Apache/1.3.29 99 may be configured but not active for, e.g. Apache/2.0.40 25 Apache/2.0.45 1 resilience. They also don’t include Apache/2.0.46 32 servers further down the infrastructure other Apache 248 stack, e.g. running content applications Microsoft-IIS 1547 or other code IIS/4.0 377 IIS/5.0 1103  Naturally, each of these servers is likely IIS/6.0 other IIS 65 2 to be accompanied by firewall and Lotus-Domino Lotus-Domino 109 storage configurations Netscape-Enterprise Netscape-Enterprise 74  At a conservative cost of £10,000 per server, the total cost of this infrastructure alone is over £29,000,000 4
  • 5. Cost of Websites (Benchmarking) e-Delivery Team Not on Record •dti •IR ONS •HMCE •Home Office DH DfT Worktrain •DEFRA Business.gov? •ODPM JC+ (development) Figures drawn from recent PQ (and, unless stated, include only hosting charges and not development or development support) JC+ DWP The Pension Service OFT Large TheRegister.com Worktrain Quasi-Public HMT (development) Sector (fully DfES Loaded) 5 250k 500k 750k 1.0m 1.25m 1.5m 1.75m 2.0m 2.25m 2.5m 3.0m
  • 6. Characteristics of .gov.uk sites e-Delivery Team Inconsistent - five different look and feels Unreliable - Poor uptime Huge - up to 100,000 pages Complex - Nine levels More than 3200 deep More than three sites navigation designs 100s of broken Some parts of the links More than 2.5 site not linked to others million documents ‘orphan content’ More than 200 URLs per dept More than 300 authors Slow - download time more than one minute 6
  • 7. Looking For The Right Thing? e-Delivery Team  Using Internet search engines in an effort to find “the right thing” can be challenging. The search terms at left were entered, with the results restricted 30/03/2004 to the “.gov.uk” domain only Disability Living Allowance 14,700 Child Tax Credit 5,790  There is a huge amount of duplication Carers Allowance 915 in government online: Working Family Tax Credit 546 Attendance Allowance 13,000  Many local authority sites repeat the Council Tax Benefit 42,000 description of the rules for claiming Housing Benefit 77,800 certain benefits, where to claim, what to Statutory Sick Pay 6,200 claim for and so on … and doubtless, Self Assessment 14,000 every year or so, each of these mentions must be updated with the correct rules (but what if they’re not?)  Even “self assessment” only has 4,950 mentions on the Inland Revenue’s own site, but a further 9,000 across the rest of government 7
  • 8. And how does .gov look to the consumer? e-Delivery Team  The variety of sites show little in the way of consistency  Navigation varies from site to site, sometimes on the left, sometimes tabbed, sometimes graphic, sometimes text  “Search” is called different things, is often not on the home page and often returns poor results – despite research showing that consumers who can’t see what they want instantly will use search  Accessibility is poor with many sites not attempting to achieve the lowest hurdles  Even sites owned by the same parent are confusing, e.g. pensionservice, pensionguide, agepositive, over50 … 8
  • 9. The Missing Data e-Delivery Team  To complete the picture and allow the proposed plan of action to be fine tuned, the following data is needed:  Visitor counts (Hitwise may offer an approximation)  Approximate costs to operate (at an infrastructure level including all servers, network equipment, firewalls, software licences etc) – both price bought at and the price for continued operations projected forwards (to allow for annual licence premiums, renewals etc that may be due in the future)  Contractual agreements around exit arrangements, renewal dates etc along with whether the contract for web hosting is part of a wider technology outsource agreement (that might, therefore, make it harder to exit) 9
  • 10. Proposal For What Next e-Delivery Team  Principles  Government is in the business of helping citizens by making information easy to find. The total number of websites needs to be rationalised dramatically – from over 3,000 to under 600 in the first stage (including Local Authorities).  Government is in the business of presenting information in a way that citizens will understand; it is not in the user interface design business. The range of navigational and interface styles needs to be harmonised to a single core style.  Government has already spent significant sums on its online presence, yet government is not a technology leader. The cost of the programme outlined must be absorbed through saves generated in the first year of the programme, making it self-funding.  Government buys in cycles and these are likely to be maintained. This cycle will allow work to be completed at a constant pace as contracts come to their natural end, thus incurring no exit penalties.  A programme of rationalisation this large will require multiple parallel streams of work – the cost of the overlap reducing the saves inherent slightly but increasing the odds of success through elimination of bottleneck and delay 10
  • 11. DotP versus Everything Else e-Delivery Team  Condensing 3,000+ sites to a few hundred is no simple task. It will likely require a variety of approaches and software solutions to ensure that there are no bottlenecks.  DotP’s primary characteristics are:  A managed service model (i.e. hardware, software, network included)  A high end content management engine allowing customised workflow, complex information architectures and large numbers of geographical authors  Highly resilient, scalable and secure infrastructure reducing the risk of failure  A model to allow changes to sites through configuration, not code customisation  A range of features tailored to solve government’s main content problems  Other content engines usually:  Come as a software licence with extensive customisation required  Have a range of features that DotP doesn’t have and that have been developed over several product cycles, primarily for commercial customers. Some of these features will be useful for government  Will develop competitively no matter what government does  But they rarely come as managed services, necessitating hosting and 11 management to be included
  • 12. Setting Up The Programme e-Delivery Team  Select a core of important websites based on:  Total size (aiming to isolate 50% of the content in government)  Visitor count (capturing a large chunk of the audience, say 50%)  Transaction generation (targeting the bulk of online transactions for both business and citizen)  Content management status (looking first for unmanaged systems still based on HTML or those that are not well advanced in terms of a content engine)  Outline the information architecture as it is coupled with the target architecture for how it should be – taking each site and fitting it into an overall architecture and design that is consistent across all of them  It is assumed that these sites – ranking as the most popular and largest in government – will need rearchitecting to make the most of them (including a new layout, new navigation and so on)  This rework will give a good chance to eliminate duplication and inconsistency, as well as remove as much as 30-50% of content as redundant (based on experience with Department of Health). 12
  • 13. Establishing The Target Platforms e-Delivery Team  To identify the target platforms, the following is proposed:  A “bake off” competition is kicked off where a variety of content management vendors are given an environment (with workspace, hardware and network connectivity).  Each vendor is given the same brief – to take an existing, static website – the “challenge site” - with a known information architecture and transfer it to a new target architecture (also provided).  The vendors then set up their systems, using templates and guidelines provided by government, to deliver the challenge site under strict timescales – including defining the architecture, implementing the style guidelines, integrating the search engine and migrating the content  At the end of the competition, a subset of the vendors who have met previously agreed and published criteria is passed through to the next stage  Commercial agreements are then built – using standard templates – with the vendors, allowing for volume discounts on licences to be obtained.  Websites in the core population are then allocated across vendors and the implementation task kicked off. Vendors that perform are given more, vendors that don’t perform are gradually eliminated and their work shared across other, more successful vendors 13
  • 14. Why a Bake Off? e-Delivery Team  Migrating some 3000 websites is a fearsome task, here is why there should be more than one solution going:  The problem is not one of only technology – the changes required to government editorial processes are enormous. The greater the range of experience thrown at this, the better the result  One single system (or even two or three) would result in bottlenecks that would delay rationalisation. Having several “similar” but independent systems will resolve the bottleneck  One large system would be high risk – a single outage could take down government’s online presence – spreading the systems will, in the end, reduce risk versus cost.  Competition is healthy – a few players working both together (to complete the goal) and against each other (to complete the goal first and therefore win business) will work well  But, we need only a few (5,6,7?) – too many will bring too high an overhead and risk quality standards 14
  • 15. Estimating the Costs e-Delivery Team  The costs of migration will include:  The initial work to identify candidates  The evaluation of target platforms  The setting up of migration environments  The cost of redesign of some sites to make them consistent with the target standard (e.g. search engine on home page, navigation through tabs, reducing the depth of the site etc)  The cost of redesigning pages to fit the new system – e.g. where the site uses custom techniques that are not easily replicable  The actual migration of data from one format to another (there are tools that claim to do this, with varying success, or manual methods – these too will need to be assessed) 15
  • 16. Integrate … Marriott.com e-Delivery Team  One URL  13 brands  Five major redesigns  2,600 locations  142,000 people 16
  • 17. Rationalise … IRS.gov e-Delivery Team  235 sites … to one  47% e-filing  25 million regular users  AOL cache data at peaks  80% of e-filers do it again  Accountants starting to charge $35 for those who want to do it on paper 17
  • 18. Unfocused and disorganised e-Delivery Team 18
  • 19. Organised and Focused e-Delivery Team 19