New tests based on a ‘standard’ job from CMS, and based on hardware at FZK.
Running with xrootd, AFS, Lustre.
Results continue to indicate Lustre offers considerable performance benefits.
RAID Technology Update – End-to-End Data Protection Dr. Ted Pang (Infortrend)
Review of issues surrounding data integrity and protection in modern storage systems.
Errors can occur in any one or more components in the storage system.
Description of existing data integrity technologies;
ECC for each data block on HDD, RAID parity checking, CRC, ECC on data links.
Silent Data Corruption – unreported, unnoticed.
Could be caused by on-wire corruption, incorrect writes, unfinished writes.
Review of methods to detecting data errors: media scans -> rebuild, media error -> mark bad data block -> report error to host, RAID parity check on read. Need more checking for data blocks.
Protection for data integrity: add 8bytes to each 512 byte block, written along with original block, checked as data is re-read.
DIF format, IEEE P1619
How does ZFS fit into the story?
ZFS is a whole system whereas DIF is at the hardware layer.
What is performance hit for DIF etc?
~ 5 to 10% performance and 1.5% space.
Lustre File System news and roadmap Peter Bojanic (Sun)
Reviewed Lustre and developments.
Well used in HEP and used in 6 of top 10 in last Top500.
Demonstrated wide scalability and high performance.
Notes on Lustre 1.8, 2.0, 3.0.
Lustre 1.8 has faster recovery in high availability use (adaptive timeouts), OST pools, OSS read cache, client SMP scalability.
Lustre 2.0 is the next ‘big’ release, with Kerberos security, server change logs, replication, commit on share, enhancements to client IO subsystem.
Scaling to 100k clients feasible, looking to a million clients in 5 years.
Increasing OSS numbers.
Investigating increase in MDS to more than 1+1 hit standby (32+).
OST read and write caches, including asynchronous IO speedups.
Looking at Flash Cache to speed up IO – store data bursts and drain out while compute node goes back to work.
End-to-end data integrity with Lustre network checksums, now working at data integrity using ZFS.
Investigating checksum offloading and checksum for ldiskfs.
Extensive news of other future developments.
CASTOR status and plans
New use case: user analysis, based mainly on disk-only use
Architecture changes necessary - chose xrootd as the protocol to give low latency access, bandwidth and simultaneous connections.
New plugin to handle Xrootd.
SRM2 and monitoring projects in CASTOR
Giuseppe Lo Presti
Review of SRM v2 and announcement of review of database schemas review
Giuseppe Lo Presti
Dashboard for each instance, based on the Lemon/SLS system at CERN
CASTOR Operational Experiences
Miguel Coelho Dos Santos
Increasing Tape Efficiency
Review of post ‘Curran talk’ work on improvements to tape handling
Operating Systems & Applications
CVS Replacement: Getting Subversive
Report on program to replace CVS service at CERN with Subversion
High Performance Cryptographic Computing
Chen-Mou Cheng (National Taiwan University)
Interesting talk on using GPUs on modern Nvidia graphics cards for cryptographic work.
Update on Mail Services at CERN
Rafal Otto (CERN)
Update on Mail handling at CERN including SPAM fighting: two phases.
Phase 1 is to reject high-confidence spam with SMTP standard error mechanism – 96% of SPAM rejected this way.
Phase 2 is to quarantine medium-confidence spam. 50% of messages passing phase 1 test are still spam. Users can tune threshold (low/medium/high). Messages either delivered into spam folder or direct to user’s mailboxes.
Details of the upgrade to Exchange 2007 for the CERN mailers using public betas in a virtual environment followed by a rollout as hardware is upgraded.
Notes on replacing the mailing list system (SIMBA) using mix of SharePoint, exchange group addresses and e-groups.
Scientific Linux Troy Dawson - FermiLab
SL 5.2 releases in June 08 for i386 and x86_64.
Post install XFS, dynamic kernel modules, kdeedu, yum utilities.
Pine replaced by alpine, r1000 is in the kernel, yum-installonlyn now in yum itself.
SL5.2 live released for i386, x86_64.
SL4.7 released September 08.
SL4.7 Live released.
Automatic download and building of RedHat errata using virtual machines for SL3,4, mock for SL5.
Some issues – kernel modules done by hand, 64bit library misplacement on 64bit machines.
Fastbugs for SL4,5 pushed weekly.
Poor choice of RPM names caused a problem with updates – fixed with a plugin to RPM and new RPMs, but still finding ‘bad’ RPMs.
Working on automatic downloading of packages from RedHat.
Need a new logo for SL6 (must be a carbon atom).
Best guesses for next releases:
SL5.3 ~ March 09, SL4.8 ~ June 09, SL6 ~ Jan 2010.
Benchmarking I CPU Benchmarking at GridKa – Update
Manfred Alef (GridKa)
Reviewed new worker nodes available at GridKa (E5345, E5430, 2356), the benchmarks used and environment in which they are run.
Comparison of SPECint_base2000 for i386, x86_64/m32 and x86_64/m64 shows that 64bit on 64bit is best and that the Opterons show a greater improvement.
The differences with SPECint_base2006 are smaller with no special advantages for 64bit/64bit, and that for E5400 series chips, 32bit modes are better because of the overhead of shifting 64bit data.
Comparison of the scaled benchmarks show good scaling for Intel chips and better results for AMD chips if one uses 64bit on 64bit.
Discussion of the GridKa power consumption studies, starting with relevant configuration details of the nodes tested and the method of using Cpuburn.
Comparison of the data shows that overall the more modern chips are better for total power and efficiency (more cores).
Noted that San Clemente chipset (normal RAM) much less power hungry than the chipsets supporting FB DIMMs.
Also reviewed results obtained using the SPECpower benchmark for comparison which seem to show that the power measurements with CPUburn are a better match for our purposes.
Question use of CPUburn rather than HPL which would load RAM too.
Suggests that the physics code is less RAM intensive.
CERN use 50% CPUburn and 50% Lapack for power measurements.
Benchmarking II Power Efficiency of Servers
Reviewed the need for power efficiency (limited power, cost, green-ness), and a survey of the sources of power use in a server.
CERN approach is to treat the server as a unit and measure the AC input power.
Outlined method of adjudicating tenders to take into account power consumption.
Data for generations of systems shows that efficiency is factor 9 better since 2004, leading to CERN retiring older systems much more aggressively.
(Intel) L v E chips: can get Es that take less power than the Ls.
Intel says L is guaranteed to be an L, and E may be as good as or better than and L.
Redundant PSUs cost a lot of power so service nodes running idle can cost a lot of power.
For disk servers, same things apply.
Some SATA disks are more efficient than others particularly when idle. Only a single CPU – sufficient performance. Choose the right chipset.
Q/A: Evident that blades are better for infrastructure machines because of redundant PSUs.
Benchmarking III Benchmarking Working Group
WLCG MoUs based on SI2K
SPEC2000 deprecated in favour of SPEC2006
no longer available and maintained
Find a benchmark accepted by HEP and others as many sites serve different communities
Work now essentially complete
Studied experiment apps, SPEC2006
64bit SL4 with 32bit apps – this is how the experiments currently run
Profiling codes using perfmon on lxbatch and SPEC2006 codes on test units shows that the c++ elements in SPEC2006 best match to experiment codes
Rigorous statistical treatment to be completed
All_cpp benchmark put forward as new benchmark standard – name to be determined
Advantage – must measure – vendors can’t look up the answer on the SPEC site
Measurement must use standard method
Run under prevailing environment at site
Need Site Licence for SPEC2006, everything else is provided
WG will submit a paper to CHEP09 detailing work
New group from WLCG management to take work on to review MoUs and process for revising commitments to new benchmark
Alef, Michelotto, Meinhard from original group
HEPiX group will continue efforts
New possibilities and ideas around Hyper-V virtualisation and Virtual Desktop Infrastructure (VDI):
Rafal Otto (CERN)
Review of virtual services provision at CERN.
Issues include limited performance with Virtual Server 2005, management difficult, lack of support for highly available clusters, lack of connection broker. Some new services not currently available: server consolidation, and a Virtual Desktop infrastructure.
Using Hyper-V as a new host – much better performance, uses hardware support for virtualisation, support for 64bit guests, multiple cores for guests, much bigger RAM for guests.
Notes on performance, details of new service and facilities.
Review of Virtual Desktop facilities and TCO.
Virtualization Experience with Hyper-V on Windows Server 2008 System
Reinhard Baltrusch (DESY)
Introduction to the virtualisation setup at DESY, based on Hyper-V on Windows Server 2008.
Reviewed past situation and hardware provision, and new work on their current setup.
Bridging Grid and Virtualization
Dr. Fubo Zhang (Platform Computing)
An intro to Platform Computing and its activities.
Described their concept of how the grid and virtualisation can work together.
Described LSF2 product that allows a virtual machine resource to be specified as a job resource parameter.
Networking – IPv4/IPv6 Transition Status and Recommendations
Fred Baker (CISCO)
Chair of the IETF IPv6 Operations Working Group and is apart chair of the IETF (1996-2001).
Overview of his position on IPv6 from his period as chair of IETF and later as an industry commentator with some comments on the difficulties then anticipated in brining the Chinese university system online.
Reviewed of the current state of the IP universe and how the various players in the game see the issues and goals differently.
Fred sees the goal as ‘extending the lifetime of the internet with maximized application options and minimized long term operational cost’. The IETF basic transition recommendation is “turn it on in your existing IPv4 network”, and Fred outlined the possibilities of what this means. All solutions should provide both ipv4 and ipv6 transparently to all devices.
RFC 5211 gives an alternative deployment plan from John Curran. In 2009, prepare by bringing up v4+v6, encourage use of ipv6. 2010, transition phase – raise price of ipv4 only services to encourage transition. Turn off v6 by 2012, which seems optimistic.
Also reviewed alternative ways forward including translation between the two types of addresses (not enough addresses for everyone the planet, never mind all the sensors we might want).
Cyber Security Update: Tony Cass (CERN)
Tony gave a talk about the general state of cyber security, some specific issues affecting CERN and the HEP community, and general trends.
Debian OpenSSL vulnerability
Signing by the badguys of malicious binary RPMs using RedHat and Fedora keys
Ongoing compromised SSH key attacks.
Notes on the two recent network security threats (DNS poisoning, TCP stack issues), and the problems with many SCADA systems.
In the HEP community, there was a problem with infected USB sticks – remember floppies of old?
CERN, will restrict access to external DNS server and use of TOR anonymous routing.
Enhanced logging and traceability from logging performed by IT/FIO.
Seeing lots of Web application attacks attempting to take over web servers.
email viruses with rapidly changing signatures
Semi-targeted phishing attacks, usually messages purporting to come from the local ‘IT department’.
Tony reviewed the CMSMON web compromise, and the general consequences of the exposure of much sensitive information because CERN is an ‘open lab’.
Plans for the future include notification of ssh logins from ‘new’ places.
Issues in general cyber security included the same classes of problem cropping up in software again and again:
more sophisticated and malicious viruses
kits for generating malicious software
MBR root kits delivered by drive-by download,
click-jacking, massive botnets, targeted automatic attacks driven by criminal gangs
Selected fragments from Site Reports
Site Reports I
Denise Heagerty stepping down as head of security.
Largest risks: flaws in custom-built apps, socially engineered phishing attacks, zero-day exploits.
Problems with too-open web servers, test accounts with weak passwords.
Skype trials for six months.
Designing new data centre but no official go-ahead. (Power running out). Can utilise chillers to implements some water-cooled racks.
Procurements: problems with a major supplier going out of business – orders not delivered. 1700 Systems no longer covered by warranties.
Site Reports II
Pushing to ISO20000 and ITIL v2 framework.
Moving email and calendaring to Exchange.
University of Tokyo
Used for analysis by Atlas-Japan
Torque/Maui, Quattor, Twiki, Lemon, MRTG
Many sites migrating from national cell to local AFS cells
Description of TRIP systems to allow INFN-wide wireless access authentication based on RADIUS. Looked at a similar system for email access but local resistance and wish for local control.
IPV6 working group to evaluate addressing and connectivity to get used to it for when it is necessary. Some issues with IPv6 bypassing firewalls.
Site Reports III
Change of name to the SLAC National Accelerator Laboratory
Various staff changes
Chuck Boheim now at Cornell
Richard Mount now working for Atlas
New Director of Computing to be appointed
Progress with new building
Tier1 migration planning
Change of email addresses: [email_address]
Production Team (Gareth Smith, John Kelly)
Site Reports IV
Using system’s temperature monitoring to profile machine room temperature
Update on Lustre use
Power and cooling shortages
Using water-cooled racks provided an emergency solution – also much quieter
Various problems with sudden loss of cooling
Continued positive experience of Lustre
Site Reports V
Report on problems with MCP55 forcedeth driver:
Nvidia driver with same name and version number available but different code – no indication in change log it is different. Solved problems.
Moving away from 3ware-based storage to SATA/FC to separate server from storage elements.