• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Scientific Computing - Hardware
 

Scientific Computing - Hardware

on

  • 603 views

Introduction to servers, HPC, and Cloud Computing

Introduction to servers, HPC, and Cloud Computing

Statistics

Views

Total Views
603
Views on SlideShare
280
Embed Views
323

Actions

Likes
0
Downloads
6
Comments
0

1 Embed 323

http://qbrc.swmed.edu 323

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Lots of overlap between these – could have a single machine that runs all of these services.
  • Answer: we don’t know.Top row could be plugged in under my desk and used only to watch YouTube and edit Word documents.Bottom row could be formally installed in a data center and used to host a website.In reality, there’s not a clear distinction. Most “clients”/PCs actually run some “serving” software such as file or media sharing.11m
  • Most servers will meet most/all of these criteria. Of course, you could really argue that just about any technical device is a server of something, but this list defines typical usage.14m
  • See mostly Linux in Academia due to licensing concerns, among other reasons.
  • Only way to access many remote, non-graphical systems. Much more efficient.Many tools only have CLIsAllows for unique combination of tools, patch togetherGraphical systems are truly easier to work with for many tasks, but when you start getting into bioinformatic analysis, you’ll begin to appreciate the power of the shell.Want to count how many times a particular motif occurs in a sequence file? Can be done in one line by combining two commands on a shell.
  • Makes sense to invest in a server everyone in the group can access and share remotely, rather than buying everyone a more powerful computer.
  • 30m
  • We have a group of 10 researchersWe have been sharing a single server, but have outgrown itToo many people want to use it and it slows down when we all do.One group member is running a simulation that will take 2 weeks to complete if only run on this server.
  • Requires a lot of manual effort to inspect which server is the most available right now.If somebody starts running a big job on the server your on, it will slow down drasticallyStill no way for these independent servers to collaborate on big jobs
  • Not all software supports parallelization. Must be specifically written with that in mind.
  • User 4 logs in to the head node
  • User 4 creates a job.
  • Head node reads the job description and finds that the user will need one node.Sees that node #3 is not being used, assigns the job to node #3.
  • User 9 logs in
  • User 9 submits a job
  • Head node reads the job description and finds that the user will need one node.Sees that node #1 is not being used, assigns the job to node #1
  • User 2 logs in
  • User 2 submits a job requiring 2 nodes to run collaboratively.
  • Head node checks existing nodes, finds that there aren’t 2 nodes available. Places user2’s job in the queue.
  • Job #2 finishes
  • Head node receives notification, (optionally) notifies User 9 that his/her job is complete.
  • Head node now checks to see if there are sufficient resources to run the next job, there are, so it initializes the job.
  • Head node now checks to see if there are sufficient resources to run the next job, there are, so it initializes the job.

Scientific Computing - Hardware Scientific Computing - Hardware Presentation Transcript

  • Computing Hardware Jeff Allen Quantitative Biomedical Research Center UT Southwestern Medical Center BSCI5096 - 3.26.2013
  • Outline• Servers• Clusters• The Cloud
  • Outline• Servers – Concepts & Definitions – Novel properties of servers• Clusters• The Cloud
  • Servers – Concepts & Definitions “A computer or program thatsupplies data or resources to other machines on a network.” • File Server • Email Server • Database Server • iTunes Server • Web Server • Computing Server server. (n.d.). Collins English Dictionary - Complete & Unabridged 10th Edition. Retrieved March 25, 2013, from Dictionary.com website: http://dictionary.reference.com/browse/server
  • Servers – Concepts & Definitions• Same hardware components as your Personal Computer – Processor, Memory, Pow er Supply, Hard Drive• Often stacked in a rack Image from: http://www.stealth.com/industrial_rackmounts_sr1501datasheet.htm
  • Servers – Concepts & Definitions• Same hardware components as your Personal Computer – Processor, Memory, Pow er Supply, Hard Drive• Often stacked in a rack http://www.daystarinc.com/hosting-facility
  • Is this a server?Image from: http://www.stealth.com/industrial_rackmounts_sr1501datasheet.htm
  • Is this a server?Image from: http://mediapool.getthespec.com/media.jpg?m=gBLSSTJ6IbHLuZD1JNnmyw%3D%3D&v=HR
  • Is this a server?Image from: http://www.maximumpc.com/articles/reviews/hardware
  • Is this a server? Image from: http://www.phonearena.com/image.php?m=Articles.Images&f=name&id=28259&name=GT-I8520_1.jpg&caption=&title=Image+from+%22UPDATED%3A+Samsung+I8520+is+an+Android+phone+with+built-in+projector%22&kw=&popup=1
  • Is this a server?
  • Common Attributes of a Server• Often runs an Operating System geared towards servers.• Primarily accessed remotely – Often “headless” (no monitor)• Runs 24/7, minimize downtime• May be kept in a data center – Superior cooling, increased security, etc.• Redundancy (Power, Disk Storage)• More powerful and expensive
  • Operating SystemClient PCs Servers• Windows (XP, Vista, 7, 8) • Linux (Red Hat, Suse• Mac OS Enterprise, Ubuntu• Linux Server) (Ubuntu, Mint, openSUS • Windows (Windows E) Server 2003, 2008, 2012) • Non-Linux Unix (BSD, Solaris, AIX)
  • Remote Access & the Shell• Typically don’t have physical access to the server, must access over a network• Windows is heavily graphical, access using “Remote Desktop” Image from http://www.softsalad.com/software/remote-desktop-control.html
  • Remote Access & the Shell• Typically don’t have physical access to the server, must access over a network• Windows is heavily graphical, access using “Remote Desktop”• Linux is less graphical, access via a “Shell” Image from http://www.softsalad.com/software/remote-desktop-control.html
  • Shell Access1. User logs in2. User types command3. Computer executes Shell command and prints output4. User types another command5. …6. User logs off Modified from http://software-carpentry.org/4_0/shell/intro.html
  • Shell ComparisonWindows (Graphical) Linux (Shell) Image from http://www.dedoimedo.com/computers/windows-7.html
  • Shell ComparisonWindows (Graphical) Linux (Shell)
  • Shell ComparisonWindows (Graphical) Linux (Shell)
  • Shell Access• Slow learning curve• Can often be confusing at first, requires a new way of thinking• Ultimately very powerful and efficient• Three reasons to use: 1. It’s your only choice for remote access on some non- graphical systems 2. Many software tools only offer Command Line interfaces 3. Allows for powerful new combinations of tools Modified from http://software-carpentry.org/4_0/shell/intro.html
  • Data Centers• Redundant, independent power feeds – Diesel generator backup• Redundant Internet connections• Redundant cooling• 24/7/365 staffing, restricted access
  • RAIDDisk 1 • “Redundant Array ofDisk 2 Independent Disks” RAID • Store informationDisk 3 Array redundantlyDisk 4 • Support failure of one or more hardDisk 5 drives without losing data
  • Server Computing Power• Often very expensive machines• Hardware designed to support more resources than a PC – May have dozens or hundreds of GB of RAM – Very expensive powerful processor, or even multiple processors
  • Outline• Servers• Clusters – Motivation & Concept – Job submission – Example• The Cloud
  • Example Problem• Group of 10 researchers• Too many concurrent users, runs slowly• Have some very large jobs
  • Naïve Solution• Buy more independent servers!• Let people connect to whichever server they want• Problems: – Not sure which servers are busiest – Still takes weeks to run big simulations
  • Clustered Solution• Servers are “nodes” in a cluster• Log in via head node• Head node manages requested jobs – Submits them to “worker” or “slave” nodes – Intelligently calculates available resources on each worker node• Multiple nodes can work on a single task
  • Job Submission• Prepare a script to be executed (“myjob.sh”) – Include specifications on resources required • (“-l nodes=2:ppn=4”) – Or what queue it should be submitted to • Different queues have different priorities and permissions• Submit that job to the head node (“qsub myjob.sh”)• Head node will begin executing as soon as it has sufficient resources
  • Clustered Solution
  • Clustered Solution Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
  • Clustered SolutionJob #1 User: User4 Nodes Req’d: 1 Program: align.sh
  • Clustered SolutionJob #1 User: User4 Nodes Req’d: 1 Program: align.sh
  • Clustered Solution Job #2Job #1 User: User9 User: User4 Nodes Req’d: 1 Nodes Req’d: 1 Program: align.sh Program: simul.sh
  • Clustered SolutionJob #2 User: User9 Nodes Req’d: 1 Program: simul.shJob #1 User: User4 Nodes Req’d: 1 Program: align.sh
  • Clustered SolutionJob #2 User: User9 Nodes Req’d: 1 Program: simul.shJob #1 User: User4 Nodes Req’d: 1 Program: align.sh
  • Clustered SolutionJob #2 User: User9 Nodes Req’d: 1 Program: simul.sh Job #3Job #1 User: User2 User: User4 Nodes Req’d: 1 Nodes Req’d: 2 Program: align.sh Program: splice.sh
  • Clustered Solution Queued Job #3Job #2 User: User2 User: User9 Nodes Req’d: 2 Nodes Req’d: 1 Program: Program: simul.sh splice.shJob #1 User: User4 Nodes Req’d: 1 Program: align.sh
  • Clustered Solution Queued Job #3Job #2 User: User2 User: User9 Nodes Req’d: 2 Nodes Req’d: 1 Program: Program: simul.sh splice.shJob #1 User: User4 Nodes Req’d: 1 Program: align.sh
  • Clustered Solution Queued Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh Job #2Job #1 User: User9 User: User4 Nodes Req’d: 1 Nodes Req’d: 1 Program: simul.sh Program: align.sh
  • Clustered Solution Queued Job #3 User: User2 Nodes Req’d: 2 Program: splice.shJob #1 User: User4 Nodes Req’d: 1 Program: align.sh
  • Clustered Solution QueuedJob #3 User: User2 Nodes Req’d: 2 Program: splice.shJob #1 User: User4 Nodes Req’d: 1 Program: align.sh
  • Clusters• Solve problem of sharing resources• Allow multiple nodes to collaborate on a single job – Programs must be specifically designed to run in this fashion• Can solve very large problems by combining hundreds of nodes together – Global weather forecasting, particle collisions at CERN, etc.
  • HPC at UT Southwestern• QBRC manages an 18 node cluster on- campus.• Have access to Texas Advanced Computing Center (TACC) at UT Austin – 6,400 node cluster with > 100k cores – Attracts many users, often a queue before your jobs will run.
  • Outline• Servers• Clusters• The Cloud
  • Cloud Computing• Vendors with access to massive computing resources began leasing their servers out – Amazon, Microsoft, Google, Rackspace – Charge per hour of use, usually just a few cents.
  • Cloud Computing - Advantages• No up-front purchase/cost• No hardware to manage• 100 servers in parallel is the same cost as a single server running for 100 hours – Can get parallel jobs done much more quickly
  • Cloud Computing - Disadvantages• Data must be transferred over the Internet – Can take hours to upload a large sequencing experiment.• Can be more expensive than internal clusters