Session3
Upcoming SlideShare
Loading in...5
×
 

Session3

on

  • 525 views

 

Statistics

Views

Total Views
525
Views on SlideShare
524
Embed Views
1

Actions

Likes
0
Downloads
3
Comments
0

1 Embed 1

http://silearning.mdl2.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft Word

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Session3 Session3 Document Transcript

    • Software Development Life Cycle (SDLC)Summary: As in any other engineering discipline, softwareengineering also has some structured models for softwaredevelopment. This document will provide you with ageneric overview about different software developmentmethodologies adopted by contemporary software firms.Read on to know more about the Software DevelopmentLife Cycle (SDLC) in detail.Curtain RaiserLike any other set of engineering products, software products are alsooriented towards the customer. It is either market driven or it drives themarket. Customer Satisfaction was the buzzword of the 80s. CustomerDelight is todays buzzword and Customer Ecstasy is the buzzword of thenew millennium. Products that are not customer or user friendly have noplace in the market although they are engineered using the besttechnology. The interface of the product is as crucial as the internaltechnology of the product.Market ResearchA market study is made to identify a potential customers need. This processis also known as market research. Here, the already existing need and thepossible and potential needs that are available in a segment of the societyare studied carefully. The market study is done based on a lot ofassumptions. Assumptions are the crucial factors in the development orinception of a products development. Unrealistic assumptions can cause anosedive in the entire venture. Though assumptions are abstract, thereshould be a move to develop tangible assumptions to come up with asuccessful product.Research and DevelopmentOnce the Market Research is carried out, the customers need is given tothe Research & Development division (R&D) to conceptualize a cost-effective system that could potentially solve the customers needs in amanner that is better than the one adopted by the competitors at present.Once the conceptual system is developed and tested in a hypotheticalenvironment, the development team takes control of it. The developmentteam adopts one of the software development methodologies that isgiven below, develops the proposed system, and gives it to the customer.The Sales & Marketing division starts selling the software to the availablecustomers and simultaneously works to develop a niche segment thatcould potentially buy the software. In addition, the division also passes thefeedback from the customers to the developers and the R&D division to
    • make possible value additions to the product.While developing a software, the company outsources the non-coreactivities to other companies who specialize in those activities. Thisaccelerates the software development process largely. Some companieswork on tie-ups to bring out a highly matured product in a short period.Popular Software Development ModelsThe following are some basic popular models that are adopted by manysoftware development firmsA. System Development Life Cycle (SDLC) ModelB. Prototyping ModelC. Rapid Application Development ModelD. Component Assembly ModelA. System Development Life Cycle (SDLC) ModelThis is also known as Classic Life Cycle Model (or) Linear Sequential Model(or) Waterfall Method. This model has the following activities.1. System/Information Engineering and ModelingAs software is always of a large system (or business), work begins byestablishing the requirements for all system elements and then allocatingsome subset of these requirements to software. This system view isessential when the software must interface with other elements such ashardware, people and other resources. System is the basic and very criticalrequirement for the existence of software in any entity. So if the system isnot in place, the system should be engineered and put in place. In somecases, to extract the maximum output, the system should be re-engineeredand spruced up. Once the ideal system is engineered or tuned, thedevelopment team studies the software requirement for the system.2. Software Requirement AnalysisThis process is also known as feasibility study. In this phase, thedevelopment team visits the customer and studies their system. Theyinvestigate the need for possible software automation in the given system.By the end of the feasibility study, the team furnishes a document that holdsthe different specific recommendations for the candidate system. It alsoincludes the personnel assignments, costs, project schedule, target datesetc.... The requirement gathering process is intensified and focussedspecially on software. To understand the nature of the program(s) to be built,the system engineer or "Analyst" must understand the information domainfor the software, as well as required function, behavior, performance andinterfacing. The essential purpose of this phase is to find the need and todefine the problem that needs to be solved .
    • 3. System Analysis and DesignIn this phase, the software development process, the softwares overallstructure and its nuances are defined. In terms of the client/servertechnology, the number of tiers needed for the package architecture, thedatabase design, the data structure design etc... are all defined in thisphase. A software development model is thus created. Analysis andDesign are very crucial in the whole development cycle. Any glitch in thedesign phase could be very expensive to solve in the later stage of thesoftware development. Much care is taken during this phase. The logicalsystem of the product is developed in this phase.4. Code GenerationThe design must be translated into a machine-readable form. The codegeneration step performs this task. If the design is performed in a detailedmanner, code generation can be accomplished without much complication.Programming tools like compilers, interpreters, debuggers etc... are usedto generate the code. Different high level programming languages like C, C++, Pascal, Java are used for coding. With respect to the type of application,the right programming language is chosen.5. TestingOnce the code is generated, the software program testing begins. Differenttesting methodologies are available to unravel the bugs that were committedduring the previous phases. Different testing tools and methodologies arealready available. Some companies build their own testing tools that aretailor made for their own development operations.6. MaintenanceThe software will definitely undergo change once it is delivered to thecustomer. There can be many reasons for this change to occur. Changecould happen because of some unexpected input values into the system. Inaddition, the changes in the system could directly affect the softwareoperations. The software should be developed to accommodate changesthat could happen during the post implementation period.Back to topB. Prototyping ModelThis is a cyclic version of the linear model. In this model, once therequirement analysis is done and the design for a prototype is made, thedevelopment process gets started. Once the prototype is created, it is givento the customer for evaluation. The customer tests the package and giveshis/her feed back to the developer who refines the product according to thecustomers exact expectation. After a finite number of iterations, the finalsoftware package is given to the customer. In this methodology, the softwareis evolved as a result of periodic shuttling of information between thecustomer and developer. This is the most popular development model in
    • the contemporary IT industry. Most of the successful software products havebeen developed using this model - as it is very difficult (even for a whiz kid!)to comprehend all the requirements of a customer in one shot. There aremany variations of this model skewed with respect to the projectmanagement styles of the companies. New versions of a software productevolve as a result of prototyping.Back to topC. Rapid Application Development (RAD) ModelThe RAD modelis a linear sequential software development process thatemphasizes an extremely short development cycle. The RAD model is a"high speed" adaptation of the linear sequential model in which rapiddevelopment is achieved by using a component-based constructionapproach. Used primarily for information systems applications, the RADapproach encompasses the following phases:1. Business modelingThe information flow among business functions is modeled in a way thatanswers the following questions:What information drives the business process?What information is generated?Who generates it?Where does the information go?Who processes it?2. Data modelingThe information flow defined as part of the business modeling phase isrefined into a set of data objects that are needed to support the business.The characteristic (called attributes) of each object is identified and therelationships between these objects are defined.3. Process modelingThe data objects defined in the data-modeling phase are transformed toachieve the information flow necessary to implement a business function.Processing the descriptions are created for adding, modifying, deleting, orretrieving a data object.4. Application generationThe RAD model assumes the use of the RAD tools like VB, VC++, Delphietc... rather than creating software using conventional third generationprogramming languages. The RAD model works to reuse existing programcomponents (when possible) or create reusable components (whennecessary). In all cases, automated tools are used to facilitate constructionof the software.
    • 5. Testing and turnoverSince the RAD process emphasizes reuse, many of the programcomponents have already been tested. This minimizes the testing anddevelopment time.Back to topD. Component Assembly ModelObject technologies provide the technical framework for a component-basedprocess model for software engineering. The object oriented paradigmemphasizes the creation of classes that encapsulate both data and thealgorithm that are used to manipulate the data. If properly designed andimplemented, object oriented classes are reusable across differentapplicationsand computer based system architectures. ComponentAssembly Model leads to software reusability. The integration/assembly ofthe already existing software components accelerate the developmentprocess. Nowadays many component libraries are available on the Internet.If the right components are chosen, the integration aspect is made muchsimpler.Back to topConclusionAll these different software development models have their own advantagesand disadvantages. Nevertheless, in the contemporary commercial softwareevelopment world, the fusion of all these methodologies is incorporated.Timing is very crucial in software development. If a delay happens in thedevelopment phase, the market could be taken over by the competitor. Alsoif a bug filled product is launched in a short period of time (quicker than thecompetitors), it may affect the reputation of the company. So, there shouldbe a tradeoff between the development time and the quality of the product.Customers dont expect a bug free product but they expect a user-friendlyproduct. That results in Customer Ecstasy!Systems Development Life CycleFrom Wikipedia, the free encyclopediaJump to: navigation, searchThis list may require cleanup to meet Wikipedias quality standards.Please help improve this list. It may be poorly defined, unverified or indiscriminate.This article or section is in need of attention from an expert on the subject.
    • Please help recruit one or improve this article yourself. See the talk page for details.Please consider using {{Expert-subject}} to associate this request with a WikiProjectSystems Development Life Cycle (SDLC) or sometimes just (SLC) is defined by theU.S. Department of Justice (DoJ) as a software development process, although it is also adistinct process independent of software or other information technology considerations.It is used by a systems analyst to develop an information system, including requirements,validation, training, and user ownership through investigation, analysis, design,implementation, and maintenance. SDLC is also known as information systemsdevelopment or application development. An SDLC should result in a high qualitysystem that meets or exceeds customer expectations, within time and cost estimates,works effectively and efficiently in the current and planned information technologyinfrastructure, and is cheap to maintain and cost-effective to enhance. SDLC is asystematic approach to problem solving and is composed of several phases, eachcomprised of multiple steps:• The Software concept - identifies and defines a need for the new system• A requirements analysis - analyzes the information needs of the end users• The architectural design - creates a blueprint for the design with the necessaryspecifications for the hardware, software, people and data resources• Coding and debugging - creates and programs the final system• System testing - evaluates the systems actual functionality in relation to expectedor intended functionality.1. Implementation2. Testing3. Evaluationor1. Feasibility Study2. Analysis3. Design4. Development5. Implementation6. Maintenanceor1. Feasibility Study2. Analysis3. Design4. Implementation5. Maintenanceor1. Feasibility Study2. Analysis3. Design4. Development5. Testing6. Implementation7. Maintenanceor1. Analysis(includingFeasibilityStudy)2. Design3. Development4. Implementation5. Evaluation or
    • 1. Feasibility Study2. Analysis3. Design4. Implementation5. Testing 6. Evaluation7. MaintenanceThe last row represents the most commonly used Life Cycle steps (used also in AQAmodule exams).Contents[hide]• 1 The Systems Life Cycle (UK Version)• 2 Systems Development Life Cycle: Building the Systemo 2.1 Insourcingo 2.2 Selfsourcingo 2.3 Prototypingo 2.4 Outsourcing• 3 References• 4 See also• 5 External links[edit] The Systems Life Cycle (UK Version)The SDLC is referred to as the Systems Life Cycle (SLC) in the United Kingdom,whereby the following names are used for each stage:1. Terms Of Reference — the management will decide what capabilities andobjectives they wish the new system to incorporate;2. Feasibility Study — asks whether the managements concept of their desired newsystem is actually an achievable, realistic goal, in-terms of money, time and endresult difference to the original system. Often, it may be decided to simply updatean existing system, rather than to completely replace one;3. Fact Finding and Recording — how is the current system used? Oftenquestionnaires are used here, but also just monitoring (watching) the staff to seehow they work is better, as people will often be reluctant to be entirely honestthrough embarrassment about the parts of the existing system they have troublewith and find difficult if merely asked;4. Analysis — free from any cost or unrealistic constraints, this stage lets minds runwild as wonder systems can be thought-up, though all must incorporateeverything asked for by the management in the Terms Of Reference section;5. Design — designers will produce one or more models of what they see a systemeventually looking like, with ideas from the analysis section either used ordiscarded. A document will be produced with a description of the system, but
    • nothing is specific — they might say touchscreen or GUI operating system, butnot mention any specific brands;6. System Specification — having generically decided on which software packagesto use and hardware to incorporate, you now have to be very specific, choosingexact models, brands and suppliers for each software application and hardwaredevice;7. Implementation and Review — set-up and install the new system (includingwriting any custom (bespoke) code required), train staff to use it and then monitorhow it operates for initial problems, and then regularly maintain thereafter.During this stage, any old system that was in-use will usually be discarded oncethe new one has proved it is reliable and as usable.8. Use - obviously the system needs to actually be used by somebody, otherwise theabove process would be completely useless.9. Close - the last step in a systems life cycle is its end, which is most oftenforgotten when you design the system. The system can be closed, it can bemigrated to another (more modern platform) or its data can be migrated into areplacing system.[edit] Systems Development Life Cycle: Building theSystemAll methods undertake the seven steps listed under insourcing to different degrees:[edit] InsourcingInsourcing is defined as having IT specialists within an organization to build theorganization’s system by• Planning – establishing the plans for creating an information system byo Defining the system to be developed – based on the systems prioritizedaccording to the organization’s critical success factor (CSF), a systemmust be identified and choseno the project scope – a high level of system requirements must be definedand put into a project scope documento Developing the project plan - – all details from tasks to be completed, whocompleted them and when they were completed must be formalizedo Managing and monitoring the project plan – this allows the organization tostay on track, creating project milestones and feature creeps which allowyou to add to the initial plan• Analysis – the users and IT specialists collaborate to collect, comprehend, andlogistically formalize business requirements byo Gathering the business requirements – IT specialists and knowledgeworkers collaborate in a joint application design (JAD) and discusswhich tasks to undertake to make the system most successful
    • o Analyzing the requirements – business requirements are prioritized andput in a requirements definition document where the knowledge workerwill approve and place their signatures• Design – this is where the technical blueprint of the system is created byo Designing the technical architecture – choosing amongst the architecturaldesigns of telecommunications, hardware and software that will best suitthe organization’s system and future needso Designing the systems model – graphically creating a model fromgraphical user interface (GUI), GUI screen design, and databases, toplacement of objects on screeno Write the test conditions - Work with the end users to develop the testscripts according to the system requirements• Development – executing the design into a physical system byo Building the technical architecture – purchasing the material needed tobuild the systemo Building the database and programs – the IT specialists write programswhich will be used on the system• Testing – testing the developed systemo Test the system using the established test scripts – test conditions areconducted by comparing expected outcomes to actual outcomes. If thesediffer, a bug is generated and a backtrack to the development stage mustoccur.• Deployment – the systems are placed and used in the actual workforce ando The user guide is createdo Training is provided to the users of the system - usually throughworkshops or online• Maintenance – keeping the system up to date with the changes in theorganization and ensuring it meets the goals of the organization byo Building a help desk to support the system users – having a team availableto aid technical difficulties and answer questionso Implementing changes to the system when necessary.[edit] SelfsourcingSelfsourcing is defined as having knowledge workers within an organization build theorganization’s system• Align selfsourcing applications to the goals of the organization – All intentionsmust be related to the organization’s goals and time management is key.• Establish what external assistance will be necessary – this may be where an ITspecialist in the organization may assist• Document and formalize the completed system created for future users –• Provide ongoing support - being able to maintain and make adjustments to thesystem as the environment changes..[edit] Prototyping
    • Prototyping is defined as creating a model, which displays the necessary characteristicsof a proposed system• Gathering requirements – these requirements will be stated by the knowledgeworkers as well as become apparent in comparison with the old or existing system• Create prototype of system – Confirm a technically proficient system by usingprototypes and create basic screen and reports• Review by knowledge workers - create a model of the system that will beanalyzed, inspected and evaluated by knowledge workers who will proposerecommendations to have the system reach its maximum potential• Revise the prototype – if necessary• Market the idea of the new system – use the prototype to sell the new system andconvince the organization of the advantages of switching up to the new system[edit] OutsourcingOutsourcing is defined as having a third party (outside the organization) to build theorganization’s system so expert minds can create the highest quality system by.• Outsourcing for development software -o Purchasing existing software and paying the publisher to make certainmodifications and paying the publisher for the right to make modificationsyourselfo Outsourcing the development of an entirely new unique system for whichno software exists• Selecting a target system – make sure there is no confidential information criticalto the organization that others should not see. If the organization is small enough,consider selfsourcing• Establish logical requirements - IT specialists and knowledge workers collaboratein a joint application design (JAD) and discuss which tasks to undertake to makethe system most successful to gather business requirements• Develop a request for a proposal – a request for proposal (RFP) is created andformalized. It includes everything the home organization is looking for in thesystem and can be used as the legal binding contract• Evaluate request for proposed returns and choose a vendor amongst the many whohave replied with different prototypes• Test and Accept a Solution – the chosen system must be tested by the homeorganization and a sign-off must be conducted• Monitor and Reevaluate – keep the system up to date with the changingenvironment and evaluate the chosen vendor’s ability and accommodate tomaintain the systemAlgorithm
    • In mathematics, computing, linguistics, and related disciplines, an algorithm is a definitelist of well-defined instructions for completing a task; that given an initial state, willproceed through a well-defined series of successive states, eventually terminating in anend-state. The transition from one state to the next is not necessarily deterministic; somealgorithms, known as probabilistic algorithms, incorporate randomness.The concept of an algorithm originated as a means of recording procedures for solvingmathematical problems such as finding the common divisor of two numbers ormultiplying two numbers. A partial formalization of the concept began with attempts tosolve the Entscheidungsproblem (the "decision problem") that David Hilbert posed in1928. Subsequent formalizations were framed as attempts to define "effectivecalculability" (cf Kleene 1943:274) or "effective method" (cf Rosser 1939:225); thoseformalizations included the Gödel-Herbrand-Kleene recursive functions of 1930, 1934and 1935, Alonzo Churchs lambda calculus of 1936, Emil Posts "Formulation I" of1936, and Alan Turings Turing machines of 1936-7 and 1939.Contents[hide]• 1 Etymology• 2 Why algorithms are necessary: an informal definition• 3 Formalization of algorithmso 3.1 Terminationo 3.2 Expressing algorithmso 3.3 Implementation• 4 Exampleo 4.1 Algorithm analysis• 5 Classeso 5.1 Classification by implementationo 5.2 Classification by design paradigmo 5.3 Classification by field of studyo 5.4 Classification by complexity• 6 Legal issues• 7 History: Development of the notion of "algorithm"o 7.1 Origin of the wordo 7.2 Discrete and distinguishable symbolso 7.3 Manipulation of symbols as "place holders" for numbers: algebrao 7.4 Mechanical contrivances with discrete stateso 7.5 Mathematics during the 1800s up to the mid-1900so 7.6 Emil Post (1936) and Alan Turing (1936-7, 1939)o 7.7 J. B. Rosser (1939) and S. C. Kleene (1943)o 7.8 History after 1950• 8 See also• 9 Notes• 10 References
    • o 10.1 Secondary references• 11 External links[edit] EtymologyAl-Khwārizmī, Persian astronomer and mathematician, wrote a treatise in Arabic in 825AD, On Calculation with Hindu Numerals. (See algorism). It was translated into Latin inthe 12th century as Algoritmi de numero Indorum,[1]which title was likely intended tomean "[Book by] Algoritmus on the numbers of the Indians", where "Algoritmi" was thetranslators rendition of the authors name in the genitive case; but peoplemisunderstanding the title treated Algoritmi as a Latin plural and this led to the word"algorithm" (Latin algorismus) coming to mean "calculation method". The intrusive "th"is most likely due to a false cognate with the Greek αριθμος (arithmos) meaning"number".Flowcharts are often used to graphically represent algorithms.[edit] Why algorithms are necessary: an informaldefinitionNo generally accepted formal definition of "algorithm" exists yet. We can, however,derive clues to the issues involved and an informal meaning of the word from thefollowing quotation from Boolos and Jeffrey (1974, 1999):"No human being can write fast enough, or long enough, or small enough to listall members of an enumerably infinite set by writing out their names, one afteranother, in some notation. But humans can do something equally useful, in thecase of certain enumerably infinite sets: They can give explicit instructions fordetermining the nth member of the set, for arbitrary finite n. Such instructionsare to be given quite explicitly, in a form in which they could be followed by acomputing machine, or by a human who is capable of carrying out only veryelementary operations on symbols" (boldface added, p. 19).The words "enumerably infinite" mean "countable using integers perhaps extending toinfinity". Thus Boolos and Jeffrey are saying that an algorithm implies instructions for aprocess that "creates" output integers from an arbitrary "input" integer or integers that, intheory, can be chosen from 0 to infinity. Thus we might expect an algorithm to be analgebraic equation such as y = m + n — two arbitrary "input variables" m and n thatproduce an output y. As we see in Algorithm characterizations — the word algorithmimplies much more than this, something on the order of (for our addition example):
    • Precise instructions (in language understood by "the computer") for a "fast,efficient, good" process that specifies the "moves" of "the computer" (machine orhuman, equipped with the necessary internally-contained information andcapabilities) to find, decode, and then munch arbitrary input integers/symbols mand n, symbols + and = ... and (reliably, correctly, "effectively") produce, in a"reasonable" time, output-integer y at a specified place and in a specified format.The concept of algorithm is also used to define the notion of decidability (logic). Thatnotion is central for explaining how formal systems come into being starting from a smallset of axioms and rules. In logic, the time that an algorithm requires to complete cannotbe measured, as it is not apparently related with our customary physical dimension. Fromsuch uncertainties, that characterize ongoing work, stems the unavailability of adefinition of algorithm that suits both concrete (in some sense) and abstract usage of theterm.For a detailed presentation of the various points of view around the definition of"algorithm" see Algorithm characterizations. For examples of simple additionalgorithms specified in the detailed manner described in Algorithmcharacterizations, see Algorithm examples.[edit] Formalization of algorithmsAlgorithms are essential to the way computers process information, because a computerprogram is essentially an algorithm that tells the computer what specific steps to perform(in what specific order) in order to carry out a specified task, such as calculatingemployees’ paychecks or printing students’ report cards. Thus, an algorithm can beconsidered to be any sequence of operations that can be performed by a Turing-completesystem. Authors who assert this thesis include Savage (1987) and Gurevich (2000):"...Turings informal argument in favor of his thesis justifies a stronger thesis:every algorithm can be simulated by a Turing machine" (Gurevich 2000:1)...according to Savage [1987], "an algorithm is a computational process definedby a Turing machine."(Gurevich 2000:3)Typically, when an algorithm is associated with processing information, data are readfrom an input source or device, written to an output sink or device, and/or stored forfurther processing. Stored data are regarded as part of the internal state of the entityperforming the algorithm. In practice, the state is stored in a data structure, but analgorithm requires the internal data only for specific operation sets called abstract datatypes.For any such computational process, the algorithm must be rigorously defined: specifiedin the way it applies in all possible circumstances that could arise. That is, anyconditional steps must be systematically dealt with, case-by-case; the criteria for eachcase must be clear (and computable).
    • Because an algorithm is a precise list of precise steps, the order of computation willalmost always be critical to the functioning of the algorithm. Instructions are usuallyassumed to be listed explicitly, and are described as starting "from the top" and going"down to the bottom", an idea that is described more formally by flow of control.So far, this discussion of the formalization of an algorithm has assumed the premises ofimperative programming. This is the most common conception, and it attempts todescribe a task in discrete, "mechanical" means. Unique to this conception of formalizedalgorithms is the assignment operation, setting the value of a variable. It derives from theintuition of "memory" as a scratchpad. There is an example below of such an assignment.For some alternate conceptions of what constitutes an algorithm see functionalprogramming and logic programming .[edit] TerminationSome writers restrict the definition of algorithm to procedures that eventually finish. Insuch a category Kleene places the "decision procedure or decision method or algorithmfor the question" (Kleene 1952:136). Others, including Kleene, include procedures thatcould run forever without stopping; such a procedure has been called a "computationalmethod" (Knuth 1997:5) or "calculation procedure or algorithm" (Kleene 1952:137);however, Kleene notes that such a method must eventually exhibit "some object" (Kleene1952:137).Minsky makes the pertinent observation, in regards to determining whether an algorithmwill eventually terminate (from a particular starting state):"But if the length of the process is not known in advance, then trying it may notbe decisive, because if the process does go on forever — then at no time will weever be sure of the answer" (Minsky 1967:105)As it happens, no other method can do any better, as was shown by Alan Turing with hiscelebrated result on the undecidability of the so-called halting problem. There is noalgorithmic procedure for determining of arbitrary algorithms whether or not theyterminate from given starting states. The analysis of algorithms for their likelihood oftermination is called termination analysis.In the case of non-halting computation method (calculation procedure) success can nolonger be defined in terms of halting with a meaningful output. Instead, terms of successthat allow for unbounded output sequences must be defined. For example, an algorithmthat verifies if there are more zeros than ones in an infinite random binary sequence mustrun forever to be effective. If it is implemented correctly, however, the algorithms outputwill be useful: for as long as it examines the sequence, the algorithm will give a positiveresponse while the number of examined zeros outnumber the ones, and a negativeresponse otherwise. Success for this algorithm could then be defined as eventuallyoutputting only positive responses if there are actually more zeros than ones in the
    • sequence, and in any other case outputting any mixture of positive and negativeresponses.See the examples of (im-)"proper" subtraction at partial function for more about what canhappen when an algorithm fails for certain of its input numbers — e.g., (i) non-termination, (ii) production of "junk" (output in the wrong format to be considered anumber) or no number(s) at all (halt ends the computation with no output), (iii) wrongnumber(s), or (iv) a combination of these. Kleene proposed that the production of "junk"or failure to produce a number is solved by having the algorithm detect these instancesand produce e.g., an error message (he suggested "0"), or preferably, force the algorithminto an endless loop (Kleene 1952:322). Davis does this to his subtraction algorithm —he fixes his algorithm in a second example so that it is proper subtraction (Davis1958:12-15). Along with the logical outcomes "true" and "false" Kleene also proposes theuse of a third logical symbol "u" — undecided (Kleene 1952:326) — thus an algorithmwill always produce something when confronted with a "proposition". The problem ofwrong answers must be solved with an independent "proof" of the algorithm e.g., usinginduction:"We normally require auxiliary evidence for this (that the algorithm correctlydefines a mu recursive function), e.g., in the form of an inductive proof that, foreach argument value, the computation terminates with a unique value" (Minsky1967:186)[edit] Expressing algorithmsAlgorithms can be expressed in many kinds of notation, including natural languages,pseudocode, flowcharts, and programming languages. Natural language expressions ofalgorithms tend to be verbose and ambiguous, and are rarely used for complex ortechnical algorithms. Pseudocode and flowcharts are structured ways to expressalgorithms that avoid many of the ambiguities common in natural language statements,while remaining independent of a particular implementation language. Programminglanguages are primarily intended for expressing algorithms in a form that can be executedby a computer, but are often used as a way to define or document algorithms.There is a wide variety of representations possible and one can express a given Turingmachine program as a sequence of machine tables (see more at finite state machine andstate transition table), as flowcharts (see more at state diagram), or as a form ofrudimentary machine code or assembly code called "sets of quadruples" (see more atTuring machine).Sometimes it is helpful in the description of an algorithm to supplement small "flowcharts" (state diagrams) with natural-language and/or arithmetic expressions writteninside "block diagrams" to summarize what the "flow charts" are accomplishing.Representations of algorithms are generally classed into three accepted levels of Turingmachine description (Sipser 2006:157):
    • • 1 High-level description:"...prose to describe an algorithm, ignoring the implementation details. At thislevel we do not need to mention how the machine manages its tape or head"• 2 Implementation description:"...prose used to define the way the Turing machine uses its head and the way thatit stores data on its tape. At this level we do not give details of states or transitionfunction"• 3 Formal description:Most detailed, "lowest level", gives the Turing machines "state table".For an example of the simple algorithm "Add m+n" described in all three levelssee Algorithm examples.[edit] ImplementationMost algorithms are intended to be implemented as computer programs. However,algorithms are also implemented by other means, such as in a biological neural network(for example, the human brain implementing arithmetic or an insect looking for food), inan electrical circuit, or in a mechanical device.[edit] ExampleOne of the simplest algorithms is to find the largest number in an (unsorted) list ofnumbers. The solution necessarily requires looking at every number in the list, but onlyonce at each. From this follows a simple algorithm, which can be stated in a high-leveldescription English prose, as:High-level description:1. Assume the first item is largest.2. Look at each of the remaining items in the list and if it is larger than the largestitem so far, make a note of it.3. The last noted item is the largest in the list when the process is complete.(Quasi-)formal description: Written in prose but much closer to the high-level languageof a computer program, the following is the more formal coding of the algorithm inpseudocode or pidgin code:Algorithm LargestNumberInput: A non-empty list of numbers L.Output: The largest number in the list L.largest ← L0for each item in the list L≥1, doif the item > largest, then
    • largest ← the itemreturn largest• "←" is a loose shorthand for "changes to". For instance, "largest ← item" means that the value oflargest changes to the value of item.• "return" terminates the algorithm and outputs the value that follows.For a more complex example of an algorithm, see Euclids algorithm for the greatestcommon divisor, one of the earliest algorithms known.[edit] Algorithm analysisAs it happens, it is important to know how much of a particular resource (such as time orstorage) is required for a given algorithm. Methods have been developed for the analysisof algorithms to obtain such quantitative answers; for example, the algorithm above has atime requirement of O(n), using the big O notation with n as the length of the list. At alltimes the algorithm only needs to remember two values: the largest number found so far,and its current position in the input list. Therefore it is said to have a space requirement ofO(1).[2](Note that the size of the inputs is not counted as space used by the algorithm.)Different algorithms may complete the same task with a different set of instructions inless or more time, space, or effort than others. For example, given two different recipesfor making potato salad, one may have peel the potato before boil the potato while theother presents the steps in the reverse order, yet they both call for these steps to berepeated for all potatoes and end when the potato salad is ready to be eaten.The analysis and study of algorithms is a discipline of computer science, and is oftenpracticed abstractly without the use of a specific programming language orimplementation. In this sense, algorithm analysis resembles other mathematicaldisciplines in that it focuses on the underlying properties of the algorithm and not on thespecifics of any particular implementation. Usually pseudocode is used for analysis as itis the simplest and most general representation.[edit] ClassesThere are various ways to classify algorithms, each with its own merits.[edit] Classification by implementationOne way to classify algorithms is by implementation means.• Recursion or iteration: A recursive algorithm is one that invokes (makesreference to) itself repeatedly until a certain condition matches, which is a methodcommon to functional programming. Iterative algorithms use repetitive constructslike loops and sometimes additional data structures like stacks to solve the givenproblems. Some problems are naturally suited for one implementation or theother. For example, towers of hanoi is well understood in recursive
    • implementation. Every recursive version has an equivalent (but possibly more orless complex) iterative version, and vice versa.• Logical: An algorithm may be viewed as controlled logical deduction. This notionmay be expressed as:Algorithm = logic + control.[3]The logic component expresses the axioms that may be used in the computationand the control component determines the way in which deduction is applied tothe axioms. This is the basis for the logic programming paradigm. In pure logicprogramming languages the control component is fixed and algorithms arespecified by supplying only the logic component. The appeal of this approach isthe elegant semantics: a change in the axioms has a well defined change in thealgorithm.• Serial or parallel or distributed: Algorithms are usually discussed with theassumption that computers execute one instruction of an algorithm at a time.Those computers are sometimes called serial computers. An algorithm designedfor such an environment is called a serial algorithm, as opposed to parallelalgorithms or distributed algorithms. Parallel algorithms take advantage ofcomputer architectures where several processors can work on a problem at thesame time, whereas distributed algorithms utilise multiple machines connectedwith a network. Parallel or distributed algorithms divide the problem into moresymmetrical or asymmetrical subproblems and collect the results back together.The resource consumption in such algorithms is not only processor cycles on eachprocessor but also the communication overhead between the processors. Sortingalgorithms can be parallelized efficiently, but their communication overhead isexpensive. Iterative algorithms are generally parallelizable. Some problems haveno parallel algorithms, and are called inherently serial problems.• Deterministic or non-deterministic: Deterministic algorithms solve the problemwith exact decision at every step of the algorithm whereas non-deterministicalgorithm solve problems via guessing although typical guesses are made moreaccurate through the use of heuristics.• Exact or approximate: While many algorithms reach an exact solution,approximation algorithms seek an approximation that is close to the true solution.Approximation may use either a deterministic or a random strategy. Suchalgorithms have practical value for many hard problems.[edit] Classification by design paradigmAnother way of classifying algorithms is by their design methodology or paradigm. Thereis a certain number of paradigms, each different from the other. Furthermore, each of
    • these categories will include many different types of algorithms. Some commonly foundparadigms include:• Divide and conquer. A divide and conquer algorithm repeatedly reduces aninstance of a problem to one or more smaller instances of the same problem(usually recursively), until the instances are small enough to solve easily. Onesuch example of divide and conquer is merge sorting. Sorting can be done on eachsegment of data after dividing data into segments and sorting of entire data can beobtained in conquer phase by merging them. A simpler variant of divide andconquer is called decrease and conquer algorithm, that solves an identicalsubproblem and uses the solution of this subproblem to solve the bigger problem.Divide and conquer divides the problem into multiple subproblems and soconquer stage will be more complex than decrease and conquer algorithms. Anexample of decrease and conquer algorithm is binary search algorithm.• Dynamic programming. When a problem shows optimal substructure, meaningthe optimal solution to a problem can be constructed from optimal solutions tosubproblems, and overlapping subproblems, meaning the same subproblems areused to solve many different problem instances, a quicker approach calleddynamic programming avoids recomputing solutions that have already beencomputed. For example, the shortest path to a goal from a vertex in a weightedgraph can be found by using the shortest path to the goal from all adjacentvertices. Dynamic programming and memoization go together. The maindifference between dynamic programming and divide and conquer is thatsubproblems are more or less independent in divide and conquer, whereassubproblems overlap in dynamic programming. The difference between dynamicprogramming and straightforward recursion is in caching or memoization ofrecursive calls. When subproblems are independent and there is no repetition,memoization does not help; hence dynamic programming is not a solution for allcomplex problems. By using memoization or maintaining a table of subproblemsalready solved, dynamic programming reduces the exponential nature of manyproblems to polynomial complexity.• The greedy method. A greedy algorithm is similar to a dynamic programmingalgorithm, but the difference is that solutions to the subproblems do not have to beknown at each stage; instead a "greedy" choice can be made of what looks best forthe moment. The greedy method extends the solution with the best possibledecision (not all feasible decisions) at an algorithmic stage based on the currentlocal optimum and the best decision (not all possible decisions) made in previousstage. It is not exhaustive, and does not give accurate answer to many problems.But when it works, it will be the fastest method. The most popular greedyalgorithm is finding the minimal spanning tree as given by Kruskal.• Linear programming. When solving a problem using linear programming,specific inequalities involving the inputs are found and then an attempt is made tomaximize (or minimize) some linear function of the inputs. Many problems (suchas the maximum flow for directed graphs) can be stated in a linear programmingway, and then be solved by a generic algorithm such as the simplex algorithm. A
    • more complex variant of linear programming is called integer programming,where the solution space is restricted to the integers.• Reduction. This technique involves solving a difficult problem by transforming itinto a better known problem for which we have (hopefully) asymptoticallyoptimal algorithms. The goal is to find a reducing algorithm whose complexity isnot dominated by the resulting reduced algorithms. For example, one selectionalgorithm for finding the median in an unsorted list involves first sorting the list(the expensive portion) and then pulling out the middle element in the sorted list(the cheap portion). This technique is also known as transform and conquer.• Search and enumeration. Many problems (such as playing chess) can bemodeled as problems on graphs. A graph exploration algorithm specifies rules formoving around a graph and is useful for such problems. This category alsoincludes search algorithms, branch and bound enumeration and backtracking.• The probabilistic and heuristic paradigm. Algorithms belonging to this class fitthe definition of an algorithm more loosely.1. Probabilistic algorithms are those that make some choices randomly (or pseudo-randomly); for some problems, it can in fact be proven that the fastest solutionsmust involve some randomness.2. Genetic algorithms attempt to find solutions to problems by mimicking biologicalevolutionary processes, with a cycle of random mutations yielding successivegenerations of "solutions". Thus, they emulate reproduction and "survival of thefittest". In genetic programming, this approach is extended to algorithms, byregarding the algorithm itself as a "solution" to a problem.3. Heuristic algorithms, whose general purpose is not to find an optimal solution, butan approximate solution where the time or resources are limited. They are notpractical to find perfect solutions. An example of this would be local search, tabusearch, or simulated annealing algorithms, a class of heuristic probabilisticalgorithms that vary the solution of a problem by a random amount. The name"simulated annealing" alludes to the metallurgic term meaning the heating andcooling of metal to achieve freedom from defects. The purpose of the randomvariance is to find close to globally optimal solutions rather than simply locallyoptimal ones, the idea being that the random element will be decreased as thealgorithm settles down to a solution.[edit] Classification by field of studySee also: List of algorithmsEvery field of science has its own problems and needs efficient algorithms. Relatedproblems in one field are often studied together. Some example classes are searchalgorithms, sorting algorithms, merge algorithms, numerical algorithms, graphalgorithms, string algorithms, computational geometric algorithms, combinatorialalgorithms, machine learning, cryptography, data compression algorithms and parsingtechniques.
    • Fields tend to overlap with each other, and algorithm advances in one field may improvethose of other, sometimes completely unrelated, fields. For example, dynamicprogramming was originally invented for optimization of resource consumption inindustry, but is now used in solving a broad range of problems in many fields.[edit] Classification by complexitySee also: Complexity classAlgorithms can be classified by the amount of time they need to complete compared totheir input size. There is a wide variety: some algorithms complete in linear time relativeto input size, some do so in an exponential amount of time or even worse, and somenever halt. Additionally, some problems may have multiple algorithms of differingcomplexity, while other problems might have no algorithms or no known efficientalgorithms. There are also mappings from some problems to other problems. Owing tothis, it was found to be more suitable to classify the problems themselves instead of thealgorithms into equivalence classes based on the complexity of the best possiblealgorithms for them.[edit] Legal issuesSee also: Software patents for a general overview of the patentability of software,including computer-implemented algorithms.Algorithms, by themselves, are not usually patentable. In the United States, a claimconsisting solely of simple manipulations of abstract concepts, numbers, or signals do notconstitute "processes" (USPTO 2006) and hence algorithms are not patentable (as inGottschalk v. Benson). However, practical applications of algorithms are sometimespatentable. For example, in Diamond v. Diehr, the application of a simple feedbackalgorithm to aid in the curing of synthetic rubber was deemed patentable. The patentingof software is highly controversial, and there are highly criticized patents involvingalgorithms, especially data compression algorithms, such as Unisys LZW patent.Additionally, some cryptographic algorithms have export restrictions (see export ofcryptography).This short section requires expansion.[edit] History: Development of the notion of"algorithm"[edit] Origin of the wordSee also: Timeline of algorithms
    • The word algorithm comes from the name of the 9th century Persian mathematician AbuAbdullah Muhammad ibn Musa al-Khwarizmi whose works introduced Indian numeralsand algebraic concepts. He worked in Baghdad at the time when it was the centre ofscientific studies and trade. The word algorism originally referred only to the rules ofperforming arithmetic using Arabic numerals but evolved via European Latin translationof al-Khwarizmis name into algorithm by the 18th century. The word evolved to includeall definite procedures for solving problems or performing tasks.[edit] Discrete and distinguishable symbolsTally-marks: To keep track of their flocks, their sacks of grain and their money theancients used tallying: accumulating stones or marks scratched on sticks, or makingdiscrete symbols in clay. Through the Babylonian and Egyptian use of marks andsymbols, eventually Roman numerals and the abacus evolved. (Dilson, p.16–41) Tallymarks appear prominently in unary numeral system arithmetic used in Turing machineand Post-Turing machine computations.[edit] Manipulation of symbols as "place holders" for numbers: algebraThe work of the ancient Greek geometers, Persian mathematician Al-Khwarizmi (oftenconsidered as the "father of algebra"), and Western European mathematicians culminatedin Leibnizs notion of the calculus ratiocinator (ca 1680):"A good century and a half ahead of his time, Leibniz proposed an algebra oflogic, an algebra that would specify the rules for manipulating logical concepts inthe manner that ordinary algebra specifies the rules for manipulating numbers"(Davis 2000:18).[edit] Mechanical contrivances with discrete statesThe clock: Bolter credits the invention of the weight-driven clock as “The key invention[of Europe in the Middle Ages]", in particular the verge escapement (Bolter 1984:24) thatprovides us with the tick and tock of a mechanical clock. “The accurate automaticmachine” (Bolter 1984:26) led immediately to "mechanical automata" beginning in thethirteenth century and finally to “computational machines" – the difference engine andanalytical engines of Charles Babbage and Countess Ada Lovelace (Bolter p.33–34,p.204–206).Jacquard loom, Hollerith punch cards, telegraphy and telephony — theelectromechanical relay: Bell and Newell (1971) indicate that the Jacquard loom (1801),precursor to Hollerith cards (punch cards, 1887), and “telephone switching technologies”were the roots of a tree leading to the development of the first computers (Bell andNewell diagram p. 39, cf Davis (2000)). By the mid-1800s the telegraph, the precursor ofthe telephone, was in use throughout the world, its discrete and distinguishable encodingof letters as “dots and dashes” a common sound. By the late 1800s the ticker tape (ca
    • 1870s) was in use, as was the use of Hollerith cards in the 1890 U.S. census. Then camethe Teletype (ca 1910) with its punched-paper use of Baudot code on tape.Telephone-switching networks of electromechanical relays (invented 1835) was behindthe work of George Stibitz (1937), the inventor of the digital adding device. As heworked in Bell Laboratories, he observed the “burdensome’ use of mechanical calculatorswith gears. "He went home one evening in 1937 intending to test his idea.... When thetinkering was over, Stibitz had constructed a binary adding device" (Valley News, p. 13).Davis (2000) observes the particular importance of the electromechanical relay (with itstwo "binary states" open and closed):It was only with the development, beginning in the 1930s, of electromechanicalcalculators using electrical relays, that machines were built having the scopeBabbage had envisioned." (Davis, p. 148)[edit] Mathematics during the 1800s up to the mid-1900sSymbols and rules: In rapid succession the mathematics of George Boole (1847, 1854),Gottlob Frege (1879), and Giuseppe Peano (1888–1889) reduced arithmetic to a sequenceof symbols manipulated by rules. Peanos The principles of arithmetic, presented by anew method (1888) was "the first attempt at an axiomatization of mathematics in asymbolic language" (van Heijenoort:81ff).But Heijenoort gives Frege (1879) this kudos: Frege’s is "perhaps the most importantsingle work ever written in logic. ... in which we see a " formula language, that is alingua characterica, a language written with special symbols, "for pure thought", that is,free from rhetorical embellishments ... constructed from specific symbols that aremanipulated according to definite rules"(van Heijenoort:1). The work of Frege wasfurther simplified and amplified by Alfred North Whitehead and Bertrand Russell in theirPrincipia Mathematica (1910–1913).The paradoxes: At the same time a number of disturbing paradoxes appeared in theliterature, in particular the Burali-Forti paradox (1897), the Russell paradox (1902–03),and the Richard Paradox (1905, Dixon 1906), (cf Kleene 1952:36–40). The resultantconsiderations led to Kurt Gödel’s paper (1931) — he specifically cites the paradox ofthe liar — that completely reduces rules of recursion to numbers.Effective calculability: In an effort to solve the Entscheidungsproblem defined preciselyby Hilbert in 1928, mathematicians first set about to define what was meant by an"effective method" or "effective calculation" or "effective calculability" (i.e., acalculation that would succeed). In rapid succession the following appeared: AlonzoChurch, Stephen Kleene and J.B. Rossers λ-calculus (cf footnote in Alonzo Church1936a:90, 1936b:110), a finely-honed definition of "general recursion" from the work ofGödel acting on suggestions of Jacques Herbrand (cf Gödels Princeton lectures of 1934)and subsequent simplifications by Kleene (1935-6:237ff, 1943:255ff), Churchs proof
    • (Church 1936:88ff) that the Entscheidungsproblem was unsolvable, Emil Posts definitionof effective calculability as a worker mindlessly following a list of instructions to moveleft or right through a sequence of rooms and while there either mark or erase a paper orobserve the paper and make a yes-no decision about the next instruction (cf his"Formulation I" 1936:289-290), Alan Turings proof of that the Entscheidungsproblemwas unsolvable by use of his "a- [automatic-] machine" (Turing 1936-7:116ff) -- in effectalmost identical to Posts "formulation", J. Barkley Rossers definition of "effectivemethod" in terms of "a machine" (Rosser 1939:226), S. C. Kleenes proposal of aprecursor to "Church thesis" that he called "Thesis I" (Kleene 1943:273–274)), and a fewyears later Kleenes renaming his Thesis "Churchs Thesis" (Kleene 1952:300, 317) andproposing "Turings Thesis" (Kleene 1952:376).[edit] Emil Post (1936) and Alan Turing (1936-7, 1939)Here is a remarkable coincidence of two men not knowing each other but describing aprocess of men-as-computers working on computations — and they yield virtuallyidentical definitions.Emil Post (1936) described the actions of a "computer" (human being) as follows:"...two concepts are involved: that of a symbol space in which the work leadingfrom problem to answer is to be carried out, and a fixed unalterable set ofdirections.His symbol space would be"a two way infinite sequence of spaces or boxes... The problem solver or workeris to move and work in this symbol space, being capable of being in, andoperating in but one box at a time.... a box is to admit of but two possibleconditions, i.e., being empty or unmarked, and having a single mark in it, say avertical stroke."One box is to be singled out and called the starting point. ...a specific problem isto be given in symbolic form by a finite number of boxes [i.e., INPUT] beingmarked with a stroke. Likewise the answer [i.e., OUTPUT] is to be given insymbolic form by such a configuration of marked boxes...."A set of directions applicable to a general problem sets up a deterministicprocess when applied to each specific problem. This process will terminate onlywhen it comes to the direction of type (C ) [i.e., STOP]." (U p. 289–290) Seemore at Post-Turing machineAlan Turing’s work (1936–1937, 1939:160) preceded that of Stibitz (1937); it isunknown if Stibitz knew of the work of Turing. Turing’s biographer believed thatTuring’s use of a typewriter-like model derived from a youthful interest: “Alan haddreamt of inventing typewriters as a boy; Mrs. Turing had a typewriter; and he could wellhave begun by asking himself what was meant by calling a typewriter mechanical"
    • (Hodges, p. 96) Given the prevalence of Morse code and telegraphy, ticker tapemachines, and Teletypes we might conjecture that all were influences.Turing — his model of computation is now called a Turing machine — begins, as didPost, with an analysis of a human computer that he whittles down to a simple set of basicmotions and "states of mind". But he continues a step further and creates a machine as amodel of computation of numbers (Turing 1936-7:116):"Computing is normally done by writing certain symbols on paper. We maysuppose this paper is divided into squares like a childs arithmetic book....I assumethen that the computation is carried out on one-dimensional paper, i.e., on a tapedivided into squares. I shall also suppose that the number of symbols which maybe printed is finite...."The behavior of the computer at any moment is determined by the symbolswhich he is observing, and his "state of mind" at that moment. We may supposethat there is a bound B to the number of symbols or squares which the computercan observe at one moment. If he wishes to observe more, he must use successiveobservations. We will also suppose that the number of states of mind which needbe taken into account is finite..."Let us imagine that the operations performed by the computer to be split up intosimple operations which are so elementary that it is not easy to imagine themfurther divided" (Turing 1936-7:136).Turings reduction yields the following:"The simple operations must therefore include:"(a) Changes of the symbol on one of the observed squares"(b) Changes of one of the squares observed to another square within L squares ofone of the previously observed squares."It may be that some of these change necessarily invoke a change of state of mind. Themost general single operation must therefore be taken to be one of the following:"(A) A possible change (a) of symbol together with a possible change of state ofmind."(B) A possible change (b) of observed squares, together with a possible changeof state of mind""We may now construct a machine to do the work of this computer."((Turing1936-7:136).A few years later, Turing expanded his analysis (thesis, definition) with this forcefulexpression of it:"A function is said to be "effectivey calculable" if its values can be found by somepurely mechanical process. Although it is fairly easy to get an intuitive grasp ofthis idea, it is neverthessless desirable to have some more definite, mathematical
    • expressible definition . . . [he discusses the history of the definition pretty much aspresented above with respect to Gödel, Herbrand, Kleene, Church, Turing andPost] . . . We may take this statement literally, understanding by a purelymechanical process one which could be carried out by a machine. It is possible togive a mathematical description, in a certain normal form, of the structures ofthese machines. The development of these ideas leads to the authors definition ofa computable function, and to an identification of computability † with effectivecalculability . . . ."† We shall use the expression "computable function" to mean a functioncalculable by a machine, and we let "effectively calculabile" refer to the intuitiveidea without particular identification with any one of these definitions." (Turing1939:160).[edit] J. B. Rosser (1939) and S. C. Kleene (1943)J. Barkley Rosser boldly defined an ‘effective [mathematical] method’ in the followingmanner (boldface added):"Effective method is used here in the rather special sense of a method each stepof which is precisely determined and which is certain to produce the answer in afinite number of steps. With this special meaning, three different precisedefinitions have been given to date. [his footnote #5; see discussion immediatelybelow]. The simplest of these to state (due to Post and Turing) says essentiallythat an effective method of solving certain sets of problems exists if one canbuild a machine which will then solve any problem of the set with no humanintervention beyond inserting the question and (later) reading the answer.All three definitions are equivalent, so it doesnt matter which one is used.Moreover, the fact that all three are equivalent is a very strong argument for thecorrectness of any one." (Rosser 1939:225–6)Rossers footnote #5 references the work of (1) Church and Kleene and their definition ofλ-definability, in particular Churchs use of it in his An Unsolvable Problem ofElementary Number Theory (1936); (2) Herbrand and Gödel and their use of recursion inparticular Gödels use in his famous paper On Formally Undecidable Propositions ofPrincipia Mathematica and Related Systems I (1931); and (3) Post (1936) and Turing(1936-7) in their mechanism-models of computation.Stephen C. Kleene defined as his now-famous "Thesis I" known as "the Church-TuringThesis". But he did this in the following context (boldface in original):"12. Algorithmic theories... In setting up a complete algorithmic theory, what wedo is to describe a procedure, performable for each set of values of theindependent variables, which procedure necessarily terminates and in suchmanner that from the outcome we can read a definite answer, "yes" or "no," to thequestion, "is the predicate value true?”" (Kleene 1943:273)
    • [edit] History after 1950A number of efforts have been directed toward further refinement of the definition of"algorithm", and activity is on-going because of issues surrounding, in particular,foundations of mathematics (especially the Church-Turing Thesis) and philosophy ofmind (especially arguments around artificial intelligence). For more, see Algorithmcharacterizations.Pseudocode (derived from pseudo and code) is a compact and informal high-leveldescription of a computer programming algorithm that uses the structural conventions ofprogramming languages, but omits detailed subroutines, variable declarations, andlanguage-specific syntax. The programming language is augmented with natural languagedescriptions of the details, where convenient, or with compact mathematical notation.The purpose of using pseudocode as opposed the language syntax is that it is easier forhumans to read. This is often achieved by making the sample application-independent somore specific items (i/o variables, etc.) can be added later.Pseudocode resembles, but should not be confused with, skeleton programs includingdummy code, which can be compiled without errors. Flowcharts can be thought of as agraphical form of pseudocode.Contents[hide]• 1 Syntax• 2 Application• 3 Examples of pseudocode• 4 Mathematical style pseudocode• 5 Machine compilation or interpretationo 5.1 Natural language grammar in programming languageso 5.2 Mathematical programming languages• 6 See also• 7 External links[edit] SyntaxAs the name suggests, pseudocode generally does not actually obey the syntax rules ofany particular language; there is no systematic standard form, although any particularwriter will generally borrow the appearance of a particular language. Popular sourcesinclude PASCAL, C, Java, BASIC, Lisp, and ALGOL. Details not relevant to thealgorithm (such as memory management code) are usually omitted. Blocks of code, for
    • example code contained within a loop, may be described in a one-line natural languagesentence.Depending on the writer, pseudocode may therefore vary widely in style, from a near-exact imitation of a real programming language at one extreme, to a descriptionapproaching formatted prose at the other.[edit] ApplicationTextbooks and scientific publications related to computer science and numericalcomputation often use pseudocode in description of algorithms, so that all programmerscan understand them, even if they do not all know the same programming languages. Intextbooks, there is usually an accompanying introduction explaining the particularconventions in use. The level of detail of such languages may in some cases approachthat of formalized general-purpose languages — for example, Knuths seminal textbookThe Art of Computer Programming describes algorithms in a fully-specified assemblylanguage for a non-existent microprocessor.A programmer who needs to implement a specific algorithm, especially an unfamiliarone, will often start with a pseudocode description, and then simply "translate" thatdescription into the target programming language and modify it to interact correctly withthe rest of the program. Programmers may also start a project by sketching out the codein pseudocode on paper before writing it in its actual language, as a top-down structuringapproach.[edit] Examples of pseudocodeAn example of how pseudocode differs from regular code is below.Regular code (written in PHP):<?phpif (is_valid($cc_number)) {execute_transaction($cc_number,$order);} else {show_failure();}?>Pseudocode:if credit card number is validexecute transaction based onnumber and orderelseshow a generic failuremessageend ifThe pseudocode of the Hello world program is particularly simple:output Hello World[edit] Mathematical style pseudocode
    • In numerical computation, pseudocode often consists of mathematical notation, typicallyfrom set and matrix theory, mixed with the control structures of a conventionalprogramming language, and perhaps also natural language descriptions. This is a compactand often informal notation that can be understood by a wide range of mathematicallytrained people, and is frequently used as a way to describe mathematical algorithms.Normally non-ASCII typesetting is used for the mathematical equations, for example bymeans of TeX or MathML markup, or proprietary formula editors.Mathematical style pseudocode is sometimes referred to as pidgin code, for examplepidgin ALGOL (the origin of the concept), pidgin Fortran, pidgin BASIC, pidgin Pascal,and pidgin C.[edit] Machine compilation or interpretationIt is often suggested that future programming languages will be more similar topseudocode or natural language than to present-day languages; the idea is that increasingcomputer speeds and advances in compiler technology will permit computers to createprograms from descriptions of algorithms, instead of requiring the details to beimplemented by a human.[edit] Natural language grammar in programming languagesVarious attempts to bring elements of natural language grammar into computerprogramming have produced programming languages such as HyperTalk, Lingo,AppleScript, SQL and Inform. In these languages, parentheses and other specialcharacters are replaced by prepositions, resulting in quite talkative code. This may makeit easier for a person without knowledge about the language to understand the code andperhaps also to learn the language. However, the similarity to natural language is usuallymore cosmetic than genuine. The syntax rules are just as strict and formal as inconventional programming, and do not necessarily make development of the programseasier.[edit] Mathematical programming languagesAn alternative to using mathematical pseudocode (involving set theory notation or matrixoperations) for documentation of algorithms is to use a formal mathematicalprogramming language that is a mix of non-ASCII mathematical notation and programcontrol structures. Then the code can be parsed and interpreted by a machine.Several formal specification languages include set theory notation using specialcharacters. Examples are:• Z notation• Vienna Development Method Specification Language (VDM-SL).
    • Some array programming languages include vectorized expressions and matrixoperations as non-ASCII formulas, mixed with conventional control structures. Examplesare:• A programming language (APL), and its dialects APLX and A+.• MathCAD.The process of converting a problem to computer code is a five-step one and you mayhave to repeat some steps in response to difficulties.Step 1: Define the problem. Before starting, its important you completely understand theproblems nature any assumptions.Step 2: Plan the solution. Break the problems solution down into its smallest steps anddetermine how they are logically linked.Step 3: Code the program. Translate the logical solution into a programming languagethe computer understands.Step 4: Test the program. Check the program logic by hand and then by machine usingvarious test cases.
    • Step 5: Document everything. The most important step in many cases. You wont alwaysremember what you did or be able to figure it out.TranslatorsTo get from your programming language down to the binary steps the computerunderstands requires some form of translator. Translators come in two general types:• Compiler: Translates an entire program at one time then executes.o Compiled programs execute much faster.o Compilation is usually a multi-step process.o Compilers do not require space in memory when programs run.• Interpreter: Translates a program line at a time while executing.o Interpreted programs are slower because translation takes times.o Interpretation translates in one step.o Interpreters must be in memory while a program is running.Programming Language HierarchyThere are a variety of programming languages, falling into several classes. These classesrange from actual machine code through languages with very English-like structure.There are other trade-offs as shown here.Language English-like Ease of Use EfficiencyMachine Not Very Hard VeryAssemblyHigh-levelNonprocedural Very Easy Not VeryThe basic trade-off you have to make is between ease of use and efficiency. Becausehigher level languages tend to require lots of extra code, they dont use the machine asefficiently as possible. This partially accounts for the need for more powerful hardware torun newer software.Comp 150 - Algorithms & Pseudo-Code(revised 01/11/2008)
    • Definition of Algorithm (after Al Kho-war-iz-mi a 9th century Persian mathematician) -an ordered sequence of unambiguous and well-defined instructions that performs sometask and halts in finite timeLets examine the four parts of this definition more closely1. an ordered sequence means that you can number the steps (its socks then shoes!)2. unambiguous and well-defined instructions means that each instruction is clear,do-able, and can be done without difficulty3. performs some task4. halts in finite time (algorithms terminate!)Algorithms can be executed by a computing agent which is not necessarily a computer.Three Catagories of Algorithmic OperationsAlgorithmic operations are ordered in that there is a first instruction, a second instructionetc. However, this is not enough. An algorithm must have the ability to alter the order ofits instructions. An instruction that alters the order of an algorithm is called a controlstructureThree Categories of Algorithmic Operations:1. sequential operations - instructions are executed in order2. conditional ("question asking") operations - a control structure that asks atrue/false question and then selects the next instruction based on the answer3. iterative operations (loops) - a control structure that repeats the execution of ablock of instructionsUnfortunately not every problem or task has a "good" algorithmic solution. There are1. unsolvable problems - no algorithm can exist to solve the problem (HaltingProblem)2. "hard" (intractable) problems - algorithm takes too long to solve the problem(Traveling Salesman Problem)3. problems with no known algorithmic solutionHow to represent algorithms?1. Use natural languageso too verboseo too "context-sensitive"- relies on experience of reader2. Use formal programming languageso too low level
    • o requires us to deal with complicated syntax of programming language3. Pseudo-Code - natural language constructs modeled to look like statementsavailable in many programming languagesPseudo-Code is simply a numbered list of instructions to perform some task. In thiscourse we will enforce three standards for good pseudo code1. Number each instruction. This is to enforce the notion of an ordered sequenceof ... operations. Furthermore we introduce a dot notation (e.g. 3.1 come after 3but before 4) to number subordinate operations for conditional and iterativeoperations2. Each instruction should be unambiguous (that is the computing agent, in this casethe reader, is capable of carrying out the instruction) and effectively computable(do-able).3. Completeness. Nothing is left out.Pseudo-code is best understood by looking at examples. Each example belowdemonstrates one of the control structures used in algorithms : sequential operations,conditional operations, and iterative operations. We also list all variables used at the endof the pseudo-code.Example #1 - Computing Sales Tax : Pseudo-code the task of computing the final priceof an item after figuring in sales tax. Note the three types of instructions: input (get),process/calculate (=) and output (display)1. get price of item2. get sales tax rate3. sales tax = price of time times sales tax rate4 final prince = price of item plus sales tax5. display final price6. haltVariables: price of item, sales tax rate, sales tax, final priceNote that the operations are numbered and each operation is unambiguous and effectivelycomputable. We also extract and list all variables used in our pseudo-code. This will beuseful when translating pseudo-code into a programming languageExample #2 - Computing Weekly Wages: Gross pay depends on the pay rate and thenumber of hours worked per week. However, if you work more than 40 hours, you getpaid time-and-a-half for all hours worked over 40. Pseudo-code the task of computinggross pay given pay rate and hours worked.1. get hours worked
    • 2. get pay rate3. if hours worked ≤ 40 then3.1 gross pay = pay rate times hours worked4. else4.1 gross pay = pay rate times 40 plus 1.5 times pay rate times(hours worked minus 40)5. display gross pay6. haltvariables: hours worked, ray rate, gross payThis example introduces the conditional control structure. On the basis of the true/falsequestion asked in line 3, we execute line 3.1 if the answer is True; otherwise if the answeris False we execute the lines subordinate to line 4 (i.e. line 4.1). In both cases we resumethe pseudo-code at line 5.Example #3 - Computing a Quiz Average: Pseudo-code a routine to calculate your quizaverage.1. get number of quizzes2. sum = 03. count = 04. while count < number of quizzes4.1 get quiz grade4.2 sum = sum + quiz grade4.3 count = count + 15. average = sum / number of quizzes6. display average7. haltvariables: number of quizzes, sum ,count, quiz grade, averageThis example introduces an iterative control statement. As long as the condition in line 4is True, we execute the subordinate operations 4.1 - 4.3. When the condition becomesFalse, we resume the pseudo-code at line 5.This is an example of a top-test or while do iterative control structure. There is also abottom-test or repeat until iterative control structure which executes a block of statementsuntil the condition tested at the end of the block is False.Pseudo-code is one important step in the process of writing a program.Pseudo-code Language Constructions : A Summary
    • Computation/Assignmentset the value of "variable" to :"arithmetic expression" or"variable" equals "expression" or"variable" = "expression"Input/Outputget "variable", "variable", ...display "variable", "variable", ...Conditional (dot notation used for numbering subordinate statements)6. if "condition"6.1 (subordinate) statement 16.2 etc ...7. else7.1 (subordinate) statement 27.2 etc ...Iterative (dot notation used for numbering subordinate statements)9. while "condition"9.1 (subordinate) statement 19.2 etc ...Return to Comp 150 Home PageDetermining the Day of the Week from the Date(see Asgt 06 - Calculating Day of the Week)03/17/2004The day of the week for any date can be obtained from the following two data items :A. The ordinal position of the day within the year (e.g. March 25, 1999 was day 84).Well call this the year_ordinal.B. The ordinal position of the day within the week for January 1 of that year (whereSunday is 1, Monday is 2 etc.). The ordinal position of the day within the week well callthe week_ordinal and week_ordinal for January 1 well refer call week_ordinal(1/1). Thisvalue depends on the year.
    • Given these two numbers, year_ordinal (the ordinal position of day within the year) andweek_ordinal(1/1) (the ordinal position of January 1 within the week), the week_ordinalfor any date is easily found by the formula((year_ordinal - 1) + (week_ordinal(1/1) - 1) ) modulo 7 + 1Essentially you start at the ordinal position of January 1 with in the week and increment itby the ordinal position of the day within the year minus 1 (modulo 7) then add 1 to obtainordinal position of the day within the week.For example, if the January 1 of the year was a Wednesday (day of week 4) and youwanted to find the day of the week January 12 fell on (day number 12) then starting at 3you count forward 11 units (modulo 7)3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 0You end up at 0. Adding 1 to 0 yields 1 so January 12 is a Sunday. You can check thatthis works using the calender below by starting at Wednesday the 1st and countingforward 11 days to arrive at Sunday!Su M Tu W Th F Sa1 2 3 45 6 7 8 9 10 1112 13 14 15 16 17 1819 20 21 22 23 24 2526 27 28 29 30 31Calculating Year Ordinal: the ordinal postion of the day within the yearCalculating the year_ordinal is easily done if we make use of the following table whichlists the number of days before the 1st of each monthMonth Number of Daysbefore 1stJan 0Feb 31Mar 59*Apr 90*May 120*Jun 151*Jul 181*Aug 212*Sept 243*Oct 273*Nov 304*Dec 334* Note : * indicates add 1 for leapyearsThis table is obtained by summing the days of all prior months.
    • The year ordinal for any date is obtained by adding the proper value for that month fromthe table to the day.ExampleJune 15, 2003 is day 166 (151 + 15).Calculating Week_Ordinal (1/1): the ordinal position of January 1Since 365 is not divisible by 7 but 365 equals 52 times 7 plus 1, the ordinal position ofthe day within the week for January 1 advances by 1 day from one year to the next exceptwhen the previous year was a leap year in which case it advances by 2 days.ExampleSince January 1, 1998 fell on a Thursday, January 1, 1999 fell on a Friday since thereare 365 days between them.Since January 1, 2000 fell on a Saturday, January 1, 2001 will fall on a Monday sincethere are 366 days between them.Given that the January 1 advances one day except when going from a leap year in whichcase it advances two days, its not difficult to show that there is a 28 year cycle fordetermining the day for January 1. Consider the years 1901 through 1928. The cyclestarts with Tuesday, January 1 1901. Years in red indicate where the ordinal positionadvances by 21901 - Tu 1902 - W 1903 - Th 1904 - F1905 - Su 1906 - M 1907 - Tu 1908 - W1909 - F 1910 - Sa 1911 - Su 1912 - M1913 - W 1914 - Th 1915 - F 1916 - Sa1917 - M 1918 - Tu 1919 - W 1920 - Th1921 - Sa 1922 - Su 1923 - M 1924 - Tu1925 - Th 1926 - F 1927 - Sa 1928 - Su1929 - Tu 1930 - W 1931 - Th 1932 - FAs is shown, the pattern repeats with 1929. The cycle is 28 years long.Since the cycle repeats every 28 years, if we calculate the difference between the currentyear and 1901 modulo 28, we will know where we are within the 28 year cycle. Using theformula we obtaina = (year - 1901) modulo 28where a is in integer between 0 and 27.
    • Example : For the year 2000,(2000 - 1901) modulo 28 equals 15.and if you count 15 forward from 1901 - Tu (1902 - W is 1, 1903 - Th is 2 etc) you endup at 1916 - Sa. So January 1, 2000 was a Saturday.There is a trick we can used to calculate the day instead counting forward on the 1901 -1928 table. Given any year, the value of a that we calculate is the offset into the 28 yearcycle. And if we did not have to take into account the effect of leap years, if we added ato the to the first day value for 1901 modulo 7, we would have the first day for the year;that is calculate (3 + a) mod 7.However this does not take into account the effect of leap years which pushed January 1ahead two days instead of 1. So if we add the number of leap years in the cycle, b whereb = floor (a / 4)the sum of a plus b modulo 7 tells us how many days we have to advance January 1 fromTuesday. Since modulo 7 returns a value between 0 and 6 and we normally number thedays of the week 1 (Sunday) through 7 (Saturday), we have to add 1.week_ordinal(1/1) = (2 + a + b) modulo 7 + 1Thus the week_ordinal(1/1) can be found by the three formulas1. a = (year - 1901) mod 282. b = floor(a/4)3. week_ordinal(1/1) = (2 + a + b) modulo 7 + 1SummaryTo find the ordinal position of the day within the week for any date between Jan 1, 1901and Dec 31 20991. Find the year_ordinal using the table of days before the first of the month2. Calculate week_ordinal(1/1) as followsa = (year - 1901) modulo 28b = floor (a/4)week_ordinal(1/1) = (2 + a + b) modulo 7 + 13. Day of the Week = ((year_ordinal - 1) + (week_ordinal(1/1) - 1)) modulo 7 + 1
    • Addendum1. This algorithm only works for the dates between 1901 and 2100. The fact that 2100is not a leap year prohibits the 28 year cycle for obtaining the first day from carryingover into the 22nd century.2. The algorithm presented makes use of modular arithmetic with its use of modulo 28and modulo 7 calculations. In modular arithmetic its easier to starting counting at 0instead of 1. Consequently the algorithm could be simplified if we made the followingchangesa. Number the days from 0 to 6 with Sunday being day 0 and Saturday being day 6b. Number the days of the year from 0 to 364 (or 365 for leap year) with January 1being day 0 etc.Alternate AlgorithmNumber the days of the week 0 - 6 and the days of the year 0 - 364 (or 365 for leap year)1. Find the year_ordinal using the table of days before the first of the month exceptsubtract 1 from this value. This would number the day of the year from 0 to 364 (or 365for leap years)2. Calculate week_ordinal(1/1) as followsa = (year - 1901) modulo 28b = floor (a/4)week_ordinal(1/1) = (2 + a + b) modulo 7We note that Tuesday January 1, 1901 is now day 2 under the new numbering (Sunday isday 0)3. Day of the Week = (year_ordinal + week_ordinal(1/1)) modulo 7Example : Find the Day of the Year for March 21, 20041. March 21, 2000 is day 60 + 21 - 1 = 802. a = (2004 - 1901) modulo 28 = 19b = floor (19/4) = 4week_ordinal(1/1) = (2 + 19 + 4) modulo 7 = 4 (Thursday)3. week_ordinal(3/21/2004) = (80+4) modulo 7 = 0 (Sunday)
    • Pseudocode ExamplesModified 15 December 1999An algorithm is a procedure for solving a problem in terms of the actionsto be executed and the order in which those actions are to beexecuted. An algorithm is merely the sequence of steps taken to solvea problem. The steps are normally "sequence," "selection, ""iteration," and a case-type statement.In C, "sequence statements" are imperatives. The "selection" is the "ifthen else" statement, and the iteration is satisfied by a number ofstatements, such as the "while," " do," and the "for," while the case-typestatement is satisfied by the "switch" statement.Pseudocode is an artificial and informal language that helps programmersdevelop algorithms. Pseudocode is a "text-based" detail (algorithmic)design tool.The rules of Pseudocode are reasonably straightforward. All statementsshowing "dependency" are to be indented. These include while, do, for, if,switch. Examples below will illustrate this notion.GUIDE TO PSEUDOCODE LEVEL OF DETAIL: Given record/filedescriptions, pseudocode should be created in sufficient detail so as todirectly support the programming effort. It is the purpose of pseudocodeto elaborate on the algorithmic detail and not just cite an abstraction.Examples:1.If students grade is greater than or equal to 60Print "passed"elsePrint "failed"endif2.
    • Set total to zeroSet grade counter to oneWhile grade counter is less than or equal to tenInput the next gradeAdd the grade into the totalendwhileSet the class average to the total divided by tenPrint the class average.3.Initialize total to zeroInitialize counter to zeroInput the first gradewhile the user has not as yet entered the sentineladd this grade into the running totaladd one to the grade counterinput the next grade (possibly the sentinel)endwhileif the counter is not equal to zeroset the average to the total divided by the counterprint the averageelseprint no grades were enteredendif4.initialize passes to zeroinitialize failures to zeroinitialize student to onewhile student counter is less than or equal to teninput the next exam resultif the student passedadd one to passeselseadd one to failuresadd one to student counterendifendwhileprint the number of passes
    • print the number of failuresif eight or more students passedprint "raise tuition"endif5.Larger example:NOTE: NEVER ANY DATA DECLARATIONS IN PSEUDOCODEPrint out appropriate heading and make it prettyWhile not EOF do:Scan over blanks and white space until a char is found(get first character on the line)set cant-be-ascending-flag to 0set consec cntr to 1set ascending cntr to 1putchar first char of string to screenset read character to hold characterWhile next character read != blanks and white spaceputchar out on screenif new char = hold char + 1add 1 to consec cntrset hold char = new charcontinueendifif new char >= hold charif consec cntr < 3set consec cntr to 1endifset hold char = new charcontinueendifif new char < hold charif consec cntr < 3set consec cntr to 1endifset hold char = new charset cant be ascending flag to 1continueendifend whileif consec cntr >= 3printf (Appropriate message 1 and skip a line)add 1 to consec totalendifif cant be ascending flag = 0printf (Appropriate message 2 and skip a line)add 1 to ascending totalelseprintf (Sorry message and skip a line)add 1 to sorry total
    • endifend WhilePrint out totals: Number of consecs, ascendings, and sorries.StopSome Keywords that should be Used and Additional PointsFor looping and selection, The keywords that are to be used include DoWhile...EndDo; Do Until...Enddo; While .... Endwhile is acceptable.Also, Loop .... endloop is also VERY good and is languageindependent. Case...EndCase; If...Endif; Call ... with (parameters);Call; Return ....; Return; When;Always use scope terminators for loops and iteration.As verbs, use the words Generate, Compute, Process, etc. Words such asset, reset, increment, compute, calculate, add, sum, multiply, ... print,display, input, output, edit, test , etc. with careful indentation tend to fosterdesirable pseudocode. Also, using words such as Set and Initialize, whenassigning values to variables is also desirable.More on Formatting and Conventions in Pseudocoding INDENTATION in pseudocode should be identical to itsimplementation in a programming language. Try to indent at leastfour spaces. As noted above, the pseudocode entries are to be cryptic, ANDSHOULD NOT BE PROSE. NO SENTENCES. No flower boxes (discussed ahead) in your pseudocode. Do not include data declarations in your pseudocode. But do cite variables that are initialized as part of their declarations.E.g. "initialize count to zero" is a good entry.Function Calls, Function Documentation, and Pseudocode Calls to Functions should appear as:Call FunctionName (arguments: field1, field2, etc.) Returns in functions should appear as:
    • Return (field1) Function headers should appear as:FunctionName (parameters: field1, field2, etc. ) Note that in C, arguments and parameters such as "fieldn" could bewritten: "pointer to fieldn ...." Functions called with addresses should be written as:Call FunctionName (arguments: pointer to fieldn, pointer to field1,etc.) Function headers containing pointers should be indicated as:FunctionName (parameters: pointer to field1, pointer to field2, ...) Returns in functions where a pointer is returned:Return (pointer to fieldn) It would not hurt the appearance of your pseudocode to draw a line ormake your function header line "bold" in your pseudocode. Try toset off your functions. Try to use scope terminators in your pseudocode and source code too. Itreally hels the readability of the text.Source Code EVERY function should have a flowerbox PRECEDING IT. Thisflower box is to include the functions name, the main purpose of thefunction, parameters it is expecting (number and type), and the typeof the data it returns. All of these listed items are to be on separatelines with spaces in between each explanatory item. FORMAT of flowerbox should be********************************************************Function: ( cryptic text describing single function....... (indented like this).......Calls: Start listing functions "this" function callsShow these functions: one per line, indentedCalled by: List of functions that calls "this" functionShow these functions: one per line, indented.Input Parameters: list, if appropriate; else NoneReturns: List, if appropriate.**************************************************************** INDENTATION is critically important in Source Code. Followstandard examples given in class. If in doubt, ASK. Always indentstatements within IFs, FOR loops, WILLE loops, SWITCH
    • statements, etc. a consistent number of spaces, such as four.Alternatively, use the tab key. One or two spaces is insufficient. Use scope terminators at the end of if statements, for statements, whilestatements, and at the end of functions. It will make your programmuch more readable.SPELLING ERRORS ARE NOT ACCEPTABLEPSEUDOCODE STANDARDPseudocode is a kind of structured english for describing algorithms. It allows thedesigner to focus on the logic of the algorithm without being distracted by details oflanguage syntax. At the same time, the pseudocode needs to be complete. It describe theentire logic of the algorithm so that implementation becomes a rote mechanical task oftranslating line by line into source code.In general the vocabulary used in the pseudocode should be the vocabulary of theproblem domain, not of the implementation domain. The pseudocode is a narrative forsomeone who knows the requirements (problem domain) and is trying to learn how thesolution is organized. E.g.,Extract the next word from the line (good)set word to get next token (poor)Append the file extension to the name (good)name = name + extension (poor)FOR all the characters in the name (good)FOR character = first to last (ok)Note that the logic must be decomposed to the level of a single loop or decision. Thus"Search the list and find the customer with highest balance" is too vague because it takesa loop AND a nested decision to implement it. Its okay to use "Find" or "Lookup" iftheres a predefined function for it such as String.indexOf().Each textbook and each individual designer may have their own personal style ofpseudocode. Pseudocode is not a rigorous notation, since it is read by other people, not bythe computer. There is no universal "standard" for the industry, but for instructionalpurposes it is helpful if we all follow a similar style. The format below is recommendedfor expressing your solutions in our class.
    • The "structured" part of pseudocode is a notation for representing six specific structuredprogramming constructs: SEQUENCE, WHILE, IF-THEN-ELSE, REPEAT-UNTIL,FOR, and CASE. Each of these constructs can be embedded inside any other construct.These constructs represent the logic, or flow of control in an algorithm.It has been proven that three basic constructs for flow of control are sufficient toimplement any "proper" algorithm.SEQUENCE is a linear progression where one task is performed sequentially afteranother.WHILE is a loop (repetition) with a simple conditional test at its beginning.IF-THEN-ELSE is a decision (selection) in which a choice is made between twoalternative courses of action.Although these constructs are sufficient, it is often useful to include three moreconstructs:REPEAT-UNTIL is a loop with a simple conditional test at the bottom.CASE is a multiway branch (decision) based on the value of an expression. CASE is ageneralization of IF-THEN-ELSE.FOR is a "counting" loop.SEQUENCESequential control is indicated by writing one action after another, each action on a lineby itself, and all actions aligned with the same indent. The actions are performed in thesequence (top to bottom) that they are written.Example (non-computer)Brush teethWash faceComb hairSmile in mirrorExampleREAD height of rectangleREAD width of rectangleCOMPUTE area as height times widthCommon Action KeywordsSeveral keywords are often used to indicate common input, output, and processingoperations.Input: READ, OBTAIN, GETOutput: PRINT, DISPLAY, SHOWCompute: COMPUTE, CALCULATE, DETERMINE
    • Initialize: SET, INITAdd one: INCREMENT, BUMPIF-THEN-ELSEBinary choice on a given Boolean condition is indicated by the use of four keywords: IF,THEN, ELSE, and ENDIF. The general form is:IF condition THENsequence 1ELSEsequence 2ENDIFThe ELSE keyword and "sequence 2" are optional. If the condition is true, sequence 1 isperformed, otherwise sequence 2 is performed.ExampleIF HoursWorked > NormalMax THENDisplay overtime messageELSEDisplay regular time messageENDIFWHILEThe WHILE construct is used to specify a loop with a test at the top. The beginning andending of the loop are indicated by two keywords WHILE and ENDWHILE. The generalform is:WHILE conditionsequenceENDWHILEThe loop is entered only if the condition is true. The "sequence" is performed for eachiteration. At the conclusion of each iteration, the condition is evaluated and the loopcontinues as long as the condition is true.ExampleWHILE Population < LimitCompute Population as Population + Births - DeathsENDWHILEExampleWHILE employee.type NOT EQUAL manager AND personCount < numEmployeesINCREMENT personCountCALL employeeList.getPerson with personCount RETURNING employeeENDWHILECASE
    • A CASE construct indicates a multiway branch based on conditions that are mutuallyexclusive. Four keywords, CASE, OF, OTHERS, and ENDCASE, and conditions areused to indicate the various alternatives. The general form is:CASE expression OFcondition 1 : sequence 1condition 2 : sequence 2...condition n : sequence nOTHERS:default sequenceENDCASEThe OTHERS clause with its default sequence is optional. Conditions are normallynumbers or charactersindicating the value of "expression", but they can be English statements or some othernotation that specifies the condition under which the given sequence is to be performed.A certain sequence may be associated with more than one condition.ExampleCASE Title OFMr : Print "Mister"Mrs : Print "Missus"Miss : Print "Miss"Ms : Print "Mizz"Dr : Print "Doctor"ENDCASEExampleCASE grade OFA : points = 4B : points = 3C : points = 2D : points = 1F : points = 0ENDCASEREPEAT-UNTILThis loop is similar to the WHILE loop except that the test is performed at the bottom ofthe loop instead of at the top. Two keywords, REPEAT and UNTIL are used. The generalform is:REPEAT
    • sequenceUNTIL conditionThe "sequence" in this type of loop is always performed at least once, because the test ispeformed after the sequence is executed. At the conclusion of each iteration, thecondition is evaluated, and the loop repeats if the condition is false. The loop terminateswhen the condition becomes true.FORThis loop is a specialized construct for iterating a specific number of times, often called a"counting" loop. Two keywords, FOR and ENDFOR are used. The general form is:FOR iteration boundssequenceENDFORIn cases where the loop constraints can be obviously inferred it is best to describe theloop using problem domain vocabulary.ExampleFOR each month of the year (good)FOR month = 1 to 12 (ok)FOR each employee in the list (good)FOR empno = 1 to listsize (ok)NESTED CONSTRUCTSThe constructs can be embedded within each other, and this is made clear by use ofindenting. Nested constructs should be clearly indented from their surroundingconstructs.ExampleSET total to zeroREPEATREAD TemperatureIF Temperature > Freezing THENINCREMENT totalEND IFUNTIL Temperature < zeroPrint totalIn the above example, the IF construct is nested within the REPEAT construct, andtherefore is indented.
    • INVOKING SUBPROCEDURESUse the CALL keyword. For example:CALL AvgAge with StudentAgesCALL Swap with CurrentItem and TargetItemCALL Account.debit with CheckAmountCALL getBalance RETURNING aBalanceCALL SquareRoot with orbitHeight RETURNING nominalOrbitEXCEPTION HANDLINGBEGINstatementsEXCEPTIONWHEN exception typestatements to handle exceptionWHEN another exception typestatements to handle exceptionENDSample Pseudocode"Adequate"FOR X = 1 to 10FOR Y = 1 to 10IF gameBoard[X][Y] = 0Do nothingELSECALL theCall(X, Y) (recursive method)increment counterEND IFEND FOREND FOR"Better"Set moveCount to 1FOR each row on the boardFOR each column on the boardIF gameBoard position (row, column) is occupiedTHENCALL findAdjacentTiles with row, columnINCREMENT moveCountEND IF
    • END FOREND FOR(Note: the logic is restructured to omit the "do nothing" clause)"Not So Good"FOR all the number at the back of the arraySET Temp equal the addition of each numberIF > 9 THENget the remainder of the number divided by 10 to that indexand carry the "1"Decrement oneDo it again for numbers before the decimal"Good Enough (not perfect)"SET Carry to 0FOR each DigitPosition in Number from least significant to mostsignificantCOMPUTE Total as sum of FirstNum[DigitPosition] andSecondNum[DigitPosition] and CarryIF Total > 10 THENSET Carry to 1SUBTRACT 10 from TotalELSESET Carry to 0END IFSTORE Total in Result[DigitPosition]END LOOPIF Carry = 1 THENRAISE Overflow exceptionEND IF"Pretty Good" This example shows how pseudocode is written as comments in thesource file. Note that the double slashes are indented.public boolean moveRobot (Robot aRobot){//IF robot has no obstacle in front THEN// Call Move robot// Add the move command to the command history
    • // RETURN true//ELSE// RETURN false without moving the robot//END IF}Example Java Implementation• source code statements are interleaved with pseudocode.• comments that correspond exactly to source code are removed during coding.public boolean moveRobot (Robot aRobot){//IF robot has no obstacle in front THENif (aRobot.isFrontClear()){// Call Move robotaRobot.move();// Add the move command to the command historycmdHistory.add(RobotAction.MOVE);return true;}else // dont move the robot{return false;}//END IF}Examples of vague pseudocodeDocument HistoryDate Author Change12/2/03 JD Added Exception Handling and more examples2/21/03 JDAdded "problem domain vocabulary" paragraph.Modified FOR loop explanation.Algorithms for calculating varianceFrom Wikipedia, the free encyclopediaJump to: navigation, search
    • Algorithms for calculating variance play a minor role in statistical computing. A keyproblem in the design of good algorithms for this problem is that formulas for thevariance may involve sums of squares, which can lead to numerical instability as well asto arithmetic overflow when dealing with large values.Contents[hide]• 1 Algorithm I• 2 Algorithm II• 3 Algorithm II (compensated)• 4 Algorithm III• 5 Algorithm IV• 6 Example• 7 See also• 8 References• 9 External links[edit] Algorithm IThe formula for calculating the variance of an entire population of size n is:The formula for calculating an unbiased estimate of the population variance from a finitesample of n observations is:Therefore a naive algorithm to calculate the estimated variance is given by the followingpseudocode:n = 0sum = 0sum_sqr = 0foreach x in data:n = n + 1sum = sum + xsum_sqr = sum_sqr + x*xend formean = sum/nvariance = (sum_sqr - sum*mean)/(n - 1)This algorithm can easily be adapted to compute the variance of a finite population:simply divide by n instead of n − 1 on the last line.Because sum_sqr and sum * mean can be very similar numbers, the precision of theresult can be much less than the inherent precision of the floating-point arithmetic used to
    • perform the computation. This is particularly bad if the variance is small relative to thesum of the numbers.[edit] Algorithm IIAn alternate approach, using a different formula for the variance, is given by thefollowing pseudocode:n = 0sum1 = 0foreach x in data:n = n + 1sum1 = sum1 + xend formean = sum1/nsum2 = 0foreach x in data:sum2 = sum2 + (x - mean)^2end forvariance = sum2/(n - 1)This algorithm is often more numerically reliable than Algorithm I for large sets of data,although it can be worse if much of the data is very close to but not precisely equal to themean and some are quite far away from it.The results of both of these simple algorithms (I and II) can depend inordinately on theordering of the data and can give poor results for very large data sets due to repeatedroundoff error in the accumulation of the sums. Techniques such as compensatedsummation can be used to combat this error to a degree.[edit] Algorithm II (compensated)The compensated-summation version of the algorithm above reads:n = 0sum1 = 0foreach x in data:n = n + 1sum1 = sum1 + xend formean = sum1/nsum2 = 0sumc = 0foreach x in data:sum2 = sum2 + (x - mean)^2sumc = sumc + (x - mean)end forvariance = (sum2 - sumc^2/n)/(n - 1)
    • [edit] Algorithm IIIThe following formulas can be used to update the mean and (estimated) variance of thesequence, for an additional element xnew. Here, m denotes the estimate of the populationmean, s2n-1 the estimate of the population variance, s2n the estimate of the sample variance,and n the number of elements in the sequence before the addition.A numerically stable algorithm is given below. It also computes the mean. This algorithmis due to Knuth,[1]who cites Welford.[2]n = 0mean = 0S = 0foreach x in data:n = n + 1delta = x - meanmean = mean + delta/nS = S + delta*(x - mean) // This expression uses the new valueof meanend forvariance = S/(n - 1)This algorithm is much less prone to loss of precision due to massive cancellation, butmight not be as efficient because of the division operation inside the loop. For aparticularly robust two-pass algorithm for computing the variance, first compute andsubtract an estimate of the mean, and then use this algorithm on the residuals.[edit] Algorithm IVWhen the observations are weighted, West (1979) [3]suggests this incremental algorithm:n = 0foreach x in the data:if n=0 thenn = 1mean = xS = 0sumweight = weightelsen = n + 1temp = weight + sumweightS = S + sumweight*weight*(x-mean)^2 / tempmean = mean + (x-mean)*weight / tempsumweight = tempend ifend forVariance = S * sumweight * n / (n-1) // if sample is the population,omit n/(n-1)
    • [edit] ExampleAssume that all floating point operations use the standard IEEE 754 double-precisionarithmetic. Consider the sample (4, 7, 13, 16) from an infinite population. Based on thissample, the estimated population mean is 10, and the unbiased estimate of populationvariance is 30. Both Algorithm I and Algorithm II compute these values correctly. Nextconsider the sample (108+ 4, 108+ 7, 108+ 13, 108+ 16), which gives rise to the sameestimated variance as the first sample. Algorithm II computes this variance estimatecorrectly, but Algorithm I returns 29.333333333333332 instead of 30. While this loss ofprecision may be tolerable and viewed as a minor flaw of Algorithm I, it is easy to finddata that reveal a major flaw in the naive algorithm: Take the sample to be (109+ 4, 109+7, 109+ 13, 109+ 16). Again the estimated population variance of 30 is computedcorrectly by Algorithm II, but the naive algorithm now computes it as-170.66666666666666. This is a serious problem with Algorithm I because by definitionthe variance can never be negative.Not to be confused with Euclidean geometry.In number theory, the Euclidean algorithm (also called Euclids algorithm) is analgorithm to determine the greatest common divisor (GCD) of two elements of anyEuclidean domain (for example, the integers). Its major significance is that it does notrequire factoring the two integers, and it is also significant in that it is one of the oldestalgorithms known, dating back to the ancient Greeks.Contents[hide]• 1 History of the Euclidean algorithm• 2 Description of the algorithmo 2.1 Using recursiono 2.2 Using iterationo 2.3 The extended Euclidean algorithmo 2.4 Original algorithm• 3 An example• 4 Proof• 5 Running time• 6 Relation with continued fractions• 7 Generalization to Euclidean domains• 8 See also• 9 References• 10 External links
    • [edit] History of the Euclidean algorithmThe Euclidean algorithm is one of the oldest algorithms known, since it appeared inEuclids Elements around 300 BC. Euclid originally formulated the problemgeometrically, as the problem of finding a common "measure" for two line lengths, andhis algorithm proceeded by repeated subtraction of the shorter from the longer segment.However, the algorithm was probably not discovered by Euclid and it may have beenknown up to 200 years earlier. It was almost certainly known by Eudoxus of Cnidus(about 375 BC), and Aristotle (about 330 BC) hinted at it in his Topics, 158b, 29-35.[edit] Description of the algorithmGiven two natural numbers a and b, not both equal to zero: check if b is zero; if yes, a isthe gcd. If not, repeat the process using, respectively, b, and the remainder after dividinga by b. The remainder after dividing a by b is usually written as a mod b.These algorithms can be used in any context where division with remainder is possible.This includes rings of polynomials over a field as well as the ring of Gaussian integers,and in general all Euclidean domains. Applying the algorithm to the more general caseother than natural number will be discussed in more detail later in the article.[edit] Using recursionUsing recursion, the algorithm can be expressed:function gcd(a, b)if b = 0 return aelse return gcd(b, a mod b)[edit] Using iterationAn efficient, iterative method, for compilers that dont optimize tail recursion:function gcd(a, b)while b ≠ 0t := bb := a mod ba := treturn a[edit] The extended Euclidean algorithmMain article: extended Euclidean algorithm
    • By keeping track of the quotients occurring during the algorithm, one can also determineintegers p and q with ap + bq = gcd(a, b). This is known as the extended Euclideanalgorithm.[edit] Original algorithmThe original algorithm as described by Euclid treated the problem geometrically, usingrepeated subtraction rather than mod (remainder).function gcd(a, b)while b ≠ 0if a > ba := a - belseb := b - areturn aNotice that while b can be zero, a cannot be zero; otherwise, this becomes an infiniteloop. This algorithm does not require an extra variable t.[edit] An exampleAs an example, consider computing the gcd of 1071 and 1029, which is 21. Recall that“mod” means “the remainder after dividing.”With the recursive algorithm:a b Explanationsgcd( 1071, 1029)The initial arguments= gcd( 1029, 42)The second argument is 1071 mod 1029= gcd( 42, 21)The second argument is 1029 mod 42= gcd( 21, 0)The second argument is 42 mod 21= 21 Since b=0, we return a
    • With the iterative algorithm:a b Explanation10711029 Step 1: The initial inputs102942Step 2: The remainder of 1071 divided by 1029 is 42, which is put on theright, and the divisor 1029 is put on the left.42 21 Step 3: We repeat the loop, dividing 1029 by 42, and get 21 as remainder.21 0Step 4: Repeat the loop again, since 42 is divisible by 21, we get 0 asremainder, and the algorithm terminates. The number on the left, that is 21, isthe gcd as required.Observe that a ≥ b in each call. If initially, b > a, there is no problem; the first iterationeffectively swaps the two values.[edit] ProofSuppose a and b are the natural numbers whose gcd has to be determined. Now, supposeb > 0, and the remainder of the division of a by b is r. Therefore a = qb + r where q is thequotient of the division.Any common divisor of a and b is also a divisor of r. To see why this is true, considerthat r can be written as r = a − qb. Now, if there is a common divisor d of a and b suchthat a = sd and b = td, then r = (s−qt)d. Since all these numbers, including s−qt, are wholenumbers, it can be seen that r is divisible by d.The above analysis is true for any divisor d; thus, the greatest common divisor of a and bis also the greatest common divisor of b and r. Therefore it is enough if we continuesearching for the greatest common divisor with the numbers b and r. Since r is smaller inabsolute value than b, we will reach r = 0 after finitely many steps.[edit] Running time
    • Plot of the running time for gcd(x,y). Red indicates a fast computation, whilesuccessively bluer points indicate slower computationsWhen analyzing the running time of Euclids algorithm, it turns out that the inputsrequiring the most divisions are two successive Fibonacci numbers (because their ratiosare the convergents in the slowest continued fraction expansion to converge, that of thegolden ratio) as proved by Gabriel Lamé, and the worst case requires O(n) divisions,where n is the number of digits in the input. However, the divisions themselves are notconstant time operations; the actual time complexity of the algorithm is O(n2). The reasonis that division of two n-bit numbers takes time O(n(m + 1)), where m is the length of thequotient. Consider the computation of gcd(a,b) where a and b have at most n bits, let bethe sequence of numbers produced by the algorithm, and let be their lengths. Then k =O(n), and the running time is bounded byThis is considerably better than Euclids original algorithm, in which the modulusoperation is effectively performed using repeated subtraction in O(2n) steps.Consequently, that version of the algorithm requires O(2nn) time for n-digit numbers, ortime for the number m.Euclids algorithm is widely used in practice, especially for small numbers, due to itssimplicity. An alternative algorithm, the binary GCD algorithm, exploits the binaryrepresentation used by computers to avoid divisions and thereby increase efficiency,although it too is O(n²); it merely shrinks the constant hidden by the big-O notation onmany real machines.There are more complex algorithms that can reduce the running time toO(n(logn)2(loglogn)). See Computational complexity of mathematical operations formore details.[edit] Relation with continued fractionsThe quotients that appear when the Euclidean algorithm is applied to the inputs a and bare precisely the numbers occurring in the continued fraction representation of a/b. Takefor instance the example of a = 1071 and b = 1029 used above. Here is the calculationwith highlighted quotients:1071 = 1029 × 1 + 421029 = 42 × 24 + 2142 = 21 × 2 + 0Consequently,.
    • This method applies to arbitrary real inputs a and nonzero b; if a/b is irrational, then theEuclidean algorithm does not terminate, but the computed sequence of quotients stillrepresents the (now infinite) continued fraction representation of a/b.The quotients 1,24,2 count certain squares nested within a rectangle R having length1071 and width 1029, in the following manner:(1) there is 1 1029×1029 square in R whose removal leaves a 42×1029 rectangle, R1;(2) there are 24 42×42 squares in R1 whose removal leaves a 21×42 rectangle, R2;(3) there are 2 21×21 squares in R2 whose removal leaves nothing.The "visual Euclidean algorithm" of nested squares applies to an arbitrary rectangle R. Ifthe (length)/(width) of R is an irrational number, then the visual Euclidean algorithmextends to a visual continued fraction.[edit] Generalization to Euclidean domainsThe Euclidean algorithm can be applied to some rings, not just the integers. The mostgeneral context in which the algorithm terminates with the greatest common divisor is ina Euclidean domain. For instance, the Gaussian integers and polynomial rings over a fieldare both Euclidean domains.As an example, consider the ring of polynomials with rational coefficients. In this ring,division with remainder is carried out using long division, also known as syntheticdivision. The resulting polynomials are then made monic by factoring out the leadingcoefficient.We calculate the greatest common divisor ofx4− 4x3+ 4x2− 3x + 14 = (x2− 5x + 7)(x2+ x + 2)andx4+ 8x3+ 12x2+ 17x + 6 = (x2+ 7x + 3)(x2+ x + 2).Following the algorithm gives these values:a bx4+ 8x3+ 12x2+ 17x + 6 x4− 4x3+ 4x2− 3x + 14
    • x4− 4x3+ 4x2− 3x + 14x2+ x + 2x2+ x + 2 0This agrees with the explicit factorization. For general Euclidean domains, the proof ofcorrectness is by induction on some size function. For the integers, this size function isjust the identity. For rings of polynomials over a field, it is the degree of the polynomial(note that each step in the above table reduces the degree by one).[edit] See also• Least common multiple• Extended Euclidean algorithm• Binary GCD algorithm• Lehmers GCD algorithm[edit] References• Donald Knuth. The Art of Computer Programming, Volume 2: SeminumericalAlgorithms, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89684-2.Sections 4.5.2–4.5.3, pp.333–379.• Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001.ISBN 0-262-03293-7. Section 31.2: Greatest common divisor, pp.856–862.• Clark Kimberling. "A Visual Euclidean Algorithm," Mathematics Teacher 76(1983) 108-109.[edit] External links• Euclids Algorithm at cut-the-knot• Binary Euclids Algorithm (Java) at cut-the-knot• Euclids Game (Java) at cut-the-knot• Eric W. Weisstein, Euclidean Algorithm at MathWorld.• Eric W. Weisstein, Lamés Theorem at MathWorld.• Euclids algorithm at PlanetMath.• Music and Euclids algorithm• Implementation in Javascript• .NET Implementation of Euclidean algorithm
    • A flowchart (also spelled flow-chart and flow chart) is a schematic representation ofan algorithm or a process.A flowchart is one of the seven basic tools of quality control, which also includes thehistogram, Pareto chart, check sheet, control chart, cause-and-effect diagram, andscatter diagram (see Quality Management Glossary). They are commonly used inbusiness/economic presentations to help the audience visualize the content better, orto find flaws in the process. Alternatively, one can use Nassi-Shneiderman diagrams.A flowchart is described as "cross-functional" when the page is divided intodifferent "lanes" describing the control of different organizational units. A symbolappearing in a particular "lane" is within the control of that organizational unit.This technique allows the analyst to locate the responsibility for performing anaction or making a decision correctly, allowing the relationship between differentorganizational units with responsibility over a single process.Contents [hide]1 History2 Software2.1 Manual2.2 Automatic3 Examples4 Symbols5 References6 See also7 External links[edit]HistoryThe first structured method for documenting process flow, the flow process chart,was introduced by Frank Gilbreth to members of ASME in 1921 as the presentation“Process Charts—First Steps in Finding the One Best Way”. Gilbreths toolsquickly found their way into industrial engineering curricula. In the early 1930s, anindustrial engineer, Allan H. Mogensen began training business people in the use ofsome of the tools of industrial engineering at his Work Simplification Conferences inLake Placid, New York.A 1944 graduate of Mogensens class, Art Spinanger, took the tools back to Procterand Gamble where he developed their Deliberate Methods Change Program.Another 1944 graduate, Ben S. Graham, Director of Formcraft Engineering atStandard Register Corporation, adapted the flow process chart to information
    • processing with his development of the multi-flow process chart to displays multipledocuments and their relationships. In 1947, ASME adopted a symbol set derivedfrom Gilbreths original work as the ASME Standard for Process Charts.According to Herman Goldstine, he developed flowcharts with John von Neumannat Princeton University in late 1946 and early 1947.[1][edit]Software[edit]ManualAny vector-based drawing program can be used to create flowcharts. Some toolsoffer special support for flowcharts, e.g., ConceptDraw and SmartDraw.[edit]AutomaticMany software packages exist that can create flowcharts automatically, eitherdirectly from source code, or from a flowchart description language:For example, Graph::Easy, a Perl package, takes a textual description of the graph,and uses the description to generate various output formats including HTML,ASCII or SVG. The example graph listed below was generated from the text shownbelow. The automatically generated SVG output is shown on the right:A simple flowchart, created automatically.graph { flow: south; }node.start { shape: rounded; fill: #ffbfc9; }node.question { shape: diamond; fill: #ffff8a; }node.action { shape: rounded; fill: #8bef91; }[ Lamp doesnt work ] { class: start }--> [ Lampn plugged in? ] { class: question; }-- No --> [ Plug in lamp ] { class: action; }[ Lampn plugged in? ]--> [ Bulbn burned out? ] { class: question; }-- Yes --> [ Replace bulb ] { class: action; }[ Bulbn burned out? ]-- No --> [ Buy new lamp ] { class: action; }There exist also various MediaWiki Extensions to incorporate flowchartdescriptions directly into wiki articles.
    • [edit]ExamplesA simple flowchart for computing factorial N (N!)A flowchart for computing factorial N (N!) Where N! = 1 * 2 * 3 *...* N. Thisflowchart would be difficult to program directly into a computer programminglanguage since the flowchart represents "a loop and a half" — a situation discussedin introductory programming textbooks that requires either a duplication of acomponent (to be both inside and outside the loop) or the component to be putinside a branch in the loop.Since computer programming languages do not contain all of the constructs thatcan be created by the drawing of flowcharts, they do not often help newprogrammers learn the concepts of logical flow and program structure. To trywriting flowcharts for computer programs, an on-line applet for iconicprogramming is available that limits the flowchart components and connections tothose that can be directly converted into any programming language. (Note: click onthe yellow square to begin.)![edit]SymbolsA typical flowchart from older Computer Science textbooks may have the followingkinds of symbols:Start and end symbols, represented as lozenges, ovals or rounded rectangles, usuallycontaining the word "Start" or "End", or another phrase signaling the start or endof a process, such as "submit enquiry" or "receive product".Arrows, showing whats called "flow of control" in computer science. An arrowcoming from one symbol and ending at another symbol represents that controlpasses to the symbol the arrow points to.Processing steps, represented as rectangles. Examples: "Add 1 to X"; "replaceidentified part"; "save changes" or similar.Input/Output, represented as a parallelogram. Examples: Get X from the user;display X.Conditional (or decision), represented as a diamond (rhombus). These typicallycontain a Yes/No question or True/False test. This symbol is unique in that it hastwo arrows coming out of it, usually from the bottom point and right point, onecorresponding to Yes or True, and one corresponding to No or False. The arrowsshould always be labeled. More than two arrows can be used, but this is normally aclear indicator that a complex decision is being taken, in which case it may need tobe broken-down further, or replaced with the "pre-defined process" symbol.A number of other symbols that have less universal currency, such as:A Document represented as a rectangle with a wavy base;
    • A Manual input represented by rectangle, with the top irregularly sloping up fromleft to right. An example would be to signify data-entry from a form;A Manual operation represented by a trapezoid with the longest parallel side at thetop, to represent an operation or adjustment to process that can only be mademanually.A Data File represented by a cylinderNote: All process symbols within a flowchart should be numbered. Normally anumber is inserted inside the top of the shape to indicate which step the process iswithin the flowchart.Flowcharts may contain other symbols, such as connectors, usually represented ascircles, to represent converging paths in the flow chart. Circles will have more thanone arrow coming into them but only one going out. Some flow charts may just havean arrow point to another arrow instead. These are useful to represent an iterativeprocess (what in Computer Science is called a loop). A loop may, for example,consist of a connector where control first enters, processing steps, a conditional withone arrow exiting the loop, and one going back to the connector. Off-pageconnectors are often used to signify a connection to a (part of another) process heldon another sheet or screen. It is important to remember to keep these connectionslogical in order. All processes should flow from top to bottom and left to right.C Data types.Variable declarationC has a concept of data types which are used to declare a variable before its use.The declaration of a variable will assign storage for the variable and define the typeof data that will be held in the location.So what data types are available?int float double char void enumPlease note that there is not a boolian data type. C does not have the traditional viewabout logical comparison, but thats another story.int - data typeint is used to define integer numbers.{int Count;Count = 5;}float - data typefloat is used to define floating point numbers.
    • {float Miles;Miles = 5.6;}double - data typedouble is used to define BIG floating point numbers. It reserves twice the storage forthe number. On PCs this is likely to be 8 bytes.{double Atoms;Atoms = 2500000;}char - data typechar defines characters.{char Letter;Letter = x;}ModifiersThe three data types above have the following modifiers.shortlongsignedunsignedThe modifiers define the amount of storage allocated to the variable. The amount ofstorage allocated is not cast in stone. ANSI has the following rules:short int <= LONGWhat this means is that a short int should assign less than or the same amount ofstorage as an int and the int should be less bytes than a long int. What thismeans in the real world is:short int - 2 bytes (16 bits)int int - 2 bytes (16 bits)long int - 4 bytes (32 bits)signed char - 1 byte (Range -128 ... +127)unsigned char - 1 byte (Range 0 ... 255)float - 4 bytesdouble - 8 byteslong double - 8 bytesThese figures only apply to todays generation of PCs. Mainframes and midrangemachines could use different figures, but would still comply with the rule above.You can find out how much storage is allocated to a data type by using the sizeofoperator.Qualifiers
    • constVolatileThe const qualifier is used to tell C that the variable value can not change afterinitialisation.const float pi=3.14159;pi cannot be changed at a later time within the program.Another way to define constants is with the #define preprocessor which has theadvantage that it does not use any storage (but who counts bytes these days?).See also:Data type conversionStorage classes.casttypedef keyword.I have had several sugestions on how to describe volatile, If you have any inputplease mail me.The volatile keyword acts as a data type qualifier. The qualifier alters the defaultwhy in which the compiler handles the variable and does not attempt to optimize thestorage referenced by it. Martin Leslievolatile means the storage is likely to change at anytime and be changed butsomething outside the control of the user program. This means that if you referencethe variable, the program should always check the physical address (ie a mappedinput fifo), and not use it in a cashed way. Stephen HuntHere is an example for the usage of the volatile keyword:/* Base address of the data input latch */volatile unsigned char *baseAddr;/* read parts of output latch */lsb = *handle->baseAddr;middle = *handle->baseAddr;msb = *handle->baseAddr;Between reads the bytes are changed in the latch.
    • Without the volatile, the compiler optimises this to a single assignment:lsb = middle = msb = *handle->baseAddr;Tim PotterA volatile variable is for dynamic use. E.G. for data that is to be passed to an I/Oport Here is an example.#define TTYPORT 0x17755Uvolatile char *port17 = (char)*TTYPORT;*port17 = o;*port17 = N;Without the volatile modifier, the compiler would think that the statement *port17= o; is redundant and would remove it from the object code. The volatile statementprevents the compiler optimisation.Data type conversionAn operator must have operands of the same type before it can carry out theoperation. Because of this, C will perform some automatic conversion of data types.These are the general rules for binary operators (* + / % etc):If either operand is long double the other is converted to long double.Otherwise, if either operand is double the other is converted to doubleOtherwise, if either operand is float the other is converted to floatOtherwise, convert char and short to intThen, if an operand is long convert the other to long.C Storage Classes.C has a concept of Storage classes which are used to define the scope (visability)and life time of variables and/or functions.So what Storage Classes are available?auto register static extern typedefauto - storage classauto is the default storage class for local variables.{int Count;
    • auto int Month;}The example above defines two variables with the same storage class. auto can onlybe used within functions, i.e. local variables.register - Storage Classregister is used to define local variables that should be stored in a register instead ofRAM. This means that the variable has a maximum size equal to the register size(usually one word) and cant have the unary & operator applied to it (as it does nothave a memory location).{register int Miles;}Register should only be used for variables that require quick access - such ascounters. It should also be noted that defining register goes not mean that thevariable will be stored in a register. It means that it MIGHT be stored in a register -depending on hardware and implimentation restrictions.static - Storage Classstatic is the default storage class for global variables. The two variables below (countand road) both have a static storage class.static int Count;int Road;{printf("%dn", Road);}static variables can be seen within all functions in this source file. At link time, thestatic variables defined here will not be seen by the object modules that are broughtin.static can also be defined within a function! If this is done the variable is initalisedat run time but is not reinitalized when the function is called. This is serious stuff -tread with care.{static Count=1;}Here is an exampleThere is one very important use for static. Consider this bit of code.char * func(void);main(){char *Text1;Text1 = func();
    • }char * func(void){char Text2[10]="martin";return(Text2);}Now, func returns a pointer to the memory location where text2 starts BUT text2has a storage class of auto and will disappear when we exit the function and couldbe overwritten but something else. The answer is to specifystatic char Text[10]="martin";The storage assigned to text2 will remain reserved for the duration if the program.extern - storage Classextern defines a global variable that is visable to ALL object modules. When you useextern the variable cannot be initalized as all it does is point the variable name at astorage location that has been previously defined.Source 1 Source 2-------- --------extern int count; int count=5;write() main(){ {printf("count is %dn", count); write();} }Count in source 1 will have a value of 5. If source 1 changes the value of count -source 2 will see the new value. Here are some example source files.Source 1Source 2The compile command will look something like.gcc source1.c source2.c -o program8.4. Const and volatileThese are new in Standard C, although the idea of const has been borrowed from C++.Let us get one thing straight: the concepts of const and volatile are completelyindependent. A common misconception is to imagine that somehow const is the oppositeof volatile and vice versa. They are unrelated and you should remember the fact.
    • Since const declarations are the simpler, well look at them first, but only after we haveseen where both of these type qualifiers may be used. The complete list of relevantkeywords ischar long float volatileshort signed double voidint unsigned constIn that list, const and volatile are type qualifiers, the rest are type specifiers. Variouscombinations of type specifiers are permitted:char, signed char, unsigned charint, signed int, unsigned intshort int, signed short int, unsigned short intlong int, signed long int, unsigned long intfloatdoublelong doubleA few points should be noted. All declarations to do with an int will be signed anyway,so signed is redundant in that context. If any other type specifier or qualifier is present,then the int part may be dropped, as that is the default.The keywords const and volatile can be applied to any declaration, including those ofstructures, unions, enumerated types or typedef names. Applying them to a declaration iscalled qualifying the declaration—thats why const and volatile are called type qualifiers,rather than type specifiers. Here are a few representative examples:volatile i;volatile int j;const long q;const volatile unsigned long int rt_clk;struct{const long int li;signed char sc;}volatile vs;Dont be put off; some of them are deliberately complicated: what they mean will beexplained later. Remember that they could also be further complicated by introducingstorage class specifications as well! In fact, the truly spectacularextern const volatile unsigned long int rt_clk;is a strong possibility in some real-time operating system kernels.8.4.1. ConstLets look at what is meant when const is used. Its really quite simple: const means thatsomething is not modifiable, so a data object that is declared with const as a part of itstype specification must not be assigned to in any way during the run of a program. It is
    • very likely that the definition of the object will contain an initializer (otherwise, since youcant assign to it, how would it ever get a value?), but this is not always the case. Forexample, if you were accessing a hardware port at a fixed memory address and promisedonly to read from it, then it would be declared to be const but not initialized.Taking the address of a data object of a type which isnt const and putting it into a pointerto the const-qualified version of the same type is both safe and explicitly permitted; youwill be able to use the pointer to inspect the object, but not modify it. Putting the addressof a const type into a pointer to the unqualified type is much more dangerous andconsequently prohibited (although you can get around this by using a cast). Here is anexample:#include <stdio.h>#include <stdlib.h>main(){int i;const int ci = 123;/* declare a pointer to a const.. */const int *cpi;/* ordinary pointer to a non-const */int *ncpi;cpi = &ci;ncpi = &i;/** this is allowed*/cpi = ncpi;/** this needs a cast* because it is usually a big mistake,* see what it permits below.*/ncpi = (int *)cpi;/** now to get undefined behaviour...* modify a const through a pointer*/*ncpi = 0;exit(EXIT_SUCCESS);}
    • Example 8.3As the example shows, it is possible to take the address of a constant object, generate apointer to a non-constant, then use the new pointer. This is an error in your program andresults in undefined behaviour.The main intention of introducing const objects was to allow them to be put into read-only store, and to permit compilers to do extra consistency checking in a program. Unlessyou defeat the intent by doing naughty things with pointers, a compiler is able to checkthat const objects are not modified explicitly by the user.An interesting extra feature pops up now. What does this mean?char c;char *const cp = &c;Its simple really; cp is a pointer to a char, which is exactly what it would be if the constwerent there. The const means that cp is not to be modified, although whatever it pointsto can be—the pointer is constant, not the thing that it points to. The other way round isconst char *cp;which means that now cp is an ordinary, modifiable pointer, but the thing that it points tomust not be modified. So, depending on what you choose to do, both the pointer and thething it points to may be modifiable or not; just choose the appropriate declaration.8.4.2. VolatileAfter const, we treat volatile. The reason for having this type qualifier is mainly to dowith the problems that are encountered in real-time or embedded systems programmingusing C. Imagine that you are writing code that controls a hardware device by placingappropriate values in hardware registers at known absolute addresses.Lets imagine that the device has two registers, each 16 bits long, at ascending memoryaddresses; the first one is the control and status register (csr) and the second is a data port.The traditional way of accessing such a device is like this:/* Standard C example butwithout const or volatile *//** Declare the device registers* Whether to use int or short* is implementation dependent*/struct devregs{unsigned short csr; /* control & status */unsigned short data; /* data port */};
    • /* bit patterns in the csr */#define ERROR 0x1#define READY 0x2#define RESET 0x4/* absolute address of the device */#define DEVADDR ((struct devregs *)0xffff0004)/* number of such devices in system */#define NDEVS 4/** Busy-wait function to read a byte from device n.* check range of device number.* Wait until READY or ERROR* if no error, read byte, return it* otherwise reset error, return 0xffff*/unsigned int read_dev(unsigned devno){struct devregs *dvp = DEVADDR + devno;if(devno >= NDEVS)return(0xffff);while((dvp->csr & (READY | ERROR)) == 0); /* NULL - wait till done */if(dvp->csr & ERROR){dvp->csr = RESET;return(0xffff);}return((dvp->data) & 0xff);}Example 8.4The technique of using a structure declaration to describe the device register layout andnames is very common practice. Notice that there arent actually any objects of that typedefined, so the declaration simply indicates the structure without using up any store.To access the device registers, an appropriately cast constant is used as if it were pointingto such a structure, but of course it points to memory addresses instead.
    • However, a major problem with previous C compilers would be in the while loop whichtests the status register and waits for the ERROR or READY bit to come on. Any self-respecting optimizing compiler would notice that the loop tests the same memory addressover and over again. It would almost certainly arrange to reference memory once only,and copy the value into a hardware register, thus speeding up the loop. This is, of course,exactly what we dont want; this is one of the few places where we must look at the placewhere the pointer points, every time around the loop.Because of this problem, most C compilers have been unable to make that sort ofoptimization in the past. To remove the problem (and other similar ones to do with whento write to where a pointer points), the keyword volatile was introduced. It tells thecompiler that the object is subject to sudden change for reasons which cannot bepredicted from a study of the program itself, and forces every reference to such an objectto be a genuine reference.Here is how you would rewrite the example, making use of const and volatile to get whatyou want./** Declare the device registers* Whether to use int or short* is implementation dependent*/struct devregs{unsigned short volatile csr;unsigned short const volatile data;};/* bit patterns in the csr */#define ERROR 0x1#define READY 0x2#define RESET 0x4/* absolute address of the device */#define DEVADDR ((struct devregs *)0xffff0004)/* number of such devices in system */#define NDEVS 4/** Busy-wait function to read a byte from device n.* check range of device number.* Wait until READY or ERROR* if no error, read byte, return it* otherwise reset error, return 0xffff*/unsigned int read_dev(unsigned devno){
    • struct devregs * const dvp = DEVADDR + devno;if(devno >= NDEVS)return(0xffff);while((dvp->csr & (READY | ERROR)) == 0); /* NULL - wait till done */if(dvp->csr & ERROR){dvp->csr = RESET;return(0xffff);}return((dvp->data) & 0xff);}Example 8.5The rules about mixing volatile and regular types resemble those for const. A pointer to avolatile object can be assigned the address of a regular object with safety, but it isdangerous (and needs a cast) to take the address of a volatile object and put it into apointer to a regular object. Using such a derived pointer results in undefined behaviour.If an array, union or structure is declared with const or volatile attributes, then all of themembers take on that attribute too. This makes sense when you think about it—howcould a member of a const structure be modifiable?That means that an alternative rewrite of the last example would be possible. Instead ofdeclaring the device registers to be volatile in the structure, the pointer could have beendeclared to point to a volatile structure instead, like this:struct devregs{unsigned short csr; /* control & status */unsigned short data; /* data port */};volatile struct devregs *const dvp=DEVADDR+devno;Since dvp points to a volatile object, it not permitted to optimize references through thepointer. Our feeling is that, although this would work, it is bad style. The volatiledeclaration belongs in the structure: it is the device registers which are volatile and that iswhere the information should be kept; it reinforces the fact for a human reader.So, for any object likely to be subject to modification either by hardware or asynchronousinterrupt service routines, the volatile type qualifier is important.
    • Now, just when you thought that you understood all that, here comes the final twist. Adeclaration like this:volatile struct devregs{/* stuff */}v_decl;declares the type struct devregs and also a volatile-qualified object of that type, calledv_decl. A later declaration like thisstruct devregs nv_decl;declares nv_decl which is not qualified with volatile! The qualification is not part of thetype of struct devregs but applies only to the declaration of v_decl. Look at it this wayround, which perhaps makes the situation more clear (the two declarations are the samein their effect):struct devregs{/* stuff */}volatile v_decl;If you do want to get a shorthand way of attaching a qualifier to another type, you can usetypedef to do it:struct x{int a;};typedef const struct x csx;csx const_sx;struct x non_const_sx = {1};const_sx = non_const_sx; /* error - attempt to modify a const */8.4.2.1. Indivisible OperationsThose of you who are familiar with techniques that involve hardware interrupts and other‘real time’ aspects of programming will recognise the need for volatile types. Related tothis area is the need to ensure that accesses to data objects are ‘atomic’, oruninterruptable. To discuss this is any depth would take us beyond the scope of this book,but we can at least outline some of the issues.Be careful not to assume that any operations written in C are uninterruptable. Forexample,extern const volatile unsigned long realtimeclock;could be a counter which is updated by a clock interrupt routine. It is essential to make itvolatile because of the asynchronous updates to it, and it is marked const because itshould not be changed by anything other than the interrupt routine. If the programaccesses it like this:unsigned long int time_of_day;
    • time_of_day = real_time_clock;there may be a problem. What if, to copy one long into another, it takes several machineinstructions to copy the two words making up real_time_clock and time_of_day? It ispossible that an interrupt will occur in the middle of the assignment and that in the worstcase, when the low-order word of real_time_clock is 0xffff and the high-order word is0x0000, then the low-order word of time_of_day will receive 0xffff. The interrupt arrivesand increments the low-order word of real_time_clock to 0x0 and then the high-orderword to 0x1, then returns. The rest of the assignment then completes, with time_of_dayending up containing 0x0001ffff and real_time_clock containing the correct value,0x00010000.This whole class of problem is what is known as a critical region, and is well understoodby those who regularly work in asynchronous environments. It should be understood thatStandard C takes no special precautions to avoid these problems, and that the usualtechniques should be employed.The header ‘signal.h’ declares a type called sig_atomic_t which is guaranteed to bemodifiable safely in the presence of asynchronous events. This means only that it can bemodified by assigning a value to it; incrementing or decrementing it, or anything elsewhich produces a new value depending on its previous value, is not safe.8.2. Declarations, Definitions and AccessibilityChapter 4 introduced the concepts of scope and linkage, showing how they can becombined to control the accessibility of things throughout a program. We deliberatelygave a vague description of exactly what constitutes a definition on the grounds that itwould give you more pain than gain at that stage. Eventually it has to be spelled out indetail, which we do in this chapter. Just to make things interesting, we need to throw instorage class too.Youll probably find the interactions between these various elements to be both complexand confusing: thats because they are! We try to eliminate some of the confusion andgive some useful rules of thumb in Section 8.2.6 below—but to understand them, youstill need to read the stuff in between at least once.For a full understanding, you need a good grasp of three distinct but related concepts. TheStandard calls them:durationscopelinkage
    • and describes what they mean in a fairly readable way (for a standard). Scope and linkagehave already been described in Chapter 4, although we do present a review of thembelow.8.2.1. Storage class specifiersThere are five keywords under the category of storage class specifiers, although one ofthem, typedef, is there more out of convenience than utility; it has its own section latersince it doesnt really belong here. The ones remaining are auto, extern, register, andstatic.Storage class specifiers help you to specify the type of storage used for data objects. Onlyone storage class specifier is permitted in a declaration—this makes sense, as there isonly one way of storing things—and if you omit the storage class specifier in adeclaration, a default is chosen. The default depends on whether the declaration is madeoutside a function (external declarations) or inside a function (internal declarations). Forexternal declarations the default storage class specifier will be extern and for internaldeclarations it will be auto. The only exception to this rule is the declaration of functions,whose default storage class specifier is always extern.The positioning of a declaration, the storage class specifiers used (or their defaults) and,in some cases, preceding declarations of the same name, can all affect the linkage of aname, although fortunately not its scope or duration. We will investigate the easier itemsfirst.8.2.1.1. DurationThe duration of an object describes whether its storage is allocated once only, at programstart-up, or is more transient in its nature, being allocated and freed as necessary.There are only two types of duration of objects: static duration and automatic duration.Static duration means that the object has its storage allocated permanently, automaticmeans that the storage is allocated and freed as necessary. Its easy to tell which is which:you only get automatic duration ifthe declaration is inside a functionand the declaration does not contain the static or extern keywordsand the declaration is not the declaration of a function(if you work through the rules, youll find that the formal parameters of a function alwaysmeet all three requirements—they are always ‘automatic’).Although the presence of static in a declaration unambiguously ensures that it has staticduration, its interesting to see that it is by no means the only way. This is a notorioussource of confusion, but we just have to accept it.Data objects declared inside functions are given the default storage class specifier of autounless some other storage class specifier is used. In the vast majority of cases, you dontwant these objects to be accessible from outside the function, so you want them to have
    • no linkage. Either the default, auto, or the explicit register storage class specifier resultsin an object with no linkage and automatic duration. Neither auto nor register can beapplied to a declaration that occurs outside a function.The register storage class is quite interesting, although it is tending to fall into disusenowadays. It suggests to the compiler that it would be a good idea to store the object inone or more hardware registers in the interests of speed. The compiler does not have totake any notice of this, but to make things easy for it, register variables do not have anaddress (the & address-of operator is forbidden) because some computers dont supportthe idea of addressable registers. Declaring too many register objects may slow theprogram down, rather than speed it up, because the compiler may either have to savemore registers on entrance to a function, often a slow process, or there wont be enoughregisters remaining to be used for intermediate calculations. Determining when to useregisters will be a machine-specific choice and should only be taken when detailedmeasurements show that a particular function needs to be speeded up. Then you will haveto experiment. In our opinion, you should never declare register variables during programdevelopment. Get the program working first, then measure it, then, maybe, judicious useof registers will give a useful increase in performance. But that work will have to berepeated for every type of processor you move the program to; even within one family ofprocessors the characteristics are often different.A final note on register variables: this is the only storage class specifier that may be usedin a function prototype or function definition. In a function prototype, the storage classspecifier is simply ignored, in a function definition it is a hint that the actual parametershould be stored in a register if possible. This example shows how it might beused:#include <stdio.h>#include <stdlib.h>void func(register int arg1, double arg2);main(){func(5, 2);exit(EXIT_SUCCESS);}/** Function illustrating that formal parameters* may be declared to have register storage class.*/void func(register int arg1, double arg2){/** Illustrative only - nobody would do this* in this context.* Cannot take address of arg1, even if you want to*/
    • double *fp = &arg2;while(arg1){printf("res = %fn", arg1 * (*fp));arg1--;}}Example 8.1So, the duration of an object depends on the storage class specifier used, whether its adata object or function, and the position (block or file scope) of the declarationconcerned. The linkage is also dependent on the storage class specifier, what kind ofobject it is and the scope of the declaration. Table 8.1 and Table 8.2 show the resultingstorage duration and apparent linkage for the various combinations of storage classspecifiers and location of the declaration. The actual linkage of objects with staticduration is a bit more complicated, so use these tables only as a guide to the simple casesand take a look at what we say later about definitions.Storage Class Specifier Function or Data Object Linkage Durationstatic either internal staticextern either probably external staticnone function probably external staticnone data object external staticTable 8.1. External declarations (outside a function)The table above omits the register and auto storage class specifiers because they are notpermitted in file-scope (external) declarations.Storage Class Specifier Function or Data Object Linkage Durationregister data object only none automaticauto data object only none automaticstatic data object only none staticextern either probably external staticnone data object none automaticnone function probably external staticTable 8.2. Internal declarationsInternal static variables retain their values between calls of the function that containsthem, which is useful in certain circumstances (see Chapter 4).8.2.2. ScopeNow we must look again at the scope of the names of objects, which defines when andwhere a given name has a particular meaning. The different types of scope are thefollowing:function scope
    • file scopeblock scopefunction prototype scopeThe easiest is function scope. This only applies to labels, whose names are visiblethroughout the function where they are declared, irrespective of the block structure. Notwo labels in the same function may have the same name, but because the name only hasfunction scope, the same name can be used for labels in every function. Labels are notobjects—they have no storage associated with them and the concepts of linkage andduration have no meaning for them.Any name declared outside a function has file scope, which means that the name is usableat any point from the declaration on to the end of the source code file containing thedeclaration. Of course it is possible for these names to be temporarily hidden bydeclarations within compound statements. As we know, function definitions must beoutside other functions, so the name introduced by any function definition will alwayshave file scope.A name declared inside a compound statement, or as a formal parameter to a function,has block scope and is usable up to the end of the associated } which closes thecompound statement. Any declaration of a name within a compound statement hides anyouter declaration of the same name until the end of the compound statement.A special and rather trivial example of scope is function prototype scope where adeclaration of a name extends only to the end of the function prototype. That meanssimply that this is wrong (same name used twice):void func(int i, int i);and this is all right:void func(int i, int j);The names declared inside the parentheses disappear outside them.The scope of a name is completely independent of any storage class specifier that may beused in its declaration.8.2.3. LinkageWe will briefly review the subject of linkage here, too. Linkage is used to determine whatmakes the same name declared in different scopes refer to the same thing. An object onlyever has one name, but in many cases we would like to be able to refer to the same objectfrom different scopes. A typical example is the wish to be able to call printf from severaldifferent places in a program, even if those places are not all in the same source file.The Standard warns that declarations which refer to the same thing must all havecompatible type, or the behaviour of the program will be undefined. A full description ofcompatible type is given later; for the moment you can take it to mean that, except for the
    • use of the storage class specifier, the declarations must be identical. Its the responsibilityof the programmer to get this right, though there will probably be tools available to helpyou check this out.The three different types of linkage are:external linkageinternal linkageno linkageIn an entire program, built up perhaps from a number of source files and libraries, if aname has external linkage, then every instance of a that name refers to the same objectthroughout the program.For something which has internal linkage, it is only within a given source code file thatinstances of the same name will refer to the same thing.Finally, names with no linkage refer to separate things.8.2.4. Linkage and definitionsEvery data object or function that is actually used in a program (except as the operand ofa sizeof operator) must have one and only one corresponding definition. This is actuallyvery important, although we havent really come across it yet because most of ourexamples have used only data objects with automatic duration, whose declarations areaxiomatically definitions, or functions which we have defined by providing their bodies.This ‘exactly one’ rule means that for objects with external linkage there must be exactlyone definition in the whole program; for things with internal linkage (confined to onesource code file) there must be exactly one definition in the file where it is declared; forthings with no linkage, whose declaration is always a definition, there is exactly onedefinition as well.Now we try to draw everything together. The real questions areHow do I get the sort of linkage that I want?What actually constitutes a definition?We need to look into linkage first, then definitions.How do you get the appropriate linkage for a particular name? The rules are a littlecomplicated.A declaration outside a function (file scope) which contains the static storage classspecifier results in internal linkage for that name. (The Standard requires that functiondeclarations which contain static must be at file scope, outside any block)If a declaration contains the extern storage class specifier, or is the declaration of afunction with no storage class specifier (or both), then:If there is already a visible declaration of that identifier with file scope, the resultinglinkage is the same as that of the visible declaration;
    • otherwise the result is external linkage.If a file scope declaration is neither the declaration of a function nor contains an explicitstorage class specifier, then the result is external linkage.Any other form of declaration results in no linkage.In any one source code file, if a given identifer has both internal and external linkage thenthe result is undefined.These rules were used to derive the ‘linkage’ columns of Table 8.1 and Table 8.2, withoutthe full application of rule 2—hence the use of the ‘probably external’ term. Rule 2allows you to determine the precise linkage in those cases.What makes a declaration into a definition?Declarations that result in no linkage are also definitions.Declarations that include an initializer are always definitions; this includes the‘initialization’ of functions by providing their body. Declarations with block scope mayonly have initializers if they also have no linkage.Otherwise, the declaration of a name with file scope and with either no storage classspecifier or with the static storage class specifier is a tentative definition. If a source codefile contains one or more tentative definitions for an object, then if that file contains noactual definitions, a default definition is provided for that object as if it had an initializerof 0. (Structures and arrays have all their elements initialized to 0). Functions do not havetentative definitions.A consequence of the foregoing is that unless you also provide an initializer, declarationsthat explicitly include the extern storage class specifier do not result in a definition.8.2.5. Realistic use of linkage and definitionsThe rules that determine the linkage and definition associated with declarations look quitecomplicated. The combinations used in practice are nothing like as bad; so letsinvestigate the usual cases.The three types of accessibility that you will want of data objects or functions are:throughout the entire program,restricted to one source file,restricted to one function (or perhaps a single compound statement).For the three cases above, you will want external linkage, internal linkage, and no linkagerespectively. The recommended practice for the first two cases is to declare all of thenames in each of the relevant source files before you define any functions. Therecommended layout of a source file would be as shown in Figure 8.1.Figure 8.1. Layout of a source fileThe external linkage declarations would be prefixed with extern, the internal linkagedeclarations with static. Heres an example./* example of a single source file layout */#include <stdio.h>
    • /* Things with external linkage:* accessible throughout program.* These are declarations, not definitions, so* we assume their definition is somewhere else.*/extern int important_variable;extern int library_func(double, int);/** Definitions with external linkage.*/extern int ext_int_def = 0; /* explicit definition */int tent_ext_int_def; /* tentative definition *//** Things with internal linkage:* only accessible inside this file.* The use of static means that they are also* tentative definitions.*/static int less_important_variable;static struct{int member_1;int member_2;}local_struct;/** Also with internal linkage, but not a tentative* definition because this is a function.*/static void lf(void);/** Definition with internal linkage.*/static float int_link_f_def = 5.3;/** Finally definitions of functions within this file*//** This function has external linkage and can be called
    • * from anywhere in the program.*/void f1(int a){}/** These two functions can only be invoked by name from* within this file.*/static int local_function(int a1, int a2){return(a1 * a2);}static void lf(void){/** A static variable with no linkage,* so usable only within this function.* Also a definition (because of no linkage)*/static int count;/** Automatic variable with no linkage but* an initializer*/int i = 1;printf("lf called for time no %dn", ++count);}/** Actual definitions are implicitly provided for* all remaining tentative definitions at the end of* the file*/Example 8.2We suggest that your re-read the preceding sections to see how the rules have beenapplied in Example 8.2.Use of volatileThe compilation system tries to reduce code size and execution time on all machines, byoptimizing code. It is programmer responsibility to inform compilation system thatcertain code (or variable) should not be optimized. Many programmers do not understandwhen to use volatile to inform compilation system that certain code should not be
    • optimized or what it does. Although (no doubt) their program work, they do not exploitthe full power of the language.A variable should be declared volatile whenever its value can be changed by somethingbeyond the control of the program in which it appears, such as a concurrently executingthread. Volatile, can appear only once in a declaration with any type specifier; however,they cannot appear after the first comma in a multiple item declaration. For example, thefollowing declarations are legal/* Let T denotes some data type */typedef volatile T i;volatile T i;T volatile i ;And the following declaration is illegalT i, volatile vi ;Volatile qualifiers can be used to change the behavior of a type. For example,volatile int p = 3;declares and initializes an object with type volatile int whose value will be always readfrom memory.SyntaxKeyword volatile can be placed before or after the data type in the variable definition. Forexample following declaration are identical:Volatile T a =3;T volatile a=3;The declaration declares and initializes an objet with type volatile T whose value will bealways read from memoryPointer declarationVolatile T * ptr;T volatile * ptr;Ptr is a a pointer to a volatile T:volatile pointer to a volatile variableint volatile * volatile ptr;volatile can be applied to derived types such as an array type ,in that case, the element isqualified, not the array type. When applied to struct types entire contents of the structbecome volatile. You can apply the volatile qualifier to the individual members of thestruct. Type qualifiers are relevant only when accessing identifiers as l-values inexpressions. volatile does not affects the range of values or arithmetic properties of theobject.. To declare the item pointed to by the pointer as volatile, use a declaration of theform:volatile T *vptr;To declare the value of the pointer - that is, the actual address stored in the pointer - asvolatile, use a declaration of the form:T* volatile ptrv;
    • Use of volatile- An object that is a memory-mapped I/O port- An object variable that is shared between multiple concurrent processes- An object that is modified by an interrupt service routine- An automatic object declared in a function that calls setjmp and whose value is-changedbetween the call to setjmp and a corresponding call to longjmp# An object that is a memory-mapped I/O portan 8 bit , memory -mapped I/O port at physical address 0X15 can be declared aschar const ptr=(char*)0X15 ;*ptr access the port consider following code fragment to set the third bit of the outputport at periodic intervals*ptr = 0;while(*ptr){*ptr = 4 ;*ptr = 0 ;}the above code may be optimize as*ptr = 0while(0) {}*ptr is assigned 0 before value 4 is used ( and value of *ptr never changes ( same constantis always assigned to it)volatile keyword is used to suppress these optimization the compiler assumes that thevalue can change any time , even if no explicit code modify it to suppress all optimizationdeclare ptr asvolatile char * const ptr = (volatile char*)0x16this declaration say the object at which ptr points can change without notice , but that ptritself is a constant whose value never change# An object that is shared between multiple concurrent processesif two threads/tasks concurrently assign distinct values to the same shared non-volatilevariable, a subsequent use of that variable may obtain a value that is not equal to either ofthe assigned values, but some implementation-dependent mixture of the two values. so aglobal variable references by multiple thread programmers must declare shared variableas volatile.#include#includevolatile int num ;void* foo(){while(1) {++num ;sleep(1000);
    • }}main(){int p ;void *targ =NULL ;thread_t id ;num = 0;p = thr_create((void*)NULL , 0,foo,targ,0,&id);if(!p) printf(" can not create thread ");while(1){printf("%d" , num ) ;}}the compiler may use a register to store the num variable in the main thread , so thechange to the value by the second thread are ignored . The volatile modifier is a away oftelling the compiler that no optimization applied to the variable its value be not placed inregister and value may change outside influence during evaluation# An automatic object declared in a function that calls setjmp and whose value is-changed between the call to setjmp and a corresponding call to longjmpVariables local to functions that call setjmp is most often used. When an automatic objectis declared in a function that calls setjmp, the compilation system knows that it has toproduce code that exactly matches what the programmer wrote. Therefore, the mostrecent value for such an automatic object will always be in memory (not just in a register)and as such will be guaranteed to be up-to-date when longjmp is called. Without volatilevariable is undefined by the standard whether the result one see differs with or withoutoptimization simply reflects that fact Consider following code#includestatic jmp_buf buf ;main( ){volatile int b;b =3 ;if(setjmp(buf)!=0) {printf("%d ", b) ;exit(0);}b=5;longjmp(buf , 1) ;}
    • volatile variable isnt affected by the optimization. So value of b is after the longjump isthe last value variable assigned. Without volatile b may or may not be restored to its lastvalue when the longjmp occurs. For clear understanding assembly listing of aboveprogram by cc compiler has been given below./*Listing1: Assembly code fragment of above programProduced by cc compiler when volatile is used*/.file "vol.c"main:pushl %ebpmovl %esp, %ebpsubl $20, %espmovl $3, -4(%ebp)pushl $bufcall _setjmpaddl $16, %esptestl %eax, %eaxje .L18subl $8, %espmovl -4(%ebp), %eaxpushl %eaxpushl $.LC0call printfmovl $0, (%esp)call exit.p2align 2.L18:movl $5, -4(%ebp)subl $8, %esppushl $1pushl $bufcall longjmp/*Listing 2:Assemply code fragment of above programproduced by cc compile witout volatile keword */.file "wvol.c"main:pushl %ebpmovl %esp, %ebpsubl $20, %esppushl $bufcall _setjmpaddl $16, %esp
    • testl %eax, %eaxje .L18subl $8, %esppushl $3pushl $.LC0call printfmovl $0, (%esp)call exit.p2align 2.L18:subl $8, %esppushl $1pushl $bufcall longjmp/listing 3: difference between listing 1 and listing 3 ***/< .file "vol.c"---> .file "wvol.c"< movl $3, -4(%ebp)< movl -4(%ebp), %eax< pushl %eax> pushl $3< movl $5, -4(%ebp) / * store in stack */From above listing 3 it is clear that when you use setjmp and longjmp, the only automaticvariables guaranteed to remain valid are those declared volatile.# An object modified by an interrupt service routineInterrupt service routines often set variables that are tested in main line code. Oneproblem that arises as soon as you use interrupt is that interrupt routines need tocommunicate with rest of the code .A interrupt may update a variable num that is used inmain line of code .An incorrect implementation of this might be:static lon int num ;void interrupt update(void){++num ;}main(){long val ;val = num ;while(val !=num)val = num ;rturn val ;
    • }When compilation system execute while statement, the optimizer in compiler may noticethat it read the value num once already and that value is still in the register. Instead of rereading the value from memory, compiler may produce code to use the (possibly messedup) value in the register defeating purpose of original C program. Some compiler mayoptimize entire while loop assuming that since the value of num was just assigned to val ,the two must be equal and condition in while statement will therefore always be false .Toavoid this ,you have to declare num to be volatile that warns compilers that certainvariables may change because of interrupt routines.static volatile long int num ;With volatile keyword in the declaration the compiler knows that the value of num mustb read from memory every time it is referenced.ConclusionWhen using exact semantics is necessary, volatile should be used. If you are given apiece of touchy code given as above. It is a good idea in any case to look the compileroutputting listing at the assembly language to be sure that compiler produce code thatmakes sense.About the authorAshok K. Pathak a member ( Research Staff ) at Bharat Electronics Limited (CRL) ,Ghaziabad .He has been developing embedded application for the past five years. Ashokholds a M.E in computer science and engineering . Ashok recently completed a bookabout "Advanced Test in C and Embedded System Programming" , Published by BPB ,ND .He can be reached at pathak@indiya.com.6.38. Variables in Specified RegistersGNU C allows you to put a few global variables into specified hardware registers. Youcan also specify the register in which an ordinary register variable should be allocated.Global register variables reserve registers throughout the program. This may be useful inprograms such as programming language interpreters which have a couple of globalvariables that are accessed very often.Local register variables in specific registers do not reserve the registers. The compilersdata flow analysis is capable of determining where the specified registers contain livevalues, and where they are available for other uses. Stores into local register variablesmay be deleted when they appear to be dead according to dataflow analysis. Referencesto local register variables may be deleted or moved or simplified.These local variables are sometimes convenient for use with the extended asm feature(Section 6.35 Assembler Instructions with C Expression Operands), if you want to writeone output of the assembler instruction directly into a particular register. (This will work
    • provided the register you specify fits the constraints specified for that operand in theasm.)6.38.1. Defining Global Register VariablesYou can define a global register variable in GNU C like this:register int *foo asm ("a5");Here a5 is the name of the register which should be used. Choose a register which isnormally saved and restored by function calls on your machine, so that library routineswill not clobber it.Naturally the register name is cpu-dependent, so you would need to conditionalize yourprogram according to cpu type. The register a5 would be a good choice on a 68000 for avariable of pointer type. On machines with register windows, be sure to choose a "global"register that is not affected magically by the function call mechanism.In addition, operating systems on one type of cpu may differ in how they name theregisters; then you would need additional conditionals. For example, some 68000operating systems call this register %a5.Eventually there may be a way of asking the compiler to choose a register automatically,but first we need to figure out how it should choose and how to enable you to guide thechoice. No solution is evident.Defining a global register variable in a certain register reserves that register entirely forthis use, at least within the current compilation. The register will not be allocated for anyother purpose in the functions in the current compilation. The register will not be savedand restored by these functions. Stores into this register are never deleted even if theywould appear to be dead, but references may be deleted or moved or simplified.It is not safe to access the global register variables from signal handlers, or from morethan one thread of control, because the system library routines may temporarily use theregister for other things (unless you recompile them specially for the task at hand).It is not safe for one function that uses a global register variable to call another suchfunction foo by way of a third function lose that was compiled without knowledge of thisvariable (i.e. in a different source file in which the variable wasnt declared). This isbecause lose might save the register and put some other value there. For example, youcant expect a global register variable to be available in the comparison-function that youpass to qsort, since qsort might have put something else in that register. (If you areprepared to recompile qsort with the same global register variable, you can solve thisproblem.)If you want to recompile qsort or other source files which do not actually use your globalregister variable, so that they will not use that register for any other purpose, then it
    • suffices to specify the compiler option -ffixed-reg. You need not actually add a globalregister declaration to their source code.A function which can alter the value of a global register variable cannot safely be calledfrom a function compiled without this variable, because it could clobber the value thecaller expects to find there on return. Therefore, the function which is the entry point intothe part of the program that uses the global register variable must explicitly save andrestore the value which belongs to its caller.On most machines, longjmp will restore to each global register variable the value it had atthe time of the setjmp. On some machines, however, longjmp will not change the value ofglobal register variables. To be portable, the function that called setjmp should makeother arrangements to save the values of the global register variables, and to restore themin a longjmp. This way, the same thing will happen regardless of what longjmp does.All global register variable declarations must precede all function definitions. If such adeclaration could appear after function definitions, the declaration would be too late toprevent the register from being used for other purposes in the preceding functions.Global register variables may not have initial values, because an executable file has nomeans to supply initial contents for a register.On the SPARC, there are reports that g3 … g7 are suitable registers, but certain libraryfunctions, such as getwd, as well as the subroutines for division and remainder, modifyg3 and g4. g1 and g2 are local temporaries.The register storage class specifierThe register storage class specifier indicates to the compiler that the object should bestored in a machine register. The register storage class specifier is typically specified forheavily used variables, such as a loop control variable, in the hopes of enhancingperformance by minimizing access time. However, the compiler is not required to honorthis request. Because of the limited size and number of registers available on mostsystems, few variables can actually be put in registers. If the compiler does not allocate amachine register for a register object, the object is treated as having the storage classspecifier auto.An object having the register storage class specifier must be defined within a block ordeclared as a parameter to a function.The following restrictions apply to the register storage class specifier:You cannot use pointers to reference objects that have the register storage class specifier.You cannot use the register storage class specifier when declaring objects in globalscope.
    • A register does not have an address. Therefore, you cannot apply the address operator(&) to a register variable.You cannot use the register storage class specifier when declaring objects in namespacescope.C++ onlyUnlike C, C++ lets you take the address of an object with the register storage class. Forexample:register int i;int* b = &i; // valid in C++, but not in CEnd of C++ onlyStorage duration of register variablesObjects with the register storage class specifier have automatic storage duration. Eachtime a block is entered, storage for register objects defined in that block is madeavailable. When the block is exited, the objects are no longer available for use.If a register object is defined within a function that is recursively invoked, memory isallocated for the variable at each invocation of the block.Linkage of register variablesSince a register object is treated as the equivalent to an object of the auto storage class, ithas no linkage.Related informationInitialization and storage classesBlock/local scopeReferences (C++ only)IBM extensionGlobal variables in specified registers (C only)You can specify that a particular hardware register is dedicated to a global variable byusing an assembly global register variable declaration. Stores into the reserved registerare never deleted. This language extension is provided for compatibility with GNU C.Global register variable declaration syntax>>-register--variable_declaration------------------------------->>--+-asm-----+-("register_specifier")--------------------------><+-__asm__-+-__asm---The register_specifier is a string representing a hardware register. The register name isCPU-specific. XL C supports the following register names on the PowerPC:
    • r0 to r31 -- general purpose registersf0 to f31 -- floating point registersv0 to v31 -- vector registersNote:You cannot use registers r30 or r14 for register variable declarations, as these arereserved for use by the XL C++ runtime environment.The following are the rules of use for register variables:General purpose registers can only be reserved for variables of integer or pointer type.Floating point registers can only be reserved for variables of float, double, or 64-bit longdouble type.Vector registers can only be reserved for variables of vector type.Variables of long long type cannot reserve registers.A global register variable cannot be initialized.The register dedicated for a global register variable should not be a volatile register, orthe value stored into the global variable might not be preserved across a function call.A global register variable can only reserve a register that is not already reserved byanother global register variable.The same global register variable cannot reserve more than one register.A register variable should not be used in an OpenMP clause or OpenMP parallel or work-sharing region.The register specified in the global register declaration is reserved for the declaredvariable only in the compilation unit in which the register declaration is specified. Theregister is not reserved in other compilation units unless you place the global registerdeclaration in a common header file, or use the -qreserved_reg compiler option.Lesson 3: The use and declaration of constants.This lesson will show you how to:Use constants in CDeclare constants in Cthe #define directivethe keyword constQ: Whats the difference between a lawyer and a bucket of shit?A: The bucket.A constant is similar to a variable in the sense that it represents a memory location (orsimply, a value). It is different from a normal variable, as I sure you can guess, in that itcannot change its value in the proram - it must stay for ever stay constant. In general,constants are a useful because they can prevent program bugs and logical errors(errorsare explained later). Unintended modifications are prevented from occurring. Thecompiler will catch attempts to reassign new values to constants.Using #define
    • There are three techniques used to define constants in C. First, constants may be definedusing the preprocessor directive (the preprocessor-directives are explained in summaryguide 1 in the advanced user category - dont be scared - they are very easy to understand)#define. The preprocessor is a program that modifies your source file prior tocompilation. Common preprocessor directives are #include, which is used to includeadditional code into your source file, #define, which is used to define a constant. Thestandard is that constants are always in uppercase whereas normal variables are inlowercase - you dont have to follow this convention - it just makes things easier.Remember that this statement must not contain a semi-colon, and must not contain theassignment operator (in english it is equals, =). The #define directive is used as follows:#define ID_NO 12345Wherever the constant appears in your source file, the preprocessor replaces it by itsvalue. So, for instance, every "pi" in your source code will be replace by 3.1415. Thecompiler will only see the value 3.1415 in your code, not "pi". Every "pi" is just replacedby its value. Here is a simple program illustrating the preprocessor directive #define.#include <stdio.h>#define MONDAY 1#define TUESDAY 2#define WEDNESDAY 3#define THURSDAY 4#define FRIDAY 5#define SATURDAY 6#define SUNDAY 7int main(void){int today = MONDAY;if ((today == SATURDAY) || (today == SUNDAY)){printf("Weekendn");}else{printf("Go to work or schooln");}return 0;}
    • Dont worry if you dont understand this program, just look at its basic structure thats all.All will be revealed. Good things come to those who wait!How do you know if a woman is about to say something smart?When she starts her sentence with something like "A man once told me..."Using const variablesThe second technique is to use the keyword const when defining a variable. When usedthe compiler will catch attempts to modify variables that have been declared const.const float pi = 3.1415;const int id_no = 12345;There are two main advantages over the first technique. First, the type of the constant isdefined. "pi" is float. "id_no" is int. This allows some type checking by the compiler.Second, these constants are variables with a definite scope. The scope of a variablerelates to parts of your program in which it is defined. Some variables may exist only incertain functions or in certain blocks of code. You may want to use "id_no" in onefunction and a completely unrelated "id_no" in your main program. Sorry if this isconfusing, the scope of variables will be covered in a latter lesson.Teach Yourself C in 21 Days- 21 -Advanced Compiler UseProgramming with Multiple Source FilesAdvantages of Modular ProgrammingModular Programming TechniquesModule ComponentsExternal Variables and Modular ProgrammingUsing .OBJ FilesUsing the make UtilityThe C PreprocessorThe #define Preprocessor DirectiveThe #include DirectiveUsing #if, #elif, #else, and #endifUsing #if...#endif to Help DebugAvoiding Multiple Inclusions of Header FilesThe #undef DirectivePredefined MacrosUsing Command-Line ArgumentsSummaryQ&A
    • WorkshopQuizExercisesThis chapter, the last of your 21 days, covers some additional features of the C compiler.Today you will learn aboutProgramming with multiple source-code filesUsing the C preprocessorUsing command-line argumentsProgramming with Multiple Source FilesUntil now, all your C programs have consisted of a single source-code file, exclusive ofheader files. A single source-code file is often all you need, particularly for smallprograms, but you can also divide the source code for a single program among two ormore files, a practice called modular programming. Why would you want to do this? Thefollowing sections explain.Advantages of Modular ProgrammingThe primary reason to use modular programming is closely related to structuredprogramming and its reliance on functions. As you become a more experiencedprogrammer, you develop more general-purpose functions that you can use, not only inthe program for which they were originally written, but in other programs as well. Forexample, you might write a collection of general-purpose functions for displayinginformation on-screen. By keeping these functions in a separate file, you can use themagain in different programs that also display information on-screen. When you write aprogram that consists of multiple source-code files, each source file is called a module.Modular Programming TechniquesA C program can have only one main() function. The module that contains the main()function is called the main module, and other modules are called secondary modules. Aseparate header file is usually associated with each secondary module (youll learn whylater in this chapter). For now, look at a few simple examples that illustrate the basics ofmultiple module programming. Listings 21.1, 21.2, and 21.3 show the main module, thesecondary module, and the header file, respectively, for a program that inputs a numberfrom the user and displays its square.Listing 21.1. LIST21_1.C: the main module.1: /* Inputs a number and displays its square. */2:3: #include <stdio.h>4: #include "calc.h"
    • 5:6: main()7: {8: int x;9:10: printf("Enter an integer value: ");11: scanf("%d", &x);12: printf("nThe square of %d is %ld.n", x, sqr(x));13: return(0);14: }Listing 21.2. CALC.C: the secondary module.1: /* Module containing calculation functions. */2:3: #include "calc.h"4:5: long sqr(int x)6: {7: return ((long)x * x);8: }Listing 21.3. CALC.H: the header file for CALC.C.1: /* CALC.H: header file for CALC.C. */2:3: long sqr(int x);4:5: /* end of CALC.H */Enter an integer value: 100The square of 100 is 10000.ANALYSIS: Lets look at the components of these three files in greater detail. Theheader file, CALC.H, contains the prototype for the sqr() function in CALC.C. Becauseany module that uses sqr() needs to know sqr()s prototype, the module must includeCALC.H.The secondary module file, CALC.C, contains the definition of the sqr() function. The#include directive is used to include the header file, CALC.H. Note that the headerfilename is enclosed in quotation marks rather than angle brackets. (Youll learn thereason for this later in this chapter.)The main module, LIST21_1.C, contains the main() function. This module also includesthe header file, CALC.H.After you use your editor to create these three files, how do you compile and link thefinal executable program? Your compiler controls this for you. At the command line,enterxxx list21_1.c calc.c
    • where xxx is your compilers command. This directs the compilers components toperform the following tasks:1. Compile LIST21_1.C, creating LIST21_1.OBJ (or LIST21_1.O on a UNIX system). Ifit encounters any errors, the compiler displays descriptive error messages.2. Compile CALC.C, creating CALC.OBJ (or CALC.O on a UNIX system). Again, errormessages appear if necessary.3. Link LIST21_1.OBJ, CALC.OBJ, and any needed functions from the standard libraryto create the final executable program LIST21_1.EXE.Module ComponentsAs you can see, the mechanics of compiling and linking a multiple-module program arequite simple. The only real question is what to put in each file. This section gives yousome general guidelines.The secondary module should contain general utility functions--that is, functions that youmight want to use in other programs. A common practice is to create one secondarymodule for each type of function--for example, KEYBOARD.C for your keyboardfunctions, SCREEN.C for your screen display functions, and so on. To compile and linkmore than two modules, list all source files on the command line:tcc mainmod.c screen.c keyboard.cThe main module should contain main(), of course, and any other functions that areprogram-specific (meaning that they have no general utility).There is usually one header file for each secondary module. Each file has the same nameas the associated module, with an .H extension. In the header file, putPrototypes for functions in the secondary module#define directives for any symbolic constants and macros used in the moduleDefinitions of any structures or external variables used in the moduleBecause this header file might be included in more than one source file, you want toprevent portions of it from compiling more than once. You can do this by using thepreprocessor directives for conditional compilation (discussed later in this chapter).External Variables and Modular ProgrammingIn many cases, the only data communication between the main module and the secondarymodule is through arguments passed to and returned from the functions. In this case, you
    • dont need to take special steps regarding data visibility, but what about an externalvariable that needs to be visible in both modules?Recall from Day 12, "Understanding Variable Scope," that an external variable is onedeclared outside of any function. An external variable is visible throughout the entiresource code file in which it is declared. However, it is not automatically visible in othermodules. To make it visible, you must declare the variable in each module, using theextern keyword. For example, if you have an external variable declared in the mainmodule asfloat interest_rate;you make interest_rate visible in a secondary module by including the followingdeclaration in that module (outside of any function):extern float interest_rate;The extern keyword tells the compiler that the original declaration of interest_rate (theone that set aside storage space for it) is located elsewhere, but that the variable should bemade visible in this module. All extern variables have static duration and are visible to allfunctions in the module. Figure 21.1 illustrates the use of the extern keyword in amultiple-module program.Figure 21.1.Using the extern keyword to make an external variable visible across modules.In Figure 21.1, the variable x is visible throughout all three modules. In contrast, y isvisible only in the main module and secondary module 1.Using .OBJ FilesAfter youve written and thoroughly debugged a secondary module, you dont need torecompile it every time you use it in a program. Once you have the object file for themodule code, all you need to do is link it with each program that uses the functions in themodule.When you compile a program, the compiler creates an object file that has the same nameas the C source code file and an .OBJ extension (or an .O extension on UNIX systems).For example, suppose youre developing a module called KEYBOARD.C and compilingit, along with the main module DATABASE.C, using the following command:tcc database.c keyboard.cThe KEYBOARD.OBJ file is also on your disk. Once you know that the functions inKEYBOARD.C work properly, you can stop compiling it every time you recompile
    • DATABASE.C (or any other program that uses it), linking the existing object file instead.To do this, use the commandtcc database.c keyboard.objThe compiler then compiles DATABASE.C and links the resulting object fileDATABASE.OBJ with KEYBOARD.OBJ to create the final executable fileDATABASE.EXE. This saves time, because the compiler doesnt have to recompile thecode in KEYBOARD.C. However, if you modify the code in KEYBOARD.C, you mustrecompile it. In addition, if you modify a header file, you must recompile all the modulesthat use it.DONT try to compile multiple source files together if more than one module contains amain() function. You can have only one main().DO create generic functions in their own source files. This way, they can be linked intoany other programs that need them.DONT always use the C source files when compiling multiple files together. If youcompile a source file into an object file, recompile only when the file changes. This savesa great deal of time.Using the make UtilityAlmost all C compilers come with a make utility that can simplify and speed the task ofworking with multiple source-code files. This utility, which is usually calledNMAKE.EXE, lets you write a so-called make file that defines the dependencies betweenyour various program components. What does dependency mean?Imagine a project that has a main module named PROGRAM.C and a secondary modulenamed SECOND.C. There are also two header files, PROGRAM.H and SECOND.H.PROGRAM.C includes both of the header files, whereas SECOND.C includes onlySECOND.H. Code in PROGRAM.C calls functions in SECOND.C.PROGRAM.C is dependent on the two header files because it includes them both. If youmake a change to either header file, you must recompile PROGRAM.C so that it willinclude those changes. In contrast, SECOND.C is dependent on SECOND.H but not onPROGRAM.H. If you change PROGRAM.H, there is no need to recompile SECOND.C--you can just link the existing object file SECOND.OBJ that was created whenSECOND.C was last compiled.A make file describes the dependencies such as those just discussed that exist in yourproject. Each time you edit one or more of your source code files, you use the NMAKEutility to "run" the make file. This utility examines the time and date stamps on the sourcecode and object files and, based on the dependencies you have defined, instructs thecompiler to recompile only those files that are dependent on the modified file(s). The
    • result is that no unnecessary compilation is done, and you can work at the maximumefficiency.For projects that involve one or two source code files, it usually isnt worth the trouble ofdefining a make file. For larger projects, however, its a real benefit. Refer to yourcompiler documentation for information on how to use its NMAKE utility.The C PreprocessorThe preprocessor is a part of all C compiler packages. When you compile a C program,the preprocessor is the first compiler component that processes your program. In most Ccompilers, the preprocessor is part of the compiler program. When you run the compiler,it automatically runs the preprocessor.The preprocessor changes your source code based on instructions, or preprocessordirectives, in the source code. The output of the preprocessor is a modified source codefile that is then used as the input for the next compilation step. Normally you never seethis file, because the compiler deletes it after its used. However, later in this chapteryoull learn how to look at this intermediate file. First, you need to learn about thepreprocessor directives, all of which begin with the # symbol.The #define Preprocessor DirectiveThe #define preprocessor directive has two uses: creating symbolic constants andcreating macros.Simple Substitution Macros Using #defineYou learned about substitution macros on Day 3, "Storing Data: Variables andConstants," although the term used to describe them in that chapter was symbolicconstants. You create a substitution macro by using #define to replace text with othertext. For example, to replace text1 with text2, you write#define text1 text2This directive causes the preprocessor to go through the entire source code file, replacingevery occurrence of text1 with text2. The only exception occurs if text1 is found withindouble quotation marks, in which case no change is made.The most frequent use for substitution macros is to create symbolic constants, asexplained on Day 3. For example, if your program contains the following lines:#define MAX 1000x = y * MAX;z = MAX - 12;
    • during preprocessing, the source code is changed to read as follows:x = y * 1000;z = 1000 - 12;The effect is the same as using your editors search-and-replace feature in order to changeevery occurrence of MAX to 1000. Your original source code file isnt changed, ofcourse. Instead, a temporary copy is created with the changes. Note that #define isntlimited to creating symbolic numeric constants. For example, you could write#define ZINGBOFFLE printfZINGBOFFLE("Hello, world.");although there is little reason to do so. You should also be aware that some authors referto symbolic constants defined with #define as being macros themselves. (Symbolicconstants are also called manifest constants.) However, in this book, the word macro isreserved for the type of construction described next.Creating Function Macros with #defineYou can use the #define directive also to create function macros. A function macro is atype of shorthand, using something simple to represent something more complicated. Thereason for the "function" name is that this type of macro can accept arguments, just like areal C function does. One advantage of function macros is that their arguments arenttype-sensitive. Therefore, you can pass any numeric variable type to a function macrothat expects a numeric argument.Lets look at an example. The preprocessor directive#define HALFOF(value) ((value)/2)defines a macro named HALFOF that takes a parameter named value. Whenever thepreprocessor encounters the text HALFOF(value) in the source code, it replaces it withthe definition text and inserts the argument as needed. Thus, the source code lineresult = HALFOF(10);is replaced by this line:result = ((10)/2);Likewise, the program lineprintf("%f", HALFOF(x[1] + y[2]));is replaced by this line:
    • printf("%f", ((x[1] + y[2])/2));A macro can have more than one parameter, and each parameter can be used more thanonce in the replacement text. For example, the following macro, which calculates theaverage of five values, has five parameters:#define AVG5(v, w, x, y, z) (((v)+(w)+(x)+(y)+(z))/5)The following macro, in which the conditional operator determines the larger of twovalues, also uses each of its parameters twice. (You learned about the conditionaloperator on Day 4, "Statements, Expressions, and Operators.")#define LARGER(x, y) ((x) > (y) ? (x) : (y))A macro can have as many parameters as needed, but all of the parameters in the list mustbe used in the substitution string. For example, the macro definition#define ADD(x, y, z) ((x) + (y))is invalid, because the parameter z is not used in the substitution string. Also, when youinvoke the macro, you must pass it the correct number of arguments.When you write a macro definition, the opening parenthesis must immediately follow themacro name; there can be no white space. The opening parenthesis tells the preprocessorthat a function macro is being defined and that this isnt a simple symbolic constant typesubstitution. Look at the following definition:#define SUM (x, y, z) ((x)+(y)+(z))Because of the space between SUM and (, the preprocessor treats this like a simplesubstitution macro. Every occurrence of SUM in the source code is replaced with (x, y, z)((x)+(y)+(z)), clearly not what you wanted.Also note that in the substitution string, each parameter is enclosed in parentheses. This isnecessary to avoid unwanted side effects when passing expressions as arguments to themacro. Look at the following example of a macro defined without parentheses:#define SQUARE(x) x*xIf you invoke this macro with a simple variable as an argument, theres no problem. Butwhat if you pass an expression as an argument?result = SQUARE(x + y);The resulting macro expansion is as follows, which doesnt give the proper result:
    • result = x + y * x + y;If you use parentheses, you can avoid the problem, as shown in this example:#define SQUARE(x) (x)*(x)This definition expands to the following line, which does give the proper result:result = (x + y) * (x + y);You can obtain additional flexibility in macro definitions by using the stringizingoperator (#) (sometimes called the string-literal operator). When a macro parameter ispreceded by # in the substitution string, the argument is converted into a quoted stringwhen the macro is expanded. Thus, if you define a macro as#define OUT(x) printf(#x)and you invoke it with the statementOUT(Hello Mom);it expands to this statement:printf("Hello Mom");The conversion performed by the stringizing operator takes special characters intoaccount. Thus, if a character in the argument normally requires an escape character, the #operator inserts a backslash before the character. Continuing with the example, theinvocationOUT("Hello Mom");expands toprintf(""Hello Mom"");The # operator is demonstrated in Listing 21.4. First, you need to look at one otheroperator used in macros, the concatenation operator (##). This operator concatenates, orjoins, two strings in the macro expansion. It doesnt include quotation marks or specialtreatment of escape characters. Its main use is to create sequences of C source code. Forexample, if you define and invoke a macro as#define CHOP(x) func ## xsalad = CHOP(3)(q, w);
    • the macro invoked in the second line is expanded tosalad = func3 (q, w);You can see that by using the ## operator, you determine which function is called. Youhave actually modified the C source code.Listing 21.4 shows an example of one way to use the # operator.Listing 21.4. Using the # operator in macro expansion.1: /* Demonstrates the # operator in macro expansion. */2:3: #include <stdio.h>4:5: #define OUT(x) printf(#x " is equal to %d.n", x)6:7: main()8: {9: int value = 123;10: OUT(value);11: return(0);12: }value is equal to 123.ANALYSIS: By using the # operator on line 5, the call to the macro expands with thevariable name value as a quoted string passed to the printf() function. After expansion online 9, the macro OUT looks like this:printf("value" " is equal to %d.", value );Macros Versus FunctionsYou have seen that function macros can be used in place of real functions, at least insituations where the resulting code is relatively short. Function macros can extendbeyond one line but usually become impractical beyond a few lines. When you can useeither a function or a macro, which should you use? Its a trade-off between programspeed and program size.A macros definition is expanded into the code each time the macro is encountered in thesource code. If your program invokes a macro 100 times, 100 copies of the expandedmacro code are in the final program. In contrast, a functions code exists only as a singlecopy. Therefore, in terms of program size, the better choice is a true function.When a program calls a function, a certain amount of processing overhead is required inorder to pass execution to the function code and then return execution to the callingprogram. There is no processing overhead in "calling" a macro, because the code is rightthere in the program. In terms of speed, a function macro has the advantage.
    • These size/speed considerations arent usually of much concern to the beginningprogrammer. Only with large, time-critical applications do they become important.Viewing Macro ExpansionAt times, you might want to see what your expanded macros look like, particularly whenthey arent working properly. To see the expanded macros, you instruct the compiler tocreate a file that includes macro expansion after the compilers first pass through thecode. You might not be able to do this if your C compiler uses an IntegratedDevelopment Environment (IDE); you might have to work from the command prompt.Most compilers have a flag that should be set during compilation. This flag is passed tothe compiler as a command-line parameter.For example, to precompile a program named PROGRAM.C with the Microsoftcompiler, you would entercl /E program.cOn a UNIX compiler, you would entercc -E program.cThe preprocessor makes the first pass through your source code. All header files areincluded, #define macros are expanded, and other preprocessor directives are carried out.Depending on your compiler, the output goes either to stdout (that is, the screen) or to adisk file with the program name and a special extension. The Microsoft compiler sendsthe preprocessed output to stdout. Unfortunately, its not at all useful to have theprocessed code whip by on your screen! You can use the redirection command to sendthis output to a file, as in this example:cl /E program.c > program.preYou can then load the file into your editor for printing or viewing.DO use #defines, especially for symbolic constants. Symbolic constants make your codemuch easier to read. Examples of things to put into defined constants are colors,true/false, yes/no, the keyboard keys, and maximum values. Symbolic constants are usedthroughout this book.DONT overuse macro functions. Use them where needed, but be sure they are a betterchoice than a normal function.The #include DirectiveYou have already learned how to use the #include preprocessor directive to includeheader files in your program. When it encounters an #include directive, the preprocessor
    • reads the specified file and inserts it at the location of the directive. You cant use the * or? wildcards to read in a group of files with one #include directive. You can, however,nest #include directives. In other words, an included file can contain #include directives,which can contain #include directives, and so on. Most compilers limit the number oflevels deep that you can nest, but you usually can nest up to 10 levels.There are two ways to specify the filename for an #include directive. If the filename isenclosed in angle brackets, such as #include <stdio.h> (as you have seen throughout thisbook), the preprocessor first looks for the file in the standard directory. If the file isntfound, or no standard directory is specified, the preprocessor looks for the file in thecurrent directory."What is the standard directory?" you might be asking. In DOS, its the directory ordirectories specified by the DOS INCLUDE environment variable. Your DOSdocumentation contains complete information on the DOS environment. To summarize,however, you set an environment variable with a SET command (usually, but notnecessarily, in your AUTOEXEC.BAT file). Most compilers automatically set theINCLUDE variable in the AUTOEXEC.BAT file when the compiler is installed.The second method of specifying the file to be included is enclosing the filename indouble quotation marks: #include "myfile.h". In this case, the preprocessor doesnt searchthe standard directories; instead, it looks in the directory containing the source code filebeing compiled. Generally speaking, header files that you write should be kept in thesame directory as the C source code files, and they are included by using doublequotation marks. The standard directory is reserved for header files supplied with yourcompiler.Using #if, #elif, #else, and #endifThese four preprocessor directives control conditional compilation. The term conditionalcompilation means that blocks of C source code are compiled only if certain conditionsare met. In many ways, the #if family of preprocessor directives operates like the Clanguages if statement. The difference is that if controls whether certain statements areexecuted, whereas #if controls whether they are compiled.The structure of an #if block is as follows:#if condition_1statement_block_1#elif condition_2statement_block_2...#elif condition_nstatement_block_n#elsedefault_statement_block
    • #endifThe test expression that #if uses can be almost any expression that evaluates to aconstant. You cant use the sizeof() operator, typecasts, or the float type. Most often youuse #if to test symbolic constants created with the #define directive.Each statement_block consists of one or more C statements of any type, includingpreprocessor directives. They dont need to be enclosed in braces, although they can be.The #if and #endif directives are required, but #elif and #else are optional. You can haveas many #elif directives as you want, but only one #else. When the compiler reaches an#if directive, it tests the associated condition. If it evaluates to TRUE (nonzero), thestatements following the #if are compiled. If it evaluates to FALSE (zero), the compilertests, in order, the conditions associated with each #elif directive. The statementsassociated with the first TRUE #elif are compiled. If none of the conditions evaluates asTRUE, the statements following the #else directive are compiled.Note that, at most, a single block of statements within the #if...#endif construction iscompiled. If the compiler finds no #else directive, it might not compile any statements.The possible uses of these conditional compilation directives are limited only by yourimagination. Heres one example. Suppose youre writing a program that uses a great dealof country-specific information. This information is contained in a header file for eachcountry. When you compile the program for use in different countries, you can use an#if...#endif construction as follows:#if ENGLAND == 1#include "england.h"#elif FRANCE == 1#include "france.h"#elif ITALY == 1#include "italy.h"#else#include "usa.h"#endifThen, by using #define to define the appropriate symbolic constant, you can controlwhich header file is included during compilation.Using #if...#endif to Help DebugAnother common use of #if...#endif is to include conditional debugging code in theprogram. You could define a DEBUG symbolic constant set to either 1 or 0. Throughoutthe program, you can insert debugging code as follows:#if DEBUG == 1
    • debugging code here#endifDuring program development, if you define DEBUG as 1, the debugging code is includedto help track down any bugs. After the program is working properly, you can redefineDEBUG as 0 and recompile the program without the debugging code.The defined() operator is useful when you write conditional compilation directives. Thisoperator tests to see whether a particular name is defined. Thus, the expressiondefined( NAME )evaluates to TRUE or FALSE, depending on whether NAME is defined. By usingdefined(), you can control compilation, based on previous definitions, without regard tothe specific value of a name. Referring to the previous debugging code example, youcould rewrite the #if...#endif section as follows:#if defined( DEBUG )debugging code here#endifYou can also use defined() to assign a definition to a name only if it hasnt beenpreviously defined. Use the NOT operator (!) as follows:#if !defined( TRUE ) /* if TRUE is not defined. */#define TRUE 1#endifNotice that the defined() operator doesnt require that a name be defined as anything inparticular. For example, after the following program line, the name RED is defined, butnot as anything in particular:#define REDEven so, the expression defined( RED ) still evaluates as TRUE. Of course, occurrencesof RED in the source code are removed and not replaced with anything, so you must usecaution.Avoiding Multiple Inclusions of Header FilesAs programs grow, or as you use header files more often, you run the risk of accidentallyincluding a header file more than once. This can cause the compiler to balk in confusion.Using the directives youve learned, you can easily avoid this problem. Look at theexample shown in Listing 21.5.Listing 21.5. Using preprocessor directives with header files.
    • 1: /* PROG.H - A header file with a check to prevent multiple includes! */2:3. #if defined( PROG_H )4: /* the file has been included already */5: #else6: #define PROG_H7:8: /* Header file information goes here... */9:10:11:12: #endifANALYSIS: Examine what this header file does. On line 3, it checks whether PROG_His defined. Notice that PROG_H is similar to the name of the header file. If PROG_H isdefined, a comment is included on line 4, and the program looks for the #endif at the endof the header file. This means that nothing more is done.How does PROG_H get defined? It is defined on line 6. The first time this header isincluded, the preprocessor checks whether PROG_H is defined. It wont be, so controlgoes to the #else statement. The first thing done after the #else is to define PROG_H sothat any other inclusions of this file skip the body of the file. Lines 7 through 11 cancontain any number of commands or declarations.The #undef DirectiveThe #undef directive is the opposite of #define--it removes the definition from a name.Heres an example:#define DEBUG 1/* In this section of the program, occurrences of DEBUG *//* are replaced with 1, and the expression defined( DEBUG ) *//* evaluates to TRUE. *.#undef DEBUG/* In this section of the program, occurrences of DEBUG *//* are not replaced, and the expression defined( DEBUG ) *//* evaluates to FALSE. */You can use #undef and #define to create a name that is defined only in parts of yoursource code. You can use this in combination with the #if directive, as explained earlier,for more control over conditional compilations.Predefined MacrosMost compilers have a number of predefined macros. The most useful of these are__DATE__, __TIME__, __LINE__, and __FILE__. Notice that each of these are
    • preceded and followed by double underscores. This is done to prevent you fromredefining them, on the theory that programmers are unlikely to create their owndefinitions with leading and trailing underscores.These macros work just like the macros described earlier in this chapter. When theprecompiler encounters one of these macros, it replaces the macro with the macros code.__DATE__ and __TIME__ are replaced with the current date and time. This is the dateand time the source file is precompiled. This can be useful information as youre workingwith different versions of a program. By having a program display its compilation dateand time, you can tell whether youre running the latest version of the program or anearlier one.The other two macros are even more valuable. __LINE__ is replaced by the currentsource-file line number. __FILE__ is replaced with the current source-code filename.These two macros are best used when youre trying to debug a program or deal witherrors. Consider the following printf() statement:31:32: printf( "Program %s: (%d) Error opening file ", __FILE__, __LINE__ );33:If these lines were part of a program called MYPROG.C, they would printProgram MYPROG.C: (32) Error opening fileThis might not seem important at this point, but as your programs grow and spread acrossmultiple source files, finding errors becomes more difficult. Using __LINE__ and__FILE__ makes debugging much easier.DO use the __LINE__ and __FILE__ macros to make your error messages more helpful.DONT forget the #endif when using the #if statement.DO put parentheses around the value to be passed to a macro. This prevents errors. Forexample, use this:#define CUBE(x) (x)*(x)*(x)instead of this:#define CUBE(x) x*x*xUsing Command-Line ArgumentsYour C program can access arguments passed to the program on the command line. Thisrefers to information entered after the program name when you start the program. If youstart a program named PROGNAME from the C:> prompt, for example, you could enter
    • C:>progname smith jonesThe two command-line arguments smith and jones can be retrieved by the programduring execution. You can think of this information as arguments passed to the programsmain() function. Such command-line arguments permit information to be passed to theprogram at startup rather than during execution, which can be convenient at times. Youcan pass as many command-line arguments as you like. Note that command-linearguments can be retrieved only within main(). To do so, declare main() as follows:main(int argc, char *argv[]){/* Statements go here */}The first parameter, argc, is an integer giving the number of command-line argumentsavailable. This value is always at least 1, because the program name is counted as the firstargument. The parameter argv[] is an array of pointers to strings. The valid subscripts forthis array are 0 through argc - 1. The pointer argv[0] points to the program name(including path information), argv[1] points to the first argument that follows theprogram name, and so on. Note that the names argc and argv[] arent required--you canuse any valid C variable names you like to receive the command-line arguments.However, these two names are traditionally used for this purpose, so you should probablystick with them.The command line is divided into discrete arguments by any white space. If you need topass an argument that includes a space, enclose the entire argument in double quotationmarks. For example, if you enterC:>progname smith "and jones"smith is the first argument (pointed to by argv[1]), and and jones is the second (pointed toby argv[2]). Listing 21.6 demonstrates how to access command-line arguments.Listing 21.6. Passing command-line arguments to main().1: /* Accessing command-line arguments. */2:3: #include <stdio.h>4:5: main(int argc, char *argv[])6: {7: int count;8:9: printf("Program name: %sn", argv[0]);10:11: if (argc > 1)12: {
    • 13: for (count = 1; count < argc; count++)14: printf("Argument %d: %sn", count, argv[count]);15: }16: else17: puts("No command line arguments entered.");18: return(0);19: }list21_6Program name: C:LIST21_6.EXENo command line arguments entered.list21_6 first second "3 4"Program name: C:LIST21_6.EXEArgument 1: firstArgument 2: secondArgument 3: 3 4ANALYSIS: This program does no more than print the command-line parameters enteredby the user. Notice that line 5 uses the argc and argv parameters shown previously. Line 9prints the one command-line parameter that you always have, the program name. Noticethis is argv[0]. Line 11 checks to see whether there is more than one command-lineparameter. Why more than one and not more than zero? Because there is always at leastone--the program name. If there are additional arguments, a for loop prints each to thescreen (lines 13 and 14). Otherwise, an appropriate message is printed (line 17).Command-line arguments generally fall into two categories: those that are requiredbecause the program cant operate without them, and those that are optional, such as flagsthat instruct the program to act in a certain way. For example, imagine a program thatsorts the data in a file. If you write the program to receive the input filename from thecommand line, the name is required information. If the user forgets to enter the inputfilename on the command line, the program must somehow deal with the situation. Theprogram could also look for the argument /r, which signals a reverse-order sort. Thisargument isnt required; the program looks for it and behaves one way if its found andanother way if it isnt.DO use argc and argv as the variable names for the command-line arguments for main().Most C programmers are familiar with these names.DONT assume that users will enter the correct number of command-line parameters.Check to be sure they did, and if not, display a message explaining the arguments theyshould enter.SummaryThis chapter covered some of the more advanced programming tools available with Ccompilers. You learned how to write a program that has source code divided amongmultiple files or modules. This practice, called modular programming, makes it easy toreuse general-purpose functions in more than one program. You saw how you can use
    • preprocessor directives to create function macros, for conditional compilation, and othertasks. Finally, you saw that the compiler provides some function macros for you.Q&AQ When compiling multiple files, how does the compiler know which filename to use forthe executable file?A You might think the compiler uses the name of the file containing the main() function;however, this isnt usually the case. When compiling from the command line, the first filelisted is used to determine the name. For example, if you compiled the following withBorlands Turbo C, the executable would be called FILE1.EXE:tcc file1.c main.c prog.cQ Do header files need to have an .H extension?A No. You can give a header file any name you want. It is standard practice to use the .Hextension.Q When including header files, can I use an explicit path?A Yes. If you want to state the path where a file to be included is, you can. In such a case,you put the name of the include file between quotation marks.Q Are all the predefined macros and preprocessor directives presented in this chapter?A No. The predefined macros and directives presented in this chapter are ones commonto most compilers. However, most compilers also have additional macros and constants.Q Is the following header also acceptable when using main() with command-lineparameters?main( int argc, char **argv);A You can probably answer this one on your own. This declaration uses a pointer to acharacter pointer instead of a pointer to a character array. Because an array is a pointer,this definition is virtually the same as the one presented in this chapter. This declarationis also commonly used. (See Day 8, "Using Numeric Arrays," and Day 10, "Charactersand Strings," for more details.)WorkshopThe Workshop provides quiz questions to help you solidify your understanding of thematerial covered and exercises to provide you with experience in using what youvelearned.Quiz1. What does the term modular programming mean?
    • 2. In modular programming, what is the main module?3. When you define a macro, why should each argument be enclosed in parentheses?4. What are the pros and cons of using a macro in place of a regular function?5. What does the defined() operator do?6. What must always be used if #if is used?7. What extension do compiled C files have? (Assume that they have not been linked.)8. What does #include do?9. What is the difference between this line of code:#include <myfile.h>and this line of code:#include "myfile.h"10. What is __DATE__ used for?11. What does argv[0] point to?ExercisesBecause many solutions are possible for the following exercises, answers are notprovided.1. Use your compiler to compile multiple source files into a single executable file. (Youcan use Listings 21.1, 21.2, and 21.3 or your own listings.)2. Write an error routine that receives an error number, line number, and module name.The routine should print a formatted error message and then exit the program. Use thepredefined macros for the line number and module name (pass the line number andmodule name from the location where the error occurs). Heres a possible example of aformatted error:module.c (Line ##): Error number ##3. Modify exercise 2 to make the error message more descriptive. Create a text file withyour editor that contains an error number and message. Call this file ERRORS.TXT. Itcould contain information such as the following:1 Error number 12 Error number 290 Error opening file
    • 100 Error reading fileHave your error routine search this file and display the appropriate error message basedon a number passed to it.4. Some header files might be included more than once when youre writing a mod-ularprogram. Use preprocessor directives to write the skeleton of a header file that compilesonly the first time it is encountered during compilation.5. Write a program that takes two filenames as command-line parameters. The programshould copy the first file into the second file. (See Day 16, "Using Disk Files," if youneed help working with files.)6. This is the last exercise of the book, and its content is up to you. Select a programmingtask of interest to you that also meets a real need you have. For example, you could writeprograms to catalog your compact disc collection, keep track of your checkbook, orcalculate financial figures related to a planned house purchase. Theres no substitute fortackling a real-world programming problem in order to sharpen your programming skillsand help you remember all the things you learned in this book.© Copyright, Macmillan Computer Publishing. All rights reserved.