Operant conditioning, also known as Type R conditioning, emphasizes response. It is a type of learning that modifies behavior by the consequences imposed on it. There are two general principles associated with Type R conditioning. First, any response that is then followed by a reinforcing stimulus will tend to be repeated. Second, a reinforcing stimulus is anything that increases the rate that the operant response occurs (Olson & Hergenhahn, 2013).Reinforcement is anything that has an effect on a behavior. Contingent or dependent reinforcement is when an organism produces a certain response. In order to modify behavior, the psychologist has to determine what is reinforcing to the organism and then wait until the desired behavior occurs. Then the psychologist must reinforce the organism immediately (Olson & Hergenhahn, 2013). When desired behavior happen more often, learning has occurred.The Skinner box is a device that B.F. Skinner created to originally test animals. It was organized so that when an animal depresses a lever inside the Skinner Box, a small pellet of food is released for the animal to consume (Olson & Hergenhahn, 2013). Skinner would use a device he called the cumulative recorder to track the animals behavior. The Skinner box would teach the animal involved in the experiment to perform certain skills in response to specific stimuli, such as a light turning on. When the animal correctly performs the behavior, it is rewarded. The animal has learned from the conditioning how to receive food from the machine. A schedule of reinforcement is an important part of the learning process because when and how often behavior is reinforced can change the strength of the response. Behavior might be reinforced every time it is observed which is known as continuous reinforcement schedule. A fixed interval reinforcement schedule reinforces the animal for responses after a set interval of time. Behavior of an animal under this schedule is similar to the way a person behaves when a deadline approaches and their activity increases (Olson & Hergenhahn, 2013). A fixed ratio reinforcement schedule is a schedule that reinforces an animal only after they have responded a fixed number of times. Variable interval reinforcement schedule is when an animal is reinforced for responses made at the end of a time interval (Olson & Hergenhahn, 2013). Using a continuous schedule is good for initially teaching, but as the learning progresses a partial schedule is often preferable.
Radical behaviorism is the philosophy that underlies the approach to psychology known as the experimental analysis of behavior and is modeled by B. F. Skinner. Radical behaviorism is concerned with the behavior of organisms, not with internal processing. So, it is a form of methodological behaviorism. Finally, radical behaviorism understands behavior as a reflection of frequency effects among stimuli (Response Conditioning) which means it is a form of psychological behaviorism (Rapanelli, Frick, & Zanutto, 2011).Shaping is the process of teaching a complex behavior by rewarding closer and closer approximations of the desired behavior. Differential reinforcement means some responses are reinforced and others are not whereas successive approximation refers to the fact that only those responses that become increasingly similar to the one the experimenter wants are reinforced (Olsen & Hergenhahn, 2013).Positive reinforcement is the process whereby presentation of a stimulus (a reward or payoff), after a behavior, makes the behavior likely to occur again. Although positive reinforcement usually leads to adaptive responding, nothing guarantees the organisms will make the ‘right’ connections between behaviors and consequences (Peter, Paul, & Walter, 2011).Negative reinforcement is the elimination of an aversive consequence, an unpleasant stimulus that strengthens behavior by its removal. negative reinforcement occurs in both escape learning and avoidance learning.Schedules of reinforcement describe the different patterns of frequency and timing of reinforcement following desired behavior. For example, a continuous reinforcement schedule reinforces a behavior every time it occurs whereas partial (intermittent) reinforcement schedule is reinforcing of a behavior some but not all of the time (Nargeot & Simmers, 2011).Just as humans and other animals can develop phobias forming idiosyncratic associations, they can erroneously associate an operant and an environmental event, a phenomenon Skinner (1941) labeled superstitious behavior (Olson & Hergenhahn, 2013).Skinner believed are major concern regarding education should be the basic relationship between classes of stimuli and classes of response. Nevertheless, Skinner felt strongly about the need for technology in and out of the classroom (Peter, Paul & Walter, 2011).
Behaviorism generated a type of therapy, known as behavior therapy (see Rimm and Masters 1974; Erwin 1978). It developed behavior management techniques for autistic children (see Lovaas and Newsom 1976) and token economies for the management of chronic schizophrenics (see Stahl and Leitenberg 1976). It fueled discussions of how best to understand the behavior of nonhuman animals, the relevance of laboratory study to the natural environmental occurrence of behavior, and whether there is built-in associative bias in learning (Rapanelli, Frick, & Zanutto, 2011). There are several techniques in schools that use operant conditioning. For example, positive or desirable changes in behavior in school are often rewarded with prizes such as gold stars, extra recess time, small toys, and extra computer time. Token economy rewards desired behavior, most common in elementary school. Time-out teaches a child that hates to sit still that there is a negative pay-off for poor behavior (Olson & Hergenhahn, 2013).According to Winters and Wallace (2002), “If consumers watch one commercial more actively than another when the groups were similar in age and sex, and low if they were dissimilar” (p. 41). Additionally, the relationship of response rates to magnitude of reinforcement involves choice between commercials and print ads. There is a body of data, which says operant conditioning is a reliable and valid measure of attention and interest whereas CONPAAD scores explain the wear-out of attention and recall of television commercials (Winters & Wallace, 2002).
Therapy shapes behavior not thought. Successive generations of behavior therapy have relaxed those conceptual restrictions. Advocates refer to themselves as cognitive behavior therapists (Montague & Berns, 2002). Clients' behavior problems are described by referring to their beliefs, desires, intentions, memories, and so on. Even the language of self-reflexive thought and belief (so-called ‘meta-cognition’) figures in some accounts of behavioral difficulties and interventions . One goal of such language is to encourage clients to monitor and self-reinforce their own behavior. Self-reinforcement is an essential feature of behavioral self-control (Rapanelli, Frick, & Zanutto, 2011).). Focus on brain mechanisms underlying reinforcement also forms the centerpiece of one of the most active research programs in current neuroscience, so-called neuroeconomics, which weds study of the brain's reward systems with models of valuation and economic decision making (Montague & Berns, 2002). Skinner began the redesigning of classroom instruction with the implementation of presenting material in little-steps, and allowing students to learn at their own pace. Programmed instruction (PSI), Personalized Systems of Instruction was the beginning of the educational format (CBI) referred to as online learning (Olson & Hergenhahn, 2013).
The entirety of B. F. Skinner’s system was operate conditioning. He firmly believed that each individual is operating in a certain environment and are continually affected by the actions. Every behavior a person enacts is followed by a consequence, and these consequences affect an individual’s tendency to repeat that behavior. Positive reinforcement will lead to repeated behavior. However, if the reinforcing stimulus subsides, the likelihood of the behavior occurring again decreases (Boeree, 2006). Radical behaviorism, often known as stimulus response psychology, includes four elements: stimulus, response, reinforcement, and an implied state of deprivation (Baum, 2011). Skinner believed the choices an individual makes are not because of environment and how it evolves, but because of genetics and natural science (Baum, 2011).
Stimulus sampling theory is an attempt to determine how learning happens. William Estes believes that at it’s simplest level, learning is a response attached to a stimuli from a single trial (Olson & Hergenhahn, 2013). Upon further investigation, stimulus sampling theory becomes much more complex. For instance, although a person might learn a response to a single stimuli, the overall learning process involves many responses to many different stimuli such a persons response to the temperature in the room, noise inside or outside the room, and other aspects of the environment. Using statistics and algebraic formula, William Estes developed a model that effectively deals with the complexity of learning (Olson & Hergenhahn, 2013). Probability matching is a phenomenon that traditionally are experiments involving a signal light that is followed by one of two other lights (Olson & Hergenhahn, 2013). When a organism sees the signal light, he or she must guess which of the two lights will turn on. The experimenter can choose any pattern they want between the two lights. For example, the right side light may be programmed to turn on 30% of the time and the left side will turn on 70% of the time. The interesting things about this experiment is that the participant usually ends up guessing what the patter for the lights is. William Estes developed a mathematical formula that predicts probability matching by the subject. William Estes preformed a number of experiments that showed that learning occurs in all or none fashion. This rapid change from the unlearned state to a learned state is similar to a Markov process (Olson & Hergenhahn, 2013). Markov was a Russian mathematician who was able to show mathematically how the future values of random variables can be statistically determined by an abrupt, stepwise change in response probabilities rather then a relatively slow, gradual change from trial to trial. Estes results supported the idea that when something is learned, it is completely learned. If it is not completely learned then it is not learned at all (Olson & Hergenhahn, 2013).
Estes was never a reinforcement (S – R) theorist. He saw reinforcement as preventing the un-learning of associates between stimuli (S) and responses ®. He believed reinforcements were learned also, as an outcome of some response. Reinforcement and punishment are performance variables because they determine how material already learned will manifest itself in behavior. This line of thinking makes a clear distinction between ‘learning’ and ‘performance’. He called this process “Learning to learn” (Olson & Hergenhahn, 2013, p. 224). Estes has defined response probability theoretically as the proportion of conditioned stimulus elements in the sample. This certainly is the case in the experiments published in the Psychological Record (2002). Today the matching principle and Paired-association learning are assisting physicists with quantum mechanics (Green & Kemeny, 2002). State of the system refers to correct responses occur reliably only after they are attached to stimuli in the experimental context. His all-or-nothing theory includes the fact that performance is probabilistic, hence a statistical learning theory (Olson & Hergenhan, 2013). In paired associate learning Estes (1950) theorizes people learn pairs of items so when shown the first member of the pair they can respond with the other (Estes, 1950).
Mathematical models of learning have helped make psychological research more scientific and moved psychology toward a more cognitive orientation. Metacognition provides the path for most effective learning through knowing yourself, your capacity to learn, the process you have successfully used, and interest in, and knowledge of, the subject wished to learn (Estes, 1972).Comprehensive School-reform - Concrete Products and Formative EvaluationEvery LTL activity produces a visible student product. This permits for ongoing formative evaluation of students' skill use. Teachers are trained in effective classroom management practices reinforced through consistent-applied school-wide policies and parental Involvement activities (Estes, 1950). The calculation and reporting of depreciation is based upon two principles. First, the cost principle is based on the original cost of the asset and the matching principle. By assigning a portion of the asset’s cost with each period in which the asset is used, it is hoped the asset’s cost is matched with the revenues earned by using the asset. An example of this is Advertising Expense and Research and Development Expense. This is taught in every Accounting 101 class (Green & Kemeny, 2002). Memory plays an important role in learning. According to Estes (1972) an accumulation of memories of previous experiences with interaction of current stimulation with memories of previous experiences produces behavior. CONPADD presents results of an experiment with one commercial inserted early, and one later in the show. With this stimuli presented one after the other there is a significant increase in the rate of reinforcement (Winters & Wallace (1970).
William K. Estes came up with the statistical Markov model, which allowed him to study the process of learning in more detail. The Markov model is summarized in the Markov chain, hidden Markov model, partially observable Markov decision process, and the Markov decision process. The Markov chain is the simplest model that transitions from one state to another state (Grinstead, 1991). These different states are finite and are a random process. The next state does not depend on the events that follow after it. The process moves from one state to another and each move is called a step (Grinstead, 1991). The main idea of the Markov model is that the outcome of an experiment can affect the outcome that comes of the next experiment. The processes in this model can be discrete and are useful during times that a decision problem involves certain risks that happen over a period of time. or when events that are important happen more than just one time. This model is useful in offering more accurate representation in clinical settings because of its ability to show repetitive events with time dependence of each probability (Grinstead, 1991). The scanning model of decision making is described as an individual figuring out what responses are possible after looking at a situation, and using the best response to achieve the best outcome. An individual learns the value of each response and stores this learned knowledge into memory (Olson & Hergenhahn, 2013). After the information is stored, the individual figures out what responses are possible and are then able to recall which response will give the most desired outcome. “In general, the model claims that in any decision-making situation, an organism will utilize whatever information it has stored in memory concerning response-outcome relationships and will respond in such a way as to produce the most beneficial outcome” (Olson & Hergenhahn, 2013, p. 229). The Array Model Is to understand behaviors of classifying and categorizing. This model is different than stimulus sampling theory (SST) because both the stimulus characteristics and the categories are stored into memory set or memory array (Olson & Hergenhahn, 2013). Also, the new stimuli is compared to these memory sets to figure out where it fits. The array model looks at current classification of different events. Comparisons are done in the present and what may happen in the future rather than just on stimulus response associations made in the past (Olson & Hergenhahn, 2013).
Skinner and estes
Chris Bucci, Abri Harrison, Jody Marvin,
and Audrey Triche
September 16, 2013
Skinner: Operant Conditioning
Stimulus Statistical Theory
Skinner: Radical Behaviorism
Estes: Markov Model of Learning
C. Major Concepts
Skinner: Type R (Response) Conditioning
Response Outcome (R – O)
D. Today’s Relevancy
Skinner: Behavior Analysis
B.F. Skinner picture, retrieved from http://www.nndb.com/people/297/000022231/
The Skinner Box
Schedule of Reinforcement
Radical Behaviorism: Type R Conditioning
Shaping: Differential Reinforcement and Successive
Primary and Negative Reinforcements
Schedules of Reinforcement
Programmed Learning: Behavioral Technology
CONPAAD: Conjugate Programmed Analysis of
Computer Based Learning
Aggression and Media
Computer Based Learning
Image from https://wikispaces.psu.edu/display/PSYCH484/3.+Reinforcement+Theory
Picture of William K. Estes retrieved from
- Stimulus Sampling Theory
- Probability Matching
- Markov Process
Response Outcome Theory: R – O
Probability Matching: Subjects Predicting
State of the System
Learning to Learn: Learning Sets
SST: Mathematical Models of Memory
Learning to Learn: School Reform
Introduction to Depreciation
Goals Influence Behavior
Memory and Advertisement
Markov Model of Learning
Scanning Model of Decision Making
B. F. Skinner adopted and developed the
scientific philosophy known as radical
behaviorism. This scientific orientation rejects
scientific language and interpretations that refer
to mentalistic events. In contrast, W. K. Estes built
a statistical sample system that led to the
development of quantitative cognitive science. At
this stage of knowledge about the nature of
learning both approaches are necessary.
Boeree, Dr. G. C. (2006). B.F Skinner. Retrieved from
Baum, W. M. (2011, January). What is radical behaviorism? A Review of Jay Moore's Conceptual
Foundations of Radical Behaviorism. NCBI, 95((1)), 119-126.
Green, E. & Kemeny, J. (2002). The matching principle revisited. The Psychological Record ISSN: 00332933, 52 (3).
Estes, W. K. (1972). Memory and conditioning. In E. J. McGuigan & D. B.
Lumsden (Eds.). Contemporary Approaches to conditioning and learning.
Psychological Review, 67, 207-223
Estes, W. K. (1950). Toward a statistical theory of learning. Psychological Review,
Heiman, M. (2013). Learning to learn: A breakthrough in learning. Retrieved on
September 19, 2013 from http://learningtolearn.com/whoweare.html
Montague, R. and Berns, G., 2002. Neural economics and the biological substrates of valuation. Neuron, 36:
265–284. Retrieved from
Nargeot, R., & Simmers, J. (2011). Neural mechanisms of operant conditioning
and learning-induced behavioral plasticity in Aplasia. Cellular and Molecular
Life Sciences: CMLS, 68(5), 803-816. Doi: 10.1007/s00018-010-0570-9
Olson, M.H. & Hergenhahn, B.R. (2013). An introduction to theories of learning. (9th Ed.). Upper
Saddle River, NJ: Pearson. Retrieved from University of Phoenix.
Peter J., Paul, N., Walter R. (2011). A clarification and extension of operant
conditioning principles in marketing. Journal of Marketing. 46/3 102-107
Winters, L. & Wallace, W. (1970). On operant conditioning techniques. Journal of Advertising Research. 9(2),
Retrieved on September 21, 2013 from http//:search.ebscohost.com.ezproxy.apollolibrary.com/