Operant conditioning, so named by psychologist B. F. Skinner, is the modification of behavior (the actions of animals) brought about by the consequences that follow upon the occurrence of the behavior. In simple terms, behavior operates on the environment producing various effects. The phrase 'operant conditioning' draws out a crucial distinction from pavlovian conditioning, which Skinner termed 'respondent' - namely that respondent conditioning, like the dog's salivation or the knee-jerk, has neither much effect on the environment, nor is its occurrence changed by its effectiveness or ineffectiveness in the environment. (This opens up a much-missed parallel with involuntary behavior or reflexes and voluntary behavior or acts. The former occur essentially no matter what given some stimulus and have nothing to ensure that they act on the rest of the world, while the latter are affected by how well or poorly they work and hence are much more likely to do work for the animal in the world.)
Operant conditioning, sometimes called "instrumental conditioning" or "instrumental learning", was first extensively studied by Edward L. Thorndike (1874-1949). Thorndike's most famous work investigated the behavior of cats trying to escape from various home-made puzzle boxes. When first constrained in the boxes the cats took a long time to escape from each. With experience however, ineffective responses occurred less frequently and successful responses occurred more quickly enabling the cats to escape in less and less time over successive trials. In his law of effect, Thorndike theorized that successful responses, those producing satisfying consequences were "stamped in" by the experience and thus occurred more frequently. Unsuccessful responses, those producing annoying consequences, were stamped out and subsequently occurred less frequently. In short, some consequence strengthened behavior and some consequences weakened behavior. This effect was (and sometimes still is) described as involving a strengthening of the association between the response and its effect, suggesting some kind of parallel to Pavlovian conditioning.
The same idea behind the Law of Effect is described in Skinner's terms by the notion of reinforcers. These are just whatever events strengthen a response (e.g., whose rate controls the rate of that response). This neatly sidestepped Thorndike's satisfaction, resulting in a term which was less theoretical and more simply descriptive: any event whose presences and absences control how often a response occurs are by definition reinforcers for that response. The problem became not what 'satisfying' meant, but the better-defined question of what events would reinforce which responses of which animals in which conditions. Skinner also innovated in making new definitions of stimulus and response which were similarly to be adapted to the behavior actually observed. To Skinner, the discriminative stimulus (SD) was not a single physically defined kind of event, but an entire class of events (possibly quite physically different) which elicited the same response. (In contrast with the reflex notion of stimulus, a discriminative stimulus was held to increase the probability of response.) Skinner's notion of the operant-conditioning response, called an operant, was similarly distinct from the physiologically defined reflex and classically conditioned responses, being a class of responses which shared a consequence - e.g., depressing a lever, which is commonly done by rats in several distinct but functionally equivalent ways. The relation between the discriminative stimulus, the operant response, and the reinforcer has often been called the 'three-term contingency' - under these (functional) conditions, this (functional) response will yield this reinforcer.
Skinner is most well known for his methodological advances and his laboratory inventions. In contrast to Thorndike's puzzle boxes, Skinner introduced free operant technique via the use of operant conditioning chamber. In this apparatus (as contrasted with earlier apparatus, which tended to emphasize experimenter-determined trials), rats could respond for food at their own pace. Skinner made much theoretical use of this new variable of response rate, which was recorded by a cumulative recorder - paper on a slowly rotating drum in contact with a pen which ticked over automatically each time the lever was pressed. This eliminated much labor and loss of precision.
Accident allowed Skinner to uncover one of his most important contributions, the intermittent reinforcement schedule. Initially, the free operant procedure involved the delivery of one food pellet per press of the lever. However, the food dispenser often broke down, allowing lever presses to occur unfollowed by food. Skinner found that the animals would continue working for some time before stopping. This technique was exploited both to save food pellets (which Skinner then made himself), and later to uncover now well-known properties of behavior under different schedules of reinforcement. These are commonly classified as interval or ratio and fixed or variable schedules - with interval schedules only giving out reinforcers upon the first response after some period of time, ratio schedules only giving out reinforcers every so many responses, fixed schedules having the same interval or ratio throughout, and variable schedules enforcing different intervals or numbers of responses before each pair of reinforcers. Skinner's initiation of this area of research, and his surprisingly varied findings on the effects of these different schedules on the rate of a response, has led to broad advances in our understanding of behavior like gambling, drug use, piecework, or waiting for the bus. Likewise, knowledge of reinforcement schedules is essential for animal training and forms a crucial part of behavior therapies.