Pick a Number

The Pick a Number model19 is a modification of a common way to choose who gets to go first in childhood games. The contestants pick a number between one and ten and the winner is the person whose number falls closest to the number selected by a third party. I have extended this game by removing the exogenous arbiter—the person who thought of the number the others were trying to guess. Instead the agents in the simulation must pick a number between zero and one hundred in an attempt to match the group outcome, defined as the average (arithmetic mean) of the choices from the entire group.20

In the simulations presented below, there are ten agents. The goal for each of these agents is to pick a number close to the entire group's choice—to act appropriately given the social context that is the average of the individual predictions. The tools available to the agents for making their predictions are simple rules. In the simulations presented here the agents have available to them a universe of seven rules. The rules themselves are simply random choices drawn from a uniform distribution of integers within specified boundaries:21

 1 0 and 10 2 15 and 25 3 30 and 40 4 45 and 55 5 60 and 70 6 75 and 85 7 90 and 100

Each agent is initially randomly assigned three of these rules (without repeats) and therefore each agent has three ideas about the appropriate number. Therefore, out of the universe of seven rules, at any one time each agent only has three to use in deciding which number to pick.22

The agent uses one of these three rules—her operating rule—to pick the number that is sent to the entire group. Each agent determines which rule is her operating rule by keeping track of scores for each rule in her repertoire. Each rule starts with a score of 100. This score tracks the usefulness of the rule—the score rises and falls depending on how close its predictions have been to the group outcome—and the rule with the currently highest score is the operating rule.23 Once the group outcome is known, agents compare their three predictions (one operating, two potential) with the outcome and reward (add 1 to the score) or penalize (subtract 1) their rules depending on the closeness of the prediction.

"Close enough" is governed by a precision parameter. In most runs of the model, precision is set at 5 percent. Rules that predict the group outcome within plus or minus 5 are rewarded and others are punished. For example, if the group outcome was 45, then rules predicting between 40 and 50 would be rewarded. Potential rules can become operating when their score exceeds that of the current operating rule. For instance if nine of the agents have rule 5 as their operating rule and one agent is using rule 2, but has rule 5 as a potential rule, eventually, the agent's operating rule will be punished and the potential rule 5 rewarded enough to elevate rule 5 to the operating rule role.

In order to facilitate adaptation and change over time, at set intervals each agent discards a poorly performing rule and is randomly assigned a new rule from the universe of rules.24 The new rule starts with a fresh score of 100. This can be thought of as akin to a change in domestic politics (if the agents are states) or an internal policy entrepreneur. Internally, someone or some organization has convinced the agent to change one of the rules it could possibly use to make a prediction.

Following constructivist thought, the model assumes that intersubjective agreement is positive for the agents. The reward/punishment procedure is drawn from the constructivist assumption that agents attempt to behave appropriately, which is defined as the intersubjectively agreed-upon behavior. The punishments can be interpreted as either internal (conscience or cognitive dissonance from breaking a norm) or external costs associated with acting inappropriately,25 while the rewards are the benefits from acting appropriately.

This scoring mechanism could be viewed as individualist and based on the logic of consequences in violation of constructivist tenets. Fortunately, this is not the case for three reasons. First, the agents' decisions are entirely socially conditioned because the only information they have is social. Their entire world consists of a drive to behave appropriately and an ability to recognize their social context.

Second, this model does not simulate coordination in the rationalist sense of agents getting a higher payoff if they choose the same strategy (though the model could be modified to explore these issues).26 Remember that only the rules, not the agents, are rewarded or punished—the agents themselves gather no utility from matching the group outcome. There is no competition to be the best "matcher." Instead, the rules help the agents reach their goal of acting appropriately. Change in the rules (altering the operating rule and dropping poorly performing rules) is quantified in this model, but the quantification is merely a simplification and representation of the multiple mechanisms that con-structivists have identified for norm compliance and behavioral/ normative choice.

Finally, in practical and functional terms, a decision based on a logic of appropriateness looks very similar to a decision based on consequen-tialist logic—that is, they are similarly processed on the computer. The agents assess whether their rules produce behavior that matches the social outcome. If not, they change those rules. In order to program this on the computer, I have chosen to quantify how well the rules are doing and to reward and punish the rules depending on their performance. The model does not explicitly address how the agents recognize their social context (outcome). The automatic recognition of the social context in the model is general enough to accommodate the multiple, debated notions of how agents come to recognize their social context—persuasion, argumentation, socialization, social learning. These mechanisms also act to devalue some rules and increase the value of others. I do not model these mechanisms explicitly; I am not attempting to simulate social learning or per suasion. Instead, by using a general mechanism for rule valuation, I can discern the fundamental effects of norm entrepreneurship.

The social context these agents find themselves in and adapt to is the other important component of the model, and it is wholly produced by the combined actions of the population of agents. It is a very limited world, where agents only perceive the group outcome (average prediction). The catch for the agents is that it is a noisy world. While the true outcome is exactly the average of the prediction from the population, the outcome that each agent perceives is obscured by noise.27 The noise is a measure of what I call social complexity. It can be conceived of as the ambiguity of the social environment—the higher the noise levels, the less clear agents are on what the appropriate group outcome should be. Thus, social complexity represents how easy or difficult is it to ascertain the appropriateness of the group outcome. Essentially, the agents are given varied degrees of blurred vision. While the true outcome flows from the agents' actions, because of the noise (or a complex social environment), the agents do not "see" or cannot judge the outcome directly. For instance, think of this in terms of contributing to a collective good. Even if all the players individually feel that 75 is the appropriate level of contribution (i.e., they all predicted 75), there may be some uncertainty in the correctness of the outcome or conversely the outcome may be communicated poorly to the players—noise.

An additional aspect of the social context is the existence of a natural attractor in this system—a preordained norm that is intrinsically attractive given the dynamics of the model. Rule 4, which produces predictions between 45 and 55, is a natural norm in this system.28 Averaging random numbers between 0 and 100 will produce a mean of around 50 in the long run. The results demonstrate that when the noise in the system is low enough, the agents learn to gravitate toward rule 4. When the noise is high, the agents have a difficult time finding this rule—the intrinsic worth of rule 4 is being obscured by a complex social environment.

This baseline model contains rule-following agents with limited computational abilities—they are goal seekers, but not strategic agents. Driven by a logic of appropriateness, they want to match their predictions with the group outcome. The behavior is socially driven— the agents only know social facts, though they do have an individual understanding of them. In addition, the agents are adaptive—they change their active rule when it fails to help them meet their goal (i.e., when they are acting out of step with the social context) and they keep it when it performs well. In the baseline model the agents act in an uncoordinated fashion, trying to reach consensus on rule use in a noisy environment. Norm emergence requires intersubjective agreement— common use of a rule, and once it emerges the norm is self-reinforcing in that norm following behavior is rewarded and norm breaking is punished. Norm change can only occur when agents begin to follow a different rule. Figure 4,1 is a model schematic.

The baseline model explores the conditions under which the agents can find the natural attractor—the focal point or natural norm in the system—by themselves through uncoordinated, adaptive behavior in situations lacking a norm entrepreneur. From there, the real test of the norm life cycle begins and norm entrepreneurs are introduced into the model. Norm entrepreneurs suggest a rule to the agents at specified intervals (every fifty rounds in all the simulations presented). Each agent replaces her currently worst performing rule with the norm entrepreneur's suggestion. The suggested rule starts with a score of 100. Sometimes the injection of a suggestion produces a cascade of behavior that leads to norm emergence or change.

In the base version of the model this stylized entrepreneur is able to reach all agents simultaneously and automatically convinces all the agents in the simulation to add the suggestion to their repertoire of rules. Crucially, the agents will only use the suggested rule if their other rules have been weakened through past punishments—that is, just because a new idea about appropriate behavior is presented, that does not mean it will automatically influence behavior.

In Figure 4.1, the octagons represent individual agents. In each round, each agent presents her prediction to the group as a whole. The predictions are averaged and noise is incorporated. The agents then observe the group outcome and evaluate their rules. The +/— 1 shown in the hexagons on the right represent the evaluation of each agent's operating rule.

Figure 4.1 Schematic of the Model

Figure 4.1 Schematic of the Model

In Figure 4.1, the octagons represent individual agents. In each round, each agent presents her prediction to the group as a whole. The predictions are averaged and noise is incorporated. The agents then observe the group outcome and evaluate their rules. The +/— 1 shown in the hexagons on the right represent the evaluation of each agent's operating rule.

0 0