博弈论 斯坦福game theory stanford week 5.1_

2024-02-01 04:30

本文主要是介绍博弈论 斯坦福game theory stanford week 5.1_,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!


title: 博弈论 斯坦福game theory stanford week 5-1
tags: note
notebook: 6- 英文课程-15-game theory
---

博弈论 斯坦福game theory stanford week 5-1

练习

1. Question 1

Two players play the following normal form game.

1  2 Left Middle Right

Left 4,2 3,3 1,2

Middle 3,3 5,5 2,6

Right 2,1 6,2 3,3

Which is the pure strategy Nash equilibrium of this stage game (if it is played only once)?

a) (Left, Left);

b) (Left, Middle);

c) (Left, Right);

d) (Middle, Left);

e) (Middle, Middle);

f) (Middle, Right);

g) (Right, Left);

h) (Right, Middle);

i) (Right, Right).

Correct 
(i) is the unique Nash equilibrium of the stage game.(Right, Right) is a Nash equilibrium of the stage game because Right is the best response when the other player is playing Right.
It is also the unique Nash equilibrium. To see this, check that in all other cases at least one player has an incentive to deviate.

Question 2

Two players play the following normal form game.

1  2 Left Middle Right

Left 4,2 3,3 1,2

Middle 3,3 5,5 2,6

Right 2,1 6,2 3,3

Suppose that the game is repeated for two periods. What is the outcome from the subgame perfect Nash equilibrium of the whole game:

a) (Left, Left) is played in both periods.

b) (Right, Right) is played in both periods.

Correct 
(b) is true.

The stage game has a unique Nash equilibrium.
In the second period, (Right, Right) must be played regardless of the outcome obtained in the first period.
Then, it is optimal for both players to maximize the current payoff at the first period and play (Right, Right).

c) (Middle, Middle) is played in the first period, followed by (Left, Left)

d) (Middle, Middle) is played in the first period, followed by (Right, Right)

Question 3

Two players play the following normal form game.

1  2 Left Middle Right

Left 4,2 3,3 1,2

Middle 3,3 5,5 2,6

Right 2,1 6,2 3,3

Suppose that there is a probability p that the game continues next period and a probability (1−p) that it ends. What is the threshold p∗ such that when p≥p∗ (Middle, Middle) is sustainable as a subgame perfect equilibrium by grim trigger strategies, but when gif.latex?p<p playing Middle in all periods is not a best response? [Here the grim strategy is: play Middle if the play in all previous periods was (Middle, Middle); play Right otherwise.]

a) 1/2;

b) 1/3;

c) 1/4;

This should not be selected

d) 2/5.

Question 4

Consider the following game:

1  2 Left Middle Right

Left 1,1 5,0 0,0

Middle 0,5 4,4 0,0

Right 0,0 0,0 3,3

Which are the pure strategy Nash equilibria of this stage game? There can be more than one.

a) (Left, Right);

Un-selected is correct 

b) (Left, Left);

Correct (b) and (f) are pure strategy Nash equilibria of the stage game.(Left, Left) and (Right, Right) are Nash equilibria of the stage game because Right is the best response when the other player is playing Right, and Left is the best response when the other player is playing Left.
There are no other pure strategy Nash equilibria. To see this, check that in all other cases at least one player has an incentive to deviate.

c) (Left, Middle);

Un-selected is correct 

d) (Middle, Right);

Un-selected is correct 

e) (Middle, Left);

Un-selected is correct 

f) (Right, Right).

Correct 
(b) and (f) are pure strategy Nash equilibria of the stage game.(Left, Left) and (Right, Right) are Nash equilibria of the stage game because Right is the best response when the other player is playing Right, and Left is the best response when the other player is playing Left.
There are no other pure strategy Nash equilibria. To see this, check that in all other cases at least one player has an incentive to deviate.

g) (Right, Middle);

Un-selected is correct 

h) (Right, Left);

Un-selected is correct 

i) (Middle, Middle);

Un-selected is correct 

Question 5

Consider the following game:

1  2 Left Middle Right

Left 1,1 5,0 0,0

Middle 0,5 4,4 0,0

Right 0,0 0,0 3,3

Suppose that the game is repeated for two periods. Which of the following outcomes could occur in some subgame perfect equilibrium? (There might be more than one).

a) (Middle, Middle) is played in the first period, followed by (Right, Right)

Correct 
(a), (b) and (c) are all correct.Recall that playing a Nash equilibrium of the stage game in each period forms a subgame perfect Nash equilibrium of the whole game. Then, (b) and (c) are subgame perfect Nash equilibria.
Outcome (a) can be obtained when both players play the following strategy:
Play Middle in the first period.
If outcome in first period was (Middle, Middle) play Right in the second period; otherwise play Left.
It is easy to check that this grim strategy forms a subgame perfect Nash equilibrium:
Suppose that player 1 plays this strategy.
If player 2 plays the same strategy, he/she will receive a total payoff of 4+3=7 (assume no discounting).
If player 2 deviates to (Left, Right), he/she will receive a total payoff of 5+1=6 (which is lower than the payoff of following the grim strategy).

b) (Left, Left) is played in both periods.

Correct 
(a), (b) and (c) are all correct.Recall that playing a Nash equilibrium of the stage game in each period forms a subgame perfect Nash equilibrium of the whole game. Then, (b) and (c) are subgame perfect Nash equilibria.
Outcome (a) can be obtained when both players play the following strategy:
Play Middle in the first period.
If outcome in first period was (Middle, Middle) play Right in the second period; otherwise play Left.
It is easy to check that this grim strategy forms a subgame perfect Nash equilibrium:
Suppose that player 1 plays this strategy.
If player 2 plays the same strategy, he/she will receive a total payoff of 4+3=7 (assume no discounting).
If player 2 deviates to (Left, Right), he/she will receive a total payoff of 5+1=6 (which is lower than the payoff of following the grim strategy).

c) (Right, Right) is played in both periods.

Correct 
(a), (b) and (c) are all correct.Recall that playing a Nash equilibrium of the stage game in each period forms a subgame perfect Nash equilibrium of the whole game. Then, (b) and (c) are subgame perfect Nash equilibria.
Outcome (a) can be obtained when both players play the following strategy:
Play Middle in the first period.
If outcome in first period was (Middle, Middle) play Right in the second period; otherwise play Left.
It is easy to check that this grim strategy forms a subgame perfect Nash equilibrium:
Suppose that player 1 plays this strategy.
If player 2 plays the same strategy, he/she will receive a total payoff of 4+3=7 (assume no discounting).
If player 2 deviates to (Left, Right), he/she will receive a total payoff of 5+1=6 (which is lower than the payoff of following the grim strategy).

Question 6

Consider the following trust game:

There is a probability p that the game continues next period and a probability (1−p) that it ends. The game is repeated indefinitely. Which statement is true? [Grim trigger in (c) and (d) is player 1 playing Not play and player 2 playing Distrust forever after a deviation from ((Play,Share), (Trust)).]

a) There exists a pure strategy Nash equilibrium in the one-shot game with player 2 playing Trust.

b) There exists a pure strategy subgame perfect equilibrium with player 2 playing Trust in any period in the finitely repeated game.

This should not be selected 

c) ((Play,Share), (Trust)) is sustainable as a subgame perfect equilibrium by grim trigger in the indefinitely repeated game with a probability of continuation of p≥5/9.

Question 7

In an infinitely repeated Prisoner's Dilemma, a version of what is known as a "tit for tat" strategy of a player i is described as follows:

There are two "statuses" that player i might be in during any period: "normal" and "revenge";
In a normal status player i cooperates;
In a revenge status player i defects;
From a normal status, player i switches to the revenge status in the next period only if the other player defects in this period;
From a revenge status player i automatically switches back to the normal status in the next period regardless of the other player's action in this period.
Consider an infinitely repeated game so that with probability p that the game continues to the next period and with probability (1−p) it ends.

Cooperate (C) Defect (D)

Cooperate (C) 4,4 0,5

Defect (D) 5,0 1,1

True or False:

When player 1 uses the above-described "tit for tat" strategy and starts the first period in a revenge status (thus plays defect for sure), in any infinite payoff maximizing strategy, player 2 plays defect in the first period

True.

Correct 
True.If player 1 uses "tit for tat" strategy and starts in a revenge status, the payoff in the first period is higher for player 2 from defection than cooperation.
Moreover, the action played by 2 in the first period when 1 begins in revenge status doesn't affect the remaining periods since 1 switches to normal status in the second period regardless of what player 2 does in the first period.

False.

Question 8

In an infinitely repeated Prisoner's Dilemma, a version of what is known as a "tit for tat" strategy of a player i is described as follows:

There are two "statuses" that player i might be in during any period: "normal" and "revenge";

In a normal status player i cooperates;

In a revenge status player i defects;

From a normal status, player i switches to the revenge status in the next period only if the other player defects in this period;

From a revenge status player i automatically switches back to the normal status in the next period regardless of the other player's action in this period.

Consider an infinitely repeated game so that with probability p that the game continues to the next period and with probability (1−p) it ends.

Cooperate (C) Defect (D)

Cooperate (C) 4,4 0,5

Defect (D) 5,0 1,1

What is the payoff for player 2 from always cooperating when player 1 uses this tit for tat strategy and begins in a normal status? How about always defecting when 1 begins in a normal status?

a) 4+4p+4p2+4p3+… ; 5+p+p2+p3+…

b) 4+4p+4p2+4p3+… ; 5+p+5p2+p3+…

Correct 
(b) is true.If 2 always cooperates, then 1 stays `normal' and cooperates always as well, and the payoff to each player is 4 in each period.
If 2 always defects, then 1 is normal in odd periods and switches to revenge in even periods (because 2 defects). 1 cooperates in odd periods and defects in even periods, thus 2 earns 5 in odd periods and 1 in even periods.

c) 5+4p+4p2+4p3+… ; 4+4p+4p2+4p3+…

d) 5+4p+4p2+4p3+… ; 5+p+p2+p3+…

Question 9

In an infinitely repeated Prisoner's Dilemma, a version of what is known as a "tit for tat" strategy of a player i is described as follows:

There are two "statuses" that player i might be in during any period: "normal" and "revenge";

In a normal status player i cooperates;

In a revenge status player i defects;

From a normal status, player i switches to the revenge status in the next period only if the other player defects in this period;

From a revenge status player i automatically switches back to the normal status in the next period regardless of the other player's action in this period.

Consider an infinitely repeated game so that with probability p that the game continues to the next period and with probability (1−p) it ends.

Cooperate (C) Defect (D)

Cooperate (C) 4,4 0,5

Defect (D) 5,0 1,1

What is the threshold p∗ such that when p≥p∗ always cooperating by player 2 is a best response to player 1 playing tit for tat and starting in a normal status, but when p<p∗ always cooperating is not a best response?

a) 1/2

b) 1/3

Correct 
(b) is true.From part (2), in order to sustain cooperation, we need 4+4p+4p2+4p3+...≥5+p+5p2+p3+... , which is 4+4p≥5+p, thus p≥1/3.
p* = 1/3.
Note that this just checks always cooperating against always defecting. However, you can easily check that if player 2 wants to defect in the first period,then s/he should also do so in the second period (our answer from part (1)). Then the third period looks just like we are starting the game over, so player 2 would want to defect again...

c) 1/4

d) 1/5

转载于:https://www.cnblogs.com/zangzelin/p/8598970.html

这篇关于博弈论 斯坦福game theory stanford week 5.1_的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/666075

相关文章

2.1/5.1和7.1声道系统有什么区别? 音频声道的专业知识科普

《2.1/5.1和7.1声道系统有什么区别?音频声道的专业知识科普》当设置环绕声系统时,会遇到2.1、5.1、7.1、7.1.2、9.1等数字,当一遍又一遍地看到它们时,可能想知道它们是什... 想要把智能电视自带的音响升级成专业级的家庭影院系统吗?那么你将面临一个重要的选择——使用 2.1、5.1 还是

fzu 2275 Game KMP

Problem 2275 Game Time Limit: 1000 mSec    Memory Limit : 262144 KB  Problem Description Alice and Bob is playing a game. Each of them has a number. Alice’s number is A, and Bob’s number i

5.1声道转化为左右声道

5.1声道转化为左右声道downmix http://szfzafa.blog.163.com/blog/static/11895416720120724729214/ 标题: Downmix 5.1ch to 2ch in AVS   最简单: function Dmix6Stereo(clip a) {  # 6 Channels L,R,C,LFE,SL,SR   f

10400 -Game Show Math

这道题的话利用了暴力深搜,尽管给了20S,但是这样还会超时,所以就需要利用回溯进行减枝,因为是DFS,所以用一个数组vis[i][j]记录是否在状态i时候取到过j值,如果取到过的话,那么直接回溯(往后搜索已经没有意义了,之前到达这个状态的时候是无法得到结果的) 还有需要注意的地方就是题目的要求,每一步的结构都在(-32000,32000)之间,所以需要一步判断,如果在这个范围外直接回溯 最后一

斯坦福UE4 C++课学习补充25:寻路EQS

文章目录 一、创建EQS二、修改行为树三、查询上下文 一、创建EQS 场景查询系统EQS:可用于收集场景相关的数据。然后该系统可以使用生成器,通过各种用户定义的测试就这些数据提问,返回符合所提问题类型的最佳项目Item。 EQS的一些使用范例包括:找到最近的回复剂或弹药、判断出威胁最大的敌人,或者找到能看到玩家的视线 参考链接:https://dev.epicgames.c

【POJ】1733 Parity game 并查集

传送门:【POJ】1733 Parity game 题目大意:给你一个长度为n的01序列,再给你m句话,每句话是一个区间【L,R】,告诉你区间【L,R】中1的个数,现在你的任务是找到从第几句话开始说的和前面矛盾,出现第一次假话的时候前面有多少是真话。 题目分析:一开始看几乎没思路啊。后来没办法了,只能跑别人的博客去看看了。。。一看到说把一个区间【L,R】拆成两个区间【0,L-1】,

【HDU】5426 Rikka with Game【DP】

题目链接:【HDU】5426 Rikka with Game #include <bits/stdc++.h>using namespace std ;typedef long long LL ;#define clr( a , x ) memset ( a , x , sizeof a )const int MAXN = 100005 ;const int MAXE = 200005 ;

LeetCode 45 Jump Game II

题意: 给出一个步长数组nums,如果一个人站在i这个点上那么他可以向右最多走nums[i]步,求从左端点走到右端点的最少步数。 思路: 如果点x可以用dp[x]步到达,那么[ x + 1, x + nums[x] ]区间内的点都可以用dp[x] + 1步到达。 利用这个想法,可以O(n)的求出走一步可以到达哪些位置,走两步可以到达哪些位置,以此类推。 代码: clas

LEAN 类型理论之注解(Annotations of LEAN Type Theory)—— 定义上相等(Definitional Equality)

定义上相等(Definitional Equality)指的是意义上相等,即同义,包括了,定义的缩写(Abbreviatory Definition),alpha转换,相同替代(substituting equals for equals)等。下面是LEAN给定的何谓 定义上相等。          注:罗列的推演规则中,如自明其义的,则不多加解析其前提、结果、或特定注解。

面试礼仪 + 5.1回家必带+大学结束感悟

回家必带的: 身份证        采集信息        实习协议     学生证 要买的: 上衣衬衫, 脚部急眼, 教师证现场确认资料:正在准备启闭,下次回家太晚直接去报名 书(EQ太低了) -------------------------------- 回学校答辩4.28号 1: 还是的交流:很多信息不知道, 如答辩步骤,上传最新版论文,。。。。 5月8号成都信息工程