報酬と複雑性のトレードオフにおける忍耐の起源

/ /

日本語AIでPubMedを検索

PubMedの提供する医学論文データベースを日本語で検索できます。AI(Deep Learning)を活用した機械翻訳エンジンにより、精度高く日本語へ翻訳された論文をご参照いただけます。

Cognition.2020 Jul;204:104394. S0010-0277(20)30213-4. doi: 10.1016/j.cognition.2020.104394.Epub 2020-07-14.

報酬と複雑性のトレードオフにおける忍耐の起源

Origin of perseveration in the trade-off between reward and complexity.

Samuel J Gershman

PMID: 32679270 DOI: 10.1016/j.cognition.2020.104394.

抄録

人間や他の動物が選択を繰り返すとき、報酬の履歴とは無関係に、以前に選択した行動を繰り返す傾向がある。この論文では、2つの計算目標間のトレードオフ（報酬の最大化と行動政策の複雑さの最小化）が忍耐の起源であることを明らかにする。我々は、政策の複雑さの情報理論的な形式化を開発し、このトレードオフを最適化することがどのようにして忍耐力につながるかを示す。2つのデータセットの分析により、人々は最適なトレードオフに近い状態に到達することが明らかになった。パラメータの推定とモデルの比較により、忍耐が理論的に予測された関数形（周波数に依存した行動バイアスを持つソフトマックス関数）と定量的に一致するという主張を支持する。

When humans and other animals make repeated choices, they tend to repeat previously chosen actions independently of their reward history. This paper locates the origin of perseveration in a trade-off between two computational goals: maximizing rewards and minimizing the complexity of the action policy. We develop an information-theoretic formalization of policy complexity and show how optimizing the trade-off leads to perseveration. Analysis of two data sets reveals that people attain close to optimal trade-offs. Parameter estimation and model comparison supports the claim that perseveration quantitatively agrees with the theoretically predicted functional form (a softmax function with a frequency-dependent action bias).