Designing Hybrid Crowd+AI Prediction Markets for Estimating Scientific Replicability 

Tatiana Chakravorti1 *, Sarah Rajtmajer1

1 Pennsylvania State University, State College, Pennsylvania, USA

* Presenting author

Despite high-profile successes in the field of Artificial intelligence (AI) [1-4], machine-driven solutions still suffer important limitations, particularly for complex tasks where creativity, common sense, intuition, or learning from limited data is required [5-8]. Both the promises and challenges of AI have motivated work exploring frameworks for human-machine collaboration [9-13]. The hope is that we can eventually develop hybrid systems that bring together human intuition and machine rationality to tackle today’s grand challenges effectively and efficiently.  

In this talk, we will overview ongoing research to develop and test hybrid prediction markets for crowd+AI collaboration. This builds on our own and others’ prior work developing fully artificial prediction markets as a novel machine learning algorithm and demonstrating the success of this approach on benchmark classification tasks [14,15]. In an artificial prediction market, algorithmic agents (or, bot traders) buy and sell outcomes of future events. Classification decisions can be framed as outcomes of future events, and accordingly, the price of an asset corresponding to a given classification outcome can be taken as a proxy for the system’s confidence in that decision. 

The most exciting opportunity artificial prediction markets afford, we suggest, is the opportunity to integrate human inputs more meaningfully than currently possible with existing machine learning algorithms. Human traders can participate alongside algorithmic agents during both training and testing, and the efficient markets hypothesis [16] states that the market price reflects the aggregate information available to participants (humans and agents) at least as well as any competing methods.  

We have designed and piloted hybrid prediction markets for the task of estimating scientific replicability (replication markets; see, e.g., [17]). Assets in the market represent “will replicate” and “will not replicate” outcomes of real replication studies for published findings in the social and behavioral sciences. We present the outcomes of our initial experiments with human subjects, compare the performance of artificial, human-alone, and hybrid scenarios, and lay out a research agenda that builds upon these ideas. 


[1] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification.” In Proceedings of the IEEE international conference on computer vision, pp. 1026-1034. 2015. 

[2] Brown, Noam, and Tuomas Sandholm. “Superhuman AI for multiplayer poker.” Science 365, no. 6456 (2019): 885-890. 

[3] Kleinberg, Jon, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. “Human decisions and machine predictions.” The quarterly journal of economics 133, no. 1 (2018): 237-293. 

[4] Zhu, Meixin, Xuesong Wang, and Yinhai Wang. “Human-like autonomous car-following model with deep reinforcement learning.” Transportation research part C: emerging technologies 97 (2018): 348-368. 

[5] Lai, Vivian, and Chenhao Tan. “On human predictions with explanations and predictions of machine learning models: A case study on deception detection.” In Proceedings of the conference on fairness, accountability, and transparency, pp. 29-38. 2019. 

[6] Green, Ben, and Yiling Chen. “The principles and limits of algorithm-in-the-loop decision making.” Proceedings of the ACM on Human-Computer Interaction 3, no. CSCW (2019): 1-24. 

[7] Li, Guoliang, Jiannan Wang, Yudian Zheng, and Michael J. Franklin. “Crowdsourced data management: A survey.” IEEE Transactions on Knowledge and Data Engineering 28, no. 9 (2016): 2296-2319. 

[8] Kamar, Ece. “Directions in Hybrid Intelligence: Complementing AI Systems with Human Intelligence.” In IJCAI, pp. 4070-4073. 2016. 

[9] Dellermann, Dominik, Adrian Calma, Nikolaus Lipusch, Thorsten Weber, Sascha Weigel, and Philipp Ebel. “The future of human-AI collaboration: a taxonomy of design knowledge for hybrid intelligence systems.” arXiv preprint arXiv:2105.03354 (2021). 

[10] Wang, Dakuo, Justin D. Weisz, Michael Muller, Parikshit Ram, Werner Geyer, Casey Dugan, Yla Tausczik, Horst Samulowitz, and Alexander Gray. “Human-AI collaboration in data science: Exploring data scientists’ perceptions of automated AI.” Proceedings of the ACM on Human-Computer Interaction 3, no. CSCW (2019): 1-24. 

[11] Nunes, David Sousa, Pei Zhang, and Jorge Sá Silva. “A survey on human-in-the-loop applications towards an internet of all.” IEEE Communications Surveys & Tutorials 17, no. 2 (2015): 944-965. 

[12] Puig, Xavier, Tianmin Shu, Shuang Li, Zilin Wang, Yuan-Hong Liao, Joshua B. Tenenbaum, Sanja Fidler, and Antonio Torralba. “Watch-and-help: A challenge for social perception and human-ai collaboration.” arXiv preprint arXiv:2010.09890 (2020). 

[13] Wu, Xingjiao, Luwei Xiao, Yixuan Sun, Junhang Zhang, Tianlong Ma, and Liang He. “A survey of human-in-the-loop for machine learning.” Future Generation Computer Systems (2022). 

[14] Barbu, Adrian, Nathan Lay, and Shie Mannor. “An Introduction to Artificial Prediction Markets for Classification.” Journal of Machine Learning Research 13, no. 7 (2012). 

[15] Rajtmajer, Sarah, Christopher Griffin, Jian Wu, Robert Fraleigh, Laxmaan Balaji, Anna Squicciarini, Anthony Kwasnica et al. “A synthetic prediction market for estimating confidence in published work.” In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 11, pp. 13218-13220. 2022. 

[16] Fama, Eugene F. “Efficient capital markets: A review of theory and empirical work.” The journal of Finance 25, no. 2 (1970): 383-417. 

[17] Smith, Vernon L. “Constructivist and ecological rationality in economics.” American economic review 93, no. 3 (2003): 465-508.