<  Back to the Polytechnique Montréal portal

BIGDOCS : an open dataset for training multimodal models on document and code tasks

Juan Rodriguez, Xiangru Jian, Siba Smarak Panigrahi, Tianyu Zhang, Aarash Feizi, Abhay Puri, Akshay Kalkunte, Francois Savard, Ahmed Masry, Shravan Nayak, Rabiul Awal, Mahsa Massoud, Amirhossein Abaskohi, Zichao Li, Suyuchen Wang, Pierre-Andre Noel, Mats Leon Richter, Saverio Vadacchino, Shubham Agarwal, Sanket Biswas, Sara Shanian, Ying Zhang, Noah Bolger, Kurt MacDonald, Simon Fauvel, Sathwik Tejaswi, Srinivas Sunkara, Joao Monteiro, Krishnamurthy D. J. Dvijotham, Torsten Scholak, Nicolas Chapados, Sepideh Kharagani, Sean Hughes, M. Özsu, Siva Reddy, Marco Pedersoli, Yoshua Bengio, Christopher J. Pal, Issam Laradji, Spandana Gella, Perouz Taslakian, David Vazquez and Sai Rajeswar

Paper (2025)

An external link is available for this item
Department: Department of Computer Engineering and Software Engineering
PolyPublie URL: https://publications.polymtl.ca/66809/
Conference Title: 13th International Conference on Learning Representations (ICLR 2025)
Conference Location: Singapore, Singapore
Conference Date(s): 2025-04-24 - 2025-04-28
Official URL: https://proceedings.iclr.cc/paper_files/paper/2025...
Date Deposited: 28 Jul 2025 15:41
Last Modified: 28 Jul 2025 15:41
Cite in APA 7: Rodriguez, J., Jian, X., Panigrahi, S. S., Zhang, T., Feizi, A., Puri, A., Kalkunte, A., Savard, F., Masry, A., Nayak, S., Awal, R., Massoud, M., Abaskohi, A., Li, Z., Wang, S., Noel, P.-A., Richter, M. L., Vadacchino, S., Agarwal, S., ... Rajeswar, S. (2025, April). BIGDOCS : an open dataset for training multimodal models on document and code tasks [Paper]. 13th International Conference on Learning Representations (ICLR 2025), Singapore, Singapore. https://proceedings.iclr.cc/paper_files/paper/2025/hash/c4659191ae1e89faa09864c23fa91f31-Abstract-Conference.html

Statistics

Stats are not available on this system.

Repository Staff Only

View Item View Item