[Corpora-List] Final CFP - SIGIR eCom'21 - Coveo Data Challenge

Surya Kallumadi kallumadisteja at gmail.com
Wed Jun 2 16:02:34 CEST 2021


*Call for Participation:*

The 2021 SIGIR workshop on eCommerce is hosting the Coveo Data Challenge for “*In-session prediction for purchase intent and recommendations*”. The Challenge comprises two tasks on a new large dataset provided by Coveo, including fine-grained behavioral, search and catalog data. SIGIR eCom is a full day workshop taking place on Thursday, July 15th, 2021 in conjunction with SIGIR 2021, which is virtual this year. Challenge participants will have the opportunity to present their work at the workshop.

*Challenge website*: https://sigir-ecom.github.io/data-task.html

*Repository*: https://github.com/coveooss/SIGIR-ecom-data-challenge

*Challenge pape*r (pre-print): https://arxiv.org/abs/2104.09423 ----------------------------------------------------- *Important Dates:*

*Registration ends: June 5th, 2021* Final leaderboard - June 15th, 2021

SIGIR eCom Full day Workshop - July 15th, 2021 ----------------------------------------------------- *Task Description:*

This challenge addresses the growing need for reliable predictions within the boundaries of a shopping session, as customer intentions can be different depending on the occasion. In the context of e-commerce technology, the feedback loop determined by behavioral signals spans from hours to a few seconds and machine learning models need to adapt as fast as possible to the continuously changing nature of the customer journey.

The need for efficient procedures for personalization is even clearer if we consider the e-commerce landscape more broadly: outside of giant digital retailers, the constraints of the problem are stricter, due to smaller user bases and the realization that most users are not frequently returning customers.

We release a new session-based dataset including more than 30M fine-grained browsing events (product detail, add, purchase), enriched by linguistic behavior (queries made by shoppers, with items clicked and items not clicked after the query) and catalog meta-data (images, text, pricing information). On this dataset, we ask participants to showcase innovative solutions for two open problems:

- a recommendation task, where a model is shown k events at the start of a session, and it is asked to predict future product interactions in the same session;

- an intent prediction task, where a model is shown a session containing an add-to-cart event, and it is asked to predict whether the item will be bought before the end of the session.

The Challenge web page will maintain up-to-date leaderboards for the tasks: please refer to the public repository for details on rules, evaluations and everything related to the dataset. While we recognize the importance of a standardized quantitative evaluation to have some measure of progress, we also agree with recent criticism of “leaderboard chasing”, as there are many practical constraints not captured by metrics which are crucial for the real-world success of a model.

To encourage a deeper understanding of the underlying business problems and foster a healthy competition among models with broad applicability, we solicit submissions that provide new insights by mixing quantitative results with industry-relevant qualitative discussion. Please refer to the pre-print for some examples of interesting topics, and a broader context on recent literature and previous work. -----------------------------------------------------

*Participation and Data:* The data challenge is open to everyone: data is freely available for download under a research-friendly license (usage of the data implies the acceptance of the T&C). The training dataset comprises three large text files, one for behavioral data, one for query data, one for catalog meta-data. Submissions are performed by submitting a version of the test file received at sign-up enriched with labels, that is, the predictions of your model: evaluation metrics follow the industry standards, as discussed at length in the accompanying paper.

Details about important dates and sign-up can be found at the website: https://sigir-ecom.github.io/data-task.html

Details about the dataset, utility scripts for data checks and data upload, and extensive documentation on the files can be found in the Challenge repository: https://github.com/coveooss/SIGIR-ecom-data-challenge -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 5002 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20210602/9d0aaa33/attachment.txt>



More information about the Corpora mailing list