Generative AI models, such as large language models (LLMs) and text-to-image/video models, have revolutionized the field of AI with their remarkable capabilities in natural language and visual understanding and generation. LLMs such as ChatGPT are widely used in a variety of applications, including question answering, personal assistants, and recommender systems. Similarly, text-to-image models like Dall-E and text-to-video models like Sora are transforming visual content creation, enabling the generation of high-quality visuals and videos from textual descriptions.

Despite their advanced capabilities, these powerful models also present significant challenges for their safe and ethical deployment. Issues such as algorithmic bias, privacy breaches, explainability, and lack of transparency can undermine trust and limit their usefulness. Therefore, the Generative AI Day will discuss the open issues and challenges associated with building trustworthy AI models. By focusing on these concerns, we aim to brainstorm and build robust frameworks to ensure that advanced AI models are developed and deployed in a responsible manner.

As demonstrated in frontier LLMs , “data quality is an important factor for highly-performing models” and “data quality and diversity are crucial for building effective LLMs” . These highlight the important role that data mining research plays in the building of powerful models. To date, there exist numerous opportunities to develop fundamental principles and algorithms to fully understand and mine the data and its connection with the “intelligence” level of the AI models trained on them. These processes are critical for the effective pre-training and alignment of language, vision, and multi-modal models. The challenges include not only gathering large volumes of data but also ensuring the data is representative, unbiased, and of high fidelity and quality.

To this end, the Generative AI Day will invite both AI/LLM and data mining speakers as well as the KDD audience to discuss the challenges and opportunities that data mining researchers face in the era of generative AI. The goal is to explore advanced data strategies and directions that could drive the next round of innovations in building advanced AI models. It will target at providing a platform for academic researchers and industry practitioners to discuss the latest advances and open problems in the field of generative AI. The program of the Generative AI Day will be shared and updated at https://bigmodel.ai/aigc-kdd24/.


Dates and Location

  • Date: Sunday 25 August 2024 – Thursday 29 August 2024
  • Location: Barcelona, Spain


Yuxiao Dong Yuxiao Dong Tsinghua University
Jie Tang Jie Tang Tsinghua University
Michalis Vazirgiannis Michalis Vazirgiannis Ecole Polytechnique

Contact: LLMDay2024@kdd.org