La prochaine séance du [Séminaire IRILL] est le jeudi 12 novembre 2021 à 14h en ligne sur Jisti : https://meet.jit.si/IrillTalkMiningDevChats
Speaker: Preetha Chatterjee
Title: Mining Information from Developer Chat Conversations Towards Building Software Maintenance Tools
Software developers are increasingly having conversations about software development via online chat services. In particular, developers are turning to public chat communities hosted on services such as Slack, IRC, and Gitter to discuss specific programming languages or technologies. The emerging trend of increased participation in developer chats motivated us to investigate and develop techniques to extract useful knowledge available in developers’ chat communication channels.
In a preliminary study, we found several new opportunities in mining chat conversations. We found that chats contain valuable information, such as descriptions of code snippets and specific APIs, good programming practices, and causes of common errors/exceptions. We also observed that developers use chats to share opinions on best practices, APIs, and tools. Q&A forums such as Stack Overflow explicitly forbid the use of opinions on their sites. The availability of these information in chats may lead to new mining opportunities for software tools.
Different from many sources of software development-related communication, the information on chat forums is shared in an unstructured, informal, and asynchronous manner. There is no predefined delineation of conversation in chats; multiple questions are discussed and answered in parallel by different participants. Therefore, a technique is required to separate, or disentangle, the conversations for analysis by researchers or automatic mining tools. Understanding the quality of the information in the mining source is essential for building effective data-driven software tools. Currently, there is a lack of a formal mechanism of quality assessment in chat platforms. Thus, in this talk I will focus on automatic techniques for: chat disentanglement, quality assessment, and extraction of opinions from developer chats.