December 21 ~ 22, 2024, Sydney, Australia
Sourabh Khot1, Venkata Duvvuri1, Heejae Roh1, and Anish Mangipudi2, 1College of Professional Studies, Northeastern University, 2Langley High School, Mclean, Virginia
Email will remain a vital marketing tool in 2024. Email marketing involves sending commercial emails to a targeted audience. It currently produces a significant ROI (return on investment) in the marketing sector [1]. This research paper presents a comprehensive study on predicting email open rates, focusing specifically on the influence of subject lines. The open-rate prediction algorithm SLk relies on the semantic features of subject lines utilizing a seed dataset of 4500 anonymized subject lines from diverse business sectors. The algorithm integrates data preprocessing, tokenization, and a custom-built repository of power words and negative words to enhance prediction accuracy. In our experiments the actual open rate margin of error was tracking close to whats allowed as per input error giving confidence that SLk can be directionally used for optimizing subject lines performance without prior history. The findings suggest that precise manipulation of subject line features can significantly improve the efficacy of email campaigns.
Email Marketing, Open Rate Prediction, Subject Line Analysis, Machine Learning, Natural Language Processing
Thanh Vu, Sara Keretna, Richi Nayak and Thiru, Telstra Group Limited and Queensland University of Technology, Australia
This study investigates the practical deployment of AI-based Text-to-SQL (T2S) models on a real-world telecommunication dataset, aiming to enhance employee productivity. Our experiment addresses the unique challenges in telecommunication datasets not explored in previous works using annotated datasets. Leveraging advanced retrieval augmented generative (RAG) models like Vanna AI and Llamaindex, we benchmark their performance on synthetic datasets such as SPIDER and BIRD with different LLM backbones and subsequently compare the best-performing model to human performance on our proprietary dataset. We propose the Productivity Gain Index (PGI) to quantify the dual aspects of productivity improvement—time efficiency and accuracy—by comparing AI performance with human analysts across various SQL tasks. Results indicate significant productivity gains, with AI-based tools demonstrating superior query processing and accuracy performance. This prominent gap signals the potential of AI-based tool applications in the actual company domain for improved productivity.
Text-to-SQL, Large Language Models, Productivity Gain Index, RetrievalAugmented Generation, Artificial Intelligence Evaluation.
Peng Zhang1 and Pingqing Liu2, 1Faculty of Science, Kunming University of Science and Technology, Kunming, China, 2School of Management and Economics, Kunming University of Science and Technology, Kunming, China
In data analytics, privacy preserving is receiving more and more attention, privacy concerns results in the formation of ”data silos”. Federated learning can accomplish data integrated analysis while protecting data privacy, it is currently an effective way to break the ”data silo” dilemma. In this paper, we build a federated learning framework based on differential privacy. First, for each local dataset, the summary statistics of the parameter estimates and the maximum L2 norm of the coefficient vector for the polynomial function used to approximate individual log-likelihood function are computed and transmitted to the trust center. Second, at the trust center, gaussian noise is added to the coefficients of the polynomial function which approximates the full log-likelihood function, and the parameter estimates under privacy is obtained from the noise/privacy objective function, and the estimator satisfies (ε, δ)-DP. In addition, theoretical guarantees are provided for the privacy guarantees and statistical utility of the proposed method. Finally, we verify the utility of the method using numerical simulations and apply our method in the study of salary impact factors.
Differential Privacy, Federated Learning, Gauss Function Mechanism, Summary Statistics.
Kenta Kawai, Wu Yuxiao, Yutaka Matubara, and Hiroaki Takada, Graduate School of Informatics, Nagoya University, Aichi 464-8601
The booming of IoT devices has attracted significant interest in data integration platforms that enable seamless utilization and control of sensor data across various applications. However, most existing platforms are centralized structure, aggregating data on specific companies servers. This centralization raises privacy concerns and imposes limitations on data sharing with third parties. To address these challenges, this paper proposes a decentralized demand-supply matching system for IoT device data distribution using blockchain technology. The paper details the requirements for the entire matching system, including both users and IoT devices, and introduces a system concept alongside a practical implementation. Evaluation experiments conducted on a prototype system demonstrate the feasibility and effectiveness of the proposed approach.
Blockchain, Data Marketplace, Demand-Supply Matching, IoT Data.
E. Aili1, 2, H. Yilahun1, 2, S. Imam1, 3, and A. Hamdulla1, 2, 1School of Computer Science and Technology, Xinjiang University, Urumqi 830017, China, 2Xinjiang Key Laboratory of Multilingual Information Technology, Urumqi 830017, China, 3School of National Security Studies, Xinjiang University, Urumqi 830017, China
Knowledge Graph Completion (KGC) is a popular topic in knowledge graph construction and related applications, aiming to complete the structure of knowledge graph by predicting missing entities or relations and mining unknown facts in the knowledge graph. In the KGC task, graph neural network (GNN)-based methods have achieved remarkable results due to their advantage of effectively capturing complex relations among entities and generating more accurate and rich entity representations by aggregating information from neighboring nodes. These methods mainly focus on the representation of entities, and the representation of relations is obtained using simple dimensional transformations or initial embeddings. This treatment ignores the diversity and complex semantics of relations, and restricts the efficiency of the model in utilizing relational information in the reasoning process. In this work, we propose the relational representation augmented graph attention network, which effectively identifies and weights neighboring relations that actually contribute to the target relation by filtering out irrelevant information through an attention function based on information and spatial domain. Furthermore, we capture complex patterns and features in the relational embedding by means of feed-forward network consisting of a series of linear transformations and nonlinear activation functions. Experiments demonstrate the very advanced performance of RRA-GAT on the link prediction task on standard datasets FB15k-237 and WN18RR(e.g., improved the MRR metric on the WN18RR dataset by 7.8%).
Knowledge Graph Completion, Knowledge Graph Embedding, Graph Neural Networks.
Qiuyan. Ji1, 2, H. Yilahun1, 2, S. Imam1, 3, and A. Hamdulla1, 2, 1School of Computer Science and Technology, Xinjiang University, Urumqi 830017, China, 2Xinjiang Key Laboratory of Multilingual Information Technology, Urumqi 830017, China, 3School of National Security Studies, Xinjiang University, Urumqi 830017, China
Named entity recognition (NER) in the military domain is crucial for information extraction and knowledge graph construction. However, military NER faces challenges such as fuzzy entity boundaries and lack of public corpora. These problems make existing NER methods ineffective when dealing with short texts and social media content. To address these challenges, we construct a military news dataset containing 11,892 Chinese military news sentences, with a total of 69,569 named entities annotated. Simultaneously, we propose a Robust Dilated-W squared NER (RDWS) model based on adversarial training and deep multi-granularity dilated convolution. The model first uses Bert-base-Chinese to extract character-level features, and then combines the fast gradient method (FGM) for adversarial training. Contextual features are captured by the BiLSTM layer, and these features are further processed using deep multi-granularity dilated convolution layers to better capture complex inter-lexical interactions. Experimental results show that the proposed method performs well on multiple datasets.
named entity recognition, adversarial training, Chinese military news, convolution.