Heterogeneous graphs organize data with nodes and edges, and have been widely used in various graph-centric applications.
If GenAI is going to go mainstream and not just be a bubble that helps prop up the global economy for a couple of years, AI ...
In this work, we propose HC-SMoE (Hierarchical Clustering for Sparsely activated Mixture of Experts), a task-agnostic expert merging framework that reduces SMoE model parameters without retraining.
Is this training speed normal? Am I making some big mistakes? I am experiencing slow pretraining performance while training a Mixture-of-Experts (MoE) LLM using Megatron-LM on a HPC cluster with ...
Abstract: Radar signal sorting is one of the crucial techniques in radar reconnaissance. However, as the electromagnetic environment increasingly complex and the density of radar pulses surges, the ...
As large language models (LLMs) scale in size and capability, the choice of pretraining data remains a critical determinant of downstream performance. Most LLMs are trained on large, web-scale ...
Creative Commons (CC): This is a Creative Commons license. Attribution (BY): Credit must be given to the creator. Ubiquitylation is involved in various physiological processes, such as signaling and ...
Abstract: Clustering of diabetic multimorbidity data from EHRs is challenging due to patient heterogeneity, high-dimensional variables, sensitivity to parameter settings, and high computational ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results