A major debate in genome biology is whether genomic activities are indicative of function or are simply background noise. Transcription is illustrative of this debate: long noncoding RNAs have been interpreted both as providing a vast network of functional transcripts and as an outcome of background noise. Similarly, alternative splicing of introns has been interpreted both as a way to functionally diversify the proteome and as the outcome of background noise. To discriminate between these opposing viewpoints, we are determining transcriptional output baselines using random DNA introduced into human cells. Here we used a human cell line with ~40% of the Arabidopsis thaliana (At) genome integrated into one chromosome. The At DNA is transcribed, but at sites largely uncorrelated with At genes, indicating mostly random transcription initiation. Transcription in At DNA is sparser than in the native genome, particularly when just considering polyA transcripts, suggesting that many native transcripts result from selection. Conversely, most splicing occurs at canonical At introns, although some occurs at splice sites that appear by chance. Surprisingly, we found the length distribution of ‘chance’ introns closely follows that of human introns but is completely different to the At intron distribution, suggesting that intron splicing is determined by both intron length and splice sites. Together our results suggest that human transcriptional activities are a mix of background noise and function. Our work also shows that the At-human hybrid line provides rich out-of-distribution data, which may help AI models learn true genomic rules. To this end, we are using machine learning to determine if noise and function can be accurately distinguished.