Density dichotomy in random words
DOI:
https://doi.org/10.11575/cdm.v13i1.62788Keywords:
Density, Free words, Limit theoryAbstract
Word $W$ is said to encounter word $V$ provided there is a homomorphism $\phi$ mapping letters to nonempty words so that $\phi(V)$ is a substring of $W$. For example, taking $\phi$ such that $\phi(h)=c$ and $\phi(u)=ien$, we see that ``science'' encounters ``huh'' since $cienc=\phi(huh)$. The density of $V$ in $W$, $\delta(V,W)$, is the proportion of substrings of $W$ that are homomorphic images of $V$. So the density of ``huh'' in ``science'' is $2/{8 \choose 2}$. A word is doubled if every letter that appears in the word appears at least twice.The dichotomy: Let $V$ be a word over any alphabet, $\Sigma$ a finite alphabet with at least 2 letters, and $W_n \in \Sigma^n$ chosen uniformly at random. Word $V$ is doubled if and only if $\mathbb{E}(\delta(V,W_n)) \rightarrow 0$ as $n \rightarrow \infty$.
We further explore convergence for nondoubled words and concentration of the limit distribution for doubled words around its mean.
References
[2] J. P. Bell and T. L. Goh, Exponential lower bounds for the number of words of uniform length avoiding a pattern, Information and Computation 205 (2007), 1295–1306.
[3] F. Blanchet-Sadri and B. Woodhouse, Strict bounds for pattern avoidance, Theoretical Computer Science 506 (2013).
[4] J. Cooper and D. Rorabaugh, Bounds on Zimin Word Avoidance, Congressus Numerantium 222 (2014), 87–95.
[5] J. Cooper and D. Rorabaugh, Asymptotic Density of Zimin Words, Discrete Mathematics & Theoretical Computer Science 18:3#3 (2016).
[6] J. D. Currie, Pattern avoidance: themes and variations, Theoretical Computer Science 339 (2005).
[7] M. Lothaire, Algebraic Combinatorics on Words, Cambridge University Press, Cambridge, 2002.
[8] L. Lovász, Large Networks and Graph Limits, American Mathematical Society, Providence, 2012.
[9] J. Tao, Pattern occurrence statistics and applications to the Ramsey theory of unavoidable patterns, arXiv:1406.0450.
[10] A. Thue, Über unendliche Zeichenreihen, Norske Vid. Skrifter I Mat.-Nat. Kl., vol. 7, Kristiania, 1906.
[11] A. I. Zimin, Blokirujushhie mnozhestva termov, Mat. Sb. 119 (1982), 363–375.
[12] A. I. Zimin, Blocking sets of terms, Math. USSR-Sb. 47 (1984), 353–364.
Downloads
Published
Issue
Section
License
This copyright statement was adapted from the statement for the University of Calgary Repository and from the statement for the Electronic Journal of Combinatorics (with permission).
The copyright policy for Contributions to Discrete Mathematics (CDM) is changed for all articles appearing in issues of the journal starting from Volume 15 Number 3.
Author(s) retain copyright over submissions published starting from Volume 15 number 3. When the author(s) indicate approval of the finalized version of the article provided by the technical editors of the journal and indicate approval, they grant to Contributions to Discrete Mathematics (CDM) a world-wide, irrevocable, royalty free, non-exclusive license as described below:
The author(s) grant to CDM the right to reproduce, translate (as defined below), and/or distribute the material, including the abstract, in print and electronic format, including but not limited to audio or video.
The author(s) agree that the journal may translate, without changing the content the material, to any medium or format for the purposes of preservation.
The author(s) also agree that the journal may keep more than one copy of the article for the purposes of security, back-up, and preservation.
In granting the journal this license the author(s) warrant that the work is their original work and that they have the right to grant the rights contained in this license.
The authors represent that the work does not, to the best of their knowledge, infringe upon anyone’s copyright.
If the work contains material for which the author(s) do not hold copyright, the author(s) represent that the unrestricted permission of the copyright holder(s) to grant CDM the rights required by this license has been obtained, and that such third-party owned material is clearly identified and acknowledged within the text or content of the work.
The author(s) agree to ensure, to the extent reasonably possible, that further publication of the Work, with the same or substantially the same content, will acknowledge prior publication in CDM.
The journal highly recommends that the work be published with a Creative Commons license. Unless otherwise arranged at the time the finalized version is approved and the licence granted with CDM, the work will appear with the CC-BY-ND logo. Here is the site to get more detail, and an excerpt from the site about the CC-BY-ND. https://creativecommons.org/licenses/
Attribution-NoDerivs
CC BY-ND
This license lets others reuse the work for any purpose, including commercially; however, it cannot be shared with others in adapted form, and credit must be provided to you.