### An Excerpt from Reviewed Papers **Complexity of Language and LLMs** - Xin Du and Kumiko Tanaka-Ishii. Correlation Dimension of Autoregressive Large Language Models. [_NeurIPS 2025_](https://arxiv.org/abs/2510.21258) - Xin Du and Kumiko Tanaka-Ishii. Correlation Dimension of Natural Language in A Statistical Manifold. [_Physical Review Research. 2024_](https://journals.aps.org/prresearch/abstract/10.1103/PhysRevResearch.6.L022028) - Xin Du and Kumiko Tanaka-Ishii. FIRE: Semantic Field of Words Represented as Nonlinear Functions. [*NeurIPS 2022*](https://proceedings.neurips.cc/paper_files/paper/2022/hash/f08223bc8d177df6807811c32f5acfed-Abstract-Conference.html) **LLM Retrieval & Clustering** - Xin Du and Kumiko Tanaka-Ishii. Information-Theoretic Generative Clustering of Documents. [_AAAI 2025_](https://arxiv.org/abs/2412.13534) - Xin Du, Lixin Xiu, and Kumiko Tanaka-Ishii. Bottleneck-Minimal Indexing for Generative Document Retrieval. [_ICML 2024 (Oral)_](https://dl.acm.org/doi/abs/10.5555/3692070.3692542) [\[arXiv\]](https://arxiv.org/abs/2405.10974) **Financial-Language Complex Systems** - Xin Du, Kai Moriyama, Kumiko Tanaka-Ishii. Co-Training Realized Volatility Prediction Model with Neural Distributional Transformation. [_ICAIF 2023_](https://arxiv.org/abs/2310.14536) - Xin Du and Kumiko Tanaka-Ishii. Stock portfolio selection balancing variance and tail risk via stock vector representation acquired from price data and texts. [_Knowledge-Based Systems_](https://www.sciencedirect.com/science/article/abs/pii/S0950705122004397) - Xin Du and Kumiko Tanaka-Ishii. Stock embeddings acquired from news articles and price history, and an application to portfolio optimization. [_ACL 2020_](https://aclanthology.org/2020.acl-main.307/)