id author title date pages extension mime words sentence flesch summary cache txt felbur-crosslingusitic-2022 felbur felbur-crosslingusitic-2022 2022 14 .pdf application/pdf 8223 359 53 While much effort is currently being invested in attempts to develop tools that will segment Chinese texts into words (some of them specifically designed to segment Buddhist materials, e.g. Wang, 2020), these tools remain unusable to us, since the underlying models themselves are often not openly released, and the training data used to create them is often not available. We then define Tibetan texts parallel to the Chinese sūtras as the ‘target.’ cache/felbur-crosslingusitic-2022.pdf txt/felbur-crosslingusitic-2022.txt