A Database of Person Marking in South-Central Trans-Himalayan

Sandra Auderset; Hunter L. Brown; Jonathan Reich; Pascal Gerber; Muhammad Zakaria; Linda Konnerth

doi:10.5334/johd.505

A Database of Person Marking in South-Central Trans-Himalayan

Journal of Open Humanities Data

Volume 12 (2026): Issue 1

By: Sandra Auderset , Hunter L. Brown , Jonathan Reich , Pascal Gerber , Muhammad Zakaria and Linda Konnerth

Open Access

|Apr 2026

Figures & Tables

(1)	a.	ka-se
		1sg-go
		‘I am going’
	b.	se-mak-ung
		go-neg-1sg
		‘I am not going’	Ranglong

(2)	a.	kaː-tà-tì-nʉ́
		1:p-touch-2-nfut
		‘you (SG) touch me’	Anal Naga
	b.	m̩̀-m̥ú-náː-tʃɘ̀
		inv-see-ipfv:tr-2
		‘you (SG) saw me’	Monsang
(3)	a.	a-t-déé
		2-inv-see
		‘you (SG) see me’
	b.	m-t-déé
		1-inv-see
		‘you (SG) see me’	Lamkang

Table 1

Languages included in the first release of PMST, with identifiers, group affiliation, collaborators and sources. Languages are ordered by group. Within Northwestern, languages are ordered by how closely related they are assumed to be following the impressionistic subgroupings in Konnerth (2022).

LANGUAGE	GLOTTOCODE	ISOCODE¹	GROUP	COLLABORATORS & SOURCES
Ranglong	rang1271	(rnl)	Northwestern	Hunter Brown, Jessi Tara
Chiru	chir1283	cdf	Northwestern	Mechek Sampar Awan; Awan (2019)
Anal Naga	anal1239	anm	Northwestern	Pavel Ozerov; Thotson Langhu; Ozerov (2019)
Monsang	mons1234	nmh	Northwestern	Linda Konnerth, Koninglee Wanglar
Lamkang	lamk1238	lmk	Northwestern	Shobhana Chelliah, Rex Rengpu Khullar; Chelliah et al. (2019)
Hmar	hmar1241	hmr	Northwestern	Marina Infimate
Pangkhua	pank1249	pkh	Central	Mohammed Zahid Akter; Akter (2024)
Hyow	khya1239	(csh)	Southern	Muhammad Zakaria; Zakaria (in press)

Location and group affiliation of the sample languages. The inset shows the location of the detailed map within South(east) Asia.

Table 2

Overview of tags used to annotate variation.

TAG	CATEGORY OF TAG	DESCRIPTION
default	paradigm_tag	unmarked form (most general, most frequent, etc.) or form that has no other tag
pragm_marked	paradigm_tag	pragmatically conditioned variant
hort	paradigm_tag	form is a hortative
emph	paradigm_tag	form is from an emphatic paradigm
unspec_var	paradigm_tag	variant of (yet) unspecified distribution
generic_nf	tense_tag	generic non-future form
non_generic_nf	tense_tag	non-generic non-future form
past	tense_tag	past tense form
optional_plural	overabundance_tag	form that does not contain a marker for plural
optional_third	overabundance_tag	form that does not contain a marker for third person
optional_future	overabundance_tag	form that does not contain a marker for future tense
variable_order	order_tag	form contains morphemes that can variably order
special_stem	morphanalysis_tag	form has a special stem form in particular cells of a paradigm
tone_alt_stem	morphanalysis_tag	form exhibits a tone alternation triggered by the stem
morphophon	morphanalysis_tag	form exhibits morphophonological process(es)
copy_v	phonanalysis_tag	form has a copy vowel in at least one morpheme
dialect_var	variants_tag	form from other dialect
sociolect_var	variants_tag	form from other sociolect

Schematic overview of workflow and the connection between the working versions of the datasets on GitHub and the published versions on Zenodo. A, B, C represent individual languages.

Table 3

Dataset description.

Repository name	Zenodo
Object name	PMST-Database
Repository location	All PMST datasets can be found at https://zenodo.org/communities/pmst/. For DOIs of individual datasets, please consult Table 4.
Format names	csv, json, md, yml
Creation dates	2023-12-27 to 2025-12-10
Publication date	The datasets pertaining to the first release of PMST were published between 2025-12-01 to 2025-12-10.
License	CC-BY-SA 4.0

Table 4

Languages (=datasets) included in the first release of PMST, with the number of forms, the number of scenarios,⁶ and their DOI.

LANGUAGE	FORMS	SCENARIOS	ZENODODOI
Anal Naga	311	184	10.5281/zenodo.17881855
Chiru	674	165	10.5281/zenodo.17779437
Hmar	267	163	10.5281/zenodo.17779055
Hyow	917	352	10.5281/zenodo.17788529
Lamkang	298	158	10.5281/zenodo.17780049
Monsang	336	163	10.5281/zenodo.17865713
Pangkhua	149	145	10.5281/zenodo.17866617
Ranglong	255	142	10.5281/zenodo.17778036

Overview of database modules and their relations. Two-way arrows indicate direct links between files, e.g., the forms file can be joined with the cells file via the cell/cell identifier which appears in both files. One-way arrows indicate subset relations, e.g., each phoneme in the phon_form columns appears in the sound file separately.

Distributional profile of morphs in the Ranglong dataset. The top panel shows the distribution across tense-aspect and polarity values. The middle panel shows the distribution across person configurations. The bottom panel shows the distribution across number categories. Stripes are used for elements appearing before the verb stem and circles for those appearing after.

Length of verb forms (minus the lexical stem) in phonemes of transitive affirmative scenarios aggregated per scenario and language. The dot indicates the average; the whiskers show the range. Languages are arranged by subgroup and relatedness (cf. Table 1).

1	first person
2	second person
3	third person
A	A (actor) argument of a transitive predicate
INV	inverse
IPFV	imperfective
NEG	negation
NFUT	non-future
P	P (undergoer) argument of a transitive predicate
S	sole argument of an intransitive predicate
SAP	speech-act participant (first and second person)
SC	South-Central (branch of Trans-Himalayan)
SG	singular
TR	transitive

References

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.5334/johd.505 | Journal eISSN: 2059-481X

Journal RSS Feed

Language: English

Submitted on: Dec 19, 2025

Accepted on: Feb 16, 2026

Published on: Apr 7, 2026

Published by: Ubiquity Press

In partnership with: Paradigm Publishing Services

Publication frequency: 1 issue per year

Keywords:

person marking,

paradigms,

South-Central Trans-Himalayan,

microtypology

© 2026 Sandra Auderset, Hunter L. Brown, Jonathan Reich, Pascal Gerber, Muhammad Zakaria, Linda Konnerth, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.

Volume 12 (2026): Issue 1

A Database of Person Marking in South-Central Trans-Himalayan

Figures & Tables

Table 1

Figure 1

Table 2

Figure 2

Table 3

Table 4

Figure 3

Figure 4

Figure 5

Paradigm

My account