Thursday, March 28, 2013

New instances of progressive aspect in Colonial Valley Zapotec

I was very pleased to find some more instances of the progressive aspect marker in Colonial Valley Zapotec texts this morning, since it is not generally known that this morpheme was present in the language at this stage.  Smith Stark (2008), for example, doesn't include any mention of it in an otherwise comprehensive overview of the Colonial Valley Zapotec aspect morphology.

These new examples involve the verb 'do' and 'be sitting'.  It's interesting that with the first verb, the form in the text is

ca-g-oni

where the /g/ looks like a reflex of the potential aspect.  Thom Smith Stark argued in a (2004) paper that the progressive evolved from a construction that involved a verb of position followed by a verb in the potential.


Saturday, March 23, 2013

Central Zapotec languages

The tree below is the current version of something that I am working on.  It's intended to show major branches of Central Zapotec (according to the classification of Smith Stark 2007), and it also shows (in brackets and italics) where we currently have Colonial Valley Zapotec documents in our FLEx corpus.

I know that there are many other Colonial Valley Zapotec documents out there, so the tree will get 'bushier' over time, but I found this a useful graphic view of where the documents come from in terms of distribution.


Friday, March 8, 2013

The function of hua- in Colonial Valley Zapotec

In Colonial Valley Zapotec texts, we often see an aspect marker that is spelled something like hue or hua before the verb.  The function of this marker is a bit uncertain, and in the modern Valley Zapotec languages that I am most familiar with, it is no longer in use.

The following passages, from an 1823 Catechism, however, seem to show that hua is not strictly speaking an aspect marker, but another kind of morpheme that precedes the aspect prefix /r-/.  





In modern Valley Zapotec /r-/ is habitual, but in Colonial Valley Zapotec, /r-/ had a wider range of uses.  Perhaps at this point in the 19th century, the combination huari was the normal way to indicate repeated actions?

Contrast the following where the /r-/ appears without the hua.  Here we seem to have more of an eternal 'gnomic' style reading of the clauses:


Monday, March 4, 2013

A verb of position as a diachronic source for continuous aspect in Zapotec?

This morning I was reading a bit of


Bybee, Joan, Revere Perkins, and William Pagliuca (1994). The Evolution of Grammar. Tense,
Aspect, and Modality in the Languages of the World, Chicago: The University of Chicago.

where she discusses the evolution of progressive aspect in languages around the world.  In many languages, it originates in some construction that involves a locative expression, either with a verb of position or something like 'be in X'.

This got me to wondering about the Continuous aspect in modern Valley Zapotec languages.   This is not a well-attested aspect in Colonial Valley Zapotec, but I find some examples of incipient grammaticalization, all in the construction 'I say', where the verb nni is used with a preceding ca- prefix that looks very much like the continuous aspect ca- found in modern Valley Zapotec.

A possible connection to a verb of position is the word cáá 'hang, be located (in a high place)'.  The lexical entry for this word in San Dionisio Ocotepec Zapotec in the current draft of my own dictionary looks like the following:


In the Colonial Valley Zapotec examples that I have, only 'I speak' is used with a preceding ca:




It is probably also significant that both of these examples are accompanied by the adverb anna 'now', as would be appropriate for a continous aspect marker.   

But I do want to emphasize that continous aspect marking is incipient in Colonial Valley Zapotec, and I haven't identified any other examples in the texts.  In modern Valley Zapotec languages, you can use continous for a wide range of verbs; in CVZ only the verb 'say' is attested so far.

Sunday, March 3, 2013

An expression of judgment with chiba in Colonial Zapotec

The following is my best guess at the analysis of a passage from Feria's Doctrina in Colonial Valley Zapotec.  What is of some interest is trying to work out the right analysis of the word chiba here.  It probably is related to the word 'put', which is a causative of 'sit':


The word that follows is xihui, which means 'sin'.  So the whole phrase, we know from the translation, means 'condemned by Pontius Pilate'.  But what is the literal Zapotec? Chiba xihui seems like 'place sin', so is 'condemn' = 'place sin', with =ni serving as the subject of this verb?

Still, it's hard to understand the syntax of the part that follows this... justicia ni pettogo xihui xitichani Iuez nila Poncio Pylato.   From looking at other examples of 'judge'  it seems that the verb  ttogo is always followed by a form of ticha 'word', so that this is a quasi-idiomatic expression.  (The possessor of the word seems to be the person who is judged.)  "Judge a person's sins" seems to involve putting the word  xihui between these two parts.

Saturday, March 2, 2013

More examples of regular expressions in FLEx

This regular expression is used to help me separate out the right allomorph of the habitual prefix of the Colonial Valley Zapotec verb.  The habitual aspect comes in a few allomorphic variants, which are orthographically usually written <te, to, ti>.  I want to strip this from the citation form and put it in a separate field called Cordoba habitual form.

The procedure is to copy the whole citation form to the Cordova habitual form, then use a regular expression in bulk edit to remove everything except the prefix.   The following search and replace comes fairly close to what I want:


This bulk replace operation looks for the beginning of the record (^), then any number of characters (.*), then t followed by one character (\w).   I put the sequence t & one character inside the capturing parentheses, because I want to be able to refer to it in the Replace operation on the next line.
After the parenthesis, I have any number of characters (.*) and the end of the record ($).

This is replaced by t, whatever the letter that was captured in the previous parenthesis was ($1) and a hyphen.

So if the first line finds   blah blah ti+capaya blah blah  it will be replaced by ti-.

This mostly works -- except that the way the original Cordova entries came to me, there are sometime several Zapotec verbs listed together in a single entry; possibly with differential potential prefixes.  So I have to give the results a visual inspection to make sure nothing has gone wrong before hitting the Apply button in Bulk Edit.  If there is a record where this will give the wrong result, I can just uncheck it, and edit it manually.


The counter part to this regular expression is the one that takes the citation form, strips off the habitual prefix and returns just the portion minus that prefix.   The search and replace that will do that is the following:



The Find expression looks for a t followed by one word-forming character and a literal (+) at the beginning of a record (^).  It starts to capture everything from here to the end of the record ($) and stores it in the variable $1.   When this search and replace is applied, the effect is to take a word like

ti+capaya

and replace it with capaya.

Friday, March 1, 2013

More sophisticated regular expression searching in the Cordova

I am not a computational linguist, by any means, but I have slowly been learning enough about regular expressions to be able to do some useful things with FLEx.  One aspect just learned is how to use regular expressions in search and replace operations in FLEx.  (The FLEx help menus are not really very explicit on this.)

In a FLEx search and replace function — in Bulk Edit, for example — each thing that is enclosed in parentheses will return some set of results, called a capture.  You can refer to this capture with the variable $.  So the material in the first set of parentheses is $1.  The material in the second capture is $2, and so on.   Here is an example of how I used this information in the Colonial Valley Zapotec database.


Córdova normally cites a verb in the 1st person habitual.  Depending on the allomorph of the verb, the habitual of the verb might be /ti, to, te/.  The verb root will usually be four to eight letters long.  And the first person will end in /a/.

So if the form cited is tichapa, I would like it to be segmented ti+chap-a.

The "Find what" on the first line sets up a first capture group, which is the prefix, made up of t plus either e, i, or o. (Elements between square brackets are options.)  Because this whole first unit is between parentheses, it is capture group one, which I can refer to as $1 in the "Replace With" line below.

I want to replace it with the same thing, followed by a + to show the boundary.

The next capture group is a group of letters (shown by \w — meaning any wordforming character), and I have shown the number as between 4 and 8.  (On second thought, perhaps the lower number should have been three…)

Since this is the second capture group, I can refer to it by $2, in the "Replace With" line, and this time I replace it with the same thing, followed by a hyphen.

This is a first attempt at using the regular expressions with FLEx, but I think I can already see how they are going to make it possible to accomplish more sophisticated data manipulation as we try to get the Córdova diccionario into a format that we can understand.