Wednesday, February 29, 2012

Condition C in Triqui

As I mentioned in a previous post, I am collecting condition C violations in Copala Triqui text.  Here is another one, from a folktale published by Hollenbach.


Choctaw dictionary in FLEx and orthographic choices

My friend Jack is working with Oklahoma Choctaws to produce an audio dictionary that will use their preferred spelling system.   I'm working this morning on a quick review of the entries.  Here is a screenshot:


One big regret of my own work with Choctaw is that I didn't use the traditional orthography enough, which alienates some Choctaw communities who are attached to it.

In my defense I have to say -- the traditional orthography is terrible.  Choctaw has a vowel length distinction not present in the traditional orthography, and the orthography also uses two different symbols for each of the three phonemic vowels /a, i, o/, which are written according to how the 19th century missionaries perceived the vowel quality.  Instances of /a/ that sounded more like schwa were written with the symbol <ʋ>.

But something I understand now that I did not understand as well when I was a grad student is that orthographies don't have to be good to be useful to native speakers.  I would think that my advice to someone starting to work with a community is that if there is any orthography which speakers can use, you should stick with it.

Monday, February 27, 2012

Two indefinites in Triqui

A little puzzle that I am trying to work out.  The one for 'one' is either yo'o or 'o in Copala Triqui, but I do not understand the different.  B. Hollenbach's dictionary lists one as a variant of the other.

It seems to me that possibly the two forms have a different syntactic distribution.  Yo'o appears in contexts that mostly seem like existentials, especially introducing a new noun phrases.  The following three examples seem typical:

1 Cor 5:1
1 Cor 10:16

Acts 5:1
'o has a somewhat different distribution, which I will explore in a subsequent post.

Saturday, February 25, 2012

More on negative fronting


In a previous post here, I talked about the negative fronting construction.   The following two (sexist!) examples show that in the texts a positive contrast with the particle ro'  often seems to be paired with a corresponding negative contrast:



Another interesting note -- previously, I was aware that negative fronting involved the following pattern:

Nuveé {NP|PP} [S...]   "It is not {NP|PP} that ...."


I can now add CP to the list of fronted elements, due to examples like the following:


1 Cor 4:14


1 Cor 7:35


So the revised constructional template for negative fronting is


Nuveé {NP|PP|CP} [S...]   "It is not {NP|PP|CP} that ...."

Thursday, February 23, 2012

Boosting corpus size for endangered languages

I've written several previous posts about methods for wrangling Bible translation material into Translation Editor and then into FLEx.  For an endangered language where there is an existing Bible translation, the amount of corpus material that can be added is fairly astonishing.

I ran the stats today on the Copala Triqui project today, and I think the numbers tell the story.   Within about 1/2 of the New Testament 'wrangled' into FLEx, the current project has around 100,000 words in it.


If we don't want to look at the translated material, we can filter it out by the Choose Texts menu:


(BTW, I made some mistake in genre type for the first few bits of Matthew that I experimented with, so don't show up in the Bible genre.  I could easily uncheck these as well, and at some point I will figure out what I have done wrong :-) )

If you filter out all the Bible material, the corpus is about 9,000 words:


A brief note -- Triqui is written in two orthographies (a practical one and a phonetic one).  The stats keep track of how many words are in each orthography as well as the total overall.

Wednesday, February 22, 2012

More options for using Bible translations for language study

I have been intrigued for a while now by the possibility of using available Bible translations as a way to study languages.  For many Native American languages, Bible translation are  the longest available texts, and they have lots of valuable information about the lexicon and syntax.  (Of course we have to consider their origin as translations.)

I've been exploring the websites that make some of these materials available, and recently investigated an option that was new to me -- a piece of Bible study software called the Word.  Various testaments are available in a format compatible with the Word, as in this screenshot:


I downloaded the free software the Word, and installed it, along with New Testaments in Copala Triqui and Itunyoso Triqui.  The software lets you put any number of translations side by side:


One potential advantage of using this program as an intermediary between online Bible materials and FLEx -- the Word has fairly sophisticated way to manipulate the Bible text.  For example, it can display or omit the verse numbers, paragraph breaks, footnotes, chapter numbers, etc.

For adding text to Translation Editor (and then looking at it in FLEx), the best option seems to be verse numbers, but no paragraph breaks.  On the File | Copy Verses menu, this is option #10:


After you click the Copy button, you can paste the text in this format anywhere you like.  You can get the English translation into the same format and paste them into the Back Translation column of Translation Editor:

Then there is only one more step before you are ready for analysis in FLEx.  The numbers in both the Triqui and the translation need to be in Verse Number format, and we want a paragraph break before each verse number in the Triqui:

Once in this form, you can look at the text in FLEx (via Tools | Back translation | Use Interlinear Text Tool) and begin ordinary analysis.

Tuesday, February 21, 2012

Itunyoso Triqui text on-line

I was pleased to see that there is some Itunyoso Triqui available online at this site, a bit of the translation of the book of James from the New Testament.  (I see that the whole Itunyoso Triqi testament is available as a PDF from some other sites, but they don't work for me, at least tonight.)  Because the same thing has been translated into Copala Triqui, I was curious to look at the similarities and differences.

Here is a bit of James 2 in Itunyoso and Copala Triqui:
Copala Triqui

Itunyoso Triqui


In a future post, I'll try to work out some sentence by sentence comparison.

Thinking in Triqui, part 3

In this previous post and this previous post, I discussed the unusual thinking construction in Copala Triqui.  The examples for today show the combination of a question and the thinking construction.   In Triqui, ga is the usual question particle for wh-questions, and na' is for yes-no questions.  Note that in these examples, the choice of question particle is sensitive to the main question -- not the 'do you think' part of the sentence.

That seems to imply to me that the constituency has the  NP portion as a parenthetical, and the question particle goes with the main sentence, roughly

[abc [rá NP] question particle]


where abc represents the clause that is being thought about.




Sunday, February 19, 2012

Causative constructions in Copala Triqui

The causative construction shows an unusual word order in Copala Triqui.  The normal order for these constructions is

[caused event]  'yaj SUBJ

Although the normal word order for Triqui is VSO, I do not think that causative constructions are ever in VSO order in natural text.  (The caused event clause, however, is in normal VSO order.)

The following passage shows a characteristic example.  (Note also that the aspect of the cause verb 'yaj typically matches the aspect of the verb(s) in the caused event.


Another similar example:


Far less common is the reversed pattern:

SUBJ 'yaj [caused event]

The examples I have seem to involve a causer subject that is fronted for some other reason, such as negation, or missing, due to relativization:



Another open question is the type of constituent the caused event may be.  In the examples above, it seems like a simple VSO clause.   But it is also possible for the caused event to contain some fronted material:


Here the caused event is [there:is:no illness [hit me]], where the noun chi'ii 'illness' has undergone the typical fronting to preverbal positions found with negative constituents.  (Maybe that is some sort of Focus Phrase?)


Saturday, February 18, 2012

Condition C violations in Triqui

The following example is nice for showing multiple condition C violations in Copala Triqui.   Similar examples are found in many languages, but I wanted to have some clear examples of this in Triqui text.



Negative construction in Copala Triqui

The contrastive negative construction in Copala Triqui has the form

Nuveé {NP|PP} [S...]   "It is not {NP|PP} that ...."


For contrasting the object of a preposition, both preposition stranding and pied-piping of the PP are possible.  This example shows pied-piping:


Stranded prepositions in Copala Triqui, continued

In this previous post and this previous post, I talked about stranded material in Copala Triqui.

In the texts, the richest natural source of stranding is found in relative clauses.  Consider the following complicated passage, which has two strandings of rihaan 'to':


Another context for naturally occurring strandings is the contrastive negation construction Nuveé NP V... 'It is not NP that...'


The last clause here is literally 'It is not God who they give that meat to'

Thursday, February 16, 2012

SayMore to FLEx

I've been experimenting today with SayMore, which is a way of organizing language documentation materials.

I think I'm probably typical of many people doing this work in that my materials are scattered across several locations -- on the hard drive at my office, on my laptop, in Dropbox -- and they are in several different formats (sound files, text files, some images, etc.)  One of the potential advantages of SayMore would be using it to organize links to all these things and keep the metadata about speakers, permissions, dates, formats, in a single place.  This screen, for example, has information about the various contributors and their roles.  There is a central repository for these, where you can keep background info (age, native language, contact information, permission form).  In these screen, Román Vidal López is the speaker in this audio clip.




I've also been interested in the new Annotation feature (present in the Alpha release), which provides a way to do first-pass transcription of audio and then export the annotations into a format that FLEx can read, for further analysis and correction.

To test this out today, I got an audio portion of the Address to the Triqui people and added it to SayMore.  (I did something wrong because the file comes out with the name NewEvent...)


When you have this in a format that you like, you can export it:


It ends up in a format called FLEx Interlinear XML.

You can now import this into a FLEx project.  Here is what the screen looks like after File | Import |FLExText Interlinear.


You browse to wherever the export file was located.  One little thing is that the default extension for an import file is .flextext, so at first you won't see your file.  You need to click on the FileType button and select .xml.


After you do that, (if everything is working right), you should see your transcription as an Interlinear Text in FLEx:


Overall, I was fairly pleased with the whole process.  There were one or two little glitches that may improve in the later releases:

a.) Before you do the annotation, you need to do a segmentation of the audio into little chunks.  You can then listen to these chunks at any speed you like, and they repeat until you've filled in the Transcription line to your satisfaction.   However -- if you start doing the transcription and discover you've accidentally put the segment boundary in a bad place (perhaps you cut off the last vowel), it does not seem like you can ever edit the segmentation.  At least, it wasn't obvious to me how to do this.

b.) To transcribe the Triqui properly, I needed to be able use the underscore diacritic (for low register tone).  I have a Keyman keyboard that allows me to do this properly in FLEx, but it did not seem that it would work in SayMore.  Possibly I missed some dialogue that would allow me to pick the font to a Unicode font that supports this diacritic?  Or something that allows me to pick my keyboard?

Still, these are fairly minor problems, and I think I could easily see using this for my next text transcription session.

Compared to ELAN, Praat, or Transcriber, the SayMore annotation tool is less elaborate.  (In my view ELAN is way too elaborate and unwieldy for my needs.  Praat is more a tool more suited to a phonetician's needs, and Transcriber works fine but its output does not import into FLEx in any straightforward way.)

Since my working style relies heavily on FLEx, anything that imports smoothly into that program has a huge advantage.  It is possible that if I were more focused on phonological, intonational, or gestural properties of the texts, it would be worth spending the time on something like Praat or Elan.  But since my interest is more keyed to morphosyntax and lexicon, I like a fairly light transcription tool that will help me do first-pass transcription, and SayMore looks promising for that.

Wednesday, February 15, 2012

More examples of the Triqui thinking construction

In this post and  this post, I talked about the strange syntax of the 'thinking construction'  in Copala Triqui, where some long clause [abc] is followed by rá NP, and the effect is to give NP's expectations, thoughts, hopes, desires about the situation described in [abc].

The following example is another strange one, and like a previous example, it might involve condition D violations if we try to interpret rá NP as the superordinate clause:


Here it is very hard to know if the material within the scope of  includes the 'Herod thought...' part of the sentence.  Maybe not?

I don't think the aspect of the verbs tells us anything, since both completive and potential verbs can be in the scope.

Tuesday, February 14, 2012

Triqui topics and the reversed copula construction

Another environment for the ro' topic marker in Copala Triqui is what I'm calling the reversed copula construction.


Regular copulas have the normal order

 Predicate copula Subject

 as in examples like the following:




The reversed construction instead has the order

  Subject copula Predicate


and the ro' topic marker often shows up on the subject here, as in the following examples:




Triqui topic marking

In a previous post, I talked about uses of the Copala Triqui morpheme ro' .  One frequent context is when two phrases are being compared with each other. More searching through the corpus shows that the words ase vaa 'like' often precede and daj or danj 'thus, likewise'  is often between the two phrases.

The following example shows two clauses that are compared: "just as [Christ rose] ro', so [he will descend to the world'":


(We also see here that ro'  doesn't have to be attached to noun phrases.)

The next example compares what he (Cornelius) and his family did:


(We see here also that ro' is not necessarily preverbal.)

One more example of this from our Address to the Triqui People text:



A rather different use of ro' seems to be involved in a structure which is NP-ro' [V pronoun ...], where the pronoun is resumptive to the NP:


Here the element with ro'  "young people" precedes the verb "listen"  and is resumed with the pronoun nij so'.

(This example is also rather neat with its initial cleft structure; the pre-cleft element is the object of 'listen (to)', which is embedded under 'have the obligation'.)

But the following two adjacent sentences show that a phrase with ro'  does not necessarily require a resumptive pronoun after the verb.  It is hard for me to say why one of these sentences has a resumptive and the other does not -- a speculation is that most of the phrases with ro'  are definite and 'each one' is not very definite.  So the daa 'o 'o determiner doesn't usually show up on phrases with ro' + resumptive.  However 'each one of the authorities together with all of their councils' is considerably more definite.  So perhaps definiteness accounts for the difference between sentences (23) and (24) below?


Monday, February 13, 2012

Topic "gappiness" and group cohesion in English

In some our work for the IARPA program, we looked at the relationship between the structure of topic chains and group cohesiveness (as judged by external evaluators).  We found a nice correlation that works like this
a.) identify the noun phrases that have the most references (pronominal, synonyms, repetitions) and call them the meso-topics of the discourse.
b.) count the number of "gaps" in the chain of references to the meso-topic, where a gap is a turn without a mention.
c.) Plot the rank of the meso-topic chain against the gappiness of the topic.

We found a neat correlation like the following:


More on special discourse function constructions in Triqui

Copala Triqui has a lot of constructions that involve discourse-prominent constituents, but I do not have a very good understanding of what they all do and how they differ from each other.  I have recently been looking at the morpheme ro', which seems to be a kind of topic marker.  The constituent marked with ro' often precedes a cleft with me se.


In the first example, we have a sequence of two sentences.  The first tells the addressee (a man named Cornelius) to go find Simon Peter.  (In this context Simon Peter must be new, since the addressee has never met him.)  The second sentence refers to Simon Peter as so' 'he', followed by the ro' topic marker.  So ro' seems to refer to a previously introduced topic, perhaps highlighting it as a continuing topic for the following discourse.


The second example shows another possible use of ro', to contrast members of a set with each other.  So in this passage, the discussion is about two representatives --  "One of them is.... and the other one is ....".   In this context ro'  appears on both contrasted noun phrases.


Saturday, February 11, 2012

Fronted phrases with maan in Triqui

I'm looking tonight at phrases with maan, which is rather inadequately glossed 'only' in my current lexicon.

The phrase preceded by maan is always (I think) clause initial and has some special discourse function -- maybe focus.  (I need to work out how it is different from the ro' topics, clefts with me se, etc...)

This topical phrase is sometimes a time adverb (either orá or tio, two words borrowed from Spanish, that mean 'time'):



It can also be a noun phrase.  In the following example, the cactus fruit with spines contrasts with the nice cactus fruits without spines which the rabbit threw in the previous episodes.


(This example also has preposition stranding in the fronted phrase)