SPANISH AND PORTUGUESE
An important part of the project to
dramatically expand and improve the Spanish and
Portuguese corpora will be to
correct the part of speech and lemmatization for the new corpora.
example, you might see (on successive pages) the Spanish words gorrión
(noun), electrica (adj), hip-hop (noun), or
reduciéndose (verb) (there would be similar data for Portuguese).
You would indicate that gorrión is in fact a noun; that
electrica should be the lemma eléctrico,
that hip-hop is definitely an Anglicism, and that reduciéndose
is actually a form of the lemma reducir. (By the way, if you help with this project, you'll be using an interface in English,
although you can enter your notes or comments in Spanish or Portuguese.)
If you think that you might be interested
in helping to "crowd source" this data, please enter your name below.
(Or let us know if you think that your (advanced) students might be
interested; it might make a nice project for them as part of a class
that you teach.) You can do as much or as little as you want (even just
5-10 minutes a week would help).
We'll contact you in a few months, but there is absolutely
for you to actually help with the project at that time, if you'd rather
not. If you are able to help, though, we'll definitely include you in the
list of contributors when the corpora are completed, so that others can
see your involvement. Thanks!