C-rule: Difference between revisions
From UNLwiki
Jump to navigationJump to search
imported>Martins No edit summary |
imported>Martins No edit summary |
||
| Line 2: | Line 2: | ||
== Expressing compounds in the UNL<sup>arium</sup> == | == Expressing compounds in the UNL<sup>arium</sup> == | ||
In the UNL<sup>arium</sup> framework, compounds are | In the UNL<sup>arium</sup> framework, compounds are treated as ordinary simple words except in case of discontinuous multi-word expressions or with infixation (such as "give in" or "take into account"). In these cases, the [[lemma]] is different from the [[base form]], and the compound-formation process is expected to be defined through specific rules: | ||
*coffee house (multi-word expression without infixation: "coffee house">"coffee houses"): BF=lemma="coffee house"<br> | |||
*give in (multi-word expression with infixation: "give in">"gave in"): BF="give" <code>≠</code> lemma="give in"<br> | |||
*behind one's back (discontinuous multi-word expression without infixation: "behind my back", "behind his back", etc): BF="behind" <code>≠</code> lemma="behind <person>'s back"<br> | |||
*take into account (discontinuous multi-word LRU with infixation: "take it into account", "took that into account"): BF="take" <code>≠</code> lemma="take into account" | |||
== Examples == | == Examples == | ||
| Line 29: | Line 33: | ||
== Observation == | == Observation == | ||
;Phrasal verbs | ;Phrasal verbs | ||
Particles of phrasal verbs must be represented as part of the head, if non separable, or as adjuncts, otherwise: | :Particles of phrasal verbs must be represented as part of the head, if non separable, or as adjuncts, otherwise: | ||
*give in = VH([in]); ("give in something" but <strike>"give something in"</strike>) | *give in = VH([in]); ("give in something" but <strike>"give something in"</strike>) | ||
*give back = VA([back]); ("give back something" or "give something back") | *give back = VA([back]); ("give back something" or "give something back") | ||
;Strings and lemmas | |||
:In the compound-formation process, the UNL<sup>arium</sup> distinguishes between strings (to be represented between "") and lemmas (to be represented between [ ]). The difference between strings and lemmas has to do with the dictionary status. Lemmas, but not strings, are expected to be defined as dictionary entries: | |||
*VA("into account"); (add the string "into account" as a verbal adjunct, take > take into account) | |||
*VC([love]); (add the lemma "love" as a verbal complement, such as in make > make love) | |||
In the above, it's unlikely to have "into account" as a single entry, whereas "love" is probably already there. | |||
== Syntax == | == Syntax == | ||
Compounds may be explicitly expressed by [[S-rules]], a formalism for describing the syntactic structure of phrases. | Compounds may be explicitly expressed by [[S-rules]], a formalism for describing the syntactic structure of phrases. | ||
Revision as of 12:35, 23 March 2010
Compounding or composition is the word-formation process of creating compounds by combining or putting together lexemes.
Expressing compounds in the UNLarium
In the UNLarium framework, compounds are treated as ordinary simple words except in case of discontinuous multi-word expressions or with infixation (such as "give in" or "take into account"). In these cases, the lemma is different from the base form, and the compound-formation process is expected to be defined through specific rules:
- coffee house (multi-word expression without infixation: "coffee house">"coffee houses"): BF=lemma="coffee house"
- give in (multi-word expression with infixation: "give in">"gave in"): BF="give"
≠lemma="give in" - behind one's back (discontinuous multi-word expression without infixation: "behind my back", "behind his back", etc): BF="behind"
≠lemma="behind <person>'s back" - take into account (discontinuous multi-word LRU with infixation: "take it into account", "took that into account"): BF="take"
≠lemma="take into account"
Examples
| Lemma | BF | Compound | Description |
|---|---|---|---|
| give in | give | VH([in]) | "in" is to be added to the base form as part of the head of the verb (VH) |
| take into account | take | VA("into account") | "into account" is to be added to the base form as an adjunct to the verb (VA) |
| throw <person> to the lions | throw | VA("to the lions"), VC(NP) | "to the lions" is to be added to the base form as an adjunct to the verb (VA) and a noun phrase (NP) is to be added as a complement to the verb (VC) |
Observation
- Phrasal verbs
- Particles of phrasal verbs must be represented as part of the head, if non separable, or as adjuncts, otherwise:
- give in = VH([in]); ("give in something" but
"give something in") - give back = VA([back]); ("give back something" or "give something back")
- Strings and lemmas
- In the compound-formation process, the UNLarium distinguishes between strings (to be represented between "") and lemmas (to be represented between [ ]). The difference between strings and lemmas has to do with the dictionary status. Lemmas, but not strings, are expected to be defined as dictionary entries:
- VA("into account"); (add the string "into account" as a verbal adjunct, take > take into account)
- VC([love]); (add the lemma "love" as a verbal complement, such as in make > make love)
In the above, it's unlikely to have "into account" as a single entry, whereas "love" is probably already there.
Syntax
Compounds may be explicitly expressed by S-rules, a formalism for describing the syntactic structure of phrases.