dc.creator |
Asmussen, Jørg |
dc.creator |
Halskov, Jakob |
dc.date.accessioned |
2018-09-24T14:34:44Z |
dc.date.available |
2018-09-24T14:34:44Z |
dc.date.issued |
2011 |
dc.identifier.uri |
http://hdl.handle.net/20.500.12115/36 |
dc.description |
DK-CLARIN Reference Corpus of General Danish has been collected as part of DK-CLARIN project, WP2.1, 2008 - 2011. All texts are in XML TEIP5 format (TEIP5DKCLARIN-format), with tokenisation, ePOS-tagging, sentence and paragraph segmentation, and lemmatisation. The corpus comprises 45,113,245 words. |
dc.language.iso |
dan |
dc.publisher |
Society for Danish Language and Literature, DSL |
dc.rights |
CLARIN-ACA-NC |
dc.rights.uri |
https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&NORED=1 |
dc.rights.label |
ACA |
dc.source.uri |
https://korpus.dsl.dk/clarin/ |
dc.subject |
LGP - Language for General Purposes |
dc.title |
DK-CLARIN Reference Corpus of General Danish |
dc.type |
corpus |
metashare.ResourceInfo#ContentInfo.mediaType |
text |
has.files |
yes |
branding |
CLARIN-DK |
contact.person |
Jørg; Asmussen; korpus@dsl.dk; Society for Danish Language and Literature, DSL |
sponsor |
n/a; n/a; DK-CLARIN; nationalFunds; |
size.info |
45113245; words |
files.size |
1559785675 |
files.count |
11 |
annotationInfo.annotationType |
tokenization |
annotationInfo.annotationType |
sentence and paragraph segmentation |
annotationInfo.annotationType |
POS-tagging |
annotationInfo.annotationType |
lemmatization |