Show simple item record

 
dc.creator Olsen, Sussi
dc.creator Braasch, Anna
dc.creator Hansen, Dorte Haltrup
dc.creator Jakob, Halskov
dc.date.accessioned 2018-04-24T14:40:28Z
dc.date.available 2018-04-24T14:40:28Z
dc.date.issued 2011
dc.identifier.uri http://hdl.handle.net/20.500.12115/9
dc.description Texts in the Construction Domain come from Statens Byggeforskningsinstitut, Erhvervs- og byggestyrelsen and Murerfagets Oplysningsråd and have been collected in the DK-CLARIN project, WP2.2, 2008 - 2011. The corpus consists of 577,392 words in 35 files. Communicative setting/Number of files: expert->expert (18) expert->advanced (6) expert->basic (11). All texts are in XML TEIP5 format (TEIP5DKCLARIN-format), with tokenisation, sentence and paragraph segmentation, pos-tagging, lemmatisation and termhood annotation placed in separate text external spangroups. "DK-CLARIN LSP Corpus - Construction domain" is a part of the Danish DK-CLARIN LSP corpus consisting of seven sub-corpora from following subject domains: Agriculture, Construction, Economics, Environment, Health, IT and Nanotechnology.
dc.language.iso dan
dc.publisher Centre for Language Technology, NorS, University of Copenhagen
dc.publisher The Danish Language Council
dc.rights CLARIN-ACA-NC
dc.rights.uri https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&NORED=1
dc.rights.label ACA
dc.subject Construction
dc.title DK-CLARIN LSP Corpus - Construction domain
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN-DK
contact.person Administrator; CLARIN-DK; info@clarin.dk; Centre for Language Technology, NorS, University of Copenhagen
sponsor n/a; n/a; DK-CLARIN; nationalFunds;
size.info 35; files
size.info 577,392; tokens
files.size 24852076
files.count 11
annotationInfo.annotationType tokenizer
annotationInfo.annotationType lemmatization
annotationInfo.annotationType POS-tagging
annotationInfo.annotationType sentence and paragraph segmentation
annotationInfo.annotationType termhood scoring


 Files in this item

 Download all files in item (23.7 MB)
This item is
Academic Use
and licensed under:
CLARIN-ACA-NC
Attribution Required Noncommercial
Icon
Name
erhvervsOgByggestyrelsen_1.zip
Size
3.72 MB
Format
application/zip
Description
Corpus 1
MD5
dea50e92118686c9dad9e20238d8adaf
 Download file
Icon
Name
erhvervsOgByggestyrelsen_2.zip
Size
2.02 MB
Format
application/zip
Description
Corpus 2
MD5
3767db815f1b24a2b15b2186ed7eff79
 Download file
Icon
Name
Muro.zip
Size
3.62 MB
Format
application/zip
Description
Corpus 3
MD5
6a40bd1d3ce17983d4b2d7fe3881a01b
 Download file
Icon
Name
SBI.zip
Size
13.3 MB
Format
application/zip
Description
Corpus 4
MD5
d48c1d676cc4d1a4990d36832f0c0e7c
 Download file
Icon
Name
DKCLARIN_fagsprogligt_korpus_dokumentation_2011.pdf
Size
361.81 KB
Format
PDF
Description
Documentation
MD5
e1752deaa6888e2f856811c8d933e655
 Download file
Icon
Name
text-format.pdf
Size
111.77 KB
Format
PDF
Description
Documentation
MD5
c4c4b5f1cd83ff232c44bc7692621da7
 Download file
Icon
Name
text-header.pdf
Size
375.79 KB
Format
PDF
Description
Documentation
MD5
47825d0010a398bf10ce1564da2a15f0
 Download file
Icon
Name
README_construction.txt
Size
3.04 KB
Format
Text file
Description
README
MD5
11ed9229c6f8755eaa19d8bc15673ef2
 Download file
Icon
Name
dkclarin-LSPConstruction-cmdi_textCorpus.xml
Size
16.3 KB
Format
XML
Description
CMDI metadata
MD5
3e436cf3b457d8d132fb24fd5d671e20
 Download file
Icon
Name
teiHeader.xsd
Size
59.88 KB
Format
XML
Description
TEI schema
MD5
9fc5374ad34319278f437b963454f972
 Download file
Icon
Name
textCorpusProfile.xsd
Size
142.26 KB
Format
XML
Description
CMDI schema
MD5
7d6b452b88175041133ea8020e453cd8
 Download file

Show simple item record