Changes

Jump to navigation Jump to search
33 bytes removed ,  11:03, 1 March 2016
no edit summary
The following table This article is part of the [[Advanced User's Guide]].It lists statistics on various XML databases instances that have been created with BaseX , with value andfull-text indexes turned off. The URLs to the original sources, if available or public, links to the source documentsare listed below.
The database size does not include any indexes* #nodes represents the number of XML nodes which have been created [[Databases]] in the BaseX are light-weight. If a database* #atr, #elnlimit is reached, you can distribute your documents across multiple database instances and #uri represent the number access all of distinct attributes, element names, and namespacesthem with a single XQuery expression.
== Databases ==
 
{| class="wikitable sortable"
|-
!Instances
!file size FileSize!#filesFiles!db size DbSize!#db nodes Nodes!#atr Attr!#eln ENames!#atn ANames!#uriURIs!heightHeight|-| '''Limits'''|'''512 GiB'''<br/>(2^39 Bytes)|'''536'870'912'''<br/>(2^29)|''no limit''<br/>&nbsp;|'''2'147'483'648'''<br/>(2^31)|''no limit''<br/>&nbsp;|'''32768'''<br/>(2^15)|'''32768'''<br/>(2^15)|'''256'''<br/>(2^8)|''no limit''<br/>&nbsp;
|-
| RuWikiHist
|1
|416 GiB
|324,'848,'508
|3
|21
| 1
| 120 GiB
| 179,'199,'662
| 3
| 21
| 1
| 75 GiB
| 134,'380,'393
| 3
| 21
| 1
| 64 GiB
| 1,'615,'071,'348
| 2
| 74
| 1
| 52 GiB
| 401,'456,'348
| 3
| 21
| 379
| 36 GiB
| 1,'623,'764,'254
| 2
| 84
| 1
| 37 GiB
| 1,'631,'218,'984
| 3
| 245
| 9
|-
| Inex209Inex2009
| 31 GiB
| 2,'666,'500
| 34 GiB
| 1,'336,'110,'639
| 15
| 28,'034
| 451
| 1
| CoPhIR
| 29 GiB
| 10,'000,'000
| 31 GiB
| 1,'104,'623,'376
| 10
| 42
| 1
| 25 GiB
| 198,'546,'747
| 3
| 24
| 1
| 26 GiB
| 645,'997,'965
| 2
| 74
| 1
| 19 GiB
| 860,'304,'235
| 5
| 7
| 1
| 13 GiB
| 432,'628,'105
| 12
| 26
| NewYorkTimes
| 12 GiB
| 1,'855,'659
| 13 GiB
| 280,'407,'005
| 5
| 41
| 1
| 14 GiB
| 589,'650,'535
| 8
| 47
| 1
| 13 GiB
| 323,'083,'409
| 2
| 74
| IntAct
| 7973 MiB
| 25,'624
| 6717 MiB
| 297,'478,'392
| 7
| 64
| 1
| 10 GiB
| 443,'627,'994
| 8
| 61
| 1
| 8028 MiB
| 395,'871,'872
| 2
| 22
| 1
| 5171 MiB
| 6,'910,'669
| 3
| 19
| 1
| 5422 MiB
| 241,'274,'406
| 8
| 70
| 1
| 5532 MiB
| 167,'328,'039
| 23
| 186
| Wikicorpus
| 4492 MiB
| 659,'338
| 4432 MiB
| 157,'948,'561
| 12
| 1,'257| 2,'687
| 2
| 50
| 1
| 3537 MiB
| 98,'433,'194
| 1
| 11
| CoPhIR
| 2695 MiB
| 1,'000,'000
| 2882 MiB
| 101,'638,'857
| 10
| 42
| 1
| 2410 MiB
| 104,'845,'819
| 3
| 6
| 1
| 2462 MiB
| 102,'901,'519
| 2
| 7
| 1
| 1303 MiB
| 32,'298,'989
| 2
| 74
| 1
| 850 MiB
| 44,'821,'506
| 4
| 3
| 1
| 918 MiB
| 46,'401,'941
| 3
| 23
| Twitter
| 736 MiB
| 1,'177,'495
| 767 MiB
| 15,'309,'015
| 0
| 8
| Organizations
| 733 MiB
| 1,'019,'132
| 724 MiB
| 33,'112,'392
| 3
| 38
| 1
| 944 MiB
| 36,'878,'181
| 4
| 35
| Feeds
| 692 MiB
| 444,'014
| 604 MiB
| 5,'933,'713
| 0
| 8
| 1
| 407 MiB
| 21,'602,'141
| 5
| 55
| 38
| 273 MiB
| 14,'512,'851
| 1
| 111
| 1
| 195 MiB
| 10,'401,'847
| 5
| 66
| ZDNET
| 130 MiB
| 95,'663
| 133 MiB
| 3,'060,'186
| 21
| 40
| 1
| 171 MiB
| 8,'592,'666
| 0
| 10
| 1
| 130 MiB
| 3,'221,'926
| 2
| 74
| 1
| 86 MiB
| 3,'832,'028
| 1
| 58
| 1
| 93 MiB
| 4,'842,'638
| 4
| 3
| 1
| 92 MiB
| 3,'829,'513
| 1
| 250
| DBLP2
| 80 MiB
| 170,'843
| 102 MiB
| 4,'044,'649
| 4
| 35
| 3
| 39 MiB
| 2,'070,'157
| 7
| 104
| 1
| 68 MiB
| 3,'784,'285
| 0
| 60
| 6
| 66 MiB
| 3,'468,'606
| 1
| 28
| 1
| 45 MiB
| 1,'619,'443
| 3
| 21
| HCIBIB2
| 32 MiB
| 26,'390
| 33 MiB
| 617,'023
| 1
| 39
| 1
| 25 MiB
| 845,'805
| 2
| 61
| 1
| 19 MiB
| 868,'980
| 6
| 7
| 1
| 18 MiB
| 917,'833
| 3
| 27
| 1
| 13 MiB
| 324,'274
| 2
| 74
| 1
| 9854 KiB
| 327,'170
| 0
| 59
| 1
| 7106 KiB
| 363,'560
| 7
| 4
| 1
| 4088 KiB
| 201,'798
| 7
| 33
| 17
| 2942 KiB
| 171,'400
| 8
| 179
| BibDBPub
| 2292 KiB
| 3,'465
| 2359 KiB
| 80,'178
| 1
| 54
| 1
| 1560 KiB
| 77,'315
| 16
| 23
| 1
| 1334 KiB
| 33,'056
| 2
| 74
| 13
|}
 
This is the meaning of the attributes:
 
* ''FileSize'' is the original size of the input documents
* ''#Files'' indicates the number of stored XML documents
* ''#DbSize'' is the size of the resulting database (excluding the [[Indexes#Value Indexes|value index structures]])
* ''#Nodes'' represents the number of XML nodes (elements, attributes, texts, etc.) stored in the database
* ''#Attr'' indicates the maximum number of attributes stored for a single element
* ''#ENames'' and #ANames reflect the number of distinct element and attribute names
* ''#URIs'' represent the number of distinct namespace URIs
* ''Height'' indicates the maximum level depth of the stored nodes
== Sources ==
<table><tr><td width{| class=120><b>"wikitable sortable"! Instances</b></td><td><b>! Source</b></td></tr><tr><td>|-| AirBase</td><td>| http://air-climate.eionet.europa.eu/databases/airbase/airbasexml</td></tr><tr><td>|-| Alfred</td><td>| http://alfred.med.yale.edu/alfred/alfredWithDescription.zip</td></tr><tr><td>|-| BibDBPub</td><td>| http://inex.is.informatik.uni-duisburg.de/2005/</td></tr><tr><td>|-| CoPhIR</td><td>| http://cophir.isti.cnr.it/</td></tr><tr><td>|-| DBLP</td><td>| http://dblp.uni-trier.de/xml</td></tr><tr><td>|-| DBLP2</td><td>| http://inex.is.informatik.uni-duisburg.de/2005/</td></tr><tr><td>|-| DDI</td><td>| http://tools.ddialliance.org/</td></tr><tr><td>|-| EnWikiMeta</td><td>| http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-meta-current.xml.bz2</td></tr><tr><td>|-| EnWikipedia</td><td>| http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2</td></tr><tr><td>|-| EnWikiRDF</td><td>| http://www.xml-benchmark.org/ &nbsp; generated with xmlgen</td></tr><tr><td>|-| EnWiktionary</td><td>| http://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-meta-history.xml.7z</td></tr><tr><td>|-| EURLex</td><td>| http://www.epsiplatform.eu/</td></tr><tr><td>|-| Factbook</td><td>| http://www.cs.washington.edu/research/xmldatasets/www/repository.html</td></tr><tr><td>|-| Freebase</td><td>| http://download.freebase.com/wex</td></tr><tr><td>|-| FreeDB</td><td>| http://www.xmldatabases.org/radio/xmlDatabases/projects/FreeDBtoXML</td></tr><tr><td>|-| Freshmeat</td><td>| http://freshmeat.net/articles/freshmeat-xml-rpc-api-available</td></tr><tr><td>|-| Genome1</td><td>| ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/XML/ds_ch1.xml.gz</td></tr><tr><td>|-| HCIBIB2</td><td>| http://inex.is.informatik.uni-duisburg.de/2005/</td></tr><tr><td>|-| Inex2009</td><td>| http://www.mpi-inf.mpg.de/departments/d5/software/inex</td></tr><tr><td>|-| IntAct</td><td>| ftp://ftp.ebi.ac.uk/pub/databases/intact/current/index.html</td></tr><tr><td>|-| InterPro</td><td>| ftp://ftp.bio.net/biomirror/interpro/match_complete.xml.gz</td></tr><tr><td>|-| iProClass</td><td>| ftp://ftp.pir.georgetown.edu/pir_databases/iproclass/iproclass.xml.gz</td></tr><tr><td>|-| JMNEdict</td><td>| ftp://ftp.monash.edu.au/pub/nihongo/enamdict_doc.html</td></tr><tr><td>|-| KanjiDic2</td><td>| http://www.csse.monash.edu.au/~jwb/kanjidic2</td></tr><tr><td>|-| MedLine</td><td>| http://www.nlm.nih.gov/bsd</td></tr><tr><td>|-| MeSH</td><td>| http://www.nlm.nih.gov/mesh/xmlmesh.html</td></tr><tr><td>|-| MovieDB</td><td>| http://eagereyes.org/InfoVisContest2007Data.html</td></tr><tr><td>|-| MusicXML</td><td>| http://www.recordare.com/xml/samples.html</td></tr><tr><td>|-| Nasa</td><td>| http://www.cs.washington.edu/research/xmldatasets/www/repository.html</td></tr><tr><td>|-| NewYorkTimes</td><td>| http://www.nytimes.com/ref/membercenter/nytarchive.html</td></tr><tr><td>|-| OpenStreetMap</td><td>| http://dump.wiki.openstreetmap.org/osmwiki-latest-files.tar.gz</td></tr><tr><td>|-| Organizations</td><td>| http://www.data.gov/raw/1358</td></tr><tr><td>|-| RuWikiHist</td><td>| http://dumps.wikimedia.org/ruwiki/latest/ruwiki-latest-pages-meta-history.xml.7z</td></tr><tr><td>|-| SDMX</td><td>| http://www.metadatatechnology.com/</td></tr><tr><td>|-| Shakespeare</td><td>| http://www.cafeconleche.org/examples/shakespeare</td></tr><tr><td>|-| SwissProt</td><td>| ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase</td></tr><tr><td>|-| Thesaurus</td><td>| http://www.drze.de/BELIT/thesaurus</td></tr><tr><td>|-| Treebank</td><td>| http://www.cs.washington.edu/research/xmldatasets</td></tr><tr><td>|-| TreeOfLife</td><td>| http://tolweb.org/data/tolskeletaldump.xml</td></tr><tr><td>|-| TrEMBL</td><td>| ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase</td></tr><tr><td>|-| Wikicorpus</td><td>| http://www-connex.lip6.fr/~denoyer/wikipediaXML</td></tr><tr><td>|-| XMark</td><td>| http://www.xml-benchmark.org/ &nbsp; generated with xmlgen</td></tr><tr><td>|-| ZDNET</td><td>| http://inex.is.informatik.uni-duisburg.de/2005/</td></tr><tr><td>|-| ZhWikiHist</td><td>| http://dumps.wikimedia.org/zhwiki/latest/zhwiki-latest-pages-meta-history.xml.7z</td></tr><tr><td></td><td></td></tr>|-<tr><td>| LibraryUKN</td><td>| generated from university library data</td></tr><tr><td>|-| MediaUKN</td><td>| generated from university library data</td></tr><tr><td>|-| DeepFS</td><td>| generated from filesystem structure</td></tr><tr><td>|-| University</td><td>| generated from students test data</td></tr><tr><td>|-| Feeds</td><td>| compiled from news feeds</td></tr><tr><td>|-| Twitter</td><td>| compiled from Twitter feeds</td></tr></table> [[Category:Internal]]|}
Bureaucrats, editor, reviewer, Administrators
13,550

edits

Navigation menu