Changes

Jump to navigation Jump to search
8,725 bytes removed ,  11:03, 1 March 2016
no edit summary
The following table This article is part of the [[Advanced User's Guide]].It lists statistics on various XML databases instances that have been created with BaseX , with value andfull-text indexes turned off. The URLs to the original sources, if available or public, links to the source documentsare listed below.
The database size does not include any indexes* #nodes represents the number of XML nodes which have been created [[Databases]] in the BaseX are light-weight. If a database* #atr, #elnlimit is reached, you can distribute your documents across multiple database instances and #uri represent the number access all of distinct attributes, element names, and namespacesthem with a single XQuery expression.
» [http://www.inf.uni-konstanz.de/dbis/basex/dl/xmldocs.pdf Download as PDF]== Databases ==
{|class="wikitable sortable"|+-!Instances |+file size |+!FileSize!#nodes |+Files!DbSize!#atr |+Nodes!#eln |+Attr!#atn |+ENames!#uri |+height |+ANames!#docs</b> URIs!Height|- | RuWikiHist '''Limits'''| 421 '''512 GiB '''<br/>(2^39 Bytes)| 416 GiB '''536'870'912'''<br/>(2^29)| 324,848,508 ''no limit''<br/>&nbsp;| 3 | 21 | 6 | '''2'147'483'648'''<br/>(2 | 6 ^31)| 1 ''no limit''<br/>&nbsp;|- '''32768'''<br/>(2^15)| ZhWikiHist '''32768'''<br/>(2^15)| 126 GiB '''256'''<br/>(2^8)| 120 ''no limit''<br/>&nbsp;|-| RuWikiHist |421 GiB |1|416 GiB | 179,199,662 324'848'508 | 3 | 21 | 6 | 2 | 6 | 1 |- | EnWiktionary ZhWikiHist| 79 126 GiB | 1| 75 120 GiB | 134,380,393 179'199'662| 3 | 21 | 6 | 2 | 6 | 1 |- | XMark EnWiktionary| 55 79 GiB | 64 GiB | 1,615,071,348 | 2 75 GiB| 74 134'380'393| 9 3| 0 21| 13 6| 2| 1 6|- | EnWikiMeta XMark| 54 55 GiB | 1| 52 64 GiB | 401,456,1'615'071'348 | 3 2| 21 74| 6 9| 2 0| 6 | 1 13|- | MedLine EnWikiMeta| 38 54 GiB | 1| 36 52 GiB | 1,623,764,254 401'456'348| 2 3| 84 21| 6 | 0 2| 9 | 379 6|- | iProClass MedLine| 36 38 GiB | 37 379| 36 GiB | 1,631,218,984 '623'764'254| 3 2| 245 84| 4 6| 2 0| 9 | 1 |- | Inex209 iProClass| 31 36 GiB | 1| 34 37 GiB | 1,336,110,639 '631'218'984| 15 3| 28,034 245| 451 4| 1 2| 37 | 2,666,500 9|- | CoPhIR Inex2009| 29 31 GiB | 2'666'500| 31 34 GiB | 1,104,623,376 '336'110'639| 10 15| 42 28'034| 42 451| 0 1| 8 | 10,000,000 37|- | EnWikipedia CoPhIR| 26 29 GiB | 25 10'000'000| 31 GiB | 198,546,747 1'104'623'376| 3 10| 24 42| 21 42| 2 0| 6 8| 1 -|- | XMark | 22 GiB EnWikipedia| 26 GiB | 645,997,965 1| 2 25 GiB| 74 198'546'747| 9 3| 0 24| 21| 13 2| 1 6|- | InterPro XMark| 14 22 GiB | 1| 19 26 GiB | 860,304,235 645'997'965| 5 2| 7 74| 15 9| 0 | 4 | 1 13|- | Genome1 InterPro| 13 14 GiB | 1| 13 19 GiB | 432,628,105 860'304'235| 12 5| 26 7| 101 15| 2 0| 6 | 1 4|- | NewYorkTimes Genome1| 12 13 GiB | 1| 13 GiB | 280,407,005 432'628'105| 5 12| 41 26| 33 101| 0 2| 6 | 1,855,659 |- | TrEMBL NewYorkTimes| 11 12 GiB | 1'855'659| 14 13 GiB | 589,650,535 280'407'005| 8 5| 47 41| 30 33| 2 0| 7 | 1 6|- | XMark TrEMBL| 11 GiB | 1| 13 14 GiB | 323,083,409 589'650'535| 2 8| 74 47| 9 30| 0 2| 13 | 1 7|- | IntAct XMark| 7973 MiB 11 GiB| 6717 MiB 1| 297,478,392 13 GiB| 7 323'083'409| 64 2| 22 74| 2 9| 14 0| 25,624 13|- | Freebase IntAct| 7366 7973 MiB | 10 GiB 25'624| 443,627,994 6717 MiB| 8 297'478'392| 61 7| 283 64| 1 22| 93 2| 1 14|- | SDMX Freebase| 6356 7366 MiB | 8028 MiB 1| 395,871,872 10 GiB| 2 443'627'994| 22 8| 6 61| 3 283| 7 1| 1 93|- | OpenStreetMap SDMX| 5312 6356 MiB | 1| 5171 8028 MiB | 6,910,669 395'871'872| 3 2| 19 22| 5 6| 2 3| 6 | 1 7|- | SwissProt OpenStreetMap| 4604 5312 MiB | 1| 5422 5171 MiB | 241,274,406 6'910'669| 8 3| 70 19| 39 5| 2 | 7 | 1 6|- | EURLex SwissProt| 4815 4604 MiB | 1| 5532 5422 MiB | 167,328,039 241'274'406| 23 8| 186 70| 46 39| 1 2| 12 | 1 7|- | Wikicorpus EURLex| 4492 4815 MiB | 4432 1| 5532 MiB | 157,948,561 167'328'039| 12 23| 1,257 186| 2,687 46| 2 1| 50 | 659,338 12|- | EnWikiRDF Wikicorpus| 3679 4492 MiB | 659'338| 3537 4432 MiB | 157'948'561| 98,433,194 12| 1 | 11 '257| 2 '687| 11 2| 4 | 1 50|- | CoPhIR EnWikiRDF| 2695 3679 MiB | 1| 2882 3537 MiB | 101,638,857 98'433'194| 10 1| 42 11| 42 2| 0 11| 8 | 1,000,000 4|- | MeSH CoPhIR| 2091 2695 MiB | 2410 1'000'000| 2882 MiB | 104,845,819 101'638'857| 3 10| 6 42| 5 42| 2 0| 5 8| 1 -|- MeSH| FreeDB 2091 MiB| 1723 MiB 1| 2462 2410 MiB | 102,901,519 | 2 | 7 104'845'819| 3 | 0 6| 5| 4 2| 1 5|- | XMark FreeDB| 1134 1723 MiB | 1| 1303 2462 MiB | 32,298,989 102'901'519| 2 | 74 7| 9 3| 0 | 13 | 1 4|- | DeepFS XMark| 810 1134 MiB | 1| 850 1303 MiB | 44,821,506 32'298'989| 4 2| 3 74| 6 9| 0 | 24 | 1 13|- | LibraryUKN DeepFS| 760 810 MiB | 1| 918 850 MiB | 46,401,941 44'821'506| 4| 3 | 23 | 3 6| 0 | 5 | 1 24|- | Twitter LibraryUKN| 736 760 MiB | 767 1| 918 MiB | 15,309,015 46'401'941| 0 3| 8 23| 0 3| 0 | 3 | 1,177,495 5|- | Organizations Twitter| 733 736 MiB | 1'177'495| 724 767 MiB | 33,112,392 15'309'015| 3 0| 38 | 9 8| 0 | 7 0| 1,019,132 3|- | DBLP Organizations| 694 733 MiB | 1'019'132| 944 724 MiB | 36,878,181 33'112'392| 4 3| 35 38| 6 9| 0 | 7 | 1 |- | Feeds DBLP| 692 694 MiB | 1| 604 944 MiB | 5,933,713 36'878'181| 0 4| 8 35| 0 6| 0 | 3 | 444,014 7|- | MedLineSupp Feeds| 477 692 MiB | 407 444'014| 604 MiB | 21,602,141 | 5 '933'713| 55 0| 7 8| 0 | 9 0| 1 3|- | AirBase MedLineSupp| 449 477 MiB | 1| 273 407 MiB | 14,512,851 21'602'141| 1 5| 111 55| 5 7| 0 | 11 | 38 9|- | MedLineDesc AirBase| 260 449 MiB | 38| 195 273 MiB | 10,401,847 14'512'851| 5 1| 66 111| 8 5| 0 | 9 | 1 11|- | ZDNET MedLineDesc| 130 260 MiB | 1| 133 195 MiB | 3,060,186 10'401'847| 21 5| 40 66| 90 8| 0 | 13 | 95,663 9|- | JMNEdict ZDNET| 124 130 MiB | 95'663| 171 133 MiB | 8,592,666 3'060'186| 0 21| 10 40| 0 90| 0 | 5 | 1 13|- | XMark JMNEdict| 111 124 MiB | 1| 130 171 MiB | 3,221,926 8'592'666| 2 0| 74 10| 9 0| 0 | 13 | 1 5|- | Freshmeat XMark| 105 111 MiB | 86 1| 130 MiB | 3,832,028 '221'926| 1 2| 58 74| 1 9| 0 | 6 | 1 13|- | DeepFS Freshmeat| 83 105 MiB | 1| 93 86 MiB | 4,842,638 3'832'028| 4 1| 3 58| 6 1| 0 | 21 | 1 6|- | Treebank DeepFS| 82 83 MiB | 1| 92 93 MiB | 3,829,513 4'842'638| 1 4| 250 3| 1 6| 0 | 37 | 1 21|- | DBLP2 Treebank| 80 82 MiB | 1| 102 92 MiB | 4,044,649 3'829'513| 4 1| 35 250| 6 1| 0 | 6 | 170,843 37|- | DDI DBLP2| 76 80 MiB | 170'843| 39 102 MiB | 2,070,157 4'044'649| 7 4| 104 35| 16 6| 21 0| 11 | 3 6|- | Alfred DDI| 75 76 MiB | 3| 68 39 MiB | 3,784,285 2'070'157| 0 7| 60 104| 0 16| 0 21| 6 | 1 11|- | University Alfred| 56 75 MiB | 1| 66 68 MiB | 3,468,606 '784'285| 1 0| 28 60| 4 0| 0 | 5 | 6 |- | MediaUKN University| 38 56 MiB | 6| 45 66 MiB | 3'468'606| 1,619,443 | 3 28| 21 | 3 4| 0 | 5 | 1 |- | HCIBIB2 MediaUKN| 32 38 MiB | 1| 33 45 MiB | 617,023 1'619'443| 1 3| 39 21| 1 3| 0 | 4 5| 26,390 -|- HCIBIB2| Nasa 32 MiB| 24 MiB 26'390| 25 33 MiB | 845,805 617'023| 2 1| 61 39| 8 | 1 | 9 0| 1 4|- | MovieDB Nasa| 16 24 MiB | 1| 19 25 MiB | 868,980 845'805| 6 2| 7 61| 8 | 0 1| 4 | 1 9|- | KanjiDic2 MovieDB| 13 16 MiB | 1| 18 19 MiB | 917,833 868'980| 3 6| 27 7| 10 8| 0 | 6 | 1 4|- | XMark | 11 MiB KanjiDic2| 13 MiB | 324,274 1| 2 18 MiB| 74 917'833| 3| 27| 9 10| 0 | 13 | 1 6|- | Shakespeare XMark| 7711 KiB 11 MiB| 9854 KiB 1| 327,170 13 MiB| 324'274| 0 2| 59 74| 0 9| 0 | 9 | 1 13|- | TreeOfLife Shakespeare| 5425 7711 KiB | 7106 1| 9854 KiB | 363,560 327'170| 7 0| 4 59| 7 0| 0 | 243 | 1 9|- | Thesaurus TreeOfLife| 4288 5425 KiB | 4088 1| 7106 KiB | 201,798 363'560| 7 | 33 4| 9 7| 0 | 7 | 1 243|- | MusicXML Thesaurus| 3155 4288 KiB | 2942 1| 4088 KiB | 171,400 201'798| 8 7| 179 33| 56 9| 0 | 8 | 17 7|- | BibDBPub MusicXML| 2292 3155 KiB | 17| 2359 2942 KiB | 80,178 171'400| 1 8| 54 179| 1 56| 0 | 4 | 3,465 8|- | Factbook BibDBPub| 1743 2292 KiB | 3'465| 1560 2359 KiB | 77,315 80'178| 16 1| 23 54| 32 1| 0 | 6 4| 1 -|- Factbook| XMark 1743 KiB| 1134 KiB 1| 1334 1560 KiB | 33,056 77'315| 2 16| 74 23| 9 32| 0 | 13 6|-| XMark| 1134 KiB| 1 | 1334 KiB| 33'056| 2| 74| 9| 0| 13|} This is the meaning of the attributes: * ''FileSize'' is the original size of the input documents* ''#Files'' indicates the number of stored XML documents* ''#DbSize'' is the size of the resulting database (excluding the [[Indexes#Value Indexes|value index structures]])* ''#Nodes'' represents the number of XML nodes (elements, attributes, texts, etc.) stored in the database* ''#Attr'' indicates the maximum number of attributes stored for a single element* ''#ENames'' and #ANames reflect the number of distinct element and attribute names* ''#URIs'' represent the number of distinct namespace URIs* ''Height'' indicates the maximum level depth of the stored nodes == Sources =={| class="wikitable sortable"! Instances! Source|-| AirBase| http://air-climate.eionet.europa.eu/databases/airbase/airbasexml|-| Alfred| http://alfred.med.yale.edu/alfred/alfredWithDescription.zip|-| BibDBPub| http://inex.is.informatik.uni-duisburg.de/2005/|-| CoPhIR| http://cophir.isti.cnr.it/|-| DBLP| http://dblp.uni-trier.de/xml|-| DBLP2| http://inex.is.informatik.uni-duisburg.de/2005/|-| DDI| http://tools.ddialliance.org/|-| EnWikiMeta| http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-meta-current.xml.bz2|-| EnWikipedia| http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2|-| EnWikiRDF| http://www.xml-benchmark.org/ generated with xmlgen|-| EnWiktionary| http://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-meta-history.xml.7z|-| EURLex| http://www.epsiplatform.eu/|-| Factbook| http://www.cs.washington.edu/research/xmldatasets/www/repository.html|-| Freebase| http://download.freebase.com/wex|-| FreeDB| http://www.xmldatabases.org/radio/xmlDatabases/projects/FreeDBtoXML|-| Freshmeat| http://freshmeat.net/articles/freshmeat-xml-rpc-api-available|-| Genome1| ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/XML/ds_ch1.xml.gz|-| HCIBIB2| http://inex.is.informatik.uni-duisburg.de/2005/|-| Inex2009| http://www.mpi-inf.mpg.de/departments/d5/software/inex|-| IntAct| ftp://ftp.ebi.ac.uk/pub/databases/intact/current/index.html|-| InterPro| ftp://ftp.bio.net/biomirror/interpro/match_complete.xml.gz|-| iProClass| ftp://ftp.pir.georgetown.edu/pir_databases/iproclass/iproclass.xml.gz|-| JMNEdict| ftp://ftp.monash.edu.au/pub/nihongo/enamdict_doc.html|-| KanjiDic2| http://www.csse.monash.edu.au/~jwb/kanjidic2|-| MedLine| http://www.nlm.nih.gov/bsd|-| MeSH| http://www.nlm.nih.gov/mesh/xmlmesh.html|-| MovieDB| http://eagereyes.org/InfoVisContest2007Data.html|-| MusicXML| http://www.recordare.com/xml/samples.html|-| Nasa| http://www.cs.washington.edu/research/xmldatasets/www/repository.html|-| NewYorkTimes| http://www.nytimes.com/ref/membercenter/nytarchive.html|-| OpenStreetMap| http://dump.wiki.openstreetmap.org/osmwiki-latest-files.tar.gz|-| Organizations| http://www.data.gov/raw/1358|-| RuWikiHist| http://dumps.wikimedia.org/ruwiki/latest/ruwiki-latest-pages-meta-history.xml.7z|-| SDMX| http://www.metadatatechnology.com/|-| Shakespeare| http://www.cafeconleche.org/examples/shakespeare|-| SwissProt| ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase|-| Thesaurus| http://www.drze.de/BELIT/thesaurus|-| Treebank| http://www.cs.washington.edu/research/xmldatasets|-| TreeOfLife| http://tolweb.org/data/tolskeletaldump.xml|-| TrEMBL| ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase|-| Wikicorpus| http://www-connex.lip6.fr/~denoyer/wikipediaXML|-| XMark| http://www.xml-benchmark.org/ generated with xmlgen|-| ZDNET| http://inex.is.informatik.uni-duisburg.de/2005/|-| ZhWikiHist| http://dumps.wikimedia.org/zhwiki/latest/zhwiki-latest-pages-meta-history.xml.7z|-| LibraryUKN| generated from university library data|-| MediaUKN| generated from university library data|-| DeepFS| generated from filesystem structure|-| University| generated from students test data|-| Feeds| compiled from news feeds|-| Twitter| compiled from Twitter feeds
|}
Bureaucrats, editor, reviewer, Administrators
13,550

edits

Navigation menu